What is rPref?
rPref is a package for the statistical computing language R for Skyline computation and some slight generalizations of it ("database preferences").
The Skyline calculation in rPref is done very efficiently as all performance critical algorithms are written in C++.
rPref is available on CRAN, hence it can be downloaded and installed by:
# Load package
2016-09-01: rPref v1.1 is on CRAN now. The changes include:
- Plotting BTG graphs via plot_btg uses the Rgraphviz package if available.
- Lazy evaluation is supported, i.e., low_("mpg") as an alternative to low(mpg) via the lazyeval package.
- Preference objects can have associated data sets, e.g., high(mpg, df = mtcars) * high(hp) creates a preference associated with mtcars.
- Many changes in the implementation. rPref uses now the light-weight R6 classes for preference objects. The C++ algorithms use shared pointers from C++11.
2016-01-19: With pypref 0.0.1 there exists a first Python port of rPref. The preference algorithms are written in Cython.
What is a Skyline?
The Skyline of a data set selects tuples which are Pareto-optimal with respect to given optimization goals. Only those tuples are returned which are not dominated by any other tuple. A tuple dominates another tuple if it is better in all relevant dimensions and strictly better in at least one dimension.
Hence, the computation of the Skyline is a powerful tool for prefiltering large data sets under given optimization goals. A typical example from economics is the search for products with low price and high quality. In this case one typically assumes that products which are worse in both dimensions (price and quality) are not interesting. Thus, a Pareto query optimizing for low price and high quality only returns the potentially interesting products. See examples or the linked papers to get a better understanding of Skylines and database preferences.