Figure: x

The rPref Package

Database Preferences and Skyline Computation in R

pypref ‐ a Python port of rPref

There is also a Python port of rPref package, named "pypref" (Database Preferences and Skyline Computation in Python). The preference constructs (low, high, true and the usual complex preference operators) are very similar to rPref. The BNL algorithm (Block Nested Loop) for determining the optimal tuples is written in Cython, a C-Compiler supporting a Python-like syntax and making C-Extensions for Python quite simple.

The pypref package covers just a part of the rPref functionality. For example, grouped preferences are not supported at the moment. Note that the development state of pypref is "alpha". In contrast to rPref, there are currently no unit tests and the documentation is less detailed.

Download and install pypref

The current development version is available on GitHub.

Additionally we provide a wheel-file of the pypref package v0.0.1 including binaries for CPython 3.5 on 64bit Windows: pypref-0.0.1-cp35-none-win_amd64.whl.

To install pypref using the PIP installer (which is included for example in the Anaconda Python 3.5 distribution), use

pip install pypref-0.0.1-cp35-none-win_amd64.whl
on the Windows command prompt. The downloaded wheel-file must be in your working directory. The Python installer pip should be on your PATH after having installed Anaconda.

For removing pypref, use pip uninstall pypref on the command prompt.

Example 1 - Skyline plot

Now we revisit the the first two use cases from the rPref example page in pypref.

For the following code snippets we have to primarily load the pypref package and the matplotlib (as a visualization interface for the Skyline plots):

import pypref as p
import matplotlib.pyplot as plt

We consider the mtcars data set from R, which is included in pypref as an example data set. We search for the optimal cars having a high horsepower and simultaneously a low fuel consumption (i.e., a high miles per gallon value).

In the following code snippet the optimal set of cars with respect to the preference "high horsepower and low fuel consumption" is calculated and the result is plotted. All pypref functions are printed bold.

# load mtcars data set given in pypref (motor trends data set from R)
mtcars = p.get_mtcars()

# preference for cars with minimal fuel consumption (high mpg value) and high power
pref = p.high("mpg") * p.high("hp")

def plot_skyline(dataset, pref):
   # plot all points
   plt.plot(dataset['mpg'], dataset['hp'], 'bo', fillstyle = "none")

   # select optimal cars according to this preference (Skyline)
   sky = pref.psel(dataset)

   # highlight Skyline
   plt.plot(sky['mpg'], sky['hp'], 'bo')

   # show plot

plot_skyline(mtcars, pref)
The result of the visualization is:
Plot of the mpg and hp values of mtcars, where the Pareto-optimal points maximizing both dimensions are filled. This plot was generated using pypref and matplotlib.pyplot.

Example 2 - Level value plot

Again we consider the same preference and data set as in the previous example (i.e., we rely on the objects mtcars and pref from the example above). The Pareto-optimal set is defined as the Level-1 points. The Pareto optima of the remainder are the Level-2 points. The optimal points of the k-th remainder are the Level-(k+1) points. The level values of all tuples are retrieved by the psel function where the top-parameter indicates the number of tuples in the data set.

In the following code snippet we show the tuples of each level in a different color and plot the Pareto front line for each level.

def plot_levels(dataset, pref):

   # get level values for all tuples from the data set
   res = pref.psel(dataset, top = len(dataset))

   # plot each level front line in a different color
   for level in range(1, res['_level'].max() + 1):
     pts = res.loc[res['_level'] == level].sort_values("mpg")
     plt.step(pts['mpg'], pts['hp'], 'o', label = "Level " + str(level))

   # show legend and plot

# show level plot for data set and preference as given above
plot_levels(mtcars, pref)

This produces the following plot:

Plot of the mpg and hp values of mtcars using pypref, where the Pareto front line for each level is shown.

More examples are given in on GitHub.