alpenglow package

Subpackages

Submodules

alpenglow.Getter module

class alpenglow.Getter.Getter[source]

Bases: object

Responsible for creating and managing cpp objects in the alpenglow.cpp package.

collect_ = False
items = []
class alpenglow.Getter.MetaGetter(a, b, c)[source]

Bases: type

Metaclass of alpenglow.Getter.Getter. Provides utilities for creating and managing cpp objects in the alpenglow.cpp package. For more information, see Memory management.

collect()[source]
get_and_clean()[source]
initialize_all(objects)[source]
run_self_test(i)[source]
set_experiment_environment(online_experiment, objects)[source]

alpenglow.OnlineExperiment module

class alpenglow.OnlineExperiment.OnlineExperiment(seed=254938879, top_k=100)[source]

Bases: alpenglow.ParameterDefaults.ParameterDefaults

This is the base class of every online experiment in Alpenglow. It builds the general experimental setup needed to run the online training and evaluation of a model. It also handles default parameters and the ability to override them when instantiating an experiment.

Subclasses should implement the config() method; for more information, check the documentation of this method as well.

Online evaluation in Alpenglow is done by processing the data row-by-row and evaluating the model on each new record before providing the model with the new information.

_images/online.png

Evaluation is done by ranking the next item on the user’s toplist and saving the rank. If the item is not found in the top top_k items, the evaluation step returns NaN.

For a brief tutorial on using this class, see Five minute tutorial.

Parameters:
  • seed (int) – The seed to initialize RNG-s. Should not be 0.
  • top_k (int) – The length of the toplists.
get_predictions()[source]

If the calculate_toplists parameter is set when calling run, this method can used to acquire the generated toplists.

Returns:DataFrame containing the columns record_id, time, user, item, rank and prediction.
  • record_id is the index of the record begin evaluated in the input DataFrame. Generally, there are top_k rows with the same record_id.
  • time is the time of the evaluation
  • user is the user the toplist is generated for
  • item is the item of the toplist at the rank place
  • prediction is the prediction given by the model for the (user, item) pair at the time of evaluation.
Return type:pandas.DataFrame
run(data, experimentType=None, columns={}, verbose=True, out_file=None, lookback=False, initialize_all=False, max_item=-1, max_user=-1, calculate_toplists=False)[source]
Parameters:
  • data (pandas.DataFrame or str) – The input data, see Five minute tutorial. If this parameter is a string, it has to be in the format specified by experimentType.
  • experimentType (str) – The format of the input file if data is a string
  • columns (dict) – Optionally the mapping of the input DataFrame’s columns’ names to the expected ones.
  • verbose (bool) – Whether to write information about the experiment while running
  • out_file (str) – If set, the results of the experiment are also written to the file located at out_file.
  • lookback (bool) – If set to True, a user’s previosly seen items are excluded from the toplist evaluation. The eval columns of the input data should be set accordingly.
  • calculate_toplists (bool or list) – Whether to actually compute the toplists or just the ranks (the latter is faster). It can be specified on a record-by-record basis, by giving a list of booleans as parameter. The calculated toplists can be acquired after the experiment’s end by using get_predictions.
Returns:

Description of return value

Return type:

bool

alpenglow.ParameterDefaults module

class alpenglow.ParameterDefaults.ParameterDefaults(**parameters)[source]

Bases: object

Base class of OnlineExperiment and OfflineModel, providing utilities for parameter defaults and overriding.

check_unused_parameters()[source]
parameter_default(name, value)[source]
parameter_defaults(**defaults)[source]
set_parameter(name, value)[source]

Module contents