alpenglow.offline.models package¶

Submodules¶

alpenglow.offline.models.ALSFactorModel module¶

class alpenglow.offline.models.ALSFactorModel.ALSFactorModel(dimension=10, begin_min=-0.01, begin_max=0.01, number_of_iterations=3, regularization_lambda=0.0001, alpha=40, implicit=1)[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

This class implements the well-known matrix factorization recommendation model [Koren2009] and trains it using ALS and iALS [Hu2008].

Parameters:

dimension (int) – The latent factor dimension of the factormodel.
begin_min (double) – The factors are initialized randomly, sampling each element uniformly from the interval (begin_min, begin_max).
begin_max (double) – See begin_min.
number_of_iterations (double) – Number of times to optimize the user and the item factors for least squares.
regularization_lambda (double) – The coefficient for the L2 regularization term. See [Hu2008]. This number is multiplied by the number of non-zero elements of the user-item rating matrix before being used, to achieve similar magnitude to the one used in traditional SGD.
alpha (int) – The weight coefficient for positive samples in the error formula in the case of implicit factorization. See [Hu2008].
implicit (int) – Whether to treat the data as implicit (and optimize using iALS) or explicit (and optimize using ALS).

alpenglow.offline.models.AsymmetricFactorModel module¶

class alpenglow.offline.models.AsymmetricFactorModel.AsymmetricFactorModel(dimension=10, begin_min=-0.01, begin_max=0.01, learning_rate=0.05, regularization_rate=0.0, negative_rate=0, number_of_iterations=9)[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

Implements the recommendation model introduced in [Paterek2007].

Parameters:

dimension (int) – The latent factor dimension of the factormodel.
begin_min (double) – The factors are initialized randomly, sampling each element uniformly from the interval (begin_min, begin_max).
begin_max (double) – See begin_min.
learning_rate (double) – The learning rate used in the stochastic gradient descent updates.
regularization_rate (double) – The coefficient for the L2 regularization term.
negative_rate (int) – The number of negative samples generated after each update. Useful for implicit recommendation.
number_of_iterations (int) – Number of times to iterate over the training data.

alpenglow.offline.models.FactorModel module¶

class alpenglow.offline.models.FactorModel.FactorModel(dimension=10, begin_min=-0.01, begin_max=0.01, learning_rate=0.05, regularization_rate=0.0, negative_rate=0.0, number_of_iterations=9)[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

This class implements the well-known matrix factorization recommendation model [Koren2009] and trains it via stochastic gradient descent. The model is able to train on implicit data using negative sample generation, see [X.He2016] and the negative_rate parameter.

Parameters:

dimension (int) – The latent factor dimension of the factormodel.
begin_min (double) – The factors are initialized randomly, sampling each element uniformly from the interval (begin_min, begin_max).
begin_max (double) – See begin_min.
learning_rate (double) – The learning rate used in the stochastic gradient descent updates.
regularization_rate (double) – The coefficient for the L2 regularization term.
negative_rate (int) – The number of negative samples generated after each update. Useful for implicit recommendation.
number_of_iterations (int) – Number of times to iterate over the training data.

alpenglow.offline.models.NearestNeighborModel module¶

class alpenglow.offline.models.NearestNeighborModel.NearestNeighborModel(num_of_neighbors=10)[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

One of the earliest and most popular collaborative filtering algorithms in practice is the item-based nearest neighbor [Sarwar2001] For these algorithms similarity scores are computed between item pairs based on the co-occurrence of the pairs in the preference of users. Non-stationarity of the data can be accounted for e.g. with the introduction of a time-decay [Ding2005] .

Describing the algorithm more formally, let us denote by $U_i$ the set of users that visited item $i$ , by $I_u$ the set of items visited by user $u$ , and by $s_{u i}$ the index of item $i$ in the sequence of interactions of user $u$ . The frequency based similarity function is defined by $sim(j,i) = \frac{\sum_{u\in {U_j \cap U_i}} 1}{\left|U_j\right|}$ . The score assigned to item $i$ for user $u$ is $score(u,i) = \sum_{j\in{I_u}} sim(j,i).$ The model is represented by the similarity scores. Only the most similar items are stored for each item. When the prediction scores are computed for a particular user, all items visited by the user are considered.

Parameters:	num_of_neighbors (int) – Number of most similar items that will be stored in the model.

alpenglow.offline.models.PopularityModel module¶

class alpenglow.offline.models.PopularityModel.PopularityModel[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

Recommends the most popular item from the set of items.

alpenglow.offline.models.SvdppModel module¶

class alpenglow.offline.models.SvdppModel.SvdppModel(dimension=10, begin_min=-0.01, begin_max=0.01, learning_rate=0.05, negative_rate=0.0, number_of_iterations=20, cumulative_item_updates=false)[source]¶

Bases: alpenglow.offline.OfflineModel.OfflineModel

This class implements the SVD++ model [Koren2008] The model is able to train on implicit data using negative sample generation, see [X.He2016] and the negative_rate parameter.

Parameters:

dimension (int) – The latent factor dimension of the factormodel.
begin_min (double) – The factors are initialized randomly, sampling each element uniformly from the interval (begin_min, begin_max).
begin_max (double) – See begin_min.
learning_rate (double) – The learning rate used in the stochastic gradient descent updates.
negative_rate (int) – The number of negative samples generated after each update. Useful for implicit recommendation.
number_of_iterations (int) – Number of times to iterate over the training data.
cumulative_item_updates (boolean) – Cumulative item updates make the model faster but less accurate.

alpenglow.offline.models package¶

Submodules¶

alpenglow.offline.models.ALSFactorModel module¶

alpenglow.offline.models.AsymmetricFactorModel module¶

alpenglow.offline.models.FactorModel module¶

alpenglow.offline.models.NearestNeighborModel module¶

alpenglow.offline.models.PopularityModel module¶

alpenglow.offline.models.SvdppModel module¶

Module contents¶