spotlight

Deep recommender models using PyTorch.

3,017

421

3,017

View on GitHub

Top Related Projects

recommenders

1,955

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.

implicit

3,662

Fast Python Collaborative Filtering for Implicit Feedback Datasets

lightfm

4,906

A Python implementation of LightFM, a hybrid recommendation algorithm.

RecBole

3,816

A unified, comprehensive and efficient recommendation library

Surprise

6,597

A Python scikit for building and analyzing recommender systems

Quick Overview

Spotlight is a deep learning recommender system library for Python. It provides a range of recommendation models, including factorization and sequence-based approaches, with a focus on implicit feedback datasets. The library is built on PyTorch, allowing for GPU acceleration and easy model customization.

Pros

Easy to use API, similar to scikit-learn
Supports both explicit and implicit feedback datasets
Implements various recommendation models, including matrix factorization and sequential models
Built on PyTorch, enabling GPU acceleration and easy model customization

Cons

Limited documentation and examples compared to some other recommender libraries
Not as actively maintained as some alternatives (last commit was over a year ago)
Smaller community compared to more popular recommender system libraries
May require more manual tuning compared to some auto-ML recommender solutions

Code Examples

Creating and fitting a factorization model:

from spotlight.factorization.explicit import ExplicitFactorizationModel
from spotlight.interactions import Interactions

# Create interactions
interactions = Interactions(user_ids, item_ids, ratings)

# Create and fit the model
model = ExplicitFactorizationModel(n_iter=10)
model.fit(interactions)

Making predictions:

# Predict ratings for user-item pairs
predictions = model.predict(user_ids, item_ids)

# Get top-k recommendations for a user
recommendations = model.recommend(user_id, k=10)

Using a sequence model:

from spotlight.sequence.implicit import ImplicitSequenceModel

# Create sequence interactions
sequence = Interactions(user_ids, item_ids, timestamps=timestamps)

# Create and fit the sequence model
sequence_model = ImplicitSequenceModel(n_iter=10)
sequence_model.fit(sequence)

# Get recommendations based on a user's history
recommendations = sequence_model.recommend(user_id, user_history)

Getting Started

To get started with Spotlight:

Install the library:

pip install spotlight

Import and use the models:

from spotlight.factorization.implicit import ImplicitFactorizationModel
from spotlight.interactions import Interactions

# Prepare your data
interactions = Interactions(user_ids, item_ids, ratings)

# Create and train a model
model = ImplicitFactorizationModel(n_iter=10)
model.fit(interactions)

# Make predictions
predictions = model.predict(user_ids, item_ids)

Competitor Comparisons

recommenders

1,955

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.

Pros of TensorFlow Recommenders

Built on top of TensorFlow, offering seamless integration with the broader TensorFlow ecosystem
Provides pre-built models and layers specifically designed for recommendation tasks
Offers scalability for large-scale recommendation systems

Cons of TensorFlow Recommenders

Steeper learning curve, especially for those not familiar with TensorFlow
More complex setup and configuration compared to Spotlight's simplicity
Potentially overkill for smaller recommendation projects

Code Comparison

Spotlight:

from spotlight.factorization.explicit import ExplicitFactorizationModel

model = ExplicitFactorizationModel(n_iter=1)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)

TensorFlow Recommenders:

import tensorflow_recommenders as tfrs

class MovieLensModel(tfrs.Model):
    def __init__(self):
        super().__init__()
        self.ranking_model = tf.keras.Sequential([...])
        self.task = tfrs.tasks.Ranking(...)

    def call(self, features):
        return self.ranking_model(features)

model = MovieLensModel()
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(dataset, epochs=5)

implicit

3,662

Fast Python Collaborative Filtering for Implicit Feedback Datasets

Pros of Implicit

Faster performance, especially for large datasets
More comprehensive set of algorithms implemented
Better documentation and examples

Cons of Implicit

Less flexibility for custom model architectures
No built-in support for deep learning models
Limited to implicit feedback scenarios

Code Comparison

Implicit:

from implicit.als import AlternatingLeastSquares
model = AlternatingLeastSquares(factors=50)
model.fit(user_item_matrix)
recommendations = model.recommend(user_id, user_item_matrix[user_id])

Spotlight:

from spotlight.factorization.implicit import ImplicitFactorizationModel
model = ImplicitFactorizationModel(n_components=50)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)

Both libraries offer easy-to-use interfaces for collaborative filtering, but Implicit focuses on traditional matrix factorization techniques, while Spotlight provides more flexibility for neural network-based approaches. Implicit is generally faster and more suitable for large-scale production environments, whereas Spotlight offers more customization options for researchers and those experimenting with novel architectures. The choice between the two depends on specific use cases, dataset sizes, and the need for customization versus performance.

lightfm

4,906

A Python implementation of LightFM, a hybrid recommendation algorithm.

Pros of LightFM

Supports both explicit and implicit feedback
Offers hybrid recommendation capabilities, combining content and collaborative filtering
More mature project with a larger user base and community support

Cons of LightFM

Less flexible in terms of model architecture compared to Spotlight
May require more manual feature engineering for optimal performance
Limited support for deep learning-based recommendation models

Code Comparison

LightFM:

from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(interactions, epochs=10)

Spotlight:

from spotlight.factorization.explicit import ExplicitFactorizationModel
model = ExplicitFactorizationModel(n_iter=10, learning_rate=0.01)
model.fit(interactions)

Both libraries offer concise and intuitive APIs for building recommendation models. LightFM's API is slightly more compact, while Spotlight provides more flexibility in model configuration.

Spotlight, being built on PyTorch, offers greater extensibility for custom model architectures and easier integration with deep learning workflows. However, LightFM's maturity and hybrid recommendation capabilities make it a solid choice for many practical applications.

RecBole

3,816

A unified, comprehensive and efficient recommendation library

Pros of RecBole

Offers a wider range of recommendation algorithms and models
Provides more comprehensive documentation and tutorials
Actively maintained with frequent updates and community support

Cons of RecBole

Steeper learning curve due to its extensive features
Potentially slower execution for simpler recommendation tasks
Requires more setup and configuration compared to Spotlight

Code Comparison

Spotlight example:

from spotlight import factorization
model = factorization.explicit.ExplicitFactorizationModel(n_iter=10)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)

RecBole example:

from recbole.quick_start import load_data_and_model
config, model, dataset, dataloader = load_data_and_model(model='BPR', dataset='ml-100k')
model.train()
predictions = model.predict(dataloader)

Both libraries offer concise ways to create and train recommendation models, but RecBole provides more flexibility in model selection and dataset handling. Spotlight's API is simpler for basic tasks, while RecBole's approach allows for more customization and advanced features.

Surprise

6,597

A Python scikit for building and analyzing recommender systems

Pros of Surprise

Wider range of traditional recommendation algorithms (e.g., SVD, KNN, NMF)
Easier to use for beginners with a scikit-learn inspired API
Better documentation and examples for getting started quickly

Cons of Surprise

Limited support for deep learning-based recommendation models
Less flexibility for customizing model architectures
Slower performance for large-scale datasets compared to Spotlight

Code Comparison

Surprise example:

from surprise import SVD, Dataset
data = Dataset.load_builtin('ml-100k')
algo = SVD()
algo.fit(data.build_full_trainset())

Spotlight example:

from spotlight.interactions import Interactions
from spotlight.factorization.explicit import ExplicitFactorizationModel
dataset = Interactions.from_csv('interactions.csv')
model = ExplicitFactorizationModel()
model.fit(dataset)

Spotlight focuses on neural network-based models and provides a PyTorch backend, offering more flexibility for advanced users. Surprise, on the other hand, emphasizes traditional collaborative filtering algorithms and is more accessible for beginners. The choice between the two depends on the specific use case, dataset size, and the user's familiarity with deep learning concepts.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

.. image:: docs/_static/img/spotlight.png

.. inclusion-marker-do-not-remove

.. image:: https://travis-ci.org/maciejkula/spotlight.svg?branch=master :target: https://travis-ci.org/maciejkula/spotlight

.. image:: https://ci.appveyor.com/api/projects/status/jq5e76a7a08ra2ji/branch/master?svg=true :target: https://ci.appveyor.com/project/maciejkula/spotlight/branch/master

.. image:: https://badges.gitter.im/gitterHQ/gitter.png :target: https://gitter.im/spotlight-recommendations/Lobby

.. image:: https://anaconda.org/maciejkula/spotlight/badges/version.svg :target: https://anaconda.org/maciejkula/spotlight

.. image:: https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat :target: https://maciejkula.github.io/spotlight/

.. image:: https://img.shields.io/badge/progress%20tracker-trello-brightgreen.svg :target: https://trello.com/b/G5iFgS1W/spotlight

Spotlight uses PyTorch <http://pytorch.org/>_ to build both deep and shallow recommender models. By providing both a slew of building blocks for loss functions (various pointwise and pairwise ranking losses), representations (shallow factorization representations, deep sequence models), and utilities for fetching (or generating) recommendation datasets, it aims to be a tool for rapid exploration and prototyping of new recommender models.

See the full documentation <https://maciejkula.github.io/spotlight/>_ for details.

Installation


.. code-block:: python

   conda install -c maciejkula -c pytorch spotlight


Usage
~~~~~

Factorization models
====================

To fit an explicit feedback model on the MovieLens dataset:

.. code-block:: python

    from spotlight.cross_validation import random_train_test_split
    from spotlight.datasets.movielens import get_movielens_dataset
    from spotlight.evaluation import rmse_score
    from spotlight.factorization.explicit import ExplicitFactorizationModel

    dataset = get_movielens_dataset(variant='100K')

    train, test = random_train_test_split(dataset)

    model = ExplicitFactorizationModel(n_iter=1)
    model.fit(train)

    rmse = rmse_score(model, test)



To fit an implicit ranking model with a BPR pairwise loss on the MovieLens dataset:

.. code-block:: python

    from spotlight.cross_validation import random_train_test_split
    from spotlight.datasets.movielens import get_movielens_dataset
    from spotlight.evaluation import mrr_score
    from spotlight.factorization.implicit import ImplicitFactorizationModel

    dataset = get_movielens_dataset(variant='100K')

    train, test = random_train_test_split(dataset)

    model = ImplicitFactorizationModel(n_iter=3,
                                       loss='bpr')
    model.fit(train)

    mrr = mrr_score(model, test)




Sequential models
=================

Recommendations can be seen as a sequence prediction task: given the items a user
has interacted with in the past, what will be the next item they will interact
with? Spotlight provides a range of models and utilities for fitting next item
recommendation models, including

- pooling models, as in `YouTube recommendations <https://pdfs.semanticscholar.org/bcdb/4da4a05f0e7bc17d1600f3a91a338cd7ffd3.pdf>`_,
- LSTM models, as in `Session-based recommendations... <https://arxiv.org/pdf/1511.06939>`_, and
- causal convolution models, as in `WaveNet <https://arxiv.org/pdf/1609.03499>`_.

.. code-block:: python

    from spotlight.cross_validation import user_based_train_test_split
    from spotlight.datasets.synthetic import generate_sequential
    from spotlight.evaluation import sequence_mrr_score
    from spotlight.sequence.implicit import ImplicitSequenceModel

    dataset = generate_sequential(num_users=100,
                                  num_items=1000,
                                  num_interactions=10000,
                                  concentration_parameter=0.01,
                                  order=3)

    train, test = user_based_train_test_split(dataset)

    train = train.to_sequence()
    test = test.to_sequence()

    model = ImplicitSequenceModel(n_iter=3,
                                  representation='cnn',
                                  loss='bpr')
    model.fit(train)

    mrr = sequence_mrr_score(model, test)


  

Datasets
========

Spotlight offers a slew of popular datasets, including Movielens 100K, 1M, 10M, and 20M.
It also incorporates utilities for creating synthetic datasets. For example, `generate_sequential`
generates a Markov-chain-derived interaction dataset, where the next item a user chooses is
a function of their previous interactions:

.. code-block:: python

    from spotlight.datasets.synthetic import generate_sequential

    # Concentration parameter governs how predictable the chain is;
    # order determins the order of the Markov chain.
    dataset = generate_sequential(num_users=100,
                                  num_items=1000,
                                  num_interactions=10000,
                                  concentration_parameter=0.01,
                                  order=3)




Examples
~~~~~~~~

1. `Rating prediction on the Movielens dataset <https://github.com/maciejkula/spotlight/tree/master/examples/movielens_explicit>`_.
2. `Using causal convolutions for sequence recommendations <https://github.com/maciejkula/spotlight/tree/master/examples/movielens_sequence>`_.
3. `Bloom embedding layers <https://github.com/maciejkula/spotlight/tree/master/examples/bloom_embeddings>`_.


How to cite
~~~~~~~~~~~

Please cite Spotlight if it helps your research. You can use the following BibTeX entry:

.. code-block::

   @misc{kula2017spotlight,
     title={Spotlight},
     author={Kula, Maciej},
     year={2017},
     publisher={GitHub},
     howpublished={\url{https://github.com/maciejkula/spotlight}},
   }


Contributing

Spotlight is meant to be extensible: pull requests are welcome. Development progress is tracked on Trello <https://trello.com/b/G5iFgS1W/spotlight>_: have a look at the outstanding tickets to get an idea of what would be a useful contribution.

We accept implementations of new recommendation models into the Spotlight model zoo: if you've just published a paper describing your new model, or have an implementation of a model from the literature, make a PR!

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot