Top Related Projects
TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Fast Python Collaborative Filtering for Implicit Feedback Datasets
A Python implementation of LightFM, a hybrid recommendation algorithm.
A unified, comprehensive and efficient recommendation library
A Python scikit for building and analyzing recommender systems
Quick Overview
Spotlight is a deep learning recommender system library for Python. It provides a range of recommendation models, including factorization and sequence-based approaches, with a focus on implicit feedback datasets. The library is built on PyTorch, allowing for GPU acceleration and easy model customization.
Pros
- Easy to use API, similar to scikit-learn
- Supports both explicit and implicit feedback datasets
- Implements various recommendation models, including matrix factorization and sequential models
- Built on PyTorch, enabling GPU acceleration and easy model customization
Cons
- Limited documentation and examples compared to some other recommender libraries
- Not as actively maintained as some alternatives (last commit was over a year ago)
- Smaller community compared to more popular recommender system libraries
- May require more manual tuning compared to some auto-ML recommender solutions
Code Examples
- Creating and fitting a factorization model:
from spotlight.factorization.explicit import ExplicitFactorizationModel
from spotlight.interactions import Interactions
# Create interactions
interactions = Interactions(user_ids, item_ids, ratings)
# Create and fit the model
model = ExplicitFactorizationModel(n_iter=10)
model.fit(interactions)
- Making predictions:
# Predict ratings for user-item pairs
predictions = model.predict(user_ids, item_ids)
# Get top-k recommendations for a user
recommendations = model.recommend(user_id, k=10)
- Using a sequence model:
from spotlight.sequence.implicit import ImplicitSequenceModel
# Create sequence interactions
sequence = Interactions(user_ids, item_ids, timestamps=timestamps)
# Create and fit the sequence model
sequence_model = ImplicitSequenceModel(n_iter=10)
sequence_model.fit(sequence)
# Get recommendations based on a user's history
recommendations = sequence_model.recommend(user_id, user_history)
Getting Started
To get started with Spotlight:
- Install the library:
pip install spotlight
- Import and use the models:
from spotlight.factorization.implicit import ImplicitFactorizationModel
from spotlight.interactions import Interactions
# Prepare your data
interactions = Interactions(user_ids, item_ids, ratings)
# Create and train a model
model = ImplicitFactorizationModel(n_iter=10)
model.fit(interactions)
# Make predictions
predictions = model.predict(user_ids, item_ids)
Competitor Comparisons
TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Pros of TensorFlow Recommenders
- Built on top of TensorFlow, offering seamless integration with the broader TensorFlow ecosystem
- Provides pre-built models and layers specifically designed for recommendation tasks
- Offers scalability for large-scale recommendation systems
Cons of TensorFlow Recommenders
- Steeper learning curve, especially for those not familiar with TensorFlow
- More complex setup and configuration compared to Spotlight's simplicity
- Potentially overkill for smaller recommendation projects
Code Comparison
Spotlight:
from spotlight.factorization.explicit import ExplicitFactorizationModel
model = ExplicitFactorizationModel(n_iter=1)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)
TensorFlow Recommenders:
import tensorflow_recommenders as tfrs
class MovieLensModel(tfrs.Model):
def __init__(self):
super().__init__()
self.ranking_model = tf.keras.Sequential([...])
self.task = tfrs.tasks.Ranking(...)
def call(self, features):
return self.ranking_model(features)
model = MovieLensModel()
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(dataset, epochs=5)
Fast Python Collaborative Filtering for Implicit Feedback Datasets
Pros of Implicit
- Faster performance, especially for large datasets
- More comprehensive set of algorithms implemented
- Better documentation and examples
Cons of Implicit
- Less flexibility for custom model architectures
- No built-in support for deep learning models
- Limited to implicit feedback scenarios
Code Comparison
Implicit:
from implicit.als import AlternatingLeastSquares
model = AlternatingLeastSquares(factors=50)
model.fit(user_item_matrix)
recommendations = model.recommend(user_id, user_item_matrix[user_id])
Spotlight:
from spotlight.factorization.implicit import ImplicitFactorizationModel
model = ImplicitFactorizationModel(n_components=50)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)
Both libraries offer easy-to-use interfaces for collaborative filtering, but Implicit focuses on traditional matrix factorization techniques, while Spotlight provides more flexibility for neural network-based approaches. Implicit is generally faster and more suitable for large-scale production environments, whereas Spotlight offers more customization options for researchers and those experimenting with novel architectures. The choice between the two depends on specific use cases, dataset sizes, and the need for customization versus performance.
A Python implementation of LightFM, a hybrid recommendation algorithm.
Pros of LightFM
- Supports both explicit and implicit feedback
- Offers hybrid recommendation capabilities, combining content and collaborative filtering
- More mature project with a larger user base and community support
Cons of LightFM
- Less flexible in terms of model architecture compared to Spotlight
- May require more manual feature engineering for optimal performance
- Limited support for deep learning-based recommendation models
Code Comparison
LightFM:
from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(interactions, epochs=10)
Spotlight:
from spotlight.factorization.explicit import ExplicitFactorizationModel
model = ExplicitFactorizationModel(n_iter=10, learning_rate=0.01)
model.fit(interactions)
Both libraries offer concise and intuitive APIs for building recommendation models. LightFM's API is slightly more compact, while Spotlight provides more flexibility in model configuration.
Spotlight, being built on PyTorch, offers greater extensibility for custom model architectures and easier integration with deep learning workflows. However, LightFM's maturity and hybrid recommendation capabilities make it a solid choice for many practical applications.
A unified, comprehensive and efficient recommendation library
Pros of RecBole
- Offers a wider range of recommendation algorithms and models
- Provides more comprehensive documentation and tutorials
- Actively maintained with frequent updates and community support
Cons of RecBole
- Steeper learning curve due to its extensive features
- Potentially slower execution for simpler recommendation tasks
- Requires more setup and configuration compared to Spotlight
Code Comparison
Spotlight example:
from spotlight import factorization
model = factorization.explicit.ExplicitFactorizationModel(n_iter=10)
model.fit(interactions)
predictions = model.predict(user_ids, item_ids)
RecBole example:
from recbole.quick_start import load_data_and_model
config, model, dataset, dataloader = load_data_and_model(model='BPR', dataset='ml-100k')
model.train()
predictions = model.predict(dataloader)
Both libraries offer concise ways to create and train recommendation models, but RecBole provides more flexibility in model selection and dataset handling. Spotlight's API is simpler for basic tasks, while RecBole's approach allows for more customization and advanced features.
A Python scikit for building and analyzing recommender systems
Pros of Surprise
- Wider range of traditional recommendation algorithms (e.g., SVD, KNN, NMF)
- Easier to use for beginners with a scikit-learn inspired API
- Better documentation and examples for getting started quickly
Cons of Surprise
- Limited support for deep learning-based recommendation models
- Less flexibility for customizing model architectures
- Slower performance for large-scale datasets compared to Spotlight
Code Comparison
Surprise example:
from surprise import SVD, Dataset
data = Dataset.load_builtin('ml-100k')
algo = SVD()
algo.fit(data.build_full_trainset())
Spotlight example:
from spotlight.interactions import Interactions
from spotlight.factorization.explicit import ExplicitFactorizationModel
dataset = Interactions.from_csv('interactions.csv')
model = ExplicitFactorizationModel()
model.fit(dataset)
Spotlight focuses on neural network-based models and provides a PyTorch backend, offering more flexibility for advanced users. Surprise, on the other hand, emphasizes traditional collaborative filtering algorithms and is more accessible for beginners. The choice between the two depends on the specific use case, dataset size, and the user's familiarity with deep learning concepts.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
.. image:: docs/_static/img/spotlight.png
.. inclusion-marker-do-not-remove
.. image:: https://travis-ci.org/maciejkula/spotlight.svg?branch=master :target: https://travis-ci.org/maciejkula/spotlight
.. image:: https://ci.appveyor.com/api/projects/status/jq5e76a7a08ra2ji/branch/master?svg=true :target: https://ci.appveyor.com/project/maciejkula/spotlight/branch/master
.. image:: https://badges.gitter.im/gitterHQ/gitter.png :target: https://gitter.im/spotlight-recommendations/Lobby
.. image:: https://anaconda.org/maciejkula/spotlight/badges/version.svg :target: https://anaconda.org/maciejkula/spotlight
.. image:: https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat :target: https://maciejkula.github.io/spotlight/
.. image:: https://img.shields.io/badge/progress%20tracker-trello-brightgreen.svg :target: https://trello.com/b/G5iFgS1W/spotlight
|
Spotlight uses PyTorch <http://pytorch.org/>
_ to build both deep and shallow
recommender models. By providing both a slew of building blocks for loss functions
(various pointwise and pairwise ranking losses), representations (shallow
factorization representations, deep sequence models), and utilities for fetching
(or generating) recommendation datasets, it aims to be a tool for rapid exploration
and prototyping of new recommender models.
See the full documentation <https://maciejkula.github.io/spotlight/>
_ for details.
Installation
.. code-block:: python
conda install -c maciejkula -c pytorch spotlight
Usage
~~~~~
Factorization models
====================
To fit an explicit feedback model on the MovieLens dataset:
.. code-block:: python
from spotlight.cross_validation import random_train_test_split
from spotlight.datasets.movielens import get_movielens_dataset
from spotlight.evaluation import rmse_score
from spotlight.factorization.explicit import ExplicitFactorizationModel
dataset = get_movielens_dataset(variant='100K')
train, test = random_train_test_split(dataset)
model = ExplicitFactorizationModel(n_iter=1)
model.fit(train)
rmse = rmse_score(model, test)
To fit an implicit ranking model with a BPR pairwise loss on the MovieLens dataset:
.. code-block:: python
from spotlight.cross_validation import random_train_test_split
from spotlight.datasets.movielens import get_movielens_dataset
from spotlight.evaluation import mrr_score
from spotlight.factorization.implicit import ImplicitFactorizationModel
dataset = get_movielens_dataset(variant='100K')
train, test = random_train_test_split(dataset)
model = ImplicitFactorizationModel(n_iter=3,
loss='bpr')
model.fit(train)
mrr = mrr_score(model, test)
Sequential models
=================
Recommendations can be seen as a sequence prediction task: given the items a user
has interacted with in the past, what will be the next item they will interact
with? Spotlight provides a range of models and utilities for fitting next item
recommendation models, including
- pooling models, as in `YouTube recommendations <https://pdfs.semanticscholar.org/bcdb/4da4a05f0e7bc17d1600f3a91a338cd7ffd3.pdf>`_,
- LSTM models, as in `Session-based recommendations... <https://arxiv.org/pdf/1511.06939>`_, and
- causal convolution models, as in `WaveNet <https://arxiv.org/pdf/1609.03499>`_.
.. code-block:: python
from spotlight.cross_validation import user_based_train_test_split
from spotlight.datasets.synthetic import generate_sequential
from spotlight.evaluation import sequence_mrr_score
from spotlight.sequence.implicit import ImplicitSequenceModel
dataset = generate_sequential(num_users=100,
num_items=1000,
num_interactions=10000,
concentration_parameter=0.01,
order=3)
train, test = user_based_train_test_split(dataset)
train = train.to_sequence()
test = test.to_sequence()
model = ImplicitSequenceModel(n_iter=3,
representation='cnn',
loss='bpr')
model.fit(train)
mrr = sequence_mrr_score(model, test)
Datasets
========
Spotlight offers a slew of popular datasets, including Movielens 100K, 1M, 10M, and 20M.
It also incorporates utilities for creating synthetic datasets. For example, `generate_sequential`
generates a Markov-chain-derived interaction dataset, where the next item a user chooses is
a function of their previous interactions:
.. code-block:: python
from spotlight.datasets.synthetic import generate_sequential
# Concentration parameter governs how predictable the chain is;
# order determins the order of the Markov chain.
dataset = generate_sequential(num_users=100,
num_items=1000,
num_interactions=10000,
concentration_parameter=0.01,
order=3)
Examples
~~~~~~~~
1. `Rating prediction on the Movielens dataset <https://github.com/maciejkula/spotlight/tree/master/examples/movielens_explicit>`_.
2. `Using causal convolutions for sequence recommendations <https://github.com/maciejkula/spotlight/tree/master/examples/movielens_sequence>`_.
3. `Bloom embedding layers <https://github.com/maciejkula/spotlight/tree/master/examples/bloom_embeddings>`_.
How to cite
~~~~~~~~~~~
Please cite Spotlight if it helps your research. You can use the following BibTeX entry:
.. code-block::
@misc{kula2017spotlight,
title={Spotlight},
author={Kula, Maciej},
year={2017},
publisher={GitHub},
howpublished={\url{https://github.com/maciejkula/spotlight}},
}
Contributing
Spotlight is meant to be extensible: pull requests are welcome. Development progress is tracked on Trello <https://trello.com/b/G5iFgS1W/spotlight>
_: have a look at the outstanding tickets to get an idea of what would be a useful contribution.
We accept implementations of new recommendation models into the Spotlight model zoo: if you've just published a paper describing your new model, or have an implementation of a model from the literature, make a PR!
Top Related Projects
TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Fast Python Collaborative Filtering for Implicit Feedback Datasets
A Python implementation of LightFM, a hybrid recommendation algorithm.
A unified, comprehensive and efficient recommendation library
A Python scikit for building and analyzing recommender systems
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot