lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.

4,744

693

4,744

157

View on GitHub

Top Related Projects

spotlight

2,981

Deep recommender models using PyTorch.

implicit

3,540

Fast Python Collaborative Filtering for Implicit Feedback Datasets

recommenders

18,977

Best Practices on Recommendation Systems

RecBole

3,326

A unified, comprehensive and efficient recommendation library

DeepCTR

7,510

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Surprise

6,376

A Python scikit for building and analyzing recommender systems

Quick Overview

LightFM is a Python implementation of a hybrid recommender system that combines collaborative filtering with content-based approaches. It's designed to handle both implicit and explicit feedback data, making it versatile for various recommendation tasks. LightFM is particularly useful for cold-start problems and can incorporate user and item metadata.

Pros

Handles both implicit and explicit feedback data
Incorporates user and item metadata for better recommendations
Efficient implementation with support for multi-core CPU training
Addresses cold-start problems effectively

Cons

Limited to matrix factorization-based models
Requires careful hyperparameter tuning for optimal performance
May not scale as well for extremely large datasets compared to some distributed systems
Documentation could be more comprehensive for advanced use cases

Code Examples

Creating and training a LightFM model:

from lightfm import LightFM
from lightfm.datasets import fetch_movielens

# Load the MovieLens 100k dataset
data = fetch_movielens(min_rating=4.0)

# Create and train the model
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(data['train'], epochs=10, num_threads=2)

Making predictions for a user:

import numpy as np

# Get predictions for a specific user
user_id = 10
n_items = data['train'].shape[1]
scores = model.predict(user_id, np.arange(n_items))

# Get top 5 item recommendations
top_items = np.argsort(-scores)[:5]
print(f"Top 5 recommendations for user {user_id}: {top_items}")

Evaluating the model:

from lightfm.evaluation import precision_at_k

# Compute precision@k for test data
test_precision = precision_at_k(model, data['test'], k=5).mean()
print(f"Test precision@5: {test_precision:.4f}")

Getting Started

To get started with LightFM, follow these steps:

Install LightFM:

pip install lightfm

Import and use LightFM in your project:

from lightfm import LightFM
from lightfm.datasets import fetch_movielens

# Load a sample dataset
data = fetch_movielens()

# Create and train a model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)

# Make predictions
predictions = model.predict(user_ids, item_ids)

For more detailed usage and advanced features, refer to the LightFM documentation.

Competitor Comparisons

spotlight

2,981

Deep recommender models using PyTorch.

Pros of Spotlight

Built on PyTorch, allowing for more flexible and dynamic model architectures
Supports GPU acceleration out-of-the-box for faster training and inference
Offers a wider range of recommendation models, including sequence-based models

Cons of Spotlight

Less mature and potentially less stable compared to LightFM
Smaller community and fewer resources available for troubleshooting
May have a steeper learning curve for users not familiar with PyTorch

Code Comparison

LightFM:

from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(interactions, epochs=10)

Spotlight:

from spotlight.factorization.explicit import ExplicitFactorizationModel
model = ExplicitFactorizationModel(n_iter=10, learning_rate=0.01)
model.fit(interactions)

Both libraries offer concise APIs for creating and training recommendation models. LightFM uses a more traditional approach, while Spotlight leverages PyTorch's capabilities for model definition and training.

implicit

3,540

Fast Python Collaborative Filtering for Implicit Feedback Datasets

Pros of implicit

Faster performance for large-scale datasets due to optimized C++ implementation
Supports more diverse recommendation models (ALS, BPR, LogisticMF)
Better documentation and examples for quick integration

Cons of implicit

Limited support for explicit feedback datasets
Fewer options for incorporating side information or metadata
Less flexibility in customizing loss functions

Code comparison

implicit:

from implicit.als import AlternatingLeastSquares

model = AlternatingLeastSquares(factors=50)
model.fit(user_item_matrix)
recommendations = model.recommend(user_id, user_item_matrix[user_id])

LightFM:

from lightfm import LightFM

model = LightFM(no_components=50, loss='warp')
model.fit(interactions, user_features=user_metadata, item_features=item_metadata)
predictions = model.predict(user_ids, item_ids)

Key differences

implicit focuses on implicit feedback, while LightFM supports both implicit and explicit feedback
LightFM allows incorporation of user and item metadata, which is not directly supported in implicit
implicit offers more specialized models for implicit feedback scenarios
LightFM provides more flexibility in terms of loss functions and model architecture

recommenders

18,977

Best Practices on Recommendation Systems

Pros of recommenders

Broader range of algorithms and techniques, including deep learning models
More comprehensive documentation and examples
Active development with frequent updates and contributions

Cons of recommenders

Steeper learning curve due to its extensive features
Heavier dependencies, potentially requiring more setup time
May be overkill for simpler recommendation tasks

Code Comparison

LightFM example:

from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(interactions, epochs=10)

recommenders example:

from recommenders.models.deeprec.models.xdeepfm import XDeepFMModel
model = XDeepFMModel(hparams, model_dir, data_dir)
model.fit(train_fn, eval_fn)

Summary

LightFM is a lightweight, focused library for factorization machines, while recommenders offers a more comprehensive suite of recommendation algorithms. LightFM is easier to get started with and suitable for simpler tasks, whereas recommenders provides more advanced options but requires more setup and learning. Choose based on your project's complexity and requirements.

RecBole

3,326

A unified, comprehensive and efficient recommendation library

Pros of RecBole

Offers a wider range of recommendation algorithms and models
Provides comprehensive evaluation metrics and tools
Includes data preprocessing and feature engineering capabilities

Cons of RecBole

Steeper learning curve due to its extensive features
May be overkill for simpler recommendation tasks
Requires more computational resources for large-scale datasets

Code Comparison

RecBole:

from recbole.quick_start import run_recbole

run_recbole(model='BPR', dataset='ml-100k')

LightFM:

from lightfm import LightFM
from lightfm.datasets import fetch_movielens

model = LightFM(loss='warp')
data = fetch_movielens(min_rating=4.0)
model.fit(data['train'], epochs=10, num_threads=2)

RecBole offers a more streamlined approach with its quick_start module, while LightFM requires more manual setup but provides finer control over the model and training process. RecBole's code is more concise, but LightFM's approach may be more familiar to users experienced with scikit-learn-style APIs.

DeepCTR

7,510

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Pros of DeepCTR

Offers a wide range of deep learning models for CTR prediction
Provides easy-to-use APIs for model training and prediction
Supports both sparse and dense features

Cons of DeepCTR

May require more computational resources due to deep learning models
Steeper learning curve for users unfamiliar with deep learning concepts

Code Comparison

LightFM example:

from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(train, epochs=10)

DeepCTR example:

from deepctr.models import DeepFM
model = DeepFM(linear_feature_columns, dnn_feature_columns)
model.compile("adam", "binary_crossentropy", metrics=['AUC'])
model.fit(train_model_input, train[target].values, batch_size=256, epochs=10)

Key Differences

LightFM focuses on factorization machines and hybrid recommender systems
DeepCTR specializes in deep learning models for CTR prediction
LightFM is more suitable for traditional collaborative filtering tasks
DeepCTR excels in scenarios with rich feature interactions and complex patterns

Both libraries have their strengths and are suited for different use cases. LightFM is more lightweight and easier to get started with, while DeepCTR offers more advanced deep learning models for CTR prediction tasks.

Surprise

6,376

A Python scikit for building and analyzing recommender systems

Pros of Surprise

More extensive documentation and tutorials
Wider range of built-in algorithms, including non-matrix factorization methods
Easier to use for beginners and researchers

Cons of Surprise

Generally slower performance, especially for large datasets
Less support for implicit feedback datasets
Limited scalability for production environments

Code Comparison

Surprise:

from surprise import SVD
from surprise import Dataset
from surprise import accuracy

data = Dataset.load_builtin('ml-100k')
algo = SVD()
predictions = algo.fit(data.build_full_trainset()).test(data.build_full_trainset().build_testset())
accuracy.rmse(predictions)

LightFM:

from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import auc_score

data = fetch_movielens()
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)
auc_score(model, data['test'], num_threads=2).mean()

Both libraries offer collaborative filtering capabilities, but Surprise is more beginner-friendly and research-oriented, while LightFM is better suited for large-scale production environments and implicit feedback datasets. Surprise provides a wider range of algorithms, while LightFM focuses on hybrid matrix factorization models with better performance for larger datasets.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

LightFM

LightFM logo

Build status
Linux
OSX (OpenMP disabled)
Windows (OpenMP disabled)

LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses. It's easy to use, fast (via multithreaded model estimation), and produces high quality results.

It also makes it possible to incorporate both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).

For more details, see the Documentation.

Need help? Contact me via email, Twitter, or Gitter.

Installation

Install from pip:

pip install lightfm

or Conda:

conda install -c conda-forge lightfm

Quickstart

Fitting an implicit feedback model on the MovieLens 100k dataset is very easy:

from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k

# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(min_rating=5.0)

# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)

# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()

Articles and tutorials on using LightFM

How to cite

Please cite LightFM if it helps your research. You can use the following BibTeX entry:

@inproceedings{DBLP:conf/recsys/Kula15,
  author    = {Maciej Kula},
  editor    = {Toine Bogers and
               Marijn Koolen},
  title     = {Metadata Embeddings for User and Item Cold-start Recommendations},
  booktitle = {Proceedings of the 2nd Workshop on New Trends on Content-Based Recommender
               Systems co-located with 9th {ACM} Conference on Recommender Systems
               (RecSys 2015), Vienna, Austria, September 16-20, 2015.},
  series    = {{CEUR} Workshop Proceedings},
  volume    = {1448},
  pages     = {14--21},
  publisher = {CEUR-WS.org},
  year      = {2015},
  url       = {http://ceur-ws.org/Vol-1448/paper4.pdf},
}

Development

Pull requests are welcome. To install for development:

Clone the repository: git clone git@github.com:lyst/lightfm.git
Setup a virtual environment: cd lightfm && python3 -m venv venv && source ./venv/bin/activate
Install it for development using pip: pip install -e . && pip install -r test-requirements.txt
You can run tests by running ./venv/bin/py.test tests.
LightFM uses black to enforce code formatting and flake8 for linting, see lint-requirements.txt.
[Optional]: You can install pre-commit to locally enfore formatting and linting. Install with:
```
pip install pre-commit
pre-commit install
```

When making changes to the .pyx extension files, you'll need to run python setup.py cythonize in order to produce the extension .c files before running pip install -e ..

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot