RecBole

A unified, comprehensive and efficient recommendation library

3,816

674

3,816

333

View on GitHub

Top Related Projects

recommenders

20,552

Best Practices on Recommendation Systems

DeepCTR

7,828

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

librec

3,257

LibRec: A Leading Java Library for Recommender Systems, see

lightfm

4,975

A Python implementation of LightFM, a hybrid recommendation algorithm.

implicit

3,697

Fast Python Collaborative Filtering for Implicit Feedback Datasets

Surprise

6,642

A Python scikit for building and analyzing recommender systems

Quick Overview

RecBole is an open-source and unified framework for building, evaluating, and deploying recommender systems. It provides a comprehensive set of state-of-the-art recommendation models, evaluation protocols, and data processing tools, making it easier for researchers and practitioners to develop and compare different recommendation algorithms.

Pros

Comprehensive: RecBole offers a wide range of state-of-the-art recommendation models, including collaborative filtering, content-based, and hybrid approaches, covering a diverse set of recommendation scenarios.
Unified Framework: The project provides a unified and modular design, allowing for easy integration of new models, datasets, and evaluation metrics, promoting extensibility and flexibility.
Efficient and Scalable: RecBole is designed to be efficient and scalable, with optimized data processing and model training capabilities, enabling the handling of large-scale recommendation tasks.
Reproducibility: The project emphasizes reproducibility, with detailed documentation, standardized evaluation protocols, and the ability to easily replicate published results.

Cons

Limited Real-world Datasets: While RecBole provides a good selection of benchmark datasets, the availability of real-world, large-scale datasets may be limited, which could hinder the evaluation of models in practical scenarios.
Steep Learning Curve: The comprehensive nature of the framework and the wide range of features it offers may present a steep learning curve for new users, especially those unfamiliar with recommender systems.
Potential Maintenance Challenges: As an open-source project, the long-term maintenance and community support for RecBole may pose challenges, especially as the project continues to evolve and new features are added.
Limited Deployment Support: While RecBole focuses on the development and evaluation of recommender systems, it may lack robust deployment and integration capabilities, which could be a concern for practitioners looking to deploy models in production environments.

Code Examples

from recbole.config import Config
from recbole.data import create_dataset
from recbole.model.general_recommender import BPR
from recbole.trainer import Trainer

# Load configuration
config = Config(model='BPR', dataset='ml-100k')

# Create dataset
dataset = create_dataset(config)

# Initialize model
model = BPR(config, dataset).to(config['device'])

# Train model
trainer = Trainer(config, model)
trainer.fit(dataset)

This code demonstrates the basic usage of RecBole, including loading the configuration, creating the dataset, initializing the BPR model, and training the model using the Trainer module.

from recbole.utils import init_seed
from recbole.data.dataset import Dataset
from recbole.data.dataloader import DataLoader

# Set random seed for reproducibility
init_seed(2023)

# Load dataset
dataset = Dataset(config)

# Split dataset into train, valid, and test sets
train_loader, valid_loader, test_loader = DataLoader(dataset).get_loaders()

This code shows how to load a dataset, split it into train, validation, and test sets, and create the corresponding data loaders using the RecBole framework.

from recbole.evaluator import Evaluator

# Initialize evaluator
evaluator = Evaluator(config)

# Evaluate model
valid_result = evaluator.evaluate(model, valid_loader)
test_result = evaluator.evaluate(model, test_loader)

print('Valid result:', valid_result)
print('Test result:', test_result)

This code demonstrates how to use the Evaluator module in RecBole to evaluate a trained model on the validation and test sets, and print the evaluation results.

Getting Started

To get started with RecBole, follow these steps:

Install the required dependencies:
```
pip install recbole
```

Import the necessary modules and create a configuration object:

from recbole.config import Config
from recbole.data import create_dataset
from recbole.model.general_recommender import BPR
from recbole.trainer import Trainer

config = Config(model='BPR', dataset='ml-100k')

Load the dataset and create the necessary data loaders:
```
dataset = create_
```

Competitor Comparisons

recommenders

20,552

Best Practices on Recommendation Systems

Pros of recommenders

More comprehensive documentation and examples
Broader range of algorithms, including deep learning models
Active community support and regular updates

Cons of recommenders

Steeper learning curve for beginners
Heavier dependencies, potentially slower setup

Code comparison

RecBole:

from recbole.quick_start import run_recbole

run_recbole(model='BPR', dataset='ml-100k')

recommenders:

from recommenders.models.bpr.bpr_recommender import BPR
from recommenders.datasets.amazon_reviews import download_and_extract

data = download_and_extract()
model = BPR(n_factors=100, n_iterations=10, learning_rate=0.01, lambda_reg=0.1)
model.fit(data)

Summary

RecBole offers a more streamlined approach with a unified interface for various recommendation models, making it easier for beginners to get started. It focuses primarily on traditional recommendation algorithms.

recommenders provides a wider range of algorithms, including deep learning models, and offers more extensive documentation. However, it may require more setup time and has a steeper learning curve.

Both libraries are actively maintained and offer valuable tools for recommendation system development, with the choice depending on specific project requirements and user expertise.

DeepCTR

7,828

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Pros of DeepCTR

Focuses specifically on Click-Through Rate (CTR) prediction models
Offers a wide range of pre-implemented deep learning CTR models
Provides easy-to-use APIs for quick model implementation and experimentation

Cons of DeepCTR

Limited to CTR prediction tasks, less versatile for other recommendation scenarios
Fewer data processing and evaluation tools compared to RecBole
Smaller community and less frequent updates

Code Comparison

DeepCTR:

from deepctr.models import DeepFM
from deepctr.feature_column import SparseFeat, DenseFeat

model = DeepFM(linear_feature_columns, dnn_feature_columns)
model.compile("adam", "binary_crossentropy", metrics=['binary_crossentropy'])

RecBole:

from recbole.quick_start import run_recbole

run_recbole(model='DeepFM', dataset='ml-100k')

DeepCTR provides a more granular approach to model configuration, allowing users to specify feature columns explicitly. RecBole offers a higher-level API for quick experimentation, abstracting away many implementation details. RecBole's approach is more user-friendly for beginners but may offer less flexibility for advanced users compared to DeepCTR.

librec

3,257

LibRec: A Leading Java Library for Recommender Systems, see

Pros of LibRec

More mature project with longer development history
Extensive documentation and user guide
Larger community and more contributors

Cons of LibRec

Less frequent updates and maintenance
Fewer built-in datasets and models
Java-based, which may be less preferred for some data scientists

Code Comparison

LibRec (Java):

DataModel dataModel = new TextDataModel(conf);
RecommenderContext context = new RecommenderContext(conf, dataModel);
Recommender recommender = new ItemKNNRecommender();
recommender.recommend(context);

RecBole (Python):

config = Config(model='ItemKNN', dataset='ml-100k')
dataset = create_dataset(config)
model = get_model(config['model'])(config, dataset)
trainer = Trainer(config, model)

Both libraries offer similar functionality for building recommender systems, but RecBole provides a more modern, Python-based approach with easier integration into data science workflows. LibRec, being Java-based, may be more suitable for enterprise environments or projects with existing Java infrastructure. RecBole offers more frequent updates and a wider range of built-in models and datasets, while LibRec has a more established community and comprehensive documentation.

lightfm

4,975

A Python implementation of LightFM, a hybrid recommendation algorithm.

Pros of LightFM

Lightweight and efficient implementation, particularly suitable for large-scale recommendation tasks
Supports both explicit and implicit feedback
Incorporates both collaborative and content-based filtering approaches

Cons of LightFM

Limited variety of recommendation algorithms compared to RecBole
Less comprehensive documentation and fewer examples
Smaller community and less frequent updates

Code Comparison

LightFM:

from lightfm import LightFM
model = LightFM(learning_rate=0.05, loss='warp')
model.fit(train, epochs=10)

RecBole:

from recbole.quick_start import run_recbole
run_recbole(model='BPR', dataset='ml-100k')

LightFM focuses on a specific hybrid matrix factorization approach, while RecBole offers a wider range of algorithms and more flexibility in configuration. RecBole provides a higher-level API for quick experimentation, whereas LightFM requires more manual setup but offers finer control over the model parameters.

implicit

3,697

Fast Python Collaborative Filtering for Implicit Feedback Datasets

Pros of implicit

Specialized focus on implicit feedback models, offering high performance for this specific use case
Efficient implementation using Cython, resulting in faster computations
Simpler API and easier to get started for users familiar with implicit feedback scenarios

Cons of implicit

Limited to implicit feedback models, lacking support for explicit feedback or other recommendation types
Fewer model options compared to RecBole's extensive collection
Less comprehensive documentation and tutorials

Code comparison

implicit:

import implicit
model = implicit.als.AlternatingLeastSquares()
model.fit(user_item_matrix)
recommendations = model.recommend(user_id, user_item_matrix[user_id])

RecBole:

from recbole.quick_start import load_data_and_model
config, model, dataset, dataloader = load_data_and_model(model='ALS')
model.fit(train_data)
scores, indices = model.full_sort_predict(user_id)

Both libraries offer concise ways to train and make recommendations, but RecBole provides a more standardized approach across different models. implicit's API is more straightforward for its specific use case, while RecBole offers greater flexibility and consistency across various recommendation algorithms.

Surprise

6,642

A Python scikit for building and analyzing recommender systems

Pros of Surprise

Simpler and more lightweight, making it easier to learn and use for beginners
Focuses specifically on rating prediction, which can be advantageous for certain use cases
Well-documented with clear examples and tutorials

Cons of Surprise

Limited in scope compared to RecBole, with fewer algorithms and features
Less suitable for large-scale production environments or complex recommendation tasks
Lacks support for deep learning-based recommendation models

Code Comparison

Surprise:

from surprise import SVD
from surprise import Dataset
from surprise import accuracy

data = Dataset.load_builtin('ml-100k')
algo = SVD()
predictions = algo.fit(data.build_full_trainset()).test(data.build_full_testset())
accuracy.rmse(predictions)

RecBole:

from recbole.quick_start import run_recbole

parameter_dict = {
    'model': 'SVD',
    'dataset': 'ml-100k'
}
run_recbole(model=parameter_dict['model'], dataset=parameter_dict['dataset'])

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

RecBole (ä¼¯ä¹)

RecBole is developed based on Python and PyTorch for reproducing and developing recommendation algorithms in a unified, comprehensive and efficient framework for research purpose. Our library includes 94 recommendation algorithms, covering four major categories:

General Recommendation
Sequential Recommendation
Context-aware Recommendation
Knowledge-based Recommendation

We design a unified and flexible data file format, and provide the support for 44 benchmark recommendation datasets. A user can apply the provided script to process the original data copy, or simply download the processed datasets by our team.

RecBole v0.1 architecture
Figure: RecBole Overall Architecture

In order to support the study of recent advances in recommender systems, we construct an extended recommendation library RecBole2.0 consisting of 8 packages for up-to-date topics and architectures (e.g., debiased, fairness and GNNs).

Feature

General and extensible data structure. We design general and extensible data structures to unify the formatting and usage of various recommendation datasets.
Comprehensive benchmark models and datasets. We implement 94 commonly used recommendation algorithms, and provide the formatted copies of 44 recommendation datasets.
Efficient GPU-accelerated execution. We optimize the efficiency of our library with a number of improved techniques oriented to the GPU environment.
Extensive and standard evaluation protocols. We support a series of widely adopted evaluation protocols or settings for testing and comparing recommendation algorithms.

RecBole News

new 02/23/2025: We release RecBole v1.2.1.

new 11/01/2023: We release RecBole v1.2.0.

11/06/2022: We release the optimal hyperparameters of the model and their tuning ranges.

10/05/2022: We release RecBole v1.1.1.

06/28/2022: We release RecBole2.0 with 8 packages consisting of 65 newly implement models.

02/25/2022: We release RecBole v1.0.1.

09/17/2021: We release RecBole v1.0.0.

03/22/2021: We release RecBole v0.2.1.

01/15/2021: We release RecBole v0.2.0.

12/10/2020: æä»¬åå¸äºRecBoleå°ç½å¥é¨ç³»åä¸æåå®¢ï¼æç»æ´æ°ä¸ï¼ ã

12/06/2020: We release RecBole v0.1.2.

11/29/2020: We constructed preliminary experiments to test the time and memory cost on three different-sized datasets and provided the test result for reference.

11/03/2020: We release the first version of RecBole v0.1.1.

Latest Update for SIGIR 2023 Submission

To better meet the user requirements and contribute to the research community, we present a significant update of RecBole in the latest version, making it more user-friendly and easy-to-use as a comprehensive benchmark library for recommendation. We summarize these updates in "Towards a More User-Friendly and Easy-to-Use Benchmark Library for Recommender Systems" and submit the paper to SIGIR 2023. The main contribution in this update is introduced below.

Our extensions are made in three major aspects, namely the models/datasets, the framework, and the configurations. Furthermore, we provide more comprehensive documentation and well-organized FAQ for the usage of our library, which largely improves the user experience. More specifically, the highlights of this update are summarized as:

We introduce more operations and settings to help benchmarking the recommendation domain.
We improve the user friendliness of our library by providing more detailed documentation and well-organized frequently asked questions.
We point out several development guidelines for the open-source library developers.

These extensions make it much easier to reproduce the benchmark results and stay up-to-date with the recent advances on recommender systems. The datailed comparison between this update and previous versions is listed below.

Aspect	RecBole 1.0	RecBole 2.0	This update
Recommendation tasks	4 categories	3 topics and 5 packages	4 categories
Models and datasets	73 models and 28 datasets	65 models and 8 new datasets	94 models and 43 datasets
Data structure	Implemented Dataset and Dataloader	Task-oriented	Compatible data module inherited from PyTorch
Continuous features	Field embedding	Field embedding	Field embedding and discretization
GPU-accelerated execution	Single-GPU utilization	Single-GPU utilization	Multi-GPU and mixed precision training
Hyper-parameter tuning	Serial gradient search	Serial gradient search	Three search methods in both serial and parallel
Significance test	-	-	Available interface
Benchmark results	-	Partially public (GNN and CDR)	Benchmark configurations on 94 models
Friendly usage	Documentation	Documentation	Improved documentation and FAQ page

Installation

RecBole works with the following operating systems:

Linux
Windows 10
macOS X

RecBole requires Python version 3.7 or later.

RecBole requires torch version 1.7.0 or later. If you want to use RecBole with GPU, please ensure that CUDA or cudatoolkit version is 9.2 or later. This requires NVIDIA driver version >= 396.26 (for Linux) or >= 397.44 (for Windows10).

Install from conda

conda install -c aibox recbole

Install from pip

pip install recbole

Install from source

git clone https://github.com/RUCAIBox/RecBole.git && cd RecBole
pip install -e . --verbose

Quick-Start

With the source code, you can use the provided script for initial usage of our library:

python run_recbole.py

This script will run the BPR model on the ml-100k dataset.

Typically, this example takes less than one minute. We will obtain some output like:

INFO ml-100k
The number of users: 944
Average actions of users: 106.04453870625663
The number of items: 1683
Average actions of items: 59.45303210463734
The number of inters: 100000
The sparsity of the dataset: 93.70575143257098%
INFO Evaluation Settings:
Group by user_id
Ordering: {'strategy': 'shuffle'}
Splitting: {'strategy': 'by_ratio', 'ratios': [0.8, 0.1, 0.1]}
Negative Sampling: {'strategy': 'full', 'distribution': 'uniform'}
INFO BPRMF(
    (user_embedding): Embedding(944, 64)
    (item_embedding): Embedding(1683, 64)
    (loss): BPRLoss()
)
Trainable parameters: 168128
INFO epoch 0 training [time: 0.27s, train loss: 27.7231]
INFO epoch 0 evaluating [time: 0.12s, valid_score: 0.021900]
INFO valid result:
recall@10: 0.0073  mrr@10: 0.0219  ndcg@10: 0.0093  hit@10: 0.0795  precision@10: 0.0088
...
INFO epoch 63 training [time: 0.19s, train loss: 4.7660]
INFO epoch 63 evaluating [time: 0.08s, valid_score: 0.394500]
INFO valid result:
recall@10: 0.2156  mrr@10: 0.3945  ndcg@10: 0.2332  hit@10: 0.7593  precision@10: 0.1591
INFO Finished training, best eval result in epoch 52
INFO Loading model structure and parameters from saved/***.pth
INFO best valid result:
recall@10: 0.2169  mrr@10: 0.4005  ndcg@10: 0.235  hit@10: 0.7582  precision@10: 0.1598
INFO test result:
recall@10: 0.2368  mrr@10: 0.4519  ndcg@10: 0.2768  hit@10: 0.7614  precision@10: 0.1901

If you want to change the parameters, such as learning_rate, embedding_size, just set the additional command parameters as you need:

python run_recbole.py --learning_rate=0.0001 --embedding_size=128

If you want to change the models, just run the script by setting additional command parameters:

python run_recbole.py --model=[model_name]

Auto-tuning Hyperparameter

Open RecBole/hyper.test and set several hyperparameters to auto-searching in parameter list. The following has two ways to search best hyperparameter:

loguniform: indicates that the parameters obey the uniform distribution, randomly taking values from e^{-8} to e^{0}.
choice: indicates that the parameter takes discrete values from the setting list.

Here is an example for hyper.test:

learning_rate loguniform -8, 0
embedding_size choice [64, 96 , 128]
train_batch_size choice [512, 1024, 2048]
mlp_hidden_size choice ['[64, 64, 64]','[128, 128]']

Set training command parameters as you need to run:

python run_hyper.py --model=[model_name] --dataset=[data_name] --config_files=xxxx.yaml --params_file=hyper.test
e.g.
python run_hyper.py --model=BPR --dataset=ml-100k --config_files=test.yaml --params_file=hyper.test

Note that --config_files=test.yaml is optional, if you don't have any customize config settings, this parameter can be empty.

This processing maybe take a long time to output best hyperparameter and result:

running parameters:                                                                                                                    
{'embedding_size': 64, 'learning_rate': 0.005947474154838498, 'mlp_hidden_size': '[64,64,64]', 'train_batch_size': 512}                
  0%|                                                                                           | 0/18 [00:00<?, ?trial/s, best loss=?]

More information about parameter tuning can be found in our docs.

Time and Memory Costs

We constructed preliminary experiments to test the time and memory cost on three different-sized datasets (small, medium and large). For detailed information, you can click the following links.

NOTE: Our test results only gave the approximate time and memory cost of our implementations in the RecBole library (based on our machine server). Any feedback or suggestions about the implementations and test are welcome. We will keep improving our implementations, and update these test results.

RecBole Major Releases

Releases	Date
v1.2.1	02/23/2025
v1.2.0	11/01/2023
v1.1.1	10/05/2022
v1.0.0	09/17/2021
v0.2.0	01/15/2021
v0.1.1	11/03/2020

Open Source Contributions

As a one-stop framework from data processing, model development, algorithm training to scientific evaluation, RecBole has a total of 11 related GitHub projects including

two versions of RecBole (RecBole 1.0 and RecBole 2.0);
8 benchmarking packages (RecBole-MetaRec, RecBole-DA, RecBole-Debias, RecBole-FairRec, RecBole-CDR, RecBole-TRM, RecBole-GNN and RecBole-PJF);
dataset repository (RecSysDatasets).

In the following table, we summarize the open source contributions of GitHub projects based on RecBole.

Projects	Stars	Forks	Issues	Pull requests
RecBole
RecBole2.0
RecBole-DA
RecBole-MetaRec
RecBole-Debias
RecBole-FairRec
RecBole-CDR
RecBole-GNN
RecBole-TRM
RecBole-PJF
RecSysDatasets

Contributing

Please let us know if you encounter a bug or have any suggestions by filing an issue.

We welcome all contributions from bug fixes to new features and extensions.

We expect all contributions discussed in the issue tracker and going through PRs.

We thank the insightful suggestions from @tszumowski, @rowedenny, @deklanw et.al.

We thank the nice contributions through PRs from @rowedennyï¼@deklanw et.al.

Cite

If you find RecBole useful for your research or development, please cite the following papers: RecBole[1.0], RecBole[2.0] and RecBole[1.2.1].

@inproceedings{recbole[1.0],
  author    = {Wayne Xin Zhao and Shanlei Mu and Yupeng Hou and Zihan Lin and Yushuo Chen and Xingyu Pan and Kaiyuan Li and Yujie Lu and Hui Wang and Changxin Tian and Yingqian Min and Zhichao Feng and Xinyan Fan and Xu Chen and Pengfei Wang and Wendi Ji and Yaliang Li and Xiaoling Wang and Ji{-}Rong Wen},
  title     = {RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms},
  booktitle = {{CIKM}},
  pages     = {4653--4664},
  publisher = {{ACM}},
  year      = {2021}
}
@inproceedings{recbole[2.0],
  author    = {Wayne Xin Zhao and Yupeng Hou and Xingyu Pan and Chen Yang and Zeyu Zhang and Zihan Lin and Jingsen Zhang and Shuqing Bian and Jiakai Tang and Wenqi Sun and Yushuo Chen and Lanling Xu and Gaowei Zhang and Zhen Tian and Changxin Tian and Shanlei Mu and Xinyan Fan and Xu Chen and Ji{-}Rong Wen},
  title     = {RecBole 2.0: Towards a More Up-to-Date Recommendation Library},
  booktitle = {{CIKM}},
  pages     = {4722--4726},
  publisher = {{ACM}},
  year      = {2022}
}
@inproceedings{recbole[1.2.1],
  author    = {Lanling Xu and Zhen Tian and Gaowei Zhang and Junjie Zhang and Lei Wang and Bowen Zheng and Yifan Li and Jiakai Tang and Zeyu Zhang and Yupeng Hou and Xingyu Pan and Wayne Xin Zhao and Xu Chen and Ji{-}Rong Wen},
  title     = {Towards a More User-Friendly and Easy-to-Use Benchmark Library for Recommender Systems},
  booktitle = {{SIGIR}},
  pages     = {2837--2847},
  publisher = {{ACM}},
  year      = {2023}
}

The Team

RecBole is developed by RUC, BUPT, ECNU, and maintained by RUC.

Here is the list of our lead developers in each development phase. They are the souls of RecBole and have made outstanding contributions.

Time	Version	Lead Developers	Paper
June 2020 ~ Nov. 2020	v0.1.1	Shanlei Mu (@ShanleiMu), Yupeng Hou (@hyp1231), Zihan Lin (@linzihan-backforward), Kaiyuan Li (@tsotfsk)	PDF
Nov. 2020 ~ Jul. 2022	v0.1.2 ~ v1.0.1	Yushuo Chen (@chenyushuo), Xingyu Pan (@2017pxy)	PDF
Jul. 2022 ~ Nov. 2023	v1.1.0 ~ v1.1.1	Lanling Xu (@Sherry-XLL), Zhen Tian (@chenyuwuxin), Gaowei Zhang (@Wicknight), Lei Wang (@Paitesanshi), Junjie Zhang (@leoleojie)	PDF
Nov. 2023 ~ Feb. 2025	v1.2.0	Bowen Zheng (@zhengbw0324), Chen Ma (@Yilu114)	PDF
Feb. 2025 ~ now	v1.2.1	Enze Liu (@BishopLiu), Kesha Ou (@TayTroye), Bingqian Li (@Fotiligner)	PDF

License

RecBole uses MIT License. All data and code in this project can only be used for academic purposes.

Acknowledgments

This project was supported by National Natural Science Foundation of China (No. 61832017).

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of recommenders

Cons of recommenders

Code comparison

Summary

Pros of DeepCTR

Cons of DeepCTR

Code Comparison

Pros of LibRec

Cons of LibRec

Code Comparison

Pros of LightFM

Cons of LightFM

Code Comparison

Pros of implicit

Cons of implicit

Code comparison

Pros of Surprise

Cons of Surprise

Code Comparison

Convert designs to code with AI

README

RecBole (ä¼¯ä¹)

Feature

RecBole News

Latest Update for SIGIR 2023 Submission

Installation

Install from conda

Install from pip

Install from source

Quick-Start

Auto-tuning Hyperparameter

Time and Memory Costs

RecBole Major Releases

Open Source Contributions

Contributing

Cite

The Team

License

Acknowledgments

Top Related Projects

Convert designs to code with AI

RecBole (ä¼¯ä¹)