Convert Figma logo to code with AI

facebookresearch logofairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

30,331
6,389
30,331
1,283

Top Related Projects

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Quick Overview

Fairseq is a sequence modeling toolkit developed by Facebook AI Research. It provides a flexible and extensible framework for training and evaluating various sequence-to-sequence models, particularly for natural language processing tasks such as machine translation, text summarization, and speech recognition.

Pros

  • Highly modular and customizable architecture
  • Supports a wide range of state-of-the-art models and techniques
  • Efficient implementation with support for distributed training
  • Active development and maintenance by Facebook AI Research

Cons

  • Steep learning curve for beginners
  • Documentation can be sparse or outdated in some areas
  • Requires significant computational resources for large-scale tasks
  • Some features may be experimental or not thoroughly tested

Code Examples

  1. Loading a pre-trained model:
from fairseq.models.roberta import RobertaModel

roberta = RobertaModel.from_pretrained('/path/to/roberta/model', checkpoint_file='model.pt')
roberta.eval()  # disable dropout for inference
  1. Tokenizing and encoding text:
tokens = roberta.encode('Hello world!')
assert tokens.tolist() == [0, 31414, 232, 328, 2]
  1. Extracting features from the model:
# Extract the last layer's features for the given tokens
last_layer_features = roberta.extract_features(tokens)
assert last_layer_features.size() == torch.Size([1, 5, 768])
  1. Fine-tuning a model for classification:
from fairseq.models.roberta import RobertaClassificationHead

model = RobertaModel.from_pretrained('/path/to/roberta/model', checkpoint_file='model.pt')
classification_head = RobertaClassificationHead(
    input_dim=768,
    inner_dim=768,
    num_classes=2,
    activation_fn='tanh'
)
model.register_classification_head('sentence_classification_head', classification_head)

# Now you can fine-tune the model on your classification task

Getting Started

To get started with Fairseq, follow these steps:

  1. Install Fairseq:
pip install fairseq
  1. Download a pre-trained model:
wget https://dl.fbaipublicfiles.com/fairseq/models/roberta.base.tar.gz
tar -xzvf roberta.base.tar.gz
  1. Use the model in your Python script:
from fairseq.models.roberta import RobertaModel

roberta = RobertaModel.from_pretrained('./roberta.base', checkpoint_file='model.pt')
roberta.eval()

tokens = roberta.encode('Hello world!')
features = roberta.extract_features(tokens)

This will load a pre-trained RoBERTa model and extract features from the input text. You can then use these features for various downstream tasks or fine-tune the model for your specific application.

Competitor Comparisons

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Pros of fairseq

  • Extensive documentation and examples
  • Large community support and active development
  • Wide range of pre-trained models available

Cons of fairseq

  • Can be complex for beginners
  • Requires more computational resources
  • May have a steeper learning curve

Code Comparison

fairseq:

from fairseq.models.transformer import TransformerModel

model = TransformerModel.from_pretrained('/path/to/model', checkpoint_file='model.pt')
model.translate('Hello world!')

Both repositories are actually the same project (fairseq), so there is no code comparison to be made between them. The repository facebookresearch/fairseq is the main and only repository for the fairseq project.

Additional Notes

fairseq is a powerful sequence modeling toolkit that supports training custom models for translation, summarization, language modeling, and other text generation tasks. It provides a flexible and modular codebase for developing state-of-the-art natural language processing models.

The project is widely used in both research and industry, offering a comprehensive set of tools and pre-trained models. While it may require some initial effort to understand and set up, fairseq's capabilities and community support make it a valuable resource for NLP practitioners and researchers.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of Transformers

  • Broader model support across various architectures and tasks
  • More extensive documentation and community support
  • Easier integration with popular deep learning frameworks

Cons of Transformers

  • Can be slower for certain tasks compared to Fairseq
  • May have a steeper learning curve for beginners
  • Less focus on specialized sequence-to-sequence tasks

Code Comparison

Fairseq example:

from fairseq.models.roberta import RobertaModel

roberta = RobertaModel.from_pretrained('path/to/roberta.base')
tokens = roberta.encode('Hello world!')
features = roberta.extract_features(tokens)

Transformers example:

from transformers import RobertaTokenizer, RobertaModel

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
model = RobertaModel.from_pretrained('roberta-base')
inputs = tokenizer('Hello world!', return_tensors='pt')
outputs = model(**inputs)

Both repositories offer powerful tools for working with transformer models, but Transformers provides a more versatile and user-friendly experience across a wider range of tasks and architectures. Fairseq, on the other hand, excels in specific sequence-to-sequence applications and may offer performance advantages in certain scenarios.

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

Pros of OpenNMT-py

  • More user-friendly and easier to get started with for beginners
  • Extensive documentation and tutorials available
  • Flexible and modular architecture for easy customization

Cons of OpenNMT-py

  • Less optimized for large-scale training compared to fairseq
  • Fewer pre-trained models and benchmarks available
  • Limited support for some advanced NLP tasks

Code Comparison

OpenNMT-py:

import onmt

model = onmt.models.build_model(opt, fields, checkpoint)
translator = onmt.translate.Translator(model, fields, opt)
translated = translator.translate(src_data, src_dir=opt.src_dir, batch_size=opt.batch_size)

fairseq:

from fairseq.models.transformer import TransformerModel

en2de = TransformerModel.from_pretrained('/path/to/model', checkpoint_file='model.pt')
en2de.translate('Hello world!')

Both repositories offer powerful tools for neural machine translation and sequence-to-sequence tasks. OpenNMT-py is more beginner-friendly and flexible, while fairseq provides better performance for large-scale training and offers more pre-trained models. The code examples demonstrate the different approaches to loading and using models in each library.

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Pros of tensor2tensor

  • More comprehensive library with a wider range of models and tasks
  • Better integration with TensorFlow ecosystem
  • More active community and frequent updates

Cons of tensor2tensor

  • Steeper learning curve due to complexity
  • Less focus on specific sequence-to-sequence tasks
  • Potentially slower training and inference compared to PyTorch-based fairseq

Code Comparison

tensor2tensor:

import tensorflow as tf
import tensor2tensor as t2t

problem = t2t.problems.problem("translate_ende_wmt32k")
model = t2t.models.transformer.Transformer(problem.get_hparams())

fairseq:

from fairseq.models.transformer import TransformerModel

model = TransformerModel.from_pretrained(
    'data-bin/wmt14_en_de',
    checkpoint_file='checkpoint_best.pt',
    data_name_or_path='data-bin/wmt14_en_de'
)

Both libraries provide high-level APIs for working with transformer models, but tensor2tensor offers a more generalized approach, while fairseq focuses on sequence-to-sequence tasks. tensor2tensor integrates closely with TensorFlow, whereas fairseq is built on PyTorch, which may influence performance and ease of use depending on your familiarity with these frameworks.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Pros of text-to-text-transfer-transformer

  • Unified approach for multiple NLP tasks
  • Extensive pre-training on diverse datasets
  • Flexible and adaptable to various downstream tasks

Cons of text-to-text-transfer-transformer

  • Higher computational requirements for training and inference
  • Less specialized for specific NLP tasks
  • Steeper learning curve for newcomers

Code Comparison

text-to-text-transfer-transformer:

import t5
model = t5.load_t5_model("t5-base")
inputs = ["translate English to German: Hello, how are you?"]
outputs = model.predict(inputs)

fairseq:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained('transformer.wmt19.en-de')
translated = model.translate('Hello, how are you?')

Both repositories offer powerful tools for natural language processing tasks. text-to-text-transfer-transformer provides a more versatile approach, suitable for a wide range of NLP tasks, while fairseq offers more specialized models and tools for specific tasks like machine translation. The choice between them depends on the specific requirements of your project and the level of customization needed.

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

  • Focuses on optimizing large-scale model training, offering better performance for very large models
  • Provides more advanced distributed training features, including ZeRO optimizer stages
  • Offers a more flexible API for customizing training pipelines

Cons of DeepSpeed

  • Steeper learning curve due to its focus on advanced optimization techniques
  • Less comprehensive documentation compared to Fairseq
  • Primarily designed for PyTorch, while Fairseq supports multiple frameworks

Code Comparison

DeepSpeed:

model_engine, optimizer, _, _ = deepspeed.initialize(
    args=args,
    model=model,
    model_parameters=model.parameters()
)

Fairseq:

trainer = Trainer(args, task, model, criterion)
trainer.train()

DeepSpeed focuses on initializing the model with its optimization features, while Fairseq provides a higher-level abstraction for training. DeepSpeed offers more fine-grained control over the training process, whereas Fairseq simplifies the training setup with its Trainer class.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README



Support Ukraine MIT License Latest Release Build Status Documentation Status CicleCI Status


Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

List of implemented papers

What's New:

Previous updates

Features:

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.

Requirements and Installation

  • PyTorch version >= 1.10.0
  • Python version >= 3.8
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

# to install the latest stable release (0.10.x)
# pip install fairseq
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
  • For large datasets install PyArrow: pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}