OpenNMT-py
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Top Related Projects
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
A general-purpose encoder-decoder framework for Tensorflow
MASS: Masked Sequence to Sequence Pre-training for Language Generation
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Quick Overview
OpenNMT-py is an open-source toolkit for neural machine translation and sequence modeling. It is the PyTorch version of the OpenNMT project, providing a flexible and modular framework for training and deploying various neural network models for sequence-to-sequence tasks.
Pros
- Highly customizable and extensible architecture
- Supports a wide range of model architectures and training techniques
- Active community and regular updates
- Comprehensive documentation and examples
Cons
- Steeper learning curve for beginners compared to some other NMT frameworks
- Can be resource-intensive for large-scale training
- Some advanced features may require in-depth knowledge of PyTorch
Code Examples
- Loading a pre-trained model and translating a sentence:
import torch
from onmt.translate import TranslationBuilder
from onmt.utils.parse import ArgumentParser
from onmt.translate.translator import build_translator
# Load model and translator
parser = ArgumentParser()
parser.add_model_args()
parser.add_translate_args()
opt = parser.parse_args(['-model', 'path/to/model.pt'])
translator = build_translator(opt, report_score=True)
# Translate a sentence
src = "Hello, how are you?"
src_data = {"src": src, "tgt": None}
batch = translator.translate_batch(src_data)
translations = translator.from_batch(batch)
print(translations[0].pred_sents[0])
- Training a new model:
from onmt.train_single import main
# Define training arguments
args = [
"-data", "path/to/data",
"-save_model", "path/to/save/model",
"-layers", "6",
"-rnn_size", "512",
"-word_vec_size", "512",
"-transformer_ff", "2048",
"-heads", "8",
"-encoder_type", "transformer",
"-decoder_type", "transformer",
"-position_encoding",
"-train_steps", "100000",
"-max_generator_batches", "2",
"-dropout", "0.1",
"-batch_size", "4096",
"-batch_type", "tokens",
"-normalization", "tokens",
"-accum_count", "2",
"-optim", "adam",
"-adam_beta2", "0.998",
"-decay_method", "noam",
"-warmup_steps", "8000",
"-learning_rate", "2",
"-max_grad_norm", "0",
"-param_init", "0",
"-param_init_glorot",
"-label_smoothing", "0.1",
"-valid_steps", "10000",
"-save_checkpoint_steps", "10000",
"-world_size", "1",
"-gpu_ranks", "0"
]
main(args)
- Preprocessing data for training:
from onmt.bin.preprocess import main
# Define preprocessing arguments
args = [
"-train_src", "path/to/train.src",
"-train_tgt", "path/to/train.tgt",
"-valid_src", "path/to/valid.src",
"-valid_tgt", "path/to/valid.tgt",
"-save_data", "path/to/processed/data",
"-src_vocab_size", "50000",
"-tgt_vocab_size", "50000",
"-src_seq_length", "100",
"-tgt_seq_length", "100"
]
main(args)
Getting Started
- Install OpenNMT-py:
pip install OpenNMT-py
- Prepare your data:
- Create source and target text files for training and validation
- Preprocess the data using the
onmt.bin.preprocess
script
- Train a model:
- Use the
onmt.bin.train
script with appropriate arguments
- Translate using the trained model:
- Use the `onmt.
Competitor Comparisons
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- More extensive model support, including advanced architectures like Transformer-XL and RoBERTa
- Better scalability for large-scale training and distributed computing
- More active development and frequent updates from Facebook AI Research team
Cons of fairseq
- Steeper learning curve and more complex codebase
- Less beginner-friendly documentation compared to OpenNMT-py
- Requires more computational resources for optimal performance
Code Comparison
fairseq:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained('/path/to/model', checkpoint_file='model.pt')
translations = model.translate(['Hello world!'])
OpenNMT-py:
from onmt.translate import TranslationServer
server = TranslationServer()
server.start(['conf.json'])
translations = server.run({"src": ["Hello world!"]})
Both repositories offer powerful neural machine translation capabilities, but fairseq provides more advanced features and scalability at the cost of increased complexity. OpenNMT-py is more accessible for beginners and smaller projects, while fairseq is better suited for large-scale research and production environments.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Broader scope: Supports a wide range of NLP tasks and models beyond machine translation
- Extensive pre-trained models: Offers a vast library of pre-trained models for various languages and tasks
- Active community: Frequent updates, contributions, and extensive documentation
Cons of transformers
- Steeper learning curve: More complex API due to its broader scope
- Potentially slower for specific tasks: OpenNMT-py may be more optimized for machine translation
Code comparison
transformers:
from transformers import MarianMTModel, MarianTokenizer
model = MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-de")
tokenizer = MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-de")
translated = model.generate(**tokenizer(["Hello, how are you?"], return_tensors="pt", padding=True))
print(tokenizer.decode(translated[0], skip_special_tokens=True))
OpenNMT-py:
import onmt
model_path = "path/to/model.pt"
translator = onmt.translate.Translator.from_opt(
onmt.utils.parse_dict_args(model_path=model_path, gpu=0)
)
translated = translator.translate(["Hello, how are you?"])
print(translated[0])
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Pros of tensor2tensor
- Broader scope, covering a wide range of machine learning tasks beyond just neural machine translation
- Tighter integration with TensorFlow ecosystem, potentially offering better performance optimizations
- More extensive library of pre-built models and datasets
Cons of tensor2tensor
- Steeper learning curve due to its broader scope and more complex architecture
- Less focused on neural machine translation specifically, which may make it harder to use for NMT-only projects
- Potentially more challenging to customize or extend for specific use cases
Code Comparison
tensor2tensor:
problem = problems.problem("translate_ende_wmt32k")
model = registry.model("transformer")
hparams = tf.contrib.training.HParams(...)
estimator = tf.estimator.Estimator(model_fn=model.estimator_model_fn, ...)
OpenNMT-py:
parser = argparse.ArgumentParser()
opts = parser.parse_args()
model = onmt.models.build_model(opts)
trainer = onmt.Trainer(model, ...)
trainer.train()
Both repositories offer powerful tools for machine translation and other NLP tasks. tensor2tensor provides a more comprehensive framework for various machine learning tasks, while OpenNMT-py focuses specifically on neural machine translation with a simpler, more straightforward API.
A general-purpose encoder-decoder framework for Tensorflow
Pros of seq2seq
- More comprehensive documentation and tutorials
- Broader range of pre-implemented models and features
- Tighter integration with TensorFlow ecosystem
Cons of seq2seq
- Less active development and community support
- More complex setup and configuration process
- Steeper learning curve for beginners
Code Comparison
seq2seq:
import tensorflow as tf
from seq2seq import models
model = models.BasicSeq2Seq(source_vocab_size, target_vocab_size)
model.train(source_data, target_data)
OpenNMT-py:
import onmt
model = onmt.models.build_model(opt, fields, checkpoint)
trainer = onmt.Trainer(model, train_loss, valid_loss, optim)
trainer.train(train_iter, valid_iter)
Both repositories offer powerful sequence-to-sequence modeling capabilities, but they cater to different user needs. seq2seq provides a wider range of pre-implemented models and features, making it suitable for users who require more advanced options out-of-the-box. However, it has a steeper learning curve and less active community support.
OpenNMT-py, on the other hand, offers a more streamlined experience with easier setup and a more active community. It may be more suitable for users who prioritize simplicity and ongoing support. The code comparison shows that both libraries have relatively straightforward APIs for basic model creation and training, but seq2seq's approach is more tightly integrated with TensorFlow.
MASS: Masked Sequence to Sequence Pre-training for Language Generation
Pros of MASS
- Specialized in masked sequence-to-sequence pre-training for language generation tasks
- Supports cross-lingual transfer learning for low-resource languages
- Achieves state-of-the-art results on various NLP tasks like machine translation and text summarization
Cons of MASS
- Less flexible compared to OpenNMT-py's modular architecture
- Limited to specific pre-training and fine-tuning approaches
- Smaller community and fewer contributions compared to OpenNMT-py
Code Comparison
MASS example:
from mass import MassForConditionalGeneration
model = MassForConditionalGeneration.from_pretrained("microsoft/mass-base-uncased")
outputs = model.generate(input_ids, max_length=50)
OpenNMT-py example:
import onmt
model = onmt.model.build_model(opt, fields, checkpoint)
translator = onmt.translate.Translator(model, fields, opt)
translations = translator.translate(src_data_iter, batch_size=opt.batch_size)
Both repositories provide powerful tools for neural machine translation and sequence-to-sequence tasks. MASS focuses on masked sequence pre-training and cross-lingual transfer, while OpenNMT-py offers a more flexible and modular approach to building various NMT architectures. The choice between them depends on specific project requirements and the desired level of customization.
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Pros of ParlAI
- Broader focus on dialogue tasks and multi-agent interactions
- Extensive library of pre-built datasets and agents
- Integrated with modern AI frameworks like PyTorch and Hugging Face
Cons of ParlAI
- Steeper learning curve due to more complex architecture
- Less specialized for pure machine translation tasks
- May be overkill for simpler NLP projects
Code Comparison
ParlAI example:
from parlai.core.agents import Agent
from parlai.core.worlds import DialogPartnerWorld
from parlai.scripts.eval_model import eval_model
agent = Agent(opt)
world = DialogPartnerWorld(opt, [agent])
eval_model(opt, agent, world)
OpenNMT-py example:
import onmt
from onmt.translate import Translator
model_path = 'model.pt'
translator = Translator.from_file(model_path)
output = translator.translate(['Hello, world!'])
ParlAI offers a more comprehensive framework for dialogue tasks, while OpenNMT-py provides a streamlined approach for machine translation. ParlAI's flexibility comes at the cost of complexity, whereas OpenNMT-py is more focused but potentially limited in scope for broader NLP applications.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Announcement: OpenNMT-py is no longer actively supported.
We started a new project Eole available on Github
It is a spin-off of OpenNMT-py in terms of features but we revamped a lot of stuff.
Eole handles NMT, LLM, Encoders as well as a new concept of Estimator within a NMT Model See this post and this news
If you are a developer, switch now. If you are a user only, then we will publish the first py-pi versions shortly.
OpenNMT-py: Open-Source Neural Machine Translation and (Large) Language Models
OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine translation (and beyond!) framework. It is designed to be research friendly to try out new ideas in translation, language modeling, summarization, and many other NLP tasks. Some companies have proven the code to be production ready.
We love contributions! Please look at issues marked with the contributions welcome tag.
Before raising an issue, make sure you read the requirements and the Full Documentation examples.
Unless there is a bug, please use the Forum or Gitter to ask questions.
For beginners:
There is a step-by-step and explained tuto (Thanks to Yasmin Moslem): Tutorial
Please try to read and/or follow before raising newbies issues.
Otherwise you can just have a look at the Quickstart steps
New:
- You will need Pytorch v2 preferably v2.2 which fixes some
scaled_dot_product_attention
issues - LLM support with converters for: Llama (+ Mistral), OpenLlama, Redpajama, MPT-7B, Falcon.
- Support for 8bit and 4bit quantization along with LoRA adapters, with or without checkpointing.
- You can finetune 7B and 13B models on a single RTX 24GB with 4-bit quantization.
- Inference can be forced in 4/8bit using the same layer quantization as in finetuning.
- Tensor parallelism when the model does not fit on one GPU's memory (both training and inference)
- Once your model is finetuned you can run inference either with OpenNMT-py or faster with CTranslate2.
- MMLU evaluation script, see results here
For all usecases including NMT, you can now use Multiquery instead of Multihead attention (faster at training and inference) and remove biases from all Linear (QKV as well as FeedForward modules).
If you used previous versions of OpenNMT-py, you can check the Changelog or the Breaking Changes
Tutorials:
- How to replicate Vicuna with a 7B or 13B llama (or Open llama, MPT-7B, Redpajama) Language Model: Tuto Vicuna
- How to finetune NLLB-200 with your dataset: Tuto Finetune NLLB-200
- How to create a simple OpenNMT-py REST Server: Tuto REST
- How to create a simple Web Interface: Tuto Streamlit
- Replicate the WMT17 en-de experiment: WMT17 ENDE
Setup
Using docker
To facilitate setup and reproducibility, some docker images are made available via the Github Container Registry: https://github.com/OpenNMT/OpenNMT-py/pkgs/container/opennmt-py
You can adapt the workflow and build your own image(s) depending on specific needs by using build.sh
and Dockerfile
in the docker
directory of the repo.
docker pull ghcr.io/opennmt/opennmt-py:3.4.3-ubuntu22.04-cuda12.1
Example oneliner to run a container and open a bash shell within it
docker run --rm -it --runtime=nvidia ghcr.io/opennmt/opennmt-py:test-ubuntu22.04-cuda12.1
Note: you need to have the Nvidia Container Toolkit (formerly nvidia-docker) installed to properly take advantage of the CUDA/GPU features.
Depending on your needs you can add various flags:
-p 5000:5000
to forward some exposed port from your container to your host;-v /some/local/directory:/some/container/directory
to mount some local directory to some container directory;--entrypoint some_command
to directly run some specific command as the container entry point (instead of the default bash shell);
Installing locally
OpenNMT-py requires:
- Python >= 3.8
- PyTorch >= 2.0 <2.2
Install OpenNMT-py
from pip
:
pip install OpenNMT-py
or from the source:
git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py
pip install -e .
Note: if you encounter a MemoryError
during installation, try to use pip
with --no-cache-dir
.
(Optional) Some advanced features (e.g. working pretrained models or specific transforms) require extra packages, you can install them with:
pip install -r requirements.opt.txt
Manual installation of some dependencies
Apex is highly recommended to have fast performance (especially the legacy fusedadam optimizer and FusedRMSNorm)
git clone https://github.com/NVIDIA/apex
cd apex
pip3 install -v --no-build-isolation --config-settings --build-option="--cpp_ext --cuda_ext --deprecated_fused_adam --xentropy --fast_multihead_attn" ./
cd ..
Flash attention:
As of Oct. 2023 flash attention 1 has been upstreamed to pytorch v2 but it is recommended to use flash attention 2 with v2.3.1 for sliding window attention support.
When using regular position_encoding=True
or Rotary with max_relative_positions=-1
OpenNMT-py will try to use an optimized dot-product path.
if you want to use flash attention then you need to manually install it first:
pip install flash-attn --no-build-isolation
if flash attention 2 is not installed, then we will use F.scaled_dot_product_attention
from pytorch 2.x
When using max_relative_positions > 0
or Alibi max_relative_positions=-2
OpenNMT-py will use its legacy code for matrix multiplications.
flash attention and F.scaled_dot_product_attention
are a bit faster and saves some GPU memory.
AWQ:
If you want to run inference or quantize an AWQ model you will need AutoAWQ.
For AutoAWQ: pip install autoawq
Documentation & FAQs
Acknowledgements
OpenNMT-py is run as a collaborative open-source project. Project was incubated by Systran and Harvard NLP in 2016 in Lua and ported to Pytorch in 2017.
Current maintainers (since 2018):
François Hernandez Vincent Nguyen (Seedfall)
Citation
If you are using OpenNMT-py for academic work, please cite the initial system demonstration paper published in ACL 2017:
@misc{klein2018opennmt,
title={OpenNMT: Neural Machine Translation Toolkit},
author={Guillaume Klein and Yoon Kim and Yuntian Deng and Vincent Nguyen and Jean Senellart and Alexander M. Rush},
year={2018},
eprint={1805.11462},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Top Related Projects
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
A general-purpose encoder-decoder framework for Tensorflow
MASS: Masked Sequence to Sequence Pre-training for Language Generation
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot