Top Related Projects
Models and examples built with TensorFlow
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Datasets, Transforms and Models specific to Computer Vision
Deep Learning for humans
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Quick Overview
The Sarasra/models repository is a collection of machine learning models and related utilities, primarily focused on natural language processing (NLP) tasks. It includes implementations of various neural network architectures and pre-trained models for tasks such as text classification, language modeling, and sequence-to-sequence learning.
Pros
- Diverse range of NLP models and techniques
- Well-documented code and examples
- Regular updates and contributions from the community
- Integration with popular deep learning frameworks like TensorFlow
Cons
- Some models may be outdated compared to state-of-the-art techniques
- Limited support for non-NLP tasks
- Inconsistent coding style across different models
- Lack of comprehensive benchmarking results for all models
Code Examples
- Loading a pre-trained BERT model for text classification:
import tensorflow as tf
from models.official.nlp.bert import bert_models
from models.official.nlp.bert import configs as bert_configs
# Load BERT configuration
bert_config = bert_configs.BertConfig.from_json_file('bert_config.json')
# Create BERT model
model = bert_models.classifier_model(bert_config, num_labels=2)
# Load pre-trained weights
checkpoint = tf.train.Checkpoint(model=model)
checkpoint.restore('bert_model.ckpt').expect_partial()
- Fine-tuning a transformer model for sequence-to-sequence tasks:
from models.official.nlp.transformer import transformer
from models.official.nlp.transformer import optimizer
# Create transformer model
model = transformer.create_model(params, is_train=True)
# Define optimizer
learning_rate = optimizer.LearningRateSchedule(params.learning_rate, params.hidden_size)
opt = tf.keras.optimizers.Adam(learning_rate, beta_1=0.9, beta_2=0.98, epsilon=1e-9)
# Compile model
model.compile(opt, loss=transformer.loss_function, metrics=[transformer.padded_accuracy])
# Fine-tune model
model.fit(train_dataset, epochs=params.train_epochs, validation_data=val_dataset)
- Using a pre-trained language model for text generation:
from models.official.nlp.modeling import models
from models.official.nlp.data import data_loader_factory
# Load tokenizer and model
tokenizer = data_loader_factory.get_tokenizer(params)
model = models.BertGenerator(params)
# Generate text
input_ids = tokenizer.encode("The quick brown fox")
output = model.generate(input_ids, max_length=50)
generated_text = tokenizer.decode(output[0])
print(generated_text)
Getting Started
To get started with the Sarasra/models repository:
-
Clone the repository:
git clone https://github.com/Sarasra/models.git cd models
-
Install dependencies:
pip install -r requirements.txt
-
Choose a model and task from the
official
orresearch
directories. -
Follow the README and example scripts in the chosen model's directory to train or use the model for your specific task.
Competitor Comparisons
Models and examples built with TensorFlow
Pros of models
- Larger community and more active development
- Wider range of model implementations across various domains
- Better documentation and examples for getting started
Cons of models
- Can be overwhelming due to the large number of models and examples
- May have more dependencies and complexity for simple use cases
- Potentially slower to incorporate cutting-edge research compared to smaller repos
Code comparison
models:
import tensorflow as tf
from official.nlp import bert
import official.nlp.bert.tokenization as tokenization
tokenizer = tokenization.FullTokenizer(vocab_file=vocab_file, do_lower_case=True)
model = bert.bert_models.get_transformer_encoder(...)
Sarasra/models:
import tensorflow as tf
from models import modeling
from models import tokenization
tokenizer = tokenization.FullTokenizer(vocab_file=vocab_file, do_lower_case=True)
model = modeling.BertModel(config=bert_config, is_training=is_training, ...)
Both repositories provide implementations of BERT and other models, but models offers a more extensive collection of pre-trained models and examples across various domains. Sarasra/models focuses primarily on BERT and related models, which may be simpler for specific use cases but lacks the breadth of models.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Extensive library of pre-trained models for various NLP tasks
- Active community and frequent updates
- Easy-to-use API for fine-tuning and inference
Cons of transformers
- Larger repository size and potentially higher resource requirements
- Steeper learning curve for beginners
- Focused primarily on NLP tasks, less versatile for other domains
Code comparison
models:
import tensorflow as tf
from official.nlp import bert
model = bert.BertModel(config=bert_config)
transformers:
from transformers import BertModel
model = BertModel.from_pretrained('bert-base-uncased')
Key differences
- models is part of TensorFlow's official models collection, while transformers is a standalone library
- transformers supports multiple deep learning frameworks (PyTorch, TensorFlow, JAX), whereas models is primarily TensorFlow-based
- transformers offers a wider range of pre-trained models and tasks, while models provides implementations for various machine learning domains
Use cases
- Choose models for TensorFlow-specific projects or when working with official TensorFlow implementations
- Opt for transformers when focusing on NLP tasks or requiring cross-framework compatibility
Datasets, Transforms and Models specific to Computer Vision
Pros of vision
- Focused specifically on computer vision tasks and models
- Tightly integrated with PyTorch ecosystem
- More active development and frequent updates
Cons of vision
- Narrower scope, limited to vision-related models and tools
- Less comprehensive documentation compared to models
Code comparison
models:
import tensorflow as tf
from models import official
model = official.nlp.modeling.models.BertClassifier(...)
vision:
import torch
import torchvision.models as models
model = models.resnet50(pretrained=True)
Summary
models is a more comprehensive repository covering various machine learning domains, while vision specializes in computer vision tasks. models offers a wider range of models and better documentation, but vision benefits from tighter PyTorch integration and more frequent updates. The choice between them depends on the specific project requirements and preferred deep learning framework.
Deep Learning for humans
Pros of Keras
- More user-friendly and intuitive API for building neural networks
- Extensive documentation and community support
- Supports multiple backend engines (TensorFlow, Theano, CNTK)
Cons of Keras
- Less flexibility for low-level operations compared to models
- May have slightly slower performance due to higher-level abstractions
- Limited support for certain advanced research-oriented features
Code Comparison
models (TensorFlow):
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
Keras:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
Dense(64, activation='relu'),
Dense(10, activation='softmax')
])
Both repositories offer ways to build neural networks, but Keras provides a more streamlined API. The models repository is part of the larger TensorFlow ecosystem, offering more low-level control and integration with TensorFlow features. Keras, while now integrated into TensorFlow, still maintains its own repository with a focus on ease of use and quick prototyping. The code comparison shows the similarity in basic model creation, with Keras offering a slightly more concise syntax.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Focuses on optimizing and scaling deep learning training, especially for large models
- Provides advanced features like ZeRO optimizer and pipeline parallelism
- Actively maintained with frequent updates and improvements
Cons of DeepSpeed
- Steeper learning curve due to its specialized nature
- May be overkill for smaller projects or simpler model training tasks
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
model=model,
model_parameters=params)
models:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
Summary
DeepSpeed is a specialized library for optimizing large-scale deep learning training, offering advanced features for performance and scalability. models is a more general-purpose repository with various pre-built models and examples. DeepSpeed is better suited for projects requiring high-performance training of large models, while models provides a broader range of pre-implemented models and examples for various tasks.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- More focused on sequence modeling tasks, particularly machine translation
- Extensive documentation and examples for various NLP tasks
- Active development and frequent updates
Cons of fairseq
- Steeper learning curve for beginners
- Less versatile compared to models, which covers a broader range of ML tasks
Code Comparison
fairseq:
from fairseq.models.transformer import TransformerModel
en2de = TransformerModel.from_pretrained(
'/path/to/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='data-bin/wmt16_en_de_bpe32k'
)
models:
import tensorflow as tf
from official.nlp import modeling
model = modeling.networks.TransformerEncoder(
vocab_size=30522,
num_layers=12,
hidden_size=768,
num_attention_heads=12,
intermediate_size=3072
)
Both repositories provide implementations of popular deep learning models, but fairseq focuses more on sequence-to-sequence tasks, while models offers a broader range of machine learning models and tasks. fairseq is more specialized for NLP researchers, while models caters to a wider audience of machine learning practitioners.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
TensorFlow Models
This repository contains a number of different models implemented in TensorFlow:
The official models are a collection of example models that use TensorFlow's high-level APIs. They are intended to be well-maintained, tested, and kept up to date with the latest stable TensorFlow API. They should also be reasonably optimized for fast performance while still being easy to read. We especially recommend newer TensorFlow users to start here.
The research models are a large collection of models implemented in TensorFlow by researchers. It is up to the individual researchers to maintain the models and/or provide support on issues and pull requests.
The samples folder contains code snippets and smaller models that demonstrate features of TensorFlow, including code presented in various blog posts.
The tutorials folder is a collection of models described in the TensorFlow tutorials.
Top Related Projects
Models and examples built with TensorFlow
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Datasets, Transforms and Models specific to Computer Vision
Deep Learning for humans
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot