Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
TensorFlow code and pre-trained models for BERT
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Quick Overview
The huggingface/huggingface_hub repository is a Python library that provides an interface to interact with the Hugging Face Hub. It allows users to easily download and upload models, datasets, and other artifacts, as well as manage repositories on the Hub. This library is essential for integrating Hugging Face's ecosystem into various machine learning workflows.
Pros
- Seamless integration with Hugging Face's ecosystem of models and datasets
- Easy-to-use API for downloading, uploading, and managing Hub resources
- Supports various file formats and versioning
- Extensive documentation and community support
Cons
- Requires an internet connection for most operations
- May have rate limits or restrictions for heavy usage
- Learning curve for users unfamiliar with Hugging Face's ecosystem
- Dependency on external services (Hugging Face Hub)
Code Examples
- Downloading a model from the Hub:
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="bert-base-uncased", filename="pytorch_model.bin")
print(f"Model downloaded to: {model_path}")
- Uploading a file to the Hub:
from huggingface_hub import HfApi
api = HfApi()
api.upload_file(
path_or_fileobj="./my_model.safetensors",
path_in_repo="model.safetensors",
repo_id="username/my-model",
repo_type="model",
)
- Creating a new repository on the Hub:
from huggingface_hub import create_repo
repo_url = create_repo(repo_id="username/new-repo", private=True)
print(f"New repository created at: {repo_url}")
Getting Started
To get started with the huggingface_hub library, follow these steps:
- Install the library:
pip install huggingface_hub
- Import the necessary modules and authenticate:
from huggingface_hub import login
login() # This will prompt for your Hugging Face access token
- Use the library to interact with the Hub:
from huggingface_hub import hf_hub_download, HfApi
# Download a model
model_path = hf_hub_download(repo_id="bert-base-uncased", filename="config.json")
# List models in a repository
api = HfApi()
models = api.list_repo_files("username/my-repo")
print(models)
With these steps, you can start using the huggingface_hub library to interact with the Hugging Face Hub, download models and datasets, and manage your repositories.
Competitor Comparisons
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Comprehensive library for state-of-the-art NLP models
- Extensive documentation and community support
- Seamless integration with popular deep learning frameworks
Cons of transformers
- Larger codebase and potentially steeper learning curve
- Higher computational requirements for some models
- May include unnecessary features for simple use cases
Code comparison
transformers:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello world!", return_tensors="pt")
outputs = model(**inputs)
huggingface_hub:
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="bert-base-uncased", filename="pytorch_model.bin")
The transformers library provides a higher-level API for working with pre-trained models, while huggingface_hub focuses on model and dataset management. transformers offers more comprehensive functionality for NLP tasks, but huggingface_hub is lighter and more focused on interacting with the Hugging Face ecosystem.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- More comprehensive and lower-level deep learning framework
- Wider range of applications beyond NLP, including computer vision and reinforcement learning
- Larger community and ecosystem of tools and extensions
Cons of PyTorch
- Steeper learning curve for beginners
- Less focus on ease of use for specific NLP tasks
- Requires more boilerplate code for common operations
Code Comparison
PyTorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.matmul(x, y)
Hugging Face Hub:
from huggingface_hub import hf_hub_download
model = hf_hub_download(repo_id="bert-base-uncased", filename="pytorch_model.bin")
PyTorch provides lower-level tensor operations, while Hugging Face Hub focuses on easy model downloading and management. PyTorch offers more flexibility but requires more code for basic operations, whereas Hugging Face Hub simplifies access to pre-trained models and datasets for NLP tasks.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Comprehensive ecosystem for machine learning, including tools for deployment and production
- Strong support for distributed computing and GPU acceleration
- Extensive documentation and large community support
Cons of TensorFlow
- Steeper learning curve compared to HuggingFace Hub
- More complex setup and configuration process
- Less focus on natural language processing tasks specifically
Code Comparison
TensorFlow example:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
HuggingFace Hub example:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
TensorFlow provides a lower-level API for building custom models, while HuggingFace Hub focuses on easy access to pre-trained models for NLP tasks. TensorFlow offers more flexibility but requires more code, whereas HuggingFace Hub prioritizes simplicity and rapid prototyping for specific NLP applications.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Focuses on optimizing large-scale model training and inference
- Offers advanced distributed training techniques like ZeRO and 3D parallelism
- Provides built-in support for mixed precision and gradient accumulation
Cons of DeepSpeed
- Steeper learning curve for beginners compared to Hugging Face Hub
- Less integrated ecosystem for model sharing and collaboration
- Requires more manual configuration for optimal performance
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(
args=args, model=model, model_parameters=params
)
Hugging Face Hub:
from huggingface_hub import Repository
repo = Repository(local_dir="./my_model", clone_from="username/my_model")
repo.git_pull()
Summary
DeepSpeed excels in optimizing large-scale model training and inference, offering advanced techniques for distributed computing. However, it has a steeper learning curve and requires more manual configuration. Hugging Face Hub, on the other hand, provides a more user-friendly interface for model sharing and collaboration, but may not offer the same level of performance optimization for large-scale models.
TensorFlow code and pre-trained models for BERT
Pros of BERT
- Original implementation of the groundbreaking BERT model
- Focused specifically on BERT, providing a deep dive into its architecture
- Includes pre-training and fine-tuning scripts for various tasks
Cons of BERT
- Less actively maintained compared to Hugging Face Hub
- Limited to BERT and its variants, not covering other model architectures
- Lacks the extensive ecosystem and community support of Hugging Face
Code Comparison
BERT:
import tensorflow as tf
import modeling
bert_config = modeling.BertConfig.from_json_file("bert_config.json")
model = modeling.BertModel(
config=bert_config,
is_training=True,
input_ids=input_ids,
input_mask=input_mask,
token_type_ids=segment_ids)
Hugging Face Hub:
from transformers import BertModel, BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertModel.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
The BERT repository provides a lower-level implementation, while Hugging Face Hub offers a more user-friendly interface with pre-trained models and easy integration. Hugging Face Hub also supports a wide range of models beyond BERT, making it more versatile for various NLP tasks.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- Specialized for sequence-to-sequence tasks like machine translation
- Includes pre-trained models and benchmarks for various NLP tasks
- Offers more low-level control and customization options
Cons of fairseq
- Steeper learning curve, especially for beginners
- Less extensive documentation compared to Hugging Face Hub
- Narrower focus on sequence modeling tasks
Code Comparison
fairseq:
from fairseq.models.transformer import TransformerModel
en2de = TransformerModel.from_pretrained(
'/path/to/model',
checkpoint_file='model.pt',
data_name_or_path='data-bin/wmt14_en_de'
)
en2de.translate('Hello world!')
huggingface_hub:
from transformers import pipeline
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")
translator("Hello world!")
The fairseq example demonstrates more explicit model loading and data path specification, while the Hugging Face Hub example showcases a simpler, more abstracted approach using pipelines. fairseq offers more granular control, while Hugging Face Hub prioritizes ease of use and quick implementation.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
The official Python client for the Huggingface Hub.
English | Deutsch | हिà¤à¤¦à¥ | íêµì´ | 䏿ï¼ç®ä½ï¼
Documentation: https://hf.co/docs/huggingface_hub
Source Code: https://github.com/huggingface/huggingface_hub
Welcome to the huggingface_hub library
The huggingface_hub
library allows you to interact with the Hugging Face Hub, a platform democratizing open-source Machine Learning for creators and collaborators. Discover pre-trained models and datasets for your projects or play with the thousands of machine learning apps hosted on the Hub. You can also create and share your own models, datasets and demos with the community. The huggingface_hub
library provides a simple way to do all these things with Python.
Key features
- Download files from the Hub.
- Upload files to the Hub.
- Manage your repositories.
- Run Inference on deployed models.
- Search for models, datasets and Spaces.
- Share Model Cards to document your models.
- Engage with the community through PRs and comments.
Installation
Install the huggingface_hub
package with pip:
pip install huggingface_hub
If you prefer, you can also install it with conda.
In order to keep the package minimal by default, huggingface_hub
comes with optional dependencies useful for some use cases. For example, if you want have a complete experience for Inference, run:
pip install huggingface_hub[inference]
To learn more installation and optional dependencies, check out the installation guide.
Quick start
Download files
Download a single file
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="tiiuae/falcon-7b-instruct", filename="config.json")
Or an entire repository
from huggingface_hub import snapshot_download
snapshot_download("stabilityai/stable-diffusion-2-1")
Files will be downloaded in a local cache folder. More details in this guide.
Login
The Hugging Face Hub uses tokens to authenticate applications (see docs). To log in your machine, run the following CLI:
huggingface-cli login
# or using an environment variable
huggingface-cli login --token $HUGGINGFACE_TOKEN
Create a repository
from huggingface_hub import create_repo
create_repo(repo_id="super-cool-model")
Upload files
Upload a single file
from huggingface_hub import upload_file
upload_file(
path_or_fileobj="/home/lysandre/dummy-test/README.md",
path_in_repo="README.md",
repo_id="lysandre/test-model",
)
Or an entire folder
from huggingface_hub import upload_folder
upload_folder(
folder_path="/path/to/local/space",
repo_id="username/my-cool-space",
repo_type="space",
)
For details in the upload guide.
Integrating to the Hub.
We're partnering with cool open source ML libraries to provide free model hosting and versioning. You can find the existing integrations here.
The advantages are:
- Free model or dataset hosting for libraries and their users.
- Built-in file versioning, even with very large files, thanks to a git-based approach.
- Serverless inference API for all models publicly available.
- In-browser widgets to play with the uploaded models.
- Anyone can upload a new model for your library, they just need to add the corresponding tag for the model to be discoverable.
- Fast downloads! We use Cloudfront (a CDN) to geo-replicate downloads so they're blazing fast from anywhere on the globe.
- Usage stats and more features to come.
If you would like to integrate your library, feel free to open an issue to begin the discussion. We wrote a step-by-step guide with â¤ï¸ showing how to do this integration.
Contributions (feature requests, bugs, etc.) are super welcome ððððð§¡â¤ï¸
Everyone is welcome to contribute, and we value everybody's contribution. Code is not the only way to help the community. Answering questions, helping others, reaching out and improving the documentations are immensely valuable to the community. We wrote a contribution guide to summarize how to get started to contribute to this repository.
Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
TensorFlow code and pre-trained models for BERT
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot