Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
An open-source NLP research library, built on PyTorch.
TensorFlow code and pre-trained models for BERT
Quick Overview
Simple-llm-finetuner is a GitHub repository that provides a straightforward approach to fine-tuning large language models (LLMs). It offers a user-friendly interface and streamlined process for customizing pre-trained models to specific tasks or domains, making it accessible for developers and researchers who want to experiment with LLM fine-tuning without extensive setup or complexity.
Pros
- Easy to use and understand, with a simple interface for fine-tuning LLMs
- Supports multiple popular LLM architectures and datasets
- Provides clear documentation and examples for getting started
- Lightweight and efficient, suitable for running on consumer-grade hardware
Cons
- Limited advanced features compared to more comprehensive fine-tuning frameworks
- May not be suitable for large-scale or production-level fine-tuning tasks
- Requires some basic understanding of LLMs and fine-tuning concepts
- Limited community support compared to more established tools
Code Examples
- Loading a pre-trained model:
from simple_llm_finetuner import load_model
model = load_model("gpt2", device="cuda")
- Preparing a dataset for fine-tuning:
from simple_llm_finetuner import prepare_dataset
dataset = prepare_dataset("custom_data.txt", tokenizer=model.tokenizer)
- Fine-tuning the model:
from simple_llm_finetuner import finetune
finetune(model, dataset, epochs=3, learning_rate=5e-5)
- Generating text with the fine-tuned model:
generated_text = model.generate("Once upon a time", max_length=100)
print(generated_text)
Getting Started
To get started with simple-llm-finetuner, follow these steps:
- Install the library:
pip install simple-llm-finetuner
- Import the necessary modules:
from simple_llm_finetuner import load_model, prepare_dataset, finetune
- Load a pre-trained model, prepare your dataset, and start fine-tuning:
model = load_model("gpt2")
dataset = prepare_dataset("your_data.txt", tokenizer=model.tokenizer)
finetune(model, dataset, epochs=3)
- Use the fine-tuned model for text generation:
generated_text = model.generate("Your prompt here", max_length=100)
print(generated_text)
Competitor Comparisons
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Comprehensive library with support for a wide range of models and tasks
- Extensive documentation and community support
- Regular updates and new model implementations
Cons of transformers
- Steeper learning curve due to its extensive features
- Can be resource-intensive for smaller projects
- May require more setup and configuration
Code Comparison
transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")
inputs = tokenizer("Hello, I'm a language model", return_tensors="pt")
outputs = model(**inputs)
simple-llm-finetuner:
from simple_llm_finetuner import Trainer
trainer = Trainer(model_name="gpt2", dataset="custom_dataset.json")
trainer.train()
trainer.save_model("finetuned_model")
The transformers library offers more flexibility and control over the model and tokenization process, while simple-llm-finetuner provides a more streamlined approach for fine-tuning specific to LLMs. transformers is better suited for complex projects requiring customization, whereas simple-llm-finetuner is ideal for quick and straightforward fine-tuning tasks.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Highly optimized for large-scale distributed training
- Supports a wide range of model architectures and training scenarios
- Extensive documentation and active community support
Cons of DeepSpeed
- Steeper learning curve due to its complexity
- May be overkill for smaller projects or simpler fine-tuning tasks
- Requires more setup and configuration
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(
args=args,
model=model,
model_parameters=params
)
simple-llm-finetuner:
from simple_llm_finetuner import Trainer
trainer = Trainer(model=model, args=training_args)
trainer.train()
Summary
DeepSpeed is a powerful library for large-scale distributed training, offering advanced optimization techniques and broad compatibility. However, it comes with increased complexity and setup requirements. simple-llm-finetuner, on the other hand, provides a more straightforward approach to fine-tuning, making it easier to use for smaller projects or those new to LLM fine-tuning. The choice between the two depends on the scale of your project, available resources, and desired level of optimization.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- Comprehensive toolkit for sequence modeling tasks
- Supports a wide range of architectures and models
- Highly customizable and extensible
Cons of fairseq
- Steeper learning curve due to its complexity
- Requires more setup and configuration
- May be overkill for simple fine-tuning tasks
Code Comparison
fairseq:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained('/path/to/model', checkpoint_file='model.pt')
model.eval()
tokens = model.encode('Hello world!')
predictions = model.predict('decoder', tokens)
simple-llm-finetuner:
from simple_llm_finetuner import Trainer
trainer = Trainer(model_name="gpt2", dataset="my_dataset.jsonl")
trainer.train()
trainer.save_model("fine_tuned_model")
The fairseq code demonstrates loading a pre-trained model and making predictions, while simple-llm-finetuner focuses on a straightforward fine-tuning process. fairseq offers more flexibility but requires more setup, whereas simple-llm-finetuner provides a more streamlined approach for basic fine-tuning tasks.
An open-source NLP research library, built on PyTorch.
Pros of AllenNLP
- More comprehensive and feature-rich NLP toolkit
- Extensive documentation and community support
- Wider range of pre-built models and datasets
Cons of AllenNLP
- Steeper learning curve for beginners
- Potentially more complex setup and configuration
- May be overkill for simple fine-tuning tasks
Code Comparison
AllenNLP:
from allennlp.data import DatasetReader, Instance
from allennlp.data.fields import TextField
from allennlp.data.token_indexers import SingleIdTokenIndexer
class MyDatasetReader(DatasetReader):
def _read(self, file_path: str) -> Iterable[Instance]:
# Implementation here
simple-llm-finetuner:
from datasets import load_dataset
dataset = load_dataset("csv", data_files="data.csv")
model = AutoModelForCausalLM.from_pretrained(model_name)
trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"])
AllenNLP offers a more structured approach with custom dataset readers and extensive configuration options, while simple-llm-finetuner provides a more straightforward implementation for quick fine-tuning tasks using popular libraries like Hugging Face's Transformers.
TensorFlow code and pre-trained models for BERT
Pros of BERT
- Comprehensive implementation of the BERT model with pre-trained weights
- Extensive documentation and examples for various NLP tasks
- Widely adopted and supported by the research community
Cons of BERT
- More complex setup and usage compared to simple-llm-finetuner
- Requires more computational resources for training and fine-tuning
- Less focused on quick and easy fine-tuning of language models
Code Comparison
BERT example:
import tensorflow as tf
from transformers import BertTokenizer, TFBertForSequenceClassification
model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
simple-llm-finetuner example:
from simple_llm_finetuner import Trainer
trainer = Trainer(model_name="gpt2", dataset="my_dataset.jsonl")
trainer.train()
The BERT repository provides a more comprehensive implementation with greater flexibility, while simple-llm-finetuner offers a streamlined approach for quick fine-tuning of language models. BERT is better suited for advanced NLP tasks and research, whereas simple-llm-finetuner is designed for simplicity and ease of use in fine-tuning scenarios.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
title: Simple LLM Finetuner emoji: ð¦ colorFrom: yellow colorTo: orange sdk: gradio app_file: app.py pinned: false
ð»ð»ð» This project is effectively dead. Please use one of the following tools instead:
- https://github.com/hiyouga/LLaMA-Factory
- https://github.com/unslothai/unsloth
- https://github.com/oobabooga/text-generation-webui
ð¦ Simple LLM Finetuner
Simple LLM Finetuner is a beginner-friendly interface designed to facilitate fine-tuning various language models using LoRA method via the PEFT library on commodity NVIDIA GPUs. With small dataset and sample lengths of 256, you can even run this on a regular Colab Tesla T4 instance.
With this intuitive UI, you can easily manage your dataset, customize parameters, train, and evaluate the model's inference capabilities.
Acknowledgements
- https://github.com/zphang/minimal-llama/
- https://github.com/tloen/alpaca-lora
- https://github.com/huggingface/peft
Features
- Simply paste datasets in the UI, separated by double blank lines
- Adjustable parameters for fine-tuning and inference
- Beginner-friendly UI with explanations for each parameter
Getting Started
Prerequisites
- Linux or WSL
- Modern NVIDIA GPU with >= 16 GB of VRAM (but it might be possible to run with less for smaller sample lengths)
Usage
I recommend using a virtual environment to install the required packages. Conda preferred.
conda create -n simple-llm-finetuner python=3.10
conda activate simple-llm-finetuner
conda install -y cuda -c nvidia/label/cuda-11.7.0
conda install -y pytorch=2 pytorch-cuda=11.7 -c pytorch
On WSL, you might need to install CUDA manually by following these steps, then running the following before you launch:
export LD_LIBRARY_PATH=/usr/lib/wsl/lib
Clone the repository and install the required packages.
git clone https://github.com/lxe/simple-llm-finetuner.git
cd simple-llm-finetuner
pip install -r requirements.txt
Launch it
python app.py
Open http://127.0.0.1:7860/ in your browser. Prepare your training data by separating each sample with 2 blank lines. Paste the whole training dataset into the textbox. Specify the new LoRA adapter name in the "New PEFT Adapter Name" textbox, then click train. You might need to adjust the max sequence length and batch size to fit your GPU memory. The model will be saved in the lora/
directory.
After training is done, navigate to "Inference" tab, select your LoRA, and play with it.
Have fun!
YouTube Walkthough
https://www.youtube.com/watch?v=yM1wanDkNz8
License
MIT License
Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
An open-source NLP research library, built on PyTorch.
TensorFlow code and pre-trained models for BERT
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot