Convert Figma logo to code with AI

lxe logosimple-llm-finetuner

Simple UI for LLM Model Finetuning

2,046
133
2,046
40

Top Related Projects

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

11,750

An open-source NLP research library, built on PyTorch.

37,810

TensorFlow code and pre-trained models for BERT

Quick Overview

Simple-llm-finetuner is a GitHub repository that provides a straightforward approach to fine-tuning large language models (LLMs). It offers a user-friendly interface and streamlined process for customizing pre-trained models to specific tasks or domains, making it accessible for developers and researchers who want to experiment with LLM fine-tuning without extensive setup or complexity.

Pros

  • Easy to use and understand, with a simple interface for fine-tuning LLMs
  • Supports multiple popular LLM architectures and datasets
  • Provides clear documentation and examples for getting started
  • Lightweight and efficient, suitable for running on consumer-grade hardware

Cons

  • Limited advanced features compared to more comprehensive fine-tuning frameworks
  • May not be suitable for large-scale or production-level fine-tuning tasks
  • Requires some basic understanding of LLMs and fine-tuning concepts
  • Limited community support compared to more established tools

Code Examples

  1. Loading a pre-trained model:
from simple_llm_finetuner import load_model

model = load_model("gpt2", device="cuda")
  1. Preparing a dataset for fine-tuning:
from simple_llm_finetuner import prepare_dataset

dataset = prepare_dataset("custom_data.txt", tokenizer=model.tokenizer)
  1. Fine-tuning the model:
from simple_llm_finetuner import finetune

finetune(model, dataset, epochs=3, learning_rate=5e-5)
  1. Generating text with the fine-tuned model:
generated_text = model.generate("Once upon a time", max_length=100)
print(generated_text)

Getting Started

To get started with simple-llm-finetuner, follow these steps:

  1. Install the library:
pip install simple-llm-finetuner
  1. Import the necessary modules:
from simple_llm_finetuner import load_model, prepare_dataset, finetune
  1. Load a pre-trained model, prepare your dataset, and start fine-tuning:
model = load_model("gpt2")
dataset = prepare_dataset("your_data.txt", tokenizer=model.tokenizer)
finetune(model, dataset, epochs=3)
  1. Use the fine-tuned model for text generation:
generated_text = model.generate("Your prompt here", max_length=100)
print(generated_text)

Competitor Comparisons

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of transformers

  • Comprehensive library with support for a wide range of models and tasks
  • Extensive documentation and community support
  • Regular updates and new model implementations

Cons of transformers

  • Steeper learning curve due to its extensive features
  • Can be resource-intensive for smaller projects
  • May require more setup and configuration

Code Comparison

transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")
inputs = tokenizer("Hello, I'm a language model", return_tensors="pt")
outputs = model(**inputs)

simple-llm-finetuner:

from simple_llm_finetuner import Trainer

trainer = Trainer(model_name="gpt2", dataset="custom_dataset.json")
trainer.train()
trainer.save_model("finetuned_model")

The transformers library offers more flexibility and control over the model and tokenization process, while simple-llm-finetuner provides a more streamlined approach for fine-tuning specific to LLMs. transformers is better suited for complex projects requiring customization, whereas simple-llm-finetuner is ideal for quick and straightforward fine-tuning tasks.

34,658

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

  • Highly optimized for large-scale distributed training
  • Supports a wide range of model architectures and training scenarios
  • Extensive documentation and active community support

Cons of DeepSpeed

  • Steeper learning curve due to its complexity
  • May be overkill for smaller projects or simpler fine-tuning tasks
  • Requires more setup and configuration

Code Comparison

DeepSpeed:

import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(
    args=args,
    model=model,
    model_parameters=params
)

simple-llm-finetuner:

from simple_llm_finetuner import Trainer
trainer = Trainer(model=model, args=training_args)
trainer.train()

Summary

DeepSpeed is a powerful library for large-scale distributed training, offering advanced optimization techniques and broad compatibility. However, it comes with increased complexity and setup requirements. simple-llm-finetuner, on the other hand, provides a more straightforward approach to fine-tuning, making it easier to use for smaller projects or those new to LLM fine-tuning. The choice between the two depends on the scale of your project, available resources, and desired level of optimization.

30,331

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Pros of fairseq

  • Comprehensive toolkit for sequence modeling tasks
  • Supports a wide range of architectures and models
  • Highly customizable and extensible

Cons of fairseq

  • Steeper learning curve due to its complexity
  • Requires more setup and configuration
  • May be overkill for simple fine-tuning tasks

Code Comparison

fairseq:

from fairseq.models.transformer import TransformerModel

model = TransformerModel.from_pretrained('/path/to/model', checkpoint_file='model.pt')
model.eval()
tokens = model.encode('Hello world!')
predictions = model.predict('decoder', tokens)

simple-llm-finetuner:

from simple_llm_finetuner import Trainer

trainer = Trainer(model_name="gpt2", dataset="my_dataset.jsonl")
trainer.train()
trainer.save_model("fine_tuned_model")

The fairseq code demonstrates loading a pre-trained model and making predictions, while simple-llm-finetuner focuses on a straightforward fine-tuning process. fairseq offers more flexibility but requires more setup, whereas simple-llm-finetuner provides a more streamlined approach for basic fine-tuning tasks.

11,750

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

  • More comprehensive and feature-rich NLP toolkit
  • Extensive documentation and community support
  • Wider range of pre-built models and datasets

Cons of AllenNLP

  • Steeper learning curve for beginners
  • Potentially more complex setup and configuration
  • May be overkill for simple fine-tuning tasks

Code Comparison

AllenNLP:

from allennlp.data import DatasetReader, Instance
from allennlp.data.fields import TextField
from allennlp.data.token_indexers import SingleIdTokenIndexer

class MyDatasetReader(DatasetReader):
    def _read(self, file_path: str) -> Iterable[Instance]:
        # Implementation here

simple-llm-finetuner:

from datasets import load_dataset

dataset = load_dataset("csv", data_files="data.csv")
model = AutoModelForCausalLM.from_pretrained(model_name)
trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"])

AllenNLP offers a more structured approach with custom dataset readers and extensive configuration options, while simple-llm-finetuner provides a more straightforward implementation for quick fine-tuning tasks using popular libraries like Hugging Face's Transformers.

37,810

TensorFlow code and pre-trained models for BERT

Pros of BERT

  • Comprehensive implementation of the BERT model with pre-trained weights
  • Extensive documentation and examples for various NLP tasks
  • Widely adopted and supported by the research community

Cons of BERT

  • More complex setup and usage compared to simple-llm-finetuner
  • Requires more computational resources for training and fine-tuning
  • Less focused on quick and easy fine-tuning of language models

Code Comparison

BERT example:

import tensorflow as tf
from transformers import BertTokenizer, TFBertForSequenceClassification

model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

simple-llm-finetuner example:

from simple_llm_finetuner import Trainer

trainer = Trainer(model_name="gpt2", dataset="my_dataset.jsonl")
trainer.train()

The BERT repository provides a more comprehensive implementation with greater flexibility, while simple-llm-finetuner offers a streamlined approach for quick fine-tuning of language models. BERT is better suited for advanced NLP tasks and research, whereas simple-llm-finetuner is designed for simplicity and ease of use in fine-tuning scenarios.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README


title: Simple LLM Finetuner emoji: 🦙 colorFrom: yellow colorTo: orange sdk: gradio app_file: app.py pinned: false

👻👻👻 This project is effectively dead. Please use one of the following tools instead:


🦙 Simple LLM Finetuner

Open In Colab Open In Spaces

Simple LLM Finetuner is a beginner-friendly interface designed to facilitate fine-tuning various language models using LoRA method via the PEFT library on commodity NVIDIA GPUs. With small dataset and sample lengths of 256, you can even run this on a regular Colab Tesla T4 instance.

With this intuitive UI, you can easily manage your dataset, customize parameters, train, and evaluate the model's inference capabilities.

Acknowledgements

Features

  • Simply paste datasets in the UI, separated by double blank lines
  • Adjustable parameters for fine-tuning and inference
  • Beginner-friendly UI with explanations for each parameter

Getting Started

Prerequisites

  • Linux or WSL
  • Modern NVIDIA GPU with >= 16 GB of VRAM (but it might be possible to run with less for smaller sample lengths)

Usage

I recommend using a virtual environment to install the required packages. Conda preferred.

conda create -n simple-llm-finetuner python=3.10
conda activate simple-llm-finetuner
conda install -y cuda -c nvidia/label/cuda-11.7.0
conda install -y pytorch=2 pytorch-cuda=11.7 -c pytorch

On WSL, you might need to install CUDA manually by following these steps, then running the following before you launch:

export LD_LIBRARY_PATH=/usr/lib/wsl/lib

Clone the repository and install the required packages.

git clone https://github.com/lxe/simple-llm-finetuner.git
cd simple-llm-finetuner
pip install -r requirements.txt

Launch it

python app.py

Open http://127.0.0.1:7860/ in your browser. Prepare your training data by separating each sample with 2 blank lines. Paste the whole training dataset into the textbox. Specify the new LoRA adapter name in the "New PEFT Adapter Name" textbox, then click train. You might need to adjust the max sequence length and batch size to fit your GPU memory. The model will be saved in the lora/ directory.

After training is done, navigate to "Inference" tab, select your LoRA, and play with it.

Have fun!

YouTube Walkthough

https://www.youtube.com/watch?v=yM1wanDkNz8

License

MIT License