Convert Figma logo to code with AI

nomic-ai logogpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

70,041
7,661
70,041
604

Top Related Projects

66,315

LLM inference in C/C++

17,318

Inference Llama 2 in one file of pure C

Port of OpenAI's Whisper model in C/C++

Python bindings for llama.cpp

18,953

Universal LLM Deployment Engine with ML Compilation

A Gradio web UI for Large Language Models.

Quick Overview

GPT4All is an open-source ecosystem of large language models (LLMs) that can run locally on consumer-grade hardware. It aims to democratize AI by providing powerful language models that can be run without the need for expensive cloud infrastructure or specialized hardware.

Pros

  • Runs locally on consumer hardware, ensuring privacy and reducing costs
  • Supports multiple platforms (Windows, macOS, Linux)
  • Offers a variety of pre-trained models with different capabilities
  • Provides both command-line and GUI interfaces for ease of use

Cons

  • Performance may be slower compared to cloud-based solutions
  • Limited by the capabilities of local hardware
  • May require significant disk space for storing models
  • Some advanced features of larger models may not be available in local versions

Code Examples

  1. Basic usage of GPT4All in Python:
from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
output = model.generate("The capital of France is", max_tokens=3)
print(output)
  1. Using GPT4All with a custom prompt:
from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
prompt = "Translate the following English text to French: 'Hello, how are you?'"
output = model.generate(prompt, max_tokens=20)
print(output)
  1. Streaming output from GPT4All:
from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
prompt = "Write a short story about a robot learning to paint:"

for token in model.generate(prompt, max_tokens=200, streaming=True):
    print(token, end='', flush=True)

Getting Started

To get started with GPT4All, follow these steps:

  1. Install the library:
pip install gpt4all
  1. Download a model (e.g., ggml-gpt4all-j-v1.3-groovy) from the GPT4All website.

  2. Use the following code to initialize and generate text:

from gpt4all import GPT4All

model = GPT4All("path/to/your/model.bin")
output = model.generate("Your prompt here", max_tokens=50)
print(output)

Replace "path/to/your/model.bin" with the actual path to your downloaded model file.

Competitor Comparisons

66,315

LLM inference in C/C++

Pros of llama.cpp

  • Highly optimized C++ implementation for efficient inference
  • Supports quantization for reduced memory usage and faster execution
  • Offers cross-platform compatibility (Windows, macOS, Linux, iOS, Android)

Cons of llama.cpp

  • Limited to LLaMA model architecture
  • Requires more technical expertise to set up and use
  • Fewer built-in features for chat-like interactions

Code Comparison

llama.cpp:

int main(int argc, char ** argv) {
    gpt_params params;
    if (gpt_params_parse(argc, argv, params) == false) {
        return 1;
    }
    llama_init_backend();
    // ... (implementation continues)
}

GPT4All:

from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
output = model.generate("Once upon a time, ", max_tokens=200)
print(output)

Summary

llama.cpp focuses on efficient C++ implementation of the LLaMA model, offering optimizations and cross-platform support. It's ideal for users who need fine-grained control and performance. GPT4All, on the other hand, provides a more user-friendly Python interface with support for multiple models, making it easier to integrate into existing projects. While llama.cpp excels in performance, GPT4All offers greater flexibility and ease of use for a wider range of applications.

17,318

Inference Llama 2 in one file of pure C

Pros of llama2.c

  • Lightweight and minimalistic implementation
  • Focused on single-file C code for simplicity
  • Easier to understand and modify for educational purposes

Cons of llama2.c

  • Limited features compared to GPT4All
  • Less support for different model architectures
  • Fewer pre-trained models available out-of-the-box

Code Comparison

llama2.c:

float* forward(Transformer* transformer, int token, int pos) {
    float* x = transformer->token_embedding_table + token * transformer->dim;
    for (int l = 0; l < transformer->n_layers; l++) {
        // ... (attention and feedforward operations)
    }
    return x;
}

GPT4All:

void LLModel::prompt(const std::string &prompt, std::function<bool(int32_t)> promptCallback,
                     std::function<bool(int32_t, const std::string&)> responseCallback,
                     std::function<bool(bool)> recalculateCallback,
                     PromptContext &promptCtx) {
    // ... (tokenization and generation logic)
}

The code snippets highlight the difference in complexity and abstraction level between the two projects. llama2.c focuses on a simple, low-level implementation, while GPT4All provides a more feature-rich and abstracted interface for language model interactions.

Port of OpenAI's Whisper model in C/C++

Pros of whisper.cpp

  • Focused on speech recognition, providing efficient transcription capabilities
  • Lightweight C++ implementation, suitable for embedded systems and low-resource environments
  • Supports multiple languages and can run offline

Cons of whisper.cpp

  • Limited to speech-to-text functionality, lacking general language understanding capabilities
  • Requires audio input, not suitable for text-based interactions or general-purpose language tasks
  • May have lower accuracy compared to larger, more complex models

Code Comparison

whisper.cpp:

#include "whisper.h"

int main() {
    struct whisper_context * ctx = whisper_init_from_file("ggml-base.en.bin");
    whisper_full_default(ctx, params, pcm, pcm_len);
    whisper_print_timings(ctx);
    whisper_free(ctx);
}

gpt4all:

from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
output = model.generate("Once upon a time", max_tokens=50)
print(output)

The code snippets demonstrate the different focus areas of the two projects, with whisper.cpp handling audio transcription and gpt4all providing text generation capabilities.

Python bindings for llama.cpp

Pros of llama-cpp-python

  • Focused on providing Python bindings for the llama.cpp library, offering a more specialized and potentially efficient implementation
  • Supports GPU acceleration out of the box, which can significantly improve performance
  • Provides a simpler API, making it easier to integrate into existing Python projects

Cons of llama-cpp-python

  • Limited to LLaMA-based models, whereas gpt4all supports a wider range of models
  • Less extensive documentation and community support compared to gpt4all
  • Fewer built-in features and tools for model fine-tuning and customization

Code Comparison

llama-cpp-python:

from llama_cpp import Llama

llm = Llama(model_path="./models/7B/ggml-model.bin")
output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output["choices"][0]["text"])

gpt4all:

from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
output = model.generate("Name the planets in the solar system.", max_tokens=32)
print(output)
18,953

Universal LLM Deployment Engine with ML Compilation

Pros of mlc-llm

  • Focuses on efficient deployment of large language models across various hardware platforms
  • Provides a unified framework for optimizing LLMs on different devices (CPUs, GPUs, mobile)
  • Supports multiple model architectures and quantization techniques

Cons of mlc-llm

  • May have a steeper learning curve due to its focus on low-level optimizations
  • Less emphasis on providing a ready-to-use chatbot interface compared to gpt4all
  • Requires more technical knowledge to implement and customize

Code Comparison

mlc-llm:

import mlc_llm
import tvm

model = mlc_llm.load_model("llama-7b")
output = model.generate("Hello, how are you?")
print(output)

gpt4all:

from gpt4all import GPT4All

model = GPT4All("ggml-gpt4all-j-v1.3-groovy")
output = model.generate("Hello, how are you?")
print(output)

Both repositories aim to make large language models more accessible, but they approach this goal differently. mlc-llm focuses on optimizing LLMs for various hardware platforms, while gpt4all provides a more user-friendly interface for running chatbots locally. The code comparison shows that mlc-llm requires more setup and configuration, while gpt4all offers a simpler API for generating text.

A Gradio web UI for Large Language Models.

Pros of text-generation-webui

  • More extensive model support, including various architectures and quantization methods
  • Rich web-based interface with multiple chat modes and extensions
  • Active development and community contributions

Cons of text-generation-webui

  • Higher system requirements and more complex setup process
  • Steeper learning curve for beginners
  • Less focus on mobile and edge device deployment

Code comparison

text-generation-webui:

def generate_reply(
    question, state, stopping_strings=None, is_chat=False, escape_html=False
):
    # Complex generation logic with multiple parameters and options
    # ...

gpt4all:

def generate(self, prompt, max_tokens=200, temp=0.7):
    # Simpler generation function with fewer parameters
    # ...

The code comparison shows that text-generation-webui offers more advanced and customizable generation options, while gpt4all provides a simpler, more straightforward approach. This reflects the overall design philosophy of each project, with text-generation-webui catering to power users and gpt4all focusing on ease of use and accessibility.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

GPT4All

WebsiteDocumentationDiscordYouTube Tutorial

GPT4All runs large language models (LLMs) privately on everyday desktops & laptops.

No API calls or GPUs required - you can just download the application and get started.

Read about what's new in our blog.

Subscribe to the newsletter

https://github.com/nomic-ai/gpt4all/assets/70534565/513a0f15-4964-4109-89e4-4f9a9011f311

GPT4All is made possible by our compute partner Paperspace.

phorm.ai

Download Links

Windows Installer

macOS Installer

Ubuntu Installer

Windows and Linux require Intel Core i3 2nd Gen / AMD Bulldozer, or better. x86-64 only, no ARM.

macOS requires Monterey 12.6 or newer. Best results with Apple Silicon M-series processors.

See the full System Requirements for more details.



Get it on Flathub
Flathub (community maintained)

Install GPT4All Python

gpt4all gives you access to LLMs with our Python client around llama.cpp implementations.

Nomic contributes to open source software like llama.cpp to make LLMs accessible and efficient for all.

pip install gpt4all
from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # downloads / loads a 4.66GB LLM
with model.chat_session():
    print(model.generate("How can I run LLMs efficiently on my laptop?", max_tokens=1024))

Integrations

:parrot::link: Langchain :card_file_box: Weaviate Vector Database - module docs :telescope: OpenLIT (OTel-native Monitoring) - Docs

Release History

  • July 2nd, 2024: V3.0.0 Release
    • Fresh redesign of the chat application UI
    • Improved user workflow for LocalDocs
    • Expanded access to more model architectures
  • October 19th, 2023: GGUF Support Launches with Support for:
    • Mistral 7b base model, an updated model gallery on our website, several new local code models including Rift Coder v1.5
    • Nomic Vulkan support for Q4_0 and Q4_1 quantizations in GGUF.
    • Offline build support for running old versions of the GPT4All Local LLM Chat Client.
  • September 18th, 2023: Nomic Vulkan launches supporting local LLM inference on NVIDIA and AMD GPUs.
  • July 2023: Stable support for LocalDocs, a feature that allows you to privately and locally chat with your data.
  • June 28th, 2023: Docker-based API server launches allowing inference of local LLMs from an OpenAI-compatible HTTP endpoint.

Contributing

GPT4All welcomes contributions, involvement, and discussion from the open source community! Please see CONTRIBUTING.md and follow the issues, bug reports, and PR markdown templates.

Check project discord, with project owners, or through existing issues/PRs to avoid duplicate work. Please make sure to tag all of the above with relevant project identifiers or your contribution could potentially get lost. Example tags: backend, bindings, python-bindings, documentation, etc.

Citation

If you utilize this repository, models or data in a downstream project, please consider citing it with:

@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}