Convert Figma logo to code with AI

serge-chat logoserge

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

5,689
405
5,689
30

Top Related Projects

A Gradio web UI for Large Language Models.

Stable Diffusion web UI

37,484

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

60,338

The official gpt4free repository | various collection of powerful language models

Interact with your documents using the power of GPT, 100% privately, no data leaks

35,868

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Quick Overview

Serge is an open-source, local chat assistant that can be run on your own hardware. It provides a ChatGPT-like experience without relying on external APIs, ensuring privacy and control over your data. Serge supports various language models and offers a user-friendly web interface.

Pros

  • Privacy-focused: Runs locally, ensuring data stays on your own hardware
  • Customizable: Supports multiple language models and can be tailored to specific needs
  • Cost-effective: No subscription fees or API costs
  • User-friendly: Offers a clean web interface for easy interaction

Cons

  • Resource-intensive: Requires significant computational power to run large language models
  • Limited compared to cloud-based alternatives: May not have access to the latest models or features
  • Setup complexity: Requires some technical knowledge to install and configure
  • Potential for lower performance: Depending on hardware, may not match the speed of cloud-based solutions

Getting Started

  1. Clone the repository:

    git clone https://github.com/serge-chat/serge.git
    cd serge
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Download a language model (e.g., GPT-J 6B):

    python3 download-model.py GPT-J-6B
    
  4. Start the Serge server:

    python3 server.py
    
  5. Access the web interface at http://localhost:8008 in your browser.

Competitor Comparisons

A Gradio web UI for Large Language Models.

Pros of text-generation-webui

  • More extensive model support, including popular models like GPT-J, LLaMA, and OPT
  • Advanced features such as character creation, chat modes, and instruct mode
  • Highly customizable interface with various extensions and plugins

Cons of text-generation-webui

  • Steeper learning curve due to more complex setup and configuration options
  • Higher system requirements, especially for running larger language models
  • May be overwhelming for users seeking a simple, out-of-the-box chat experience

Code Comparison

text-generation-webui:

def generate_reply(
    question, chatbot, state, stopping_strings=None, is_chat=False, **kwargs
):
    # Complex generation logic with multiple parameters and options

serge:

def generate_response(self, prompt: str) -> str:
    # Simpler generation function focused on basic chat functionality
    return self.model.generate(prompt)

The code comparison highlights the difference in complexity between the two projects. text-generation-webui offers more advanced features and customization options, while serge focuses on providing a straightforward chat experience with simpler code structure.

Stable Diffusion web UI

Pros of stable-diffusion-webui

  • More extensive features for image generation and manipulation
  • Larger community and more frequent updates
  • Better documentation and user guides

Cons of stable-diffusion-webui

  • Higher system requirements and more complex setup
  • Steeper learning curve for new users
  • Less focused on chat-based interactions

Code Comparison

stable-diffusion-webui:

def create_infotext(p, all_prompts, all_seeds, all_subseeds, comments=None, iteration=0, position_in_batch=0):
    index = position_in_batch + iteration * p.batch_size

    clip_skip = getattr(p, 'clip_skip', opts.CLIP_stop_at_last_layers)
    token_merging_ratio = getattr(p, 'token_merging_ratio', 0)
    token_merging_ratio_hr = getattr(p, 'token_merging_ratio_hr', 0)

serge:

def get_model_path(model_name: str) -> str:
    model_path = os.path.join(MODELS_PATH, model_name)
    if not os.path.exists(model_path):
        raise ValueError(f"Model {model_name} not found in {MODELS_PATH}")
    return model_path

The code snippets show different focuses: stable-diffusion-webui deals with image generation parameters, while serge handles model path management for chat-based interactions.

37,484

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Pros of FastChat

  • More comprehensive and feature-rich, offering a wider range of functionalities
  • Better documentation and examples, making it easier for developers to integrate and use
  • Actively maintained with frequent updates and improvements

Cons of FastChat

  • More complex setup and configuration process
  • Higher system requirements due to its extensive features
  • Steeper learning curve for beginners

Code Comparison

Serge (Python):

from serge import Serge

chat = Serge()
response = chat.chat("Hello, how are you?")
print(response)

FastChat (Python):

from fastchat.model import load_model, get_conversation_template
from fastchat.serve.inference import chat_loop

model, tokenizer = load_model("vicuna-7b")
conv = get_conversation_template("vicuna")
chat_loop(model, tokenizer, conv)

FastChat offers more flexibility and control over the conversation flow, while Serge provides a simpler, more straightforward interface for basic chatbot functionality. FastChat's code demonstrates its ability to load custom models and conversation templates, making it more versatile for advanced use cases.

60,338

The official gpt4free repository | various collection of powerful language models

Pros of gpt4free

  • Offers access to multiple AI models and providers
  • Includes a web interface for easy interaction
  • Provides more frequent updates and active development

Cons of gpt4free

  • Less focus on privacy and self-hosting
  • May have potential legal and ethical concerns
  • Lacks some advanced features present in Serge

Code Comparison

gpt4free:

from g4f import ChatCompletion

response = ChatCompletion.create(model='gpt-3.5-turbo', messages=[
    {'role': 'user', 'content': 'Hello, how are you?'}
])
print(response)

Serge:

from serge import ChatBot

bot = ChatBot()
response = bot.chat("Hello, how are you?")
print(response)

Summary

gpt4free offers a wider range of AI models and providers, along with a web interface, making it more versatile for users seeking various AI interactions. However, Serge focuses more on privacy and self-hosting, which may be preferable for users concerned about data security. gpt4free's code appears more complex, allowing for model selection, while Serge's implementation is simpler and more straightforward. Both projects have their strengths, and the choice between them depends on the user's specific needs and priorities.

Interact with your documents using the power of GPT, 100% privately, no data leaks

Pros of private-gpt

  • Focuses on privacy and local data processing
  • Supports multiple document types (PDF, TXT, etc.)
  • Utilizes LangChain for improved language model interactions

Cons of private-gpt

  • Less emphasis on multi-user collaboration
  • May require more setup and configuration
  • Potentially higher resource requirements for local processing

Code Comparison

Serge (Python):

@app.route('/api/chat', methods=['POST'])
def chat():
    data = request.json
    conversation_id = data.get('conversation_id')
    message = data.get('message')
    # ... (processing logic)
    return jsonify(response)

private-gpt (Python):

@app.route("/chat", methods=["POST"])
def chat_endpoint():
    request_data = request.json
    question = request_data["question"]
    history = request_data.get("history", [])
    # ... (processing logic)
    return jsonify({"answer": answer, "history": updated_history})

Both projects use Flask for API endpoints, but private-gpt focuses on question-answering with history, while Serge emphasizes conversation management.

35,868

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

  • Highly optimized for large-scale distributed training of deep learning models
  • Supports a wide range of AI models and architectures
  • Extensive documentation and active community support

Cons of DeepSpeed

  • Steeper learning curve for beginners
  • Primarily focused on training, less emphasis on inference
  • Requires more setup and configuration for optimal performance

Code Comparison

DeepSpeed:

import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
                                                     model=model,
                                                     model_parameters=params)

Serge:

from serge import Serge
serge = Serge()
response = serge.chat("Hello, how are you?")

Summary

DeepSpeed is a powerful library for optimizing large-scale AI model training, offering advanced features and broad compatibility. However, it may be more complex for beginners and requires more setup. Serge, on the other hand, appears to be a simpler chat-based interface, potentially easier to use but with fewer advanced optimization features. The choice between them depends on the specific use case and level of expertise required.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Serge - LLaMA made easy 🦙

License Discord

Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted!

  • 🌐 SvelteKit frontend
  • 💾 Redis for storing chat history & parameters
  • ⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

🌍 Environment Variables

The following Environment Variables are available:

Variable NameDescriptionDefault Value
SERGE_DATABASE_URLDatabase connection stringsqlite:////data/db/sql_app.db
SERGE_JWT_SECRETKey for auth token encryption. Use a random stringuF7FGN5uzfGdFiPzR
SERGE_SESSION_EXPIRYDuration in minutes before a user must reauthenticate60
NODE_ENVNode.js running environmentproduction

🖥️ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

CategoryModels
Alfred40B-1023
BioMistral7B
Code13B, 33B
CodeLLaMA7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python
Codestral22B v0.1
Gemma2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct, 2-9B, 2-9B-Instruct, 2-27B, 2-27B-Instruct
GorillaFalcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2
Falcon7B, 7B-Instruct, 11B, 40B, 40B-Instruct
LLaMA 27B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST
LLaMA 311B-Instruct, 13B-Instruct, 16B-Instruct
LLaMA Pro8B, 8B-Instruct
Mathstral7B
Med4270B, v2-8B, v2-70B
Medalpaca13B
MedicineChat, LLM
Meditron7B, 7B-Chat, 70B, 3-8B
Meta-LlaMA-33-8B, 3.1-8B, 3.2-1B-Instruct, 3-8B-Instruct, 3.1-8B-Instruct, 3.2-3B-Instruct, 3-70B, 3.1-70B, 3-70B-Instruct, 3.1-70B-Instruct
Mistral7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca, Nemo-Instruct
MistralLite7B
Mixtral8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1
Neural-Chat7B-v3.3
Notus7B-v1
Notux8x7b-v1
Nous-Hermes 2Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT
OpenChat7B-v3.5-1210? 8B-v3.6-20240522
OpenCodeInterpreterDS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B
OpenLLaMA3B-v2, 7B-v2, 13B-v2
Orca 27B, 13B
Phi2-2.7B, 3-mini-4k-instruct, 3.1-mini-4k-instruct, 3.1-mini-128k-instruct,3.5-mini-instruct, 3-medium-4k-instruct, 3-medium-128k-instruct
Python Code13B, 33B
PsyMedRP13B-v1, 20B-v1
Starling LM7B-Alpha
SOLAR10.7B-v1.0, 10.7B-instruct-v1.0
TinyLlama1.1B
Vicuna7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder
WizardLM2-7B, 13B-v1.2, 70B-v1.0
Zephyr3B, 7B-Alpha, 7B-Beta

Additional models can be requested by opening a GitHub issue. Other models are also available at Serge Models.

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model

💬 Support

Need help? Join our Discord

🧾 License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License and Apache-2.0.

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build

The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Remote Debug",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}/api",
                    "remoteRoot": "/usr/src/app/api/"
                }
            ],
            "justMyCode": false
        }
    ]
}