serge

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

5,707

403

5,707

View on GitHub

Top Related Projects

text-generation-webui

41,681

A Gradio web UI for Large Language Models with support for multiple inference backends.

FastChat

37,484

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

gpt4free

63,126

The official gpt4free repository | various collection of powerful language models

private-gpt

55,497

Interact with your documents using the power of GPT, 100% privately, no data leaks

DeepSpeed

37,573

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Quick Overview

Serge is an open-source, local chat assistant that can be run on your own hardware. It provides a ChatGPT-like experience without relying on external APIs, ensuring privacy and control over your data. Serge supports various language models and offers a user-friendly web interface.

Pros

Privacy-focused: Runs locally, ensuring data stays on your own hardware
Customizable: Supports multiple language models and can be tailored to specific needs
Cost-effective: No subscription fees or API costs
User-friendly: Offers a clean web interface for easy interaction

Cons

Resource-intensive: Requires significant computational power to run large language models
Limited compared to cloud-based alternatives: May not have access to the latest models or features
Setup complexity: Requires some technical knowledge to install and configure
Potential for lower performance: Depending on hardware, may not match the speed of cloud-based solutions

Getting Started

Clone the repository:

git clone https://github.com/serge-chat/serge.git
cd serge

Install dependencies:
```
pip install -r requirements.txt
```
Download a language model (e.g., GPT-J 6B):
```
python3 download-model.py GPT-J-6B
```
Start the Serge server:
```
python3 server.py
```
Access the web interface at http://localhost:8008 in your browser.

Competitor Comparisons

text-generation-webui

41,681

A Gradio web UI for Large Language Models with support for multiple inference backends.

Pros of text-generation-webui

More extensive model support, including popular models like GPT-J, LLaMA, and OPT
Advanced features such as character creation, chat modes, and instruct mode
Highly customizable interface with various extensions and plugins

Cons of text-generation-webui

Steeper learning curve due to more complex setup and configuration options
Higher system requirements, especially for running larger language models
May be overwhelming for users seeking a simple, out-of-the-box chat experience

Code Comparison

text-generation-webui:

def generate_reply(
    question, chatbot, state, stopping_strings=None, is_chat=False, **kwargs
):
    # Complex generation logic with multiple parameters and options

serge:

def generate_response(self, prompt: str) -> str:
    # Simpler generation function focused on basic chat functionality
    return self.model.generate(prompt)

The code comparison highlights the difference in complexity between the two projects. text-generation-webui offers more advanced features and customization options, while serge focuses on providing a straightforward chat experience with simpler code structure.

stable-diffusion-webui

149,793

Stable Diffusion web UI

Pros of stable-diffusion-webui

More extensive features for image generation and manipulation
Larger community and more frequent updates
Better documentation and user guides

Cons of stable-diffusion-webui

Higher system requirements and more complex setup
Steeper learning curve for new users
Less focused on chat-based interactions

Code Comparison

stable-diffusion-webui:

def create_infotext(p, all_prompts, all_seeds, all_subseeds, comments=None, iteration=0, position_in_batch=0):
    index = position_in_batch + iteration * p.batch_size

    clip_skip = getattr(p, 'clip_skip', opts.CLIP_stop_at_last_layers)
    token_merging_ratio = getattr(p, 'token_merging_ratio', 0)
    token_merging_ratio_hr = getattr(p, 'token_merging_ratio_hr', 0)

serge:

def get_model_path(model_name: str) -> str:
    model_path = os.path.join(MODELS_PATH, model_name)
    if not os.path.exists(model_path):
        raise ValueError(f"Model {model_name} not found in {MODELS_PATH}")
    return model_path

The code snippets show different focuses: stable-diffusion-webui deals with image generation parameters, while serge handles model path management for chat-based interactions.

FastChat

37,484

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Pros of FastChat

More comprehensive and feature-rich, offering a wider range of functionalities
Better documentation and examples, making it easier for developers to integrate and use
Actively maintained with frequent updates and improvements

Cons of FastChat

More complex setup and configuration process
Higher system requirements due to its extensive features
Steeper learning curve for beginners

Code Comparison

Serge (Python):

from serge import Serge

chat = Serge()
response = chat.chat("Hello, how are you?")
print(response)

FastChat (Python):

from fastchat.model import load_model, get_conversation_template
from fastchat.serve.inference import chat_loop

model, tokenizer = load_model("vicuna-7b")
conv = get_conversation_template("vicuna")
chat_loop(model, tokenizer, conv)

FastChat offers more flexibility and control over the conversation flow, while Serge provides a simpler, more straightforward interface for basic chatbot functionality. FastChat's code demonstrates its ability to load custom models and conversation templates, making it more versatile for advanced use cases.

gpt4free

63,126

The official gpt4free repository | various collection of powerful language models

Pros of gpt4free

Offers access to multiple AI models and providers
Includes a web interface for easy interaction
Provides more frequent updates and active development

Cons of gpt4free

Less focus on privacy and self-hosting
May have potential legal and ethical concerns
Lacks some advanced features present in Serge

Code Comparison

gpt4free:

from g4f import ChatCompletion

response = ChatCompletion.create(model='gpt-3.5-turbo', messages=[
    {'role': 'user', 'content': 'Hello, how are you?'}
])
print(response)

Serge:

from serge import ChatBot

bot = ChatBot()
response = bot.chat("Hello, how are you?")
print(response)

Summary

gpt4free offers a wider range of AI models and providers, along with a web interface, making it more versatile for users seeking various AI interactions. However, Serge focuses more on privacy and self-hosting, which may be preferable for users concerned about data security. gpt4free's code appears more complex, allowing for model selection, while Serge's implementation is simpler and more straightforward. Both projects have their strengths, and the choice between them depends on the user's specific needs and priorities.

private-gpt

55,497

Interact with your documents using the power of GPT, 100% privately, no data leaks

Pros of private-gpt

Focuses on privacy and local data processing
Supports multiple document types (PDF, TXT, etc.)
Utilizes LangChain for improved language model interactions

Cons of private-gpt

Less emphasis on multi-user collaboration
May require more setup and configuration
Potentially higher resource requirements for local processing

Code Comparison

Serge (Python):

@app.route('/api/chat', methods=['POST'])
def chat():
    data = request.json
    conversation_id = data.get('conversation_id')
    message = data.get('message')
    # ... (processing logic)
    return jsonify(response)

private-gpt (Python):

@app.route("/chat", methods=["POST"])
def chat_endpoint():
    request_data = request.json
    question = request_data["question"]
    history = request_data.get("history", [])
    # ... (processing logic)
    return jsonify({"answer": answer, "history": updated_history})

Both projects use Flask for API endpoints, but private-gpt focuses on question-answering with history, while Serge emphasizes conversation management.

DeepSpeed

37,573

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

Highly optimized for large-scale distributed training of deep learning models
Offers advanced features like ZeRO optimizer and 3D parallelism for improved efficiency
Extensive documentation and integration with popular frameworks like PyTorch

Cons of DeepSpeed

Steeper learning curve due to its complexity and advanced features
Primarily focused on training, while Serge offers a more complete chat application solution
May be overkill for smaller projects or those not requiring distributed training

Code Comparison

DeepSpeed (model initialization):

model_engine, optimizer, _, _ = deepspeed.initialize(
    args=args,
    model=model,
    model_parameters=params
)

Serge (chat message handling):

@app.post("/chat/{conversation_id}")
async def chat(
    conversation_id: str,
    message: ChatMessage,
    background_tasks: BackgroundTasks,
    db: Session = Depends(get_db),
):
    # Message handling logic

While DeepSpeed focuses on optimizing model training, Serge provides a framework for building chat applications. DeepSpeed's code emphasizes distributed training setup, whereas Serge's code showcases API endpoints for chat functionality.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Serge - LLaMA made easy ð¦

License

Serge is a chat interface crafted with llama.cpp for running LLM models. No API keys, entirely self-hosted!

ð SvelteKit frontend
ð¾ Redis for storing chat history & parameters
âï¸ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

ð¥ Demo:

demo.webm

â¡ï¸ Quick start

ð³ Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

ð Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

ð Environment Variables

The following Environment Variables are available:

Variable Name	Description	Default Value
`SERGE_DATABASE_URL`	Database connection string	`sqlite:////data/db/sql_app.db`
`SERGE_JWT_SECRET`	Key for auth token encryption. Use a random string	`uF7FGN5uzfGdFiPzR`
`SERGE_SESSION_EXPIRY`	Duration in minutes before a user must reauthenticate	`60`
`NODE_ENV`	Node.js running environment	`production`

ð¥ï¸ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

â ï¸ Memory Usage

LLaMA will crash if you don't have enough available memory for the model

ð¬ Support

Need help? Join our Discord

ð§¾ License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License and Apache-2.0.

ð¤ Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build

The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Remote Debug",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}/api",
                    "remoteRoot": "/usr/src/app/api/"
                }
            ],
            "justMyCode": false
        }
    ]
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Getting Started

Competitor Comparisons

Pros of text-generation-webui

Cons of text-generation-webui

Code Comparison

Pros of stable-diffusion-webui

Cons of stable-diffusion-webui

Code Comparison

Pros of FastChat

Cons of FastChat

Code Comparison

Pros of gpt4free

Cons of gpt4free

Code Comparison

Summary

Pros of private-gpt

Cons of private-gpt

Code Comparison

Pros of DeepSpeed

Cons of DeepSpeed

Code Comparison

Convert designs to code with AI

README

Serge - LLaMA made easy ð¦

â¡ï¸ Quick start

ð Environment Variables

ð¥ï¸ Windows

â ï¸ Memory Usage

ð¬ Support

ð§¾ License

ð¤ Contributing

Top Related Projects

Convert designs to code with AI

Serge - LLaMA made easy ð¦

â¡ï¸ Quick start

ð Environment Variables

ð¥ï¸ Windows

â ï¸ Memory Usage

ð¬ Support

ð§¾ License

ð¤ Contributing