Convert Figma logo to code with AI

LearningCircuit logolocal-deep-research

Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini) and includes benchmark tools to test on your own data. Searches 10+ sources - arXiv, PubMed, GitHub, web, and your private documents.

3,121
311
3,121
42

Top Related Projects

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

34,439

A library for efficient similarity search and clustering of dense vectors.

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

190,523

An Open Source Machine Learning Framework for Everyone

39,267

TensorFlow code and pre-trained models for BERT

31,840

💫 Industrial-strength Natural Language Processing (NLP) in Python

Quick Overview

The LearningCircuit/local-deep-research repository is a collection of tools and resources for conducting deep learning research on local machines. It aims to provide a streamlined environment for researchers and enthusiasts to experiment with various deep learning models and techniques without relying on cloud services or expensive hardware.

Pros

  • Enables deep learning research on local machines, reducing dependency on cloud services
  • Provides a curated set of tools and libraries optimized for local environments
  • Offers flexibility for customization and experimentation
  • Promotes learning and understanding of deep learning concepts through hands-on experience

Cons

  • May have limitations in processing power compared to cloud-based solutions
  • Requires some technical knowledge to set up and configure the environment
  • Might not be suitable for extremely large-scale projects or datasets
  • Could have compatibility issues with certain hardware configurations

Code Examples

Here are a few code examples demonstrating the usage of the local-deep-research tools:

  1. Loading a pre-trained model:
from local_deep_research import models

model = models.load_pretrained("resnet50")
print(model.summary())
  1. Preparing a dataset for training:
from local_deep_research import data

dataset = data.load_dataset("cifar10")
train_data, test_data = data.split_dataset(dataset, test_size=0.2)
  1. Training a model:
from local_deep_research import training

trainer = training.Trainer(model, train_data, test_data)
trainer.train(epochs=10, batch_size=32)
  1. Evaluating model performance:
from local_deep_research import evaluation

evaluator = evaluation.Evaluator(model, test_data)
results = evaluator.evaluate()
print(f"Test accuracy: {results['accuracy']:.2f}")

Getting Started

To get started with the local-deep-research project, follow these steps:

  1. Clone the repository:

    git clone https://github.com/LearningCircuit/local-deep-research.git
    cd local-deep-research
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the setup script:

    python setup.py
    
  4. Start experimenting with the provided examples:

    python examples/basic_training.py
    

For more detailed instructions and documentation, refer to the README.md file in the repository.

Competitor Comparisons

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of transformers

  • Extensive library of pre-trained models for various NLP tasks
  • Well-documented and actively maintained by a large community
  • Seamless integration with popular deep learning frameworks

Cons of transformers

  • Larger library size and potentially higher resource requirements
  • Steeper learning curve for beginners due to its comprehensive nature
  • May include unnecessary features for simple projects

Code comparison

transformers:

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier("I love this library!")[0]
print(f"Label: {result['label']}, Score: {result['score']:.4f}")

local-deep-research:

# No direct equivalent code available
# The repository focuses on local research and doesn't provide
# a similar high-level API for sentiment analysis

Summary

transformers is a comprehensive library for NLP tasks with a wide range of pre-trained models and extensive documentation. It offers more features and community support but may be overkill for simple projects. local-deep-research appears to be a smaller, more focused repository for local deep learning research, potentially offering a simpler approach for specific use cases but with fewer out-of-the-box features compared to transformers.

34,439

A library for efficient similarity search and clustering of dense vectors.

Pros of faiss

  • Highly optimized for efficient similarity search and clustering of dense vectors
  • Supports GPU acceleration for improved performance on large datasets
  • Extensive documentation and active community support

Cons of faiss

  • Steeper learning curve due to its complex architecture and numerous options
  • Primarily focused on vector similarity search, less versatile for general research tasks
  • Requires more setup and configuration compared to simpler alternatives

Code comparison

faiss:

import faiss
index = faiss.IndexFlatL2(d)
index.add(xb)
D, I = index.search(xq, k)

local-deep-research:

from local_deep_research import LocalDeepResearch
ldr = LocalDeepResearch()
results = ldr.search(query, k)

Summary

faiss is a powerful library for efficient similarity search and clustering of dense vectors, offering GPU acceleration and extensive features. However, it has a steeper learning curve and is more focused on vector operations. local-deep-research appears to be a simpler alternative for general research tasks, potentially easier to set up and use, but with fewer advanced features and optimizations compared to faiss.

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of pytorch

  • Extensive ecosystem with wide industry adoption
  • Comprehensive documentation and community support
  • Highly optimized for performance and scalability

Cons of pytorch

  • Steeper learning curve for beginners
  • Larger codebase and installation size
  • More complex setup for custom environments

Code Comparison

local-deep-research:

import torch
from torch import nn

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(10, 1)

pytorch:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)

local-deep-research is designed for simplicity and ease of use in local research environments, while pytorch offers a more comprehensive framework for deep learning tasks. The code comparison shows that local-deep-research focuses on basic neural network structures, whereas pytorch provides more advanced components like convolutional layers out of the box.

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Extensive ecosystem with robust tools and libraries
  • Strong community support and extensive documentation
  • Highly scalable for large-scale machine learning projects

Cons of TensorFlow

  • Steeper learning curve for beginners
  • Can be overkill for smaller projects or quick prototyping
  • Slower development cycle compared to more lightweight frameworks

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

Local Deep Research:

from local_deep_research import NeuralNetwork

model = NeuralNetwork([
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

The code comparison shows that TensorFlow uses a more structured approach with its keras API, while Local Deep Research appears to have a simpler, more direct implementation. TensorFlow's code is likely part of a larger, more comprehensive framework, whereas Local Deep Research might be more focused on quick, local experimentation.

39,267

TensorFlow code and pre-trained models for BERT

Pros of BERT

  • Widely adopted and extensively researched NLP model
  • Supports multiple languages and tasks out-of-the-box
  • Backed by Google, ensuring ongoing development and support

Cons of BERT

  • Large model size, requiring significant computational resources
  • Complex architecture, potentially challenging for beginners
  • Limited flexibility for customization without extensive modifications

Code Comparison

BERT:

import tensorflow as tf
import bert

bert_params = bert.params_from_pretrained_ckpt(bert_ckpt_dir)
bert_layer = bert.BertModelLayer.from_params(bert_params, name="bert")

input_ids = tf.keras.layers.Input(shape=(max_seq_len,), dtype='int32', name="input_ids")
output = bert_layer(input_ids)

local-deep-research:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model(**inputs)

The local-deep-research repository appears to focus on local implementation of deep learning models, potentially offering more flexibility and customization options. However, it may lack the extensive pre-training and optimization of BERT. The code comparison shows that BERT uses TensorFlow, while local-deep-research uses PyTorch and the Transformers library, which may impact ease of use and integration depending on the user's preferred framework.

31,840

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

  • Comprehensive NLP library with extensive features and pre-trained models
  • Well-documented and actively maintained by a dedicated team
  • Optimized for production use with efficient performance

Cons of spaCy

  • Steeper learning curve due to its extensive feature set
  • Larger footprint and potentially slower for simple NLP tasks
  • Less flexibility for customization compared to lightweight alternatives

Code Comparison

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for ent in doc.ents:
    print(ent.text, ent.label_)

local-deep-research:

# No direct code comparison available
# local-deep-research focuses on local AI research
# and doesn't provide a similar NLP pipeline

Summary

spaCy is a robust, production-ready NLP library with extensive features, while local-deep-research appears to be a project focused on local AI research. spaCy offers a comprehensive solution for various NLP tasks, but may be overkill for simpler applications. local-deep-research likely provides more flexibility for custom research implementations but lacks the out-of-the-box functionality of spaCy.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Local Deep Research

GitHub stars Docker Pulls PyPI Downloads

Tests CodeQL

Discord Reddit

AI-powered research assistant for deep, iterative research

Performs deep, iterative research using multiple LLMs and search engines with proper citations

🚀 What is Local Deep Research?

LDR is an AI research assistant that performs systematic research by:

  • Breaking down complex questions into focused sub-queries
  • Searching multiple sources in parallel (web, academic papers, local documents)
  • Verifying information across sources for accuracy
  • Creating comprehensive reports with proper citations

It aims to help researchers, students, and professionals find accurate information quickly while maintaining transparency about sources.

🎯 Why Choose LDR?

  • Privacy-Focused: Run entirely locally with Ollama + SearXNG
  • Flexible: Use any LLM, any search engine, any vector store
  • Comprehensive: Multiple research modes from quick summaries to detailed reports
  • Transparent: Track costs and performance with built-in analytics
  • Open Source: MIT licensed with an active community

📊 Performance

~95% accuracy on SimpleQA benchmark (preliminary results)

  • Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy
  • Comparable to state-of-the-art AI research systems
  • Local models can achieve similar performance with proper configuration
  • Join our community benchmarking effort →

✨ Key Features

🔍 Research Modes

  • Quick Summary - Get answers in 30 seconds to 3 minutes with citations
  • Detailed Research - Comprehensive analysis with structured findings
  • Report Generation - Professional reports with sections and table of contents
  • Document Analysis - Search your private documents with AI

🛠️ Advanced Capabilities

  • LangChain Integration - Use any vector store as a search engine
  • REST API - Language-agnostic HTTP access
  • Benchmarking - Test and optimize your configuration
  • Analytics Dashboard - Track costs, performance, and usage metrics
  • Real-time Updates - WebSocket support for live research progress
  • Export Options - Download results as PDF or Markdown
  • Research History - Save, search, and revisit past research
  • Adaptive Rate Limiting - Intelligent retry system that learns optimal wait times
  • Keyboard Shortcuts - Navigate efficiently (ESC, Ctrl+Shift+1-5)

🌐 Search Sources

Free Search Engines

  • Academic: arXiv, PubMed, Semantic Scholar
  • General: Wikipedia, SearXNG, DuckDuckGo
  • Technical: GitHub, Elasticsearch
  • Historical: Wayback Machine
  • News: The Guardian

Premium Search Engines

  • Tavily - AI-powered search
  • Google - Via SerpAPI or Programmable Search Engine
  • Brave Search - Privacy-focused web search

Custom Sources

  • Local Documents - Search your files with AI
  • LangChain Retrievers - Any vector store or database
  • Meta Search - Combine multiple engines intelligently

Full Search Engines Guide →

⚡ Quick Start

Option 1: Docker (Quickstart on MAC/ARM)

# Step 1: Pull and run SearXNG for optimal search results
docker run -d -p 8080:8080 --name searxng searxng/searxng

# Step 2: Pull and run Local Deep Research (Please build your own docker on ARM)
docker run -d -p 5000:5000 --name local-deep-research --volume 'deep-research:/install/.venv/lib/python3.13/site-packages/data/' localdeepresearch/local-deep-research

Option 2: Docker Compose (Recommended)

LDR uses Docker compose to bundle the web app and all it's dependencies so you can get up and running quickly.

Option 2a: Quick Start (One Command)

curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d

Open http://localhost:5000 after ~30 seconds. This starts LDR with SearXNG and all dependencies.

Option 2b: DIY docker-compose

See docker-compose.yml for a docker-compose file with reasonable defaults to get up and running with ollama, searxng, and local deep research all running locally.

Things you may want/need to configure:

  • Ollama GPU driver
  • Ollama context length (depends on available VRAM)
  • Ollama keep alive (duration model will stay loaded into VRAM and idle before getting unloaded automatically)
  • Deep Research model (depends on available VRAM and preference)

Option 2c: Use Cookie Cutter to tailor a docker-compose to your needs:

Prerequisites

Clone the repository:

git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research

Configuring with Docker Compose

Cookiecutter will interactively guide you through the process of creating a docker-compose configuration that meets your specific needs. This is the recommended approach if you are not very familiar with Docker.

In the LDR repository, run the following command to generate the compose file:

cookiecutter cookiecutter-docker/
docker compose -f docker-compose.default.yml up

Docker Compose Guide →

Option 3: Python Package

# Step 1: Install the package
pip install local-deep-research

# Step 2: Setup SearXNG for best results
docker pull searxng/searxng
docker run -d -p 8080:8080 --name searxng searxng/searxng

# Step 3: Install Ollama from https://ollama.ai

# Step 4: Download a model
ollama pull gemma3:12b

# Step 5: Start the web interface
python -m local_deep_research.web.app

Full Installation Guide →

💻 Usage Examples

Python API

from local_deep_research.api import quick_summary

# Simple usage
result = quick_summary("What are the latest advances in quantum computing?")
print(result["summary"])

# Advanced usage with custom configuration
result = quick_summary(
    query="Impact of AI on healthcare",
    search_tool="searxng",
    search_strategy="focused-iteration",
    iterations=2
)

HTTP API

curl -X POST http://localhost:5000/api/v1/quick_summary \
  -H "Content-Type: application/json" \
  -d '{"query": "Explain CRISPR gene editing"}'

More Examples →

Command Line Tools

# Run benchmarks from CLI
python -m local_deep_research.benchmarks --dataset simpleqa --examples 50

# Manage rate limiting
python -m local_deep_research.web_search_engines.rate_limiting status
python -m local_deep_research.web_search_engines.rate_limiting reset

🔗 Enterprise Integration

Connect LDR to your existing knowledge base:

from local_deep_research.api import quick_summary

# Use your existing LangChain retriever
result = quick_summary(
    query="What are our deployment procedures?",
    retrievers={"company_kb": your_retriever},
    search_tool="company_kb"
)

Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.

Integration Guide →

📊 Performance & Analytics

Benchmark Results

Early experiments on small SimpleQA dataset samples:

ConfigurationAccuracyNotes
gpt-4.1-mini + SearXNG + focused_iteration90-95%Limited sample size
gpt-4.1-mini + Tavily + focused_iteration90-95%Limited sample size
gemini-2.0-flash-001 + SearXNG82%Single test run

Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. Run your own benchmarks →

Built-in Analytics Dashboard

Track costs, performance, and usage with detailed metrics. Learn more →

🤖 Supported LLMs

Local Models (via Ollama)

  • Llama 3, Mistral, Gemma, DeepSeek
  • LLM processing stays local (search queries still go to web)
  • No API costs

Cloud Models

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude 3)
  • Google (Gemini)
  • 100+ models via OpenRouter

Model Setup →

📚 Documentation

Getting Started

Core Features

Advanced Features

Development

Examples & Tutorials

🤝 Community & Support

🚀 Contributing

We welcome contributions! See our Contributing Guide to get started.

📄 License

MIT License - see LICENSE file.

Built with: LangChain, Ollama, SearXNG, FAISS

Support Free Knowledge: Consider donating to Wikipedia, arXiv, or PubMed.