khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

30,446

1,735

30,446

View on GitHub

Top Related Projects

openai-cookbook

64,769

Examples and guides for using the OpenAI API

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

faiss

34,439

A library for efficient similarity search and clustering of dense vectors.

haystack

21,304

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

allennlp

11,843

An open-source NLP research library, built on PyTorch.

Quick Overview

Khoj is an AI-powered personal search engine and chatbot. It allows users to search and chat with their personal knowledge base, including notes, documents, and images. Khoj aims to provide a privacy-focused, offline-first solution for personal information management and retrieval.

Pros

Privacy-focused: Runs locally on your device, ensuring data privacy
Versatile: Supports various file formats and integrations (e.g., Markdown, Org-mode, PDF, images)
Customizable: Offers different search algorithms and embedding models
Open-source: Allows for community contributions and transparency

Cons

Resource-intensive: May require significant computational resources for large knowledge bases
Setup complexity: Initial configuration and indexing process can be challenging for non-technical users
Limited natural language understanding: May not always interpret complex queries accurately
Ongoing maintenance: Requires regular updates and re-indexing to keep the knowledge base current

Code Examples

# Initialize Khoj client
from khoj.utils.khoj_client import KhojClient

client = KhojClient(api_url="http://localhost:8000")

# Perform a search query
results = client.search("What are the benefits of meditation?")
print(results)

# Chat with Khoj
conversation = client.chat("Tell me about the last book I read.")
for message in conversation:
    print(f"{message['role']}: {message['content']}")

# Add a new file to the knowledge base
client.index_file("/path/to/new_document.md")

Getting Started

Install Khoj:
```
pip install khoj-assistant
```
Start the Khoj server:
```
khoj
```
Open a web browser and navigate to http://localhost:8000 to access the Khoj web interface.
Configure your knowledge base sources in the settings.
Begin searching and chatting with your personal knowledge base!

Competitor Comparisons

openai-cookbook

64,769

Examples and guides for using the OpenAI API

Pros of openai-cookbook

Comprehensive collection of OpenAI API usage examples and best practices
Regularly updated with new features and improvements from OpenAI
Extensive documentation and explanations for various AI tasks

Cons of openai-cookbook

Focused solely on OpenAI's products, limiting its applicability to other AI platforms
Requires API keys and potentially significant costs for running examples
Less emphasis on local, privacy-focused AI solutions

Code Comparison

openai-cookbook:

import openai

response = openai.Completion.create(
  engine="text-davinci-002",
  prompt="Translate the following English text to French: '{}'",
  max_tokens=60
)

khoj:

from khoj.utils.ai import get_ai_response

response = get_ai_response(
    "Translate the following English text to French: '{}'",
    model="gpt-3.5-turbo"
)

Summary

openai-cookbook provides a wealth of information and examples for working with OpenAI's APIs, making it an excellent resource for developers using their services. However, it's limited to OpenAI's ecosystem and may involve costs.

khoj offers a more privacy-focused, local-first approach to AI integration, with support for multiple models and platforms. It may have a steeper learning curve but provides greater flexibility and control over data privacy.

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of transformers

Extensive library of pre-trained models for various NLP tasks
Well-documented and widely adopted in the AI/ML community
Regular updates and contributions from a large open-source community

Cons of transformers

Steep learning curve for beginners due to its comprehensive nature
Can be resource-intensive, especially for large models
Primarily focused on NLP tasks, limiting its use in other domains

Code Comparison

Khoj (Python):

from khoj.utils.constants import EMBEDDING_MODEL_NAME
from khoj.utils.rawconfig import RawConfig
from khoj.processor.content.text_to_entries import process_text_files

config = RawConfig(embedding_model=EMBEDDING_MODEL_NAME)
entries = process_text_files(config)

transformers (Python):

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model(**inputs)

faiss

34,439

A library for efficient similarity search and clustering of dense vectors.

Pros of FAISS

Highly optimized for large-scale similarity search and clustering of dense vectors
Supports GPU acceleration for faster processing
Extensive documentation and wide industry adoption

Cons of FAISS

Steeper learning curve due to its focus on low-level operations
Limited to vector similarity search, lacking broader AI assistant capabilities
Requires more setup and integration work for end-user applications

Code Comparison

FAISS (vector indexing and search):

import faiss
index = faiss.IndexFlatL2(d)
index.add(xb)
D, I = index.search(xq, k)

Khoj (AI assistant interaction):

from khoj.interface.cli import cli
result = cli.query("What is the capital of France?")
print(result)

Key Differences

FAISS is a specialized library for efficient similarity search and clustering of dense vectors, ideal for large-scale machine learning applications. Khoj, on the other hand, is an AI-powered personal assistant focused on natural language processing and information retrieval from personal knowledge bases.

While FAISS excels in vector operations, Khoj provides a more user-friendly interface for AI-assisted tasks and personal knowledge management. FAISS requires more technical expertise but offers greater flexibility for custom vector search implementations, whereas Khoj aims to provide an out-of-the-box solution for AI-powered personal assistance.

haystack

21,304

Pros of Haystack

More comprehensive and feature-rich, offering a wide range of NLP tasks and pipelines
Larger community and ecosystem, with better documentation and support
Designed for production-ready, scalable applications

Cons of Haystack

Steeper learning curve due to its complexity and extensive features
Heavier resource requirements, which may be overkill for simpler projects
Less focus on personal knowledge management compared to Khoj

Code Comparison

Khoj (Python):

from khoj.processor.text.semantic_search import SemanticSearch

searcher = SemanticSearch()
results = searcher.search("query", ["file1.txt", "file2.txt"])

Haystack (Python):

from haystack import Pipeline
from haystack.nodes import EmbeddingRetriever, Ranker

pipeline = Pipeline()
pipeline.add_node(component=EmbeddingRetriever(), name="Retriever", inputs=["Query"])
pipeline.add_node(component=Ranker(), name="Ranker", inputs=["Retriever"])

results = pipeline.run(query="query", documents=["file1.txt", "file2.txt"])

The code comparison shows that Khoj offers a simpler, more straightforward API for semantic search, while Haystack provides a more flexible and customizable pipeline approach. Haystack's code demonstrates its ability to chain multiple components together, which can be beneficial for complex NLP tasks but may be unnecessary for basic search functionality.

allennlp

11,843

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

Comprehensive NLP toolkit with a wide range of pre-built models and components
Extensive documentation and tutorials for ease of use
Large and active community support

Cons of AllenNLP

Steeper learning curve for beginners due to its extensive feature set
Heavier resource requirements for some models and tasks

Code Comparison

AllenNLP:

from allennlp.predictors.predictor import Predictor

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz")
result = predictor.predict(sentence="Did Uriah honestly think he could beat the game in under three hours?")

Khoj:

from khoj.utils.rawconfig import RawConfig
from khoj.processor.content.markdown import MarkdownContent

config = RawConfig(content_type="markdown")
processor = MarkdownContent(config)
entries = processor.process_file("path/to/file.md")

AllenNLP offers a more comprehensive set of NLP tools and pre-trained models, making it suitable for a wide range of NLP tasks. It has extensive documentation and community support, but may have a steeper learning curve for beginners. Khoj, on the other hand, is more focused on personal knowledge management and information retrieval, with a simpler API for processing specific content types like Markdown.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Your AI second brain

ð Docs â¢ ð Web â¢ ð¥ App â¢ ð¬ Discord â¢ âð½ Blog

ð New

Start any message with /research to try out the experimental research mode with Khoj.
Anyone can now create custom agents with tunable personality, tools and knowledge bases.
Read about Khoj's excellent performance on modern retrieval and reasoning benchmarks.

Overview

Khoj is a personal AI app to extend your capabilities. It smoothly scales up from an on-device personal AI to a cloud-scale enterprise AI.

Chat with any local or online LLM (e.g llama3, qwen, gemma, mistral, gpt, claude, gemini, deepseek).
Get answers from the internet and your docs (including image, pdf, markdown, org-mode, word, notion files).
Access it from your Browser, Obsidian, Emacs, Desktop, Phone or Whatsapp.
Create agents with custom knowledge, persona, chat model and tools to take on any role.
Automate away repetitive research. Get personal newsletters and smart notifications delivered to your inbox.
Find relevant docs quickly and easily using our advanced semantic search.
Generate images, talk out loud, play your messages.
Khoj is open-source, self-hostable. Always.
Run it privately on your computer or try it on our cloud app.

See it in action

demo_chat

Go to https://app.khoj.dev to see Khoj live.

Full feature list

You can see the full feature list here.

Self-Host

To get started with self-hosting Khoj, read the docs.

Enterprise

Khoj is available as a cloud service, on-premises, or as a hybrid solution. To learn more about Khoj Enterprise, visit our website.

Frequently Asked Questions (FAQ)

Q: Can I use Khoj without self-hosting?

Yes! You can use Khoj right away at https://app.khoj.dev â no setup required.

Q: What kinds of documents can Khoj read?

Khoj supports a wide variety: PDFs, Markdown, Notion, Word docs, org-mode files, and more.

Q: How can I make my own agent?

Check out this blog post for a step-by-step guide to custom agents. For more questions, head over to our Discord!

Contributors

Cheers to our awesome contributors! ð

Made with contrib.rocks.

Interested in Contributing?

Khoj is open source. It is sustained by the community and weâd love for you to join it! Whether youâre a coder, designer, writer, or enthusiast, thereâs a place for you.

Why Contribute?

Make an Impact: Help build, test and improve a tool used by thousands to boost productivity.
Learn & Grow: Work on cutting-edge AI, LLMs, and semantic search technologies.

You can help us build new features, improve the project documentation, report issues and fix bugs. If you're a developer, please see our Contributing Guidelines and check out good first issues to work on.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of openai-cookbook

Cons of openai-cookbook

Code Comparison

Summary

Pros of transformers

Cons of transformers

Code Comparison

Pros of FAISS

Cons of FAISS

Code Comparison

Key Differences

Pros of Haystack

Cons of Haystack

Code Comparison

Pros of AllenNLP

Cons of AllenNLP

Code Comparison

Convert designs to code with AI

README

ð New

Overview

See it in action

Full feature list

Self-Host

Enterprise

Frequently Asked Questions (FAQ)

Contributors

Interested in Contributing?

Top Related Projects

Convert designs to code with AI

ð New