Top Related Projects
The official Python library for the OpenAI API
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Free and Open Source, Distributed, RESTful Search Engine
Quick Overview
Jina is an open-source neural search framework for building cross-modal and multi-modal applications powered by deep learning. It allows developers to build scalable and cloud-native neural search solutions that can handle various data types, including text, images, video, and audio.
Pros
- Supports multi-modal and cross-modal search capabilities
- Highly scalable and cloud-native architecture
- Provides a rich ecosystem of pre-built executors and integrations
- Easy to use with a pythonic API and comprehensive documentation
Cons
- Steep learning curve for beginners in neural search
- Limited community support compared to more established search frameworks
- May be overkill for simple search use cases
- Requires significant computational resources for large-scale deployments
Code Examples
- Creating a simple text search flow:
from jina import Flow, Document
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
f.post('/index', Document(text='Hello, World!'))
response = f.post('/search', Document(text='Hello'))
print(response[0].matches[0].text)
- Building an image search pipeline:
from jina import Flow, Document
f = (
Flow()
.add(uses='jinahub://CLIPImageEncoder')
.add(uses='jinahub://SimpleIndexer')
)
with f:
f.index(Document(uri='path/to/image.jpg'))
response = f.search(Document(uri='path/to/query_image.jpg'))
print(response[0].matches[0].uri)
- Creating a multi-modal search flow:
from jina import Flow, Document
f = (
Flow()
.add(uses='jinahub://CLIPTextEncoder', name='text_encoder')
.add(uses='jinahub://CLIPImageEncoder', name='image_encoder')
.add(uses='jinahub://SimpleIndexer')
)
with f:
f.index([
Document(text='A cute cat'),
Document(uri='path/to/cat_image.jpg')
])
response = f.search(Document(text='Find me a cat picture'))
print(response[0].matches[0].uri)
Getting Started
To get started with Jina, follow these steps:
- Install Jina:
pip install jina
- Create a new Python file (e.g.,
app.py
) and import Jina:
from jina import Flow, Document
- Define a simple flow and run a search:
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
f.post('/index', Document(text='Hello, Jina!'))
response = f.post('/search', Document(text='Hello'))
print(response[0].matches[0].text)
- Run the script:
python app.py
For more advanced usage and configurations, refer to the official Jina documentation.
Competitor Comparisons
The official Python library for the OpenAI API
Pros of openai-python
- Focused specifically on OpenAI's API, providing a streamlined interface
- Extensive documentation and examples for various OpenAI services
- Lightweight and easy to integrate into existing projects
Cons of openai-python
- Limited to OpenAI's services, lacking versatility for other AI tasks
- Requires API key and potentially costly usage of OpenAI's resources
- Less flexibility for custom AI model deployment and management
Code Comparison
openai-python:
import openai
openai.api_key = "your-api-key"
response = openai.Completion.create(engine="davinci", prompt="Hello, world!")
print(response.choices[0].text)
jina:
from jina import Flow, Document
f = Flow().add(uses='jinahub://CLIPTextEncoder')
with f:
resp = f.post('/search', inputs=Document(text='Hello, world!'))
print(resp[0].matches)
The openai-python code focuses on text completion using OpenAI's API, while jina demonstrates a more flexible approach for creating AI workflows with various components. jina offers greater customization and scalability for complex AI tasks, but may require more setup and understanding of its ecosystem. openai-python provides a simpler interface for specific OpenAI services but is limited to their offerings.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Pros of Transformers
- Extensive library of pre-trained models for various NLP tasks
- Well-documented and widely adopted in the research community
- Seamless integration with PyTorch and TensorFlow
Cons of Transformers
- Focused primarily on NLP tasks, less versatile for other AI domains
- Can be resource-intensive for large models and datasets
- Steeper learning curve for beginners in machine learning
Code Comparison
Transformers:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")[0]
print(f"Label: {result['label']}, Score: {result['score']:.4f}")
Jina:
from jina import Flow, Document
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
resp = f.post('/index', Document(text='I love this product!'))
print(f"Indexed document: {resp[0].id}")
Key Differences
- Transformers focuses on NLP tasks, while Jina is a more general-purpose neural search framework
- Jina offers a microservice architecture for scalable AI applications, whereas Transformers is primarily a model library
- Transformers provides easy access to pre-trained models, while Jina emphasizes building end-to-end search solutions
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Widely adopted in the machine learning community with extensive ecosystem
- Powerful and flexible for deep learning research and production
- Excellent GPU acceleration and distributed training capabilities
Cons of PyTorch
- Steeper learning curve for beginners compared to Jina
- Primarily focused on deep learning, less versatile for general AI applications
- Requires more boilerplate code for certain tasks
Code Comparison
PyTorch example (basic neural network):
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self):
super(SimpleNet, self).__init__()
self.fc = nn.Linear(10, 5)
def forward(self, x):
return self.fc(x)
Jina example (basic Flow):
from jina import Flow, Document
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
f.post('/index', Document(text='Hello, World!'))
PyTorch is a powerful deep learning framework, while Jina is a cloud-native neural search framework. PyTorch offers more flexibility for custom neural network architectures, whereas Jina provides higher-level abstractions for building search and AI applications. PyTorch requires more low-level coding, while Jina emphasizes simplicity and rapid development for specific use cases.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Extensive ecosystem with robust tools and libraries
- Highly optimized for large-scale machine learning and deep learning
- Strong support for distributed computing and GPU acceleration
Cons of TensorFlow
- Steeper learning curve, especially for beginners
- Can be overkill for simpler machine learning tasks
- Slower development cycle compared to more lightweight frameworks
Code Comparison
TensorFlow:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
Jina:
from jina import Flow, Document
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
f.post('/index', Document(text='Hello, World!'))
TensorFlow is a comprehensive machine learning framework, while Jina is a neural search framework. TensorFlow excels in building and training complex neural networks, whereas Jina focuses on creating scalable neural search solutions. TensorFlow's code typically involves defining and training models, while Jina's code centers around creating flows and processing documents for search and retrieval tasks.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Pros of Haystack
- More focused on question answering and document retrieval tasks
- Offers a wider range of pre-built pipelines for specific NLP tasks
- Better documentation and tutorials for beginners
Cons of Haystack
- Less flexible for general-purpose AI applications
- Smaller community and fewer contributors compared to Jina
Code Comparison
Haystack example:
from haystack import Pipeline
from haystack.nodes import TfidfRetriever, FARMReader
pipeline = Pipeline()
pipeline.add_node(component=TfidfRetriever(document_store=document_store), name="Retriever", inputs=["Query"])
pipeline.add_node(component=FARMReader(model_name_or_path="deepset/roberta-base-squad2"), name="Reader", inputs=["Retriever"])
Jina example:
from jina import Flow, Document
f = Flow().add(uses='jinahub://SimpleIndexer')
with f:
f.post('/index', Document(text='Hello, World!'))
f.post('/search', Document(text='Hello'))
Both frameworks offer easy-to-use pipelines for various NLP tasks, but Haystack is more specialized for question answering and document retrieval, while Jina provides a more flexible architecture for general AI applications. Haystack's code tends to be more explicit in defining pipeline components, whereas Jina's approach is more concise and modular.
Free and Open Source, Distributed, RESTful Search Engine
Pros of Elasticsearch
- Mature and battle-tested search engine with extensive documentation
- Powerful full-text search capabilities and advanced querying options
- Large ecosystem with numerous plugins and integrations
Cons of Elasticsearch
- Steep learning curve and complex configuration
- Resource-intensive, especially for large-scale deployments
- Primarily focused on text-based search, less versatile for multimodal data
Code Comparison
Elasticsearch query example:
{
"query": {
"match": {
"title": "search example"
}
}
}
Jina query example:
from jina import Client, Document
c = Client()
d = Document(text='search example')
results = c.search(d)
Key Differences
- Jina is designed for multimodal and cross-modal search, while Elasticsearch excels in text-based search
- Elasticsearch uses a RESTful API with JSON queries, whereas Jina uses a Python-native API
- Jina focuses on neural search and deep learning models, while Elasticsearch relies more on traditional information retrieval techniques
Both projects have their strengths, with Elasticsearch being a robust choice for text-based search and Jina offering more flexibility for multimodal and AI-powered search applications.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Jina-Serve
Jina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.
Key Features
- Native support for all major ML frameworks and data types
- High-performance service design with scaling, streaming, and dynamic batching
- LLM serving with streaming output
- Built-in Docker integration and Executor Hub
- One-click deployment to Jina AI Cloud
- Enterprise-ready with Kubernetes and Docker Compose support
Comparison with FastAPI
Key advantages over FastAPI:
- DocArray-based data handling with native gRPC support
- Built-in containerization and service orchestration
- Seamless scaling of microservices
- One-command cloud deployment
Install
pip install jina
See guides for Apple Silicon and Windows.
Core Concepts
Three main layers:
- Data: BaseDoc and DocList for input/output
- Serving: Executors process Documents, Gateway connects services
- Orchestration: Deployments serve Executors, Flows create pipelines
Build AI Services
Let's create a gRPC-based AI service using StableLM:
from jina import Executor, requests
from docarray import DocList, BaseDoc
from transformers import pipeline
class Prompt(BaseDoc):
text: str
class Generation(BaseDoc):
prompt: str
text: str
class StableLM(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.generator = pipeline(
'text-generation', model='stabilityai/stablelm-base-alpha-3b'
)
@requests
def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
generations = DocList[Generation]()
prompts = docs.text
llm_outputs = self.generator(prompts)
for prompt, output in zip(prompts, llm_outputs):
generations.append(Generation(prompt=prompt, text=output))
return generations
Deploy with Python or YAML:
from jina import Deployment
from executor import StableLM
dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)
with dep:
dep.block()
jtype: Deployment
with:
uses: StableLM
py_modules:
- executor.py
timeout_ready: -1
port: 12345
Use the client:
from jina import Client
from docarray import DocList
from executor import Prompt, Generation
prompt = Prompt(text='suggest an interesting image generation prompt')
client = Client(port=12345)
response = client.post('/', inputs=[prompt], return_type=DocList[Generation])
Build Pipelines
Chain services into a Flow:
from jina import Flow
flow = Flow(port=12345).add(uses=StableLM).add(uses=TextToImage)
with flow:
flow.block()
Scaling and Deployment
Local Scaling
Boost throughput with built-in features:
- Replicas for parallel processing
- Shards for data partitioning
- Dynamic batching for efficient model inference
Example scaling a Stable Diffusion deployment:
jtype: Deployment
with:
uses: TextToImage
timeout_ready: -1
py_modules:
- text_to_image.py
env:
CUDA_VISIBLE_DEVICES: RR
replicas: 2
uses_dynamic_batching:
/default:
preferred_batch_size: 10
timeout: 200
Cloud Deployment
Containerize Services
- Structure your Executor:
TextToImage/
âââ executor.py
âââ config.yml
âââ requirements.txt
- Configure:
# config.yml
jtype: TextToImage
py_modules:
- executor.py
metas:
name: TextToImage
description: Text to Image generation Executor
- Push to Hub:
jina hub push TextToImage
Deploy to Kubernetes
jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s
Use Docker Compose
jina export docker-compose flow.yml docker-compose.yml
docker-compose up
JCloud Deployment
Deploy with a single command:
jina cloud deploy jcloud-flow.yml
LLM Streaming
Enable token-by-token streaming for responsive LLM applications:
- Define schemas:
from docarray import BaseDoc
class PromptDocument(BaseDoc):
prompt: str
max_tokens: int
class ModelOutputDocument(BaseDoc):
token_id: int
generated_text: str
- Initialize service:
from transformers import GPT2Tokenizer, GPT2LMHeadModel
class TokenStreamingExecutor(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.model = GPT2LMHeadModel.from_pretrained('gpt2')
- Implement streaming:
@requests(on='/stream')
async def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:
input = tokenizer(doc.prompt, return_tensors='pt')
input_len = input['input_ids'].shape[1]
for _ in range(doc.max_tokens):
output = self.model.generate(**input, max_new_tokens=1)
if output[0][-1] == tokenizer.eos_token_id:
break
yield ModelOutputDocument(
token_id=output[0][-1],
generated_text=tokenizer.decode(
output[0][input_len:], skip_special_tokens=True
),
)
input = {
'input_ids': output,
'attention_mask': torch.ones(1, len(output[0])),
}
- Serve and use:
# Server
with Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:
dep.block()
# Client
async def main():
client = Client(port=12345, protocol='grpc', asyncio=True)
async for doc in client.stream_doc(
on='/stream',
inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),
return_type=ModelOutputDocument,
):
print(doc.generated_text)
Support
Jina-serve is backed by Jina AI and licensed under Apache-2.0.
Top Related Projects
The official Python library for the OpenAI API
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Free and Open Source, Distributed, RESTful Search Engine
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot