langchain-serve

⚡ Langchain apps in production using Jina & FastAPI

1,632

139

1,632

View on GitHub

Top Related Projects

semantic-kernel

25,112

Integrate cutting-edge LLM technology quickly and easily into your apps

langchain

112,752

🦜🔗 Build context-aware reasoning applications

llama_index

42,647

LlamaIndex is the leading framework for building LLM-powered agents over your data.

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

deeplearning4j

14,064

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...

Quick Overview

Langchain-serve is a project that enables easy deployment of LangChain apps to the cloud. It provides a seamless way to serve LangChain applications, allowing developers to focus on building their AI-powered applications without worrying about the complexities of deployment and scaling.

Pros

Simplifies the deployment process for LangChain applications
Supports cloud deployment, making it easy to scale applications
Integrates well with the LangChain ecosystem
Provides a user-friendly interface for managing deployments

Cons

Limited documentation, which may make it challenging for new users
Potential learning curve for those unfamiliar with cloud deployment concepts
May have limitations in customization options compared to manual deployment
Dependency on specific cloud providers or platforms

Code Examples

# Example 1: Creating a simple LangChain app
from langchain import PromptTemplate, LLMChain
from langchain.llms import OpenAI

template = "What is a good name for a company that makes {product}?"
prompt = PromptTemplate(template=template, input_variables=["product"])
llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0.9))

# Example 2: Deploying the app using langchain-serve
from langchain_serve import deploy

@deploy
def generate_company_name(product: str):
    return llm_chain.run(product)

# Example 3: Invoking the deployed function
from langchain_serve import invoke

result = invoke("generate_company_name", product="eco-friendly water bottles")
print(result)

Getting Started

To get started with langchain-serve:

Install the library:
```
pip install langchain-serve
```
Set up your LangChain application as shown in the code examples above.
Use the @deploy decorator to mark functions for deployment.
Run your script to deploy the application:
```
python your_script.py
```
Use the invoke function to call your deployed endpoints from other parts of your code or external applications.

Competitor Comparisons

langserve

2,141

LangServe 🦜️🏓

Pros of langserve

Native integration with LangChain, ensuring seamless compatibility
Simpler setup and configuration for LangChain-based applications
Built-in support for LangChain's templating and chaining features

Cons of langserve

Less flexible for non-LangChain projects or custom implementations
More limited deployment options compared to langchain-serve
Potentially less scalable for large-scale, distributed applications

Code Comparison

langserve:

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langserve import LangserveServer

prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
model = ChatOpenAI()
chain = prompt | model

server = LangserveServer(chains={"joke": chain})
server.run()

langchain-serve:

from jina import Flow
from docarray import Document
from langchain.llms import OpenAI

f = Flow().add(uses='jinahub://LangChainExecutor')
with f:
    f.post('/generate', inputs=Document(text='Tell me a joke about AI'))

Both repositories aim to simplify the deployment of LangChain applications, but they differ in their approach and integration level. langserve offers tighter integration with LangChain, while langchain-serve provides more flexibility and scalability options.

semantic-kernel

25,112

Integrate cutting-edge LLM technology quickly and easily into your apps

Pros of Semantic Kernel

More comprehensive framework for building AI applications
Better integration with Azure AI services
Stronger focus on enterprise-grade development and scalability

Cons of Semantic Kernel

Steeper learning curve due to more complex architecture
Less flexibility for using multiple LLM providers
Primarily designed for .NET, limiting language options

Code Comparison

Semantic Kernel (C#):

var kernel = Kernel.Builder.Build();
var function = kernel.CreateSemanticFunction("Generate a story about {{$input}}");
var result = await kernel.RunAsync("a brave knight", function);

LangChain-Serve (Python):

from langchain_serve import LangServe
from langchain.prompts import PromptTemplate

app = LangServe()
prompt = PromptTemplate(template="Generate a story about {input}")
result = app.run(prompt, input="a brave knight")

Summary

Semantic Kernel offers a more comprehensive framework with better Azure integration, ideal for enterprise-grade AI applications. However, it has a steeper learning curve and is primarily designed for .NET. LangChain-Serve provides more flexibility in LLM providers and language options but may lack some of the advanced features and integrations offered by Semantic Kernel.

langchain

112,752

🦜🔗 Build context-aware reasoning applications

Pros of LangChain

More comprehensive and feature-rich, offering a wide range of tools and integrations for building language model applications
Larger and more active community, resulting in frequent updates and extensive documentation
Supports multiple programming languages, including Python and JavaScript

Cons of LangChain

Steeper learning curve due to its extensive feature set
Can be overkill for simpler projects that don't require all the available functionalities
May have higher resource requirements for deployment and execution

Code Comparison

LangChain:

from langchain import OpenAI, LLMChain, PromptTemplate

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(input_variables=["product"], template="What is a good name for a company that makes {product}?")
chain = LLMChain(llm=llm, prompt=prompt)

Langchain-serve:

from langchain_serve import LangServe
from langchain.llms import OpenAI

app = LangServe()
llm = OpenAI(temperature=0.9)
app.add_route("/generate", llm)

The code comparison shows that LangChain offers more flexibility in chain creation, while Langchain-serve provides a simpler API for serving language models. LangChain's approach allows for more complex prompt engineering and chain composition, whereas Langchain-serve focuses on quick deployment of language models as API endpoints.

llama_index

42,647

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Pros of llama_index

More focused on indexing and querying large language models
Provides a wider range of data connectors and index types
Offers more advanced retrieval methods, including hybrid search

Cons of llama_index

Less emphasis on serving and deploying models
May require more setup and configuration for complex use cases
Limited built-in support for distributed computing

Code Comparison

langchain-serve:

from langchain_serve import serving

@serving
def my_function():
    return "Hello, World!"

llama_index:

from llama_index import GPTSimpleVectorIndex, Document

documents = [Document('content')]
index = GPTSimpleVectorIndex.from_documents(documents)
response = index.query("query")

The code snippets highlight the different focus areas of each project. langchain-serve emphasizes easy deployment of functions as services, while llama_index focuses on creating and querying indexes for large language models.

haystack

21,304

Pros of Haystack

More comprehensive NLP framework with a wider range of components and pipelines
Better suited for production-ready applications with scalability features
Extensive documentation and community support

Cons of Haystack

Steeper learning curve due to its broader scope and complexity
May be overkill for simpler projects or prototypes

Code Comparison

Haystack:

from haystack import Pipeline
from haystack.nodes import TextConverter, Preprocessor, FARMReader

pipeline = Pipeline()
pipeline.add_node(component=TextConverter(), name="TextConverter", inputs=["File"])
pipeline.add_node(component=Preprocessor(), name="Preprocessor", inputs=["TextConverter"])
pipeline.add_node(component=FARMReader(model_name_or_path="deepset/roberta-base-squad2"), name="Reader", inputs=["Preprocessor"])

Langchain-serve:

from langchain import OpenAI, PromptTemplate, LLMChain

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(input_variables=["product"], template="What is a good name for a company that makes {product}?")
chain = LLMChain(llm=llm, prompt=prompt)

Both repositories offer unique approaches to NLP tasks. Haystack provides a more comprehensive framework for building end-to-end NLP pipelines, while Langchain-serve focuses on simplifying the deployment of language models and chains. The choice between the two depends on the specific requirements of your project and the level of complexity you're comfortable with.

deeplearning4j

14,064

Pros of deeplearning4j

Comprehensive deep learning library for Java and the JVM
Supports a wide range of neural network architectures and algorithms
Integrates well with existing Java ecosystems and enterprise environments

Cons of deeplearning4j

Steeper learning curve compared to Python-based alternatives
Less frequent updates and smaller community compared to more popular deep learning frameworks
Potentially slower development cycle for rapid prototyping

Code Comparison

deeplearning4j:

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
    .list()
    .layer(0, new DenseLayer.Builder().nIn(784).nOut(250).build())
    .layer(1, new OutputLayer.Builder().nIn(250).nOut(10).build())
    .build();

langchain-serve:

from langchain import OpenAI, LLMChain
from langchain.prompts import PromptTemplate

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(input_variables=["product"], template="What is a good name for a company that makes {product}?")
chain = LLMChain(llm=llm, prompt=prompt)

While deeplearning4j focuses on traditional deep learning tasks, langchain-serve is designed for serving and deploying language models and chains. deeplearning4j offers more flexibility for custom neural network architectures, while langchain-serve provides a higher-level abstraction for working with language models and prompts.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

:warning: IMPORTANT NOTICE: This repository is no longer maintained.

â¡ LangChain Apps on Production with Jina & FastAPI ð

Jina is an open-source framework for building scalable multi modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs.

langchain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own infrastructure to ensure data privacy. With langchain-serve, you can craft REST/Websocket APIs, spin up LLM-powered conversational Slack bots, or wrap your LangChain apps into FastAPI packages on cloud or on-premises.

Give us a :star: and tell us what more you'd like to see!

âï¸ LLM Apps as-a-service

langchain-serve currently wraps following apps as a service to be deployed on Jina AI Cloud with one command.

ð® AutoGPT-as-a-service

AutoGPT is an "AI agent" that given a goal in natural language, will attempt to achieve it by breaking it into sub-tasks and using the internet and other tools in an automatic loop.

Show usage

Deploy autogpt on Jina AI Cloud with one command

lc-serve deploy autogpt

Show command output

ââââââââââââââââ¬âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ®
â App ID       â                                           autogpt-6cbd489454                                           â
ââââââââââââââââ¼âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Phase        â                                                Serving                                                 â
ââââââââââââââââ¼âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Endpoint     â                                 wss://autogpt-6cbd489454.wolf.jina.ai                                  â
ââââââââââââââââ¼âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â App logs     â                                        dashboards.wolf.jina.ai                                         â
ââââââââââââââââ¼âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Swagger UI   â                              https://autogpt-6cbd489454.wolf.jina.ai/docs                              â
ââââââââââââââââ¼âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â OpenAPI JSON â                          https://autogpt-6cbd489454.wolf.jina.ai/openapi.json                          â
â°âââââââââââââââ´âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¯

Integrate autogpt with external services using APIs. Get a flavor of the integration on your CLI with
```
lc-serve playground autogpt
```
Show playground

ð§ Babyagi-as-a-service

Babyagi is a task-driven autonomous agent that uses LLMs to create, prioritize, and execute tasks. It is a general-purpose AI agent that can be used to automate a wide variety of tasks.

Show usage

Deploy babyagi on Jina AI Cloud with one command
```
lc-serve deploy babyagi
```
Integrate babyagi with external services using our Websocket API. Get a flavor of the integration on your CLI with
```
lc-serve playground babyagi
```
Show playground

:panda_face: pandas-ai-as-a-service

pandas-ai integrates LLM capabilities into Pandas, to make dataframes conversational in Python code. Thanks to langchain-serve, we can now expose pandas-ai APIs on Jina AI Cloud in just a matter of seconds.

Show usage

Deploy pandas-ai on Jina AI Cloud

lc-serve deploy pandas-ai

Show command output

ââââââââââââââââ¬ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ®
â App ID       â                               pandasai-06879349ca                               â
ââââââââââââââââ¼ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Phase        â                                     Serving                                     â
ââââââââââââââââ¼ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Endpoint     â                     wss://pandasai-06879349ca.wolf.jina.ai                      â
ââââââââââââââââ¼ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â App logs     â                             dashboards.wolf.jina.ai                             â
ââââââââââââââââ¼ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â Swagger UI   â                  https://pandasai-06879349ca.wolf.jina.ai/docs                  â
ââââââââââââââââ¼ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¤
â OpenAPI JSON â              https://pandasai-06879349ca.wolf.jina.ai/openapi.json              â
â°âââââââââââââââ´ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ¯

Upload your DataFrame to Jina AI Cloud (Optional - you can also use a publicly available CSV)
- Define your DataFrame in a Python file
```
# dataframe.py
import pandas as pd
df = pd.DataFrame(some_data)
```
- Upload your DataFrame to Jina AI Cloud using <module>:<variable> syntax
```
lc-serve util upload-df dataframe:df
```
Conversationalize your DataFrame using pandas-ai APIs. Get a flavor of the integration with a local playground on your CLI with
```
lc-serve playground pandas-ai <host>
```
Show playground

ð¬ Question Answer Bot on PDFs

pdfqna is a simple question answering bot that uses LLMs to answer questions on PDF documents, showcasing the how easy it is to integrate langchain apps on Jina AI Cloud.

Show usage

Deploy pdf_qna on Jina AI Cloud with one command
```
lc-serve deploy pdf-qna
```
Get a flavor of the integration with Streamlit playground on your CLI with
```
lc-serve playground pdf-qna
```
Show playground
Expand the Q&A bot to multiple languages, different document types & integrate with external services using simple REST APIs.

https://github.com/jina-ai/langchain-serve/blob/8f7a9272e99490a5357655becfc5da3569655f38/lcserve/apps/pdf_qna/app.py#L8-L12

ðª Features

ð LLM Apps on production

ð Define your API using @serving decorator
ð Build, deploy & distribute Slack bots using @slackbot decorator
ð Bring your own FastAPI app

ð¥ Secure, Scalable, Serverless, Streaming REST/Websocket APIs on Jina AI Cloud.

ð Globally available REST/Websocket APIs with automatic TLS certs.
ð Stream LLM interactions in real-time with Websockets.
ð¥ Enable human in the loop for your agents.
ð¬ Build, deploy & distribute Slack bots built with langchain.
ð Protect your APIs with API authorization using Bearer tokens.
ð Swagger UI, and OpenAPI spec included with your APIs.
â¡ï¸ Serverless, autoscaling apps that scales automatically with your traffic.
ðï¸ Secure handling of secrets and environment variables.
ð Persistent storage (EFS) mounted on your app for your data.
â±ï¸ Trigger one-time jobs to run asynchronously, allowing for non-blocking execution.
ð Builtin logging, monitoring, and traces for your APIs.
ð¤ No need to change your code to manage APIs, or manage dockerfiles, or worry about infrastructure!

ð Self-host LLM Apps with Docker Compose or Kubernetes

ð Export your apps as Kubernetes or Docker Compose YAMLs with single command.
ð lc-serve export app --kind <kubernetes/docker-compose> --path .
ð¦ Deploy your app on your own internal infrastructure with your own security policies.
ð Talk to us if you need all the features of Jina AI Cloud on your own infrastructure.

ð§° Usage

Let's first install langchain-serve using pip.

pip install langchain-serve

ð REST APIs using `@serving` decorator

ð Let's go through a step-by-step guide to build, deploy and use a REST API using @serving decorator.

ð¤ð¬ Build, Deploy & Distribute Slack bots built with LangChain

langchain-serve exposes a @slackbot decorator to quickly build, deploy & distribute LLM-powered Slack bots without worrying about the infrastructure. It provides a simple interface to any langchain app on and makes them super accessible to users a platform they're already comfortable with.

â¨ Ready to dive in?

There's a step-by-step guide in the repository to help you build your own bot for helping with reasoning.
Here's another step-by-step guide to help you chat over own internal HR-realted documents (like onboarding, policies etc.) with your employees right inside your Slack workspace.

ð Authorize your APIs

To add an extra layer of security, we can integrate any custom API authorization by adding a auth argument to the @serving decorator.

Show code & gotchas

from lcserve import serving

def authorizer(token: str) -> Any:
    if not token == 'mysecrettoken':            # Change this to add your own authorization logic
        raise Exception('Unauthorized')         # Raise an exception if the request is not authorized

    return 'userid'                             # Return any user id or object

@serving(auth=authorizer)
def ask(question: str, **kwargs) -> str:
    auth_response = kwargs['auth_response']     # This will be 'userid'
    return ...

@serving(websocket=True, auth=authorizer)
async def talk(question: str, **kwargs) -> str:
    auth_response = kwargs['auth_response']     # This will be 'userid'
    return ...

ð¤ Gotchas about the `auth` function

Should accept only one argument token.
Should raise an Exception if the request is not authorized.
Can return any object, which will be passed to the auth_response object under kwargs to the functions.
Expects Bearer token in the Authorization header of the request.

Sample HTTP request with curl:

curl -X 'POST' 'http://localhost:8080/ask' -H 'Authorization: Bearer mysecrettoken' -d '{ "question": "...", "envs": {} }'

Sample WebSocket request with wscat:

wscat -H "Authorization: Bearer mysecrettoken" -c ws://localhost:8080/talk

ðââï¸ Enable streaming & human-in-the-loop (HITL) with WebSockets

HITL for LangChain agents on production can be challenging since the agents are typically running on servers where humans don't have direct access. langchain-serve bridges this gap by enabling websocket APIs that allow for real-time interaction and feedback between the agent and a human operator.

Check out this example to see how you can enable HITL for your agents.

ð Persistent storage on Jina AI Cloud

Every app deployed on Jina AI Cloud gets a persistent storage (EFS) mounted locally which can be accessed via workspace kwarg in the @serving function.

Show code

from lcserve import serving

@serving
def store(text: str, **kwargs):
    workspace: str = kwargs.get('workspace')
    path = f'{workspace}/store.txt'
    print(f'Writing to {path}')
    with open(path, 'a') as f:
        f.writelines(text + '\n')
    return 'OK'


@serving(websocket=True)
async def stream(**kwargs):
    workspace: str = kwargs.get('workspace')
    websocket: WebSocket = kwargs.get('websocket')
    path = f'{workspace}/store.txt'
    print(f'Streaming {path}')
    async with aiofiles.open(path, 'r') as f:
        async for line in f:
            await websocket.send_text(line)
    return 'OK'

Here, we are using the workspace to store the incoming text in a file via the REST endpoint and streaming the contents of the file via the WebSocket endpoint.

ð Bring your own FastAPI app

If you already have a FastAPI app with pre-defined endpoints, you can use lc-serve to deploy it on Jina AI Cloud.

lc-serve deploy jcloud --app filename:app

Show details

Let's take an example of a simple FastAPI app with directory structure

.
âââ endpoints.py

# endpoints.py
from typing import Union

from fastapi import FastAPI

app = FastAPI()


@app.get("/status")
def read_root():
    return {"Hello": "World"}


@app.get("/items/{item_id}")
def read_item(item_id: int, q: Union[str, None] = None):
    return {"item_id": item_id, "q": q}

lc-serve deploy jcloud --app endpoints:app

ðï¸ Using Secrets during Deployment

You can use secrets during app deployment by passing a secrets file to deployment with the --secrets flag. The secrets file should be a .env file containing the secrets.

lcserve deploy jcloud app --secrets .env

Show details

Let's take an example of a simple app that uses OPENAI_API_KEY stored as secrets.

This app directory contains the following files:

.
âââ main.py             # The app
âââ jcloud.yml          # JCloud deployment config file
âââ README.md           # This README file
âââ requirements.txt    # The requirements file for the app
âââ secrets.env         # The secrets file containing the redis credentials

Note secret.env in this directory is a dummy file. You should replace it with your own secrets after creating a Redis instance. (For example with Upstash), such as:

OPENAI_API_KEY=sk-xxx

main.py will look like:

# main.py
from lcserve import serving
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ChatOpenAI

prompt = PromptTemplate(
    input_variables=["subject"],
    template="Write me a short poem about {subject}?",
)


@serving(openai_tracing=True)
def poem(subject: str, **kwargs):
    tracing_handler = kwargs.get("tracing_handler")

    chat = ChatOpenAI(temperature=0.5, callbacks=[tracing_handler])
    chain = LLMChain(llm=chat, prompt=prompt, callbacks=[tracing_handler])

    return chain.run(subject)

In the above example, the app will use OPENAI_API_KEY provided by the secrets to interact with OpenAI.

Then you can deploy using the following command and interact with the deployed endpoint.

lc-serve deploy jcloud main --secrets secrets.env

â±ï¸ Trigger one-time jobs to run asynchronously

Here's a step-by-step guide to trigger one-time jobs to run asynchronously using @job decorator.

ð» `lc-serve` CLI

lc-serve is a simple CLI that helps you to deploy your agents on Jina AI Cloud (JCloud)

Description	Command
Deploy your app locally	`lc-serve deploy local app`
Export your app as Kubernetes YAML	`lc-serve export app --kind kubernetes --path .`
Export your app as Docker Compose YAML	`lc-serve export app --kind docker-compose --path .`
Deploy your app on JCloud	`lc-serve deploy jcloud app`
Deploy FastAPI app on JCloud	`lc-serve deploy jcloud --app <app-name>:<app-object>`
Update existing app on JCloud	`lc-serve deploy jcloud app --app-id <app-id>`
Get app status on JCloud	`lc-serve status <app-id>`
List all apps on JCloud	`lc-serve list`
Remove app on JCloud	`lc-serve remove <app-id>`
Pause app on JCloud	`lc-serve pause <app-id>`
Resume app on JCloud	`lc-serve resume <app-id>`

ð¡ JCloud Deployment

âï¸ Configurations

For JCloud deployment, you can configure your application infrastructure by providing a YAML configuration file using the --config option. The supported configurations are:

Instance type (instance), as defined by Jina AI Cloud.
Minimum number of replicas for your application (autoscale_min). Setting it 0 enables serverless.
Disk size (disk_size), in GB. The default value is 1 GB.

For example:

instance: C4
autoscale_min: 0
disk_size: 1.5G

You can alternatively include a jcloud.yaml file in your application directory with the desired configurations. However, please note that if the --config option is explicitly used in the command line interface, the local jcloud.yaml file will be disregarded. The command line provided configuration file will take precedence.

If you don't provide a configuration file or a specific configuration isn't specified, the following default settings will be applied:

instance: C3
autoscale_min: 1
disk_size: 1G

ð° Pricing

Applications hosted on JCloud are priced in two categories:

Base credits

Base credits are charged to ensure high availability for your application by maintaining at least one instance running continuously, ready to handle incoming requests. If you wish to stop the serving application, you can either remove the app completely or put it on pause, the latter allows you to resume the app serving based on persisted configurations (refer to lc-serve CLI section for more information). Both options will halt the consumption of credits.
Actual credits charged for base credits are calculated based on the instance type as defined by Jina AI Cloud.
By default, instance type C3 is used with a minimum of 1 instance and Amazon EFS disk of size 1G, which means that if your application is served on JCloud, you will be charged ~10 credits per hour.
You can change the instance type and the minimum number of instances by providing a YAML configuration file using the --config option. For example, if you want to use instance type C4 with a minimum of 0 replicas, and 2G EFS disk, you can provide the following configuration file:
```
instance: C4
autoscale_min: 0
disk_size: 2G
```

Serving credits

Serving credits are charged when your application is actively serving incoming requests.
Actual credits charged for serving credits are calculated based on the credits for the instance type multiplied by the duration for which your application serves requests.
You are charged for each second your application is serving requests.

Total credits charged = Base credits + Serving credits. (Jina AI Cloud defines each credit as â¬0.005)

Examples

Example 1

Consider an HTTP application that has served requests for 10 minutes in the last hour and uses a custom config:

instance: C4
autoscale_min: 0
disk_size: 2G

Total credits per hour charged would be 3.33. The calculation is as follows:

C4 instance has an hourly credit rate of 20.
EFS has hourly credit rate of 0.104 per GB.
Base credits = 0 + 2 * 0.104 = 0.208 (since `autoscale_min` is 0)
Serving credits = 20 * 10/60 = 3.33
Total credits per hour = 0.208 + 3.33 = 3.538

Example 2

Consider a WebSocket application that had active connections for 20 minutes in the last hour and uses the default configuration.

instance: C3
autoscale_min: 1
disk_size: 1G

Total credits per hour charged would be 13.33. The calculation is as follows:

C3 instance has an hourly credit rate of 10.
EFS has hourly credit rate of 0.104 per GB.
Base credits = 10 + 1 * 0.104 = 10.104 (since `autoscale_min` is 1)
Serving credits = 10 * 20/60 = 3.33
Total credits per hour = 10.104 + 3.33 = 13.434

â Frequently Asked Questions

lc-serve command not found
My client that connects to the JCloud hosted App gets timed-out, what should I do?
How to pass environment variables to the app?
JCloud deployment failed at pushing image to Jina Hubble, what should I do?
Debug babyagi playground request/response for external integration

`lc-serve` command not found

Expand

lc-serve command is registered during langchain-serve installation. If you get command not found: lc-serve error, please replace lc-serve command with python -m lcserve & retry.

My client that connects to the JCloud hosted App gets timed-out, what should I do?

Expand

If you make long HTTP/ WebSocket requests, the default timeout value (2 minutes) might not be suitable for your use case. You can provide a custom timeout value during JCloud deployment by using the --timeout argument.

Additionally, for HTTP, you may also experience timeouts due to limitations in the OSS we used in langchain-serve. While we are working to permanently address this issue, we recommend using HTTP/1.1 in your client as a temporary workaround.

For WebSocket, please note that the connection will be closed if idle for more than 5 minutes.

How to pass environment variables to the app?

Expand

We provide 2 options to pass environment variables:

Use --env during app deployment to load env variables from a .env file. For example, lc-serve deploy jcloud app --env some.env will load all env variables from some.env file and pass them to the app. These env variables will be available in the app as os.environ['ENV_VAR_NAME'].
You can also pass env variables while sending requests to the app both in HTTP and WebSocket. envs field in the request body is used to pass env variables. For example
```
{
    "question": "What is the meaning of life?",
    "envs": {
        "ENV_VAR_NAME": "ENV_VAR_VALUE"
    }
}
```

JCloud deployment failed at pushing image to Jina Hubble, what should I do?

Expand

Please use --verbose and retry to get more information. If you are operating on computer with arm64 arch, please retry with --platform linux/amd64 so the image can be built correctly.

Debug babyagi playground request/response for external integration

Expand

1. Start textual console in a terminal (exclude following groups to reduce the noise in logging)

```bash
textual console -x EVENT -x SYSTEM -x DEBUG
```

2. Start the playground with --verbose flag. Start interacting and see the logs in the console.

```bash
lc-serve playground babyagi --verbose
```

ð£ Reach out to us

Want to deploy your LLM apps on your own infrastructure with all capabilities of Jina AI Cloud?

Serverless
Autoscaling
TLS certs
Persistent storage
End to end LLM observability
and more on auto-pilot!

Join us on Discord and we'd be happy to hear more about your use case.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of langserve

Cons of langserve

Code Comparison

Pros of Semantic Kernel

Cons of Semantic Kernel

Code Comparison

Summary

Pros of LangChain

Cons of LangChain

Code Comparison

Pros of llama_index

Cons of llama_index

Code Comparison

Pros of Haystack

Cons of Haystack

Code Comparison

Pros of deeplearning4j

Cons of deeplearning4j

Code Comparison

Convert designs to code with AI

README

â¡ LangChain Apps on Production with Jina & FastAPI ð

âï¸ LLM Apps as-a-service

ð® AutoGPT-as-a-service

ð§ Babyagi-as-a-service

:panda_face: pandas-ai-as-a-service

ð¬ Question Answer Bot on PDFs

ðª Features

ð LLM Apps on production

ð¥ Secure, Scalable, Serverless, Streaming REST/Websocket APIs on Jina AI Cloud.

ð Self-host LLM Apps with Docker Compose or Kubernetes

ð§° Usage

ð REST APIs using @serving decorator

ð¤ð¬ Build, Deploy & Distribute Slack bots built with LangChain

ð Authorize your APIs

ð¤ Gotchas about the auth function

ðââï¸ Enable streaming & human-in-the-loop (HITL) with WebSockets

ð Persistent storage on Jina AI Cloud

ð Bring your own FastAPI app

ðï¸ Using Secrets during Deployment

â±ï¸ Trigger one-time jobs to run asynchronously

ð» lc-serve CLI

ð¡ JCloud Deployment

âï¸ Configurations

ð° Pricing

Examples

â Frequently Asked Questions

lc-serve command not found

My client that connects to the JCloud hosted App gets timed-out, what should I do?

How to pass environment variables to the app?

JCloud deployment failed at pushing image to Jina Hubble, what should I do?

Debug babyagi playground request/response for external integration

ð£ Reach out to us

Top Related Projects

Convert designs to code with AI

â¡ LangChain Apps on Production with Jina & FastAPI ð

âï¸ LLM Apps as-a-service

ð® AutoGPT-as-a-service

ð§ Babyagi-as-a-service

ð¬ Question Answer Bot on PDFs

ðª Features

ð LLM Apps on production

ð¥ Secure, Scalable, Serverless, Streaming REST/Websocket APIs on Jina AI Cloud.

ð Self-host LLM Apps with Docker Compose or Kubernetes

ð§° Usage

ð REST APIs using `@serving` decorator

ð¤ð¬ Build, Deploy & Distribute Slack bots built with LangChain

ð Authorize your APIs

ð¤ Gotchas about the `auth` function

ðââï¸ Enable streaming & human-in-the-loop (HITL) with WebSockets

ð Persistent storage on Jina AI Cloud

ð Bring your own FastAPI app

ðï¸ Using Secrets during Deployment

â±ï¸ Trigger one-time jobs to run asynchronously

ð» `lc-serve` CLI

ð¡ JCloud Deployment

âï¸ Configurations

ð° Pricing

â Frequently Asked Questions

`lc-serve` command not found

ð£ Reach out to us