Top Related Projects
🦜🔗 Build context-aware reasoning applications
Integrate cutting-edge LLM technology quickly and easily into your apps
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
LlamaIndex is a data framework for your LLM applications
Examples and guides for using the OpenAI API
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
Quick Overview
LangServe is a library for deploying LangChain runnables and chains as REST APIs. It provides a simple way to serve LangChain applications, making it easier to integrate language models and chains into various applications and services.
Pros
- Easy deployment of LangChain runnables and chains as REST APIs
- Automatic generation of OpenAPI specs and API documentation
- Supports streaming responses for real-time applications
- Integrates well with the LangChain ecosystem
Cons
- Limited to LangChain-based applications
- May require additional setup for complex deployments
- Documentation could be more comprehensive for advanced use cases
- Potential learning curve for users new to LangChain
Code Examples
- Creating a simple LangServe app:
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langserve import add_routes
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
)
model = ChatOpenAI()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(app, prompt | model, path="/joke")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="localhost", port=8000)
- Adding a chain to LangServe:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
llm = OpenAI()
chain = LLMChain.from_string(llm, "Translate the following to {language}: {text}")
add_routes(app, chain, path="/translate")
- Using streaming responses:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
chat = ChatOpenAI(streaming=True)
prompt = ChatPromptTemplate.from_template("Write a story about {topic}")
chain = prompt | chat
add_routes(app, chain, path="/stream_story")
Getting Started
To get started with LangServe:
-
Install the library:
pip install langserve
-
Create a FastAPI app and add LangChain runnables:
from fastapi import FastAPI from langchain.prompts import PromptTemplate from langchain.llms import OpenAI from langserve import add_routes app = FastAPI() prompt = PromptTemplate.from_template("Tell me a fact about {topic}") model = OpenAI() chain = prompt | model add_routes(app, chain, path="/fact")
-
Run the server:
uvicorn main:app --reload
Your LangServe API is now running and accessible at http://localhost:8000
.
Competitor Comparisons
🦜🔗 Build context-aware reasoning applications
Pros of LangChain
- More comprehensive and feature-rich, offering a wide range of tools and integrations
- Extensive documentation and community support
- Flexible and customizable for various AI/ML applications
Cons of LangChain
- Steeper learning curve due to its extensive features
- Can be overwhelming for simple projects or beginners
- Potentially higher resource usage and complexity
Code Comparison
LangChain:
from langchain import OpenAI, LLMChain, PromptTemplate
llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
input_variables=["product"],
template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
LangServe:
from langserve import LangServeApp
from langchain.chat_models import ChatOpenAI
app = LangServeApp()
app.add_route("/chat", ChatOpenAI())
Key Differences
LangChain is a comprehensive framework for building applications with large language models, offering a wide range of tools and integrations. It's highly flexible but can be complex for simple projects.
LangServe, on the other hand, is specifically designed for serving LangChain chains and agents as APIs. It's more focused and easier to use for deploying LangChain applications, but has a narrower scope compared to the full LangChain framework.
Choose LangChain for complex, customizable AI applications, and LangServe for quickly deploying LangChain models as APIs.
Integrate cutting-edge LLM technology quickly and easily into your apps
Pros of Semantic Kernel
- More comprehensive framework with built-in memory, planning, and reasoning capabilities
- Stronger integration with Azure services and Microsoft ecosystem
- Supports multiple programming languages (C#, Python, Java)
Cons of Semantic Kernel
- Steeper learning curve due to more complex architecture
- Less flexibility in choosing and integrating external LLM providers
- Smaller community and ecosystem compared to LangChain
Code Comparison
Semantic Kernel (C#):
var kernel = Kernel.Builder.Build();
var function = kernel.CreateSemanticFunction("Generate a story about {{$input}}");
var result = await kernel.RunAsync("a brave knight", function);
LangServe (Python):
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
prompt = PromptTemplate.from_template("Generate a story about {input}")
chain = prompt | OpenAI()
result = chain.invoke({"input": "a brave knight"})
Both frameworks aim to simplify working with language models, but LangServe focuses on serving LangChain chains as APIs, while Semantic Kernel provides a more comprehensive toolkit for building AI-powered applications. LangServe offers greater flexibility and ease of use, especially for those already familiar with LangChain, while Semantic Kernel provides deeper integration with Microsoft technologies and supports multiple programming languages.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Pros of Haystack
- More comprehensive framework for building end-to-end NLP applications
- Offers a wider range of pre-built components and pipelines
- Stronger focus on document retrieval and question-answering tasks
Cons of Haystack
- Steeper learning curve due to its broader scope
- Less flexibility for custom integrations compared to LangServe
- Potentially heavier resource requirements for simpler use cases
Code Comparison
Haystack example:
from haystack import Pipeline
from haystack.nodes import TextConverter, Preprocessor, BM25Retriever, FARMReader
pipeline = Pipeline()
pipeline.add_node(component=TextConverter(), name="TextConverter", inputs=["File"])
pipeline.add_node(component=Preprocessor(), name="Preprocessor", inputs=["TextConverter"])
pipeline.add_node(component=BM25Retriever(document_store), name="Retriever", inputs=["Preprocessor"])
pipeline.add_node(component=FARMReader(model_name_or_path="deepset/roberta-base-squad2"), name="Reader", inputs=["Retriever"])
LangServe example:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langserve import LangserveApp
prompt = ChatPromptTemplate.from_template("Tell me a {adjective} joke about {topic}")
model = ChatOpenAI()
chain = prompt | model
app = LangserveApp(routes=[("/joke", chain)])
LlamaIndex is a data framework for your LLM applications
Pros of LlamaIndex
- More focused on data indexing and retrieval, making it potentially more efficient for specific use cases
- Offers a wider range of indexing strategies and data structures
- Provides built-in support for various document types and data sources
Cons of LlamaIndex
- Less integrated with other language model tools and frameworks
- May have a steeper learning curve for users not familiar with indexing concepts
- Potentially less flexible for general-purpose language model applications
Code Comparison
LlamaIndex:
from llama_index import GPTSimpleVectorIndex, Document
documents = [Document('content1'), Document('content2')]
index = GPTSimpleVectorIndex.from_documents(documents)
response = index.query("What is the content about?")
LangServe:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
llm = OpenAI()
chain = LLMChain.from_string(llm, "Summarize: {text}")
response = chain.run("content1 content2")
Both repositories offer valuable tools for working with language models, but they focus on different aspects. LlamaIndex excels in data indexing and retrieval, while LangServe provides a more general-purpose framework for building language model applications. The choice between them depends on the specific requirements of your project.
Examples and guides for using the OpenAI API
Pros of openai-cookbook
- Comprehensive collection of OpenAI API usage examples
- Covers a wide range of applications and use cases
- Regularly updated with new features and best practices
Cons of openai-cookbook
- Focused solely on OpenAI's offerings, limiting its scope
- May require more setup and configuration for each example
- Less emphasis on production-ready deployment
Code Comparison
openai-cookbook:
import openai
response = openai.Completion.create(
engine="text-davinci-002",
prompt="Translate the following English text to French: '{}'",
max_tokens=60
)
langserve:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
prompt = PromptTemplate(
input_variables=["text"],
template="Translate the following English text to French: {text}",
)
llm = OpenAI(temperature=0)
chain = prompt | llm
Key Differences
- langserve provides a more abstracted approach, allowing for easier integration of multiple AI services
- openai-cookbook offers direct API calls, giving more control over specific OpenAI parameters
- langserve focuses on serving AI applications, while openai-cookbook is primarily for learning and experimentation
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
Pros of PromptFlow
- Integrated with Azure AI services, offering seamless deployment and scaling
- Provides a visual flow editor for designing and managing prompt workflows
- Supports multiple programming languages (Python, JavaScript) for flexibility
Cons of PromptFlow
- More tightly coupled with Microsoft ecosystem, potentially limiting portability
- Steeper learning curve for users not familiar with Azure services
- Less community-driven development compared to LangServe
Code Comparison
PromptFlow example:
from promptflow import tool
@tool
def my_python_tool(input1: str, input2: int) -> str:
return f"Processed {input1} with {input2}"
LangServe example:
from fastapi import FastAPI
from langserve import add_routes
app = FastAPI()
add_routes(app, my_chain, path="/my-chain")
Both repositories aim to simplify the deployment of language models and prompt engineering workflows. PromptFlow focuses on integration with Azure services and visual flow design, while LangServe emphasizes easy API creation for LangChain applications. PromptFlow offers broader language support, but LangServe provides a more lightweight and portable solution for LangChain users.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
ð¦ï¸ð LangServe
[!WARNING] We recommend using LangGraph Platform rather than LangServe for new projects.
Please see the LangGraph Platform Migration Guide for more information.
We will continue to accept bug fixes for LangServe from the community; however, we will not be accepting new feature contributions.
Overview
LangServe helps developers
deploy LangChain
runnables and chains
as a REST API.
This library is integrated with FastAPI and uses pydantic for data validation.
In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
Features
- Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
- API docs page with JSONSchema and Swagger (insert example link)
- Efficient
/invoke
,/batch
and/stream
endpoints with support for many concurrent requests on a single server /stream_log
endpoint for streaming all (or some) intermediate steps from your chain/agent- new as of 0.0.40, supports
/stream_events
to make it easier to stream without needing to parse the output of/stream_log
. - Playground page at
/playground/
with streaming output and intermediate steps - Built-in (optional) tracing to LangSmith, just add your API key (see Instructions)
- All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
- Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
- LangServe Hub
â ï¸ LangGraph Compatibility
LangServe is designed to primarily deploy simple Runnables and work with well-known primitives in langchain-core.
If you need a deployment option for LangGraph, you should instead be looking at LangGraph Cloud (beta) which will be better suited for deploying LangGraph applications.
Limitations
- Client callbacks are not yet supported for events that originate on the server
- Versions of LangServe <= 0.2.0, will not generate OpenAPI docs properly when using Pydantic V2 as Fast API does not support mixing pydantic v1 and v2 namespaces. See section below for more details. Either upgrade to LangServe>=0.3.0 or downgrade Pydantic to pydantic 1.
Security
- Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. Resolved in 0.0.16.
Installation
For both client and server:
pip install "langserve[all]"
or pip install "langserve[client]"
for client code,
and pip install "langserve[server]"
for server code.
LangChain CLI ð ï¸
Use the LangChain
CLI to bootstrap a LangServe
project quickly.
To use the langchain CLI make sure that you have a recent version of langchain-cli
installed. You can install it with pip install -U langchain-cli
.
Setup
Note: We use poetry
for dependency management. Please follow poetry doc to learn more about it.
1. Create new app using langchain cli command
langchain app new my-app
2. Define the runnable in add_routes. Go to server.py and edit
add_routes(app. NotImplemented)
3. Use poetry
to add 3rd party packages (e.g., langchain-openai, langchain-anthropic, langchain-mistral etc).
poetry add [package-name] // e.g `poetry add langchain-openai`
4. Set up relevant env variables. For example,
export OPENAI_API_KEY="sk-..."
5. Serve your app
poetry run langchain serve --port=8100
Examples
Get your LangServe instances started quickly with the examples directory.
Description | Links |
---|---|
LLMs Minimal example that reserves OpenAI and Anthropic chat models. Uses async, supports batching and streaming. | server, client |
Retriever Simple server that exposes a retriever as a runnable. | server, client |
Conversational Retriever A Conversational Retriever exposed via LangServe | server, client |
Agent without conversation history based on OpenAI tools | server, client |
Agent with conversation history based on OpenAI tools | server, client |
RunnableWithMessageHistory to implement chat persisted on backend, keyed off a session_id supplied by client. | server, client |
RunnableWithMessageHistory to implement chat persisted on backend, keyed off a conversation_id supplied by client, and user_id (see Auth for implementing user_id properly). | server, client |
Configurable Runnable to create a retriever that supports run time configuration of the index name. | server, client |
Configurable Runnable that shows configurable fields and configurable alternatives. | server, client |
APIHandler Shows how to use APIHandler instead of add_routes . This provides more flexibility for developers to define endpoints. Works well with all FastAPI patterns, but takes a bit more effort. | server |
LCEL Example Example that uses LCEL to manipulate a dictionary input. | server, client |
Auth with add_routes : Simple authentication that can be applied across all endpoints associated with app. (Not useful on its own for implementing per user logic.) | server |
Auth with add_routes : Simple authentication mechanism based on path dependencies. (No useful on its own for implementing per user logic.) | server |
Auth with add_routes : Implement per user logic and auth for endpoints that use per request config modifier. (Note: At the moment, does not integrate with OpenAPI docs.) | server, client |
Auth with APIHandler : Implement per user logic and auth that shows how to search only within user owned documents. | server, client |
Widgets Different widgets that can be used with playground (file upload and chat) | server |
Widgets File upload widget used for LangServe playground. | server, client |
Sample Application
Server
Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.
#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
)
add_routes(
app,
ChatOpenAI(model="gpt-3.5-turbo-0125"),
path="/openai",
)
add_routes(
app,
ChatAnthropic(model="claude-3-haiku-20240307"),
path="/anthropic",
)
model = ChatAnthropic(model="claude-3-haiku-20240307")
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
app,
prompt | model,
path="/joke",
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="localhost", port=8000)
If you intend to call your endpoint from the browser, you will also need to set CORS headers. You can use FastAPI's built-in middleware for that:
from fastapi.middleware.cors import CORSMiddleware
# Set all CORS enabled origins
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
expose_headers=["*"],
)
Docs
If you've deployed the server above, you can view the generated OpenAPI docs using:
â ï¸ If using LangServe <= 0.2.0 and pydantic v2, docs will not be generated for invoke, batch, stream, stream_log. See Pydantic section below for more details. To resolve please upgrade to LangServe 0.3.0.
curl localhost:8000/docs
make sure to add the /docs
suffix.
â ï¸ Index page
/
is not defined by design, socurl localhost:8000
or visiting the URL will return a 404. If you want content at/
define an endpoint@app.get("/")
.
Client
Python SDK
from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable
openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")
joke_chain.invoke({"topic": "parrots"})
# or async
await joke_chain.ainvoke({"topic": "parrots"})
prompt = [
SystemMessage(content='Act like either a cat or a parrot.'),
HumanMessage(content='Hello!')
]
# Supports astream
async for msg in anthropic.astream(prompt):
print(msg, end="", flush=True)
prompt = ChatPromptTemplate.from_messages(
[("system", "Tell me a long story about {topic}")]
)
# Can define custom chains
chain = prompt | RunnableMap({
"openai": openai,
"anthropic": anthropic,
})
chain.batch([{"topic": "parrots"}, {"topic": "cats"}])
In TypeScript (requires LangChain.js version 0.0.166 or later):
import { RemoteRunnable } from "@langchain/core/runnables/remote";
const chain = new RemoteRunnable({
url: `http://localhost:8000/joke/`,
});
const result = await chain.invoke({
topic: "cats",
});
Python using requests
:
import requests
response = requests.post(
"http://localhost:8000/joke/invoke",
json={'input': {'topic': 'cats'}}
)
response.json()
You can also use curl
:
curl --location --request POST 'http://localhost:8000/joke/invoke' \
--header 'Content-Type: application/json' \
--data-raw '{
"input": {
"topic": "cats"
}
}'
Endpoints
The following code:
...
add_routes(
app,
runnable,
path="/my_runnable",
)
adds of these endpoints to the server:
POST /my_runnable/invoke
- invoke the runnable on a single inputPOST /my_runnable/batch
- invoke the runnable on a batch of inputsPOST /my_runnable/stream
- invoke on a single input and stream the outputPOST /my_runnable/stream_log
- invoke on a single input and stream the output, including output of intermediate steps as it's generatedPOST /my_runnable/astream_events
- invoke on a single input and stream events as they are generated, including from intermediate steps.GET /my_runnable/input_schema
- json schema for input to the runnableGET /my_runnable/output_schema
- json schema for output of the runnableGET /my_runnable/config_schema
- json schema for config of the runnable
These endpoints match the LangChain Expression Language interface -- please reference this documentation for more details.
Playground
You can find a playground page for your runnable at /my_runnable/playground/
. This
exposes a simple UI
to configure
and invoke your runnable with streaming output and intermediate steps.
Widgets
The playground supports widgets and can be used to test your runnable with different inputs. See the widgets section below for more details.
Sharing
In addition, for configurable runnables, the playground will allow you to configure the runnable and share a link with the configuration:
Chat playground
LangServe also supports a chat-focused playground that opt into and use under /my_runnable/playground/
.
Unlike the general playground, only certain types of runnables are supported - the runnable's input schema must
be a dict
with either:
- a single key, and that key's value must be a list of chat messages.
- two keys, one whose value is a list of messages, and the other representing the most recent message.
We recommend you use the first format.
The runnable must also return either an AIMessage
or a string.
To enable it, you must set playground_type="chat",
when adding your route. Here's an example:
# Declare a chain
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful, professional assistant named Cob."),
MessagesPlaceholder(variable_name="messages"),
]
)
chain = prompt | ChatAnthropic(model="claude-2.1")
class InputChat(BaseModel):
"""Input for the chat endpoint."""
messages: List[Union[HumanMessage, AIMessage, SystemMessage]] = Field(
...,
description="The chat messages representing the current conversation.",
)
add_routes(
app,
chain.with_types(input_type=InputChat),
enable_feedback_endpoint=True,
enable_public_trace_link_endpoint=True,
playground_type="chat",
)
If you are using LangSmith, you can also set enable_feedback_endpoint=True
on your route to enable thumbs-up/thumbs-down buttons
after each message, and enable_public_trace_link_endpoint=True
to add a button that creates a public traces for runs.
Note that you will also need to set the following environment variables:
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_PROJECT="YOUR_PROJECT_NAME"
export LANGCHAIN_API_KEY="YOUR_API_KEY"
Here's an example with the above two options turned on:
Note: If you enable public trace links, the internals of your chain will be exposed. We recommend only using this setting for demos or testing.
Legacy Chains
LangServe works with both Runnables (constructed
via LangChain Expression Language)
and legacy chains (inheriting from Chain
).
However, some of the input schemas for legacy chains may be incomplete/incorrect,
leading to errors.
This can be fixed by updating the input_schema
property of those chains in LangChain.
If you encounter any errors, please open an issue on THIS repo, and we will work to
address it.
Deployment
Deploy to AWS
You can deploy to AWS using the AWS Copilot CLI
copilot init --app [application-name] --name [service-name] --type 'Load Balanced Web Service' --dockerfile './Dockerfile' --deploy
Click here to learn more.
Deploy to Azure
You can deploy to Azure using Azure Container Apps (Serverless):
az containerapp up --name [container-app-name] --source . --resource-group [resource-group-name] --environment [environment-name] --ingress external --target-port 8001 --env-vars=OPENAI_API_KEY=your_key
You can find more info here
Deploy to GCP
You can deploy to GCP Cloud Run using the following command:
gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key
Community Contributed
Deploy to Railway
Pydantic
LangServe>=0.3 fully supports Pydantic 2.
If you're using an earlier version of LangServe (<= 0.2), then please note that support for Pydantic 2 has the following limitations:
- OpenAPI docs will not be generated for invoke/batch/stream/stream_log when using
Pydantic V2. Fast API does not support [mixing pydantic v1 and v2 namespaces]. To fix this, use
pip install pydantic==1.10.17
. - LangChain uses the v1 namespace in Pydantic v2. Please read the following guidelines to ensure compatibility with LangChain
Except for these limitations, we expect the API endpoints, the playground and any other features to work as expected.
Advanced
Handling Authentication
If you need to add authentication to your server, please read Fast API's documentation about dependencies and security.
The below examples show how to wire up authentication logic LangServe endpoints using FastAPI primitives.
You are responsible for providing the actual authentication logic, the users table etc.
If you're not sure what you're doing, you could try using an existing solution Auth0.
Using add_routes
If you're using add_routes
, see
examples here.
Description | Links |
---|---|
Auth with add_routes : Simple authentication that can be applied across all endpoints associated with app. (Not useful on its own for implementing per user logic.) | server |
Auth with add_routes : Simple authentication mechanism based on path dependencies. (No useful on its own for implementing per user logic.) | server |
Auth with add_routes : Implement per user logic and auth for endpoints that use per request config modifier. (Note: At the moment, does not integrate with OpenAPI docs.) | server, client |
Alternatively, you can use FastAPI's middleware.
Using global dependencies and path dependencies has the advantage that auth will be properly supported in the OpenAPI docs page, but these are not sufficient for implement per user logic (e.g., making an application that can search only within user owned documents).
If you need to implement per user logic, you can use the per_req_config_modifier
or APIHandler
(below) to implement this logic.
Per User
If you need authorization or logic that is user dependent,
specify per_req_config_modifier
when using add_routes
. Use a callable receives the
raw Request
object and can extract relevant information from it for authentication and
authorization purposes.
Using APIHandler
If you feel comfortable with FastAPI and python, you can use LangServe's APIHandler.
Description | Links |
---|---|
Auth with APIHandler : Implement per user logic and auth that shows how to search only within user owned documents. | server, client |
APIHandler Shows how to use APIHandler instead of add_routes . This provides more flexibility for developers to define endpoints. Works well with all FastAPI patterns, but takes a bit more effort. | server, client |
It's a bit more work, but gives you complete control over the endpoint definitions, so you can do whatever custom logic you need for auth.
Files
LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:
- The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
- The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
- The processing endpoint may be blocking or non-blocking
- If significant processing is required, the processing may be offloaded to a dedicated process pool
You should determine what is the appropriate architecture for your application.
Currently, to upload files by value to a runnable, use base64 encoding for the
file (multipart/form-data
is not supported yet).
Here's an example that shows how to use base64 encoding to send a file to a remote runnable.
Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.
Custom Input and Output Types
Input and Output types are defined on all runnables.
You can access them via the input_schema
and output_schema
properties.
LangServe
uses these types for validation and documentation.
If you want to override the default inferred types, you can use the with_types
method.
Here's a toy example to illustrate the idea:
from typing import Any
from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda
app = FastAPI()
def func(x: Any) -> int:
"""Mistyped function that should accept an int but accepts anything."""
return x + 1
runnable = RunnableLambda(func).with_types(
input_type=int,
)
add_routes(app, runnable)
Custom User Types
Inherit from CustomUserType
if you want the data to de-serialize into a
pydantic model rather than the equivalent dict representation.
At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.
from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda
from langserve import add_routes
from langserve.schema import CustomUserType
app = FastAPI()
class Foo(CustomUserType):
bar: int
def func(foo: Foo) -> int:
"""Sample function that expects a Foo type which is a pydantic model"""
assert isinstance(foo, Foo)
return foo.bar
# Note that the input and output type are automatically inferred!
# You do not need to specify them.
# runnable = RunnableLambda(func).with_types( # <-- Not needed in this case
# input_type=Foo,
# output_type=int,
#
add_routes(app, RunnableLambda(func), path="/foo")
Playground Widgets
The playground allows you to define custom widgets for your runnable from the backend.
Here are a few examples:
Description | Links |
---|---|
Widgets Different widgets that can be used with playground (file upload and chat) | server, client |
Widgets File upload widget used for LangServe playground. | server, client |
Schema
- A widget is specified at the field level and shipped as part of the JSON schema of the input type
- A widget must contain a key called
type
with the value being one of a well known list of widgets - Other widget keys will be associated with values that describe paths in a JSON object
type JsonPath = number | string | (number | string)[];
type NameSpacedPath = { title: string; path: JsonPath }; // Using title to mimick json schema, but can use namespace
type OneOfPath = { oneOf: JsonPath[] };
type Widget = {
type: string; // Some well known type (e.g., base64file, chat etc.)
[key: string]: JsonPath | NameSpacedPath | OneOfPath;
};
Available Widgets
There are only two widgets that the user can specify manually right now:
- File Upload Widget
- Chat History Widget
See below more information about these widgets.
All other widgets on the playground UI are created and managed automatically by the UI based on the config schema of the Runnable. When you create Configurable Runnables, the playground should create appropriate widgets for you to control the behavior.
File Upload Widget
Allows creation of a file upload input in the UI playground for files that are uploaded as base64 encoded strings. Here's the full example.
Snippet:
try:
from pydantic.v1 import Field
except ImportError:
from pydantic import Field
from langserve import CustomUserType
# ATTENTION: Inherit from CustomUserType instead of BaseModel otherwise
# the server will decode it into a dict instead of a pydantic model.
class FileProcessingRequest(CustomUserType):
"""Request including a base64 encoded file."""
# The extra field is used to specify a widget for the playground UI.
file: str = Field(..., extra={"widget": {"type": "base64file"}})
num_chars: int = 100
Example widget:
Chat Widget
Look at the widget example.
To define a chat widget, make sure that you pass "type": "chat".
- "input" is JSONPath to the field in the Request that has the new input message.
- "output" is JSONPath to the field in the Response that has new output message(s).
- Don't specify these fields if the entire input or output should be used as they are ( e.g., if the output is a list of chat messages.)
Here's a snippet:
class ChatHistory(CustomUserType):
chat_history: List[Tuple[str, str]] = Field(
...,
examples=[[("human input", "ai response")]],
extra={"widget": {"type": "chat", "input": "question", "output": "answer"}},
)
question: str
def _format_to_messages(input: ChatHistory) -> List[BaseMessage]:
"""Format the input to a list of messages."""
history = input.chat_history
user_input = input.question
messages = []
for human, ai in history:
messages.append(HumanMessage(content=human))
messages.append(AIMessage(content=ai))
messages.append(HumanMessage(content=user_input))
return messages
model = ChatOpenAI()
chat_model = RunnableParallel({"answer": (RunnableLambda(_format_to_messages) | model)})
add_routes(
app,
chat_model.with_types(input_type=ChatHistory),
config_keys=["configurable"],
path="/chat",
)
Example widget:
You can also specify a list of messages as your a parameter directly, as shown in this snippet:
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assisstant named Cob."),
MessagesPlaceholder(variable_name="messages"),
]
)
chain = prompt | ChatAnthropic(model="claude-2.1")
class MessageListInput(BaseModel):
"""Input for the chat endpoint."""
messages: List[Union[HumanMessage, AIMessage]] = Field(
...,
description="The chat messages representing the current conversation.",
extra={"widget": {"type": "chat", "input": "messages"}},
)
add_routes(
app,
chain.with_types(input_type=MessageListInput),
path="/chat",
)
See this sample file for an example.
Enabling / Disabling Endpoints (LangServe >=0.0.33)
You can enable / disable which endpoints are exposed when adding routes for a given chain.
Use enabled_endpoints
if you want to make sure to never get a new endpoint when upgrading langserve to a newer
verison.
Enable: The code below will only enable invoke
, batch
and the
corresponding config_hash
endpoint variants.
add_routes(app, chain, enabled_endpoints=["invoke", "batch", "config_hashes"], path="/mychain")
Disable: The code below will disable the playground for the chain
add_routes(app, chain, disabled_endpoints=["playground"], path="/mychain")
Top Related Projects
🦜🔗 Build context-aware reasoning applications
Integrate cutting-edge LLM technology quickly and easily into your apps
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
LlamaIndex is a data framework for your LLM applications
Examples and guides for using the OpenAI API
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot