developer

the first library to let you embed a developer agent in your own app!

11,940

1,059

11,940

View on GitHub

Top Related Projects

openai-cookbook

64,769

Examples and guides for using the OpenAI API

semantic-kernel

25,112

Integrate cutting-edge LLM technology quickly and easily into your apps

langchain

106,456

🦜🔗 Build context-aware reasoning applications

AutoGPT

176,470

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Quick Overview

The smol-ai/developer repository is an AI-powered coding assistant that aims to help developers write entire apps from scratch. It utilizes large language models to generate code, fix errors, and provide explanations, acting as a virtual pair programmer throughout the development process.

Pros

Streamlines the development process by generating initial code and helping with iterations
Provides explanations and context for generated code, aiding in learning and understanding
Capable of working with multiple programming languages and frameworks
Integrates with popular development tools and environments

Cons

May produce inconsistent or incorrect code in complex scenarios
Relies heavily on the quality and capabilities of the underlying language models
Potential privacy concerns when sharing code and project details with AI
May not always adhere to best practices or follow specific coding standards

Getting Started

To get started with smol-ai/developer:

Clone the repository:

git clone https://github.com/smol-ai/developer.git

Install dependencies:

cd developer
pip install -r requirements.txt

Set up your OpenAI API key:

export OPENAI_API_KEY=your_api_key_here

Run the main script:
```
python main.py
```

Follow the prompts to describe your project and start generating code with AI assistance.

Competitor Comparisons

openai-cookbook

64,769

Examples and guides for using the OpenAI API

Pros of openai-cookbook

Comprehensive collection of OpenAI API usage examples and best practices
Well-organized with separate notebooks for different tasks and techniques
Regularly updated with new features and improvements from OpenAI

Cons of openai-cookbook

Focused solely on OpenAI's offerings, limiting its scope for other AI tools
May be overwhelming for beginners due to its extensive content
Lacks a unified application structure, as it's primarily a collection of examples

Code Comparison

openai-cookbook:

import openai

response = openai.Completion.create(
  engine="text-davinci-002",
  prompt="Translate the following English text to French: '{}'",
  max_tokens=60
)

developer:

from autogen import AssistantAgent, UserProxyAgent, config_list_from_json

assistant = AssistantAgent("assistant", llm_config={"config_list": config_list_from_json("OAI_CONFIG_LIST")})
user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding"})

The openai-cookbook example demonstrates direct API usage, while developer showcases a higher-level abstraction for agent-based interactions.

semantic-kernel

25,112

Integrate cutting-edge LLM technology quickly and easily into your apps

Pros of Semantic Kernel

More comprehensive and feature-rich, offering a full SDK for AI orchestration
Better documentation and extensive examples for various use cases
Stronger integration with Azure AI services and other Microsoft technologies

Cons of Semantic Kernel

Steeper learning curve due to its complexity and extensive features
Primarily focused on .NET ecosystem, which may limit its appeal to developers using other languages
Heavier and more resource-intensive compared to the lightweight Developer project

Code Comparison

Developer:

def generate_clarification(messages):
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=messages,
        max_tokens=300,
    )
    return response.choices[0].message.content.strip()

Semantic Kernel:

var kernel = Kernel.Builder.Build();
kernel.Config.AddOpenAITextCompletionService("davinci", "your-api-key");
var result = await kernel.RunAsync("What is the capital of France?", new OpenAIRequestSettings { MaxTokens = 300 });

Both repositories aim to simplify AI integration, but Developer focuses on a lightweight approach for rapid prototyping, while Semantic Kernel provides a more comprehensive framework for building AI-powered applications, particularly within the Microsoft ecosystem.

langchain

106,456

🦜🔗 Build context-aware reasoning applications

Pros of langchain

More comprehensive and feature-rich framework for building LLM applications
Extensive documentation and community support
Offers a wide range of integrations with various LLMs and tools

Cons of langchain

Steeper learning curve due to its complexity and extensive features
May be overkill for simpler projects or quick prototypes
Requires more setup and configuration

Code Comparison

langchain:

from langchain import OpenAI, LLMChain, PromptTemplate

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
print(chain.run("colorful socks"))

developer:

import openai

openai.api_key = "your-api-key"
response = openai.Completion.create(
  engine="text-davinci-002",
  prompt="What is a good name for a company that makes colorful socks?",
  max_tokens=50
)
print(response.choices[0].text.strip())

The langchain example showcases its abstraction and chaining capabilities, while the developer example demonstrates a more direct approach using the OpenAI API.

TaskMatrix

34,489

Pros of TaskMatrix

Offers a more comprehensive task planning and execution framework
Includes a visual interface for task management and visualization
Provides built-in support for multi-modal tasks (text, image, audio)

Cons of TaskMatrix

More complex setup and configuration required
Less focused on specific developer workflows
May have a steeper learning curve for new users

Code Comparison

TaskMatrix:

task = Task("Analyze image and generate description")
image_analyzer = ImageAnalyzer()
text_generator = TextGenerator()
result = task.execute(image_analyzer, text_generator)

Developer:

prompt = "Analyze the following image and generate a description:"
image_path = "path/to/image.jpg"
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": f"{prompt}\n[Image: {image_path}]"}]
)

Summary

TaskMatrix offers a more comprehensive framework for task management and execution, including multi-modal support and a visual interface. However, it may be more complex to set up and use compared to Developer. Developer, on the other hand, provides a more streamlined approach focused specifically on developer workflows, making it potentially easier to integrate into existing projects. The code comparison shows that TaskMatrix uses a more structured task-based approach, while Developer relies on direct API calls to language models.

AutoGPT

176,470

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Pros of AutoGPT

More comprehensive and feature-rich, offering a wider range of capabilities
Supports multiple AI models and has a larger community for support and development
Includes a web interface for easier interaction and visualization

Cons of AutoGPT

More complex setup and configuration process
Requires more computational resources due to its extensive features
Steeper learning curve for new users

Code Comparison

AutoGPT:

def start_agent(
    task: str,
    ai_name: str,
    memory: Optional[Memory] = None,
    ...
) -> Agent:
    agent = Agent(task, ai_name, memory, ...)
    return agent

developer:

def generate_clarification(messages, prompt):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages + [{"role": "user", "content": prompt}],
        temperature=0.7,
    )
    return response['choices'][0]['message']['content']

The code snippets show that AutoGPT focuses on creating and managing agents, while developer emphasizes generating clarifications using OpenAI's API. This reflects the different approaches and scopes of the two projects.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

ð£ smol developer

Morph

Human-centric & Coherent Whole Program Synthesis aka your own personal junior developer

Build the thing that builds the thing! a smol dev for every dev in every situation

This is a "junior developer" agent (aka smol dev) that either:

scaffolds an entire codebase out for you once you give it a product spec
gives you basic building blocks to have a smol developer inside of your own app.

Instead of making and maintaining specific, rigid, one-shot starters, like create-react-app, or create-nextjs-app, this is basically is or helps you make create-anything-app where you develop your scaffolding prompt in a tight loop with your smol dev.

After the successful initial v0 launch, smol developer was rewritten to be even smol-ler, and importable from a library!

Basic Usage

In Git Repo mode

# install
git clone https://github.com/smol-ai/developer.git
cd developer
poetry install # install dependencies. pip install poetry if you need

# run
python main.py "a HTML/JS/CSS Tic Tac Toe Game" # defaults to gpt-4-0613
# python main.py "a HTML/JS/CSS Tic Tac Toe Game" --model=gpt-3.5-turbo-0613

# other cli flags
python main.py --prompt prompt.md # for longer prompts, move them into a markdown file
python main.py --prompt prompt.md --debug True # for debugging

This lets you develop apps as a human in the loop, as per the original version of smol developer.

engineering with prompts, rather than prompt engineering

The demo example in prompt.md shows the potential of AI-enabled, but still firmly human developer centric, workflow:

Human writes a basic prompt for the app they want to build
main.py generates code
Human runs/reads the code
Human can:
- simply add to the prompt as they discover underspecified parts of the prompt
- manually runs the code and identifies errors
- paste the error into the prompt just like they would file a GitHub issue
- for extra help, they can use debugger.py which reads the whole codebase to make specific code change suggestions

Loop until happiness is attained. Notice that AI is only used as long as it is adding value - once it gets in your way, just take over the codebase from your smol junior developer with no fuss and no hurt feelings. (we could also have smol-dev take over an existing codebase and bootstrap its own prompt... but that's a Future Direction)

In this way you can use your clone of this repo itself to prototype/develop your app.

In Library mode

This is the new thing in smol developer v1! Add smol developer to your own projects!

pip install smol_dev

Here you can basically look at the contents of main.py as our "documentation" of how you can use these functions and prompts in your own app:

from smol_dev.prompts import plan, specify_file_paths, generate_code_sync

prompt = "a HTML/JS/CSS Tic Tac Toe Game"

shared_deps = plan(prompt) # returns a long string representing the coding plan

# do something with the shared_deps plan if you wish, for example ask for user confirmation/edits and iterate in a loop

file_paths = specify_file_paths(prompt, shared_deps) # returns an array of strings representing the filenames it needs to write based on your prompt and shared_deps. Relies on OpenAI's new Function Calling API to guarantee JSON.

# do something with the filepaths if you wish, for example display a plan

# loop through file_paths array and generate code for each file
for file_path in file_paths:
    code = generate_code_sync(prompt, shared_deps, file_path) # generates the source code of each file

    # do something with the source code of the file, eg. write to disk or display in UI
    # there is also an async `generate_code()` version of this

In API mode (via Agent Protocol)

To start the server run:

poetry run api

python smol_dev/api.py

and then you can call the API using either the following commands:

To create a task run:

curl --request POST \
  --url http://localhost:8000/agent/tasks \
  --header 'Content-Type: application/json' \
  --data '{
	"input": "Write simple script in Python. It should write '\''Hello world!'\'' to hi.txt"
}'

You will get a response like this:

{"input":"Write simple script in Python. It should write 'Hello world!' to hi.txt","task_id":"d2c4e543-ae08-4a97-9ac5-5f9a4459cb19","artifacts":[]}

Then to execute one step of the task copy the task_id you got from the previous request and run:

curl --request POST \
  --url http://localhost:8000/agent/tasks/<task-id>/steps

or you can use Python client library:

from agent_protocol_client import AgentApi, ApiClient, TaskRequestBody

...

prompt = "Write simple script in Python. It should write 'Hello world!' to hi.txt"

async with ApiClient() as api_client:
    # Create an instance of the API class
    api_instance = AgentApi(api_client)
    task_request_body = TaskRequestBody(input=prompt)

    task = await api_instance.create_agent_task(
        task_request_body=task_request_body
    )
    task_id = task.task_id
    response = await api_instance.execute_agent_task_step(task_id=task_id)

...

examples/prompt gallery

6 minute video demo - (sorry for sped up audio, we were optimizing for twitter, bad call)
- this was the original smol developer demo - going from prompt to full chrome extension that requests and stores and apikey, generates a popup window, reads and transmits page content, and usefully summarizes any website with Anthropic Claude, switching models up to the 100k one based on length of input
- the prompt is located in prompt.md and it outputs /exampleChromeExtension
smol-plugin - prompt to ChatGPT plugin (tweet, fork)
Prompt to Pokemon App
Political Campaign CRM Program example
Lessons from Creating a VSCode Extension with GPT-4 (also on HN)
7 min Video: Smol AI Developer - Build ENTIRE Codebases With A Single Prompt produces a full working OpenAI CLI python app from a prompt
12 min Video: SMOL AI - Develop Large Scale Apps with AGI in one click scaffolds a surprisingly complex React/Node/MongoDB full stack app in 40 minutes and $9

I'm actively seeking more examples, please PR yours!

sorry for the lack of examples, I know that is frustrating but I wasnt ready for so many of you lol

major forks/alternatives

please send in alternative implementations, and deploy strategies on alternative stacks!

JS/TS: https://github.com/PicoCreator/smol-dev-js A pure JS variant of smol-dev, allowing even smoler incremental changes via prompting (if you dun want to do the whole spec2code thing), allowing you to plug it into any project live (for better or worse)
C#/Dotnet: https://github.com/colhountech/smol-ai-dotnet in C#!
Golang: https://github.com/tmc/smol-dev-go in Go
https://github.com/gmchad/smol-plugin automatically generate @openai plugins by specifying your API in markdown in smol-developer style
your fork here!

innovations and insights

Please subscribe to https://latent.space/ for a fuller writeup and insights and reflections

Markdown is all you need - Markdown is the perfect way to prompt for whole program synthesis because it is easy to mix english and code (whether variable_names or entire ``` code fenced code samples)
- turns out you can specify prompts in code in prompts and gpt4 obeys that to the letter
Copy and paste programming
- teaching the program to understand how to code around a new API (Anthropic's API is after GPT3's knowledge cutoff) by just pasting in the curl input and output
- pasting error messages into the prompt and vaguely telling the program how you'd like it handled. it kind of feels like "logbook driven programming".
Debugging by cating the whole codebase with your error message and getting specific fix suggestions - particularly delightful!
Tricks for whole program coherence - our chosen example usecase, Chrome extensions, have a lot of indirect dependencies across files. Any hallucination of cross dependencies causes the whole program to error.
- We solved this by adding an intermediate step asking GPT to think through shared_dependencies.md, and then insisting on using that in generating each file. This basically means GPT is able to talk to itself...
- ... but it's not perfect, yet. shared_dependencies.md is sometimes not comperehensive in understanding what are hard dependencies between files. So we just solved it by specifying a specific name in the prompt. felt dirty at first but it works, and really it's just clear unambiguous communication at the end of the day.
- see prompt.md for SOTA smol-dev prompting
Low activation energy for unfamiliar APIs
- we have never really learned css animations, but now can just say we want a "juicy css animated red and white candy stripe loading indicator" and it does the thing.
- ditto for Chrome Extension Manifest v3 - the docs are an abject mess, but fortunately we don't have to read them now to just get a basic thing done
- the Anthropic docs (bad bad) were missing guidance on what return signature they have. so just curl it and dump it in the prompt lol.
Modal is all you need - we chose Modal to solve 4 things:
- solve python dependency hell in dev and prod
- parallelizable code generation
- simple upgrade path from local dev to cloud hosted endpoints (in future)
- fault tolerant openai api calls with retries/backoff, and attached storage (for future use)

Please subscribe to https://latent.space/ for a fuller writeup and insights and reflections

caveats

We were working on a Chrome Extension, which requires images to be generated, so we added some usecase specific code in there to skip destroying/regenerating them, that we haven't decided how to generalize.

We dont have access to GPT4-32k, but if we did, we'd explore dumping entire API/SDK documentation into context.

The feedback loop is very slow right now (time says about 2-4 mins to generate a program with GPT4, even with parallelization due to Modal (occasionally spiking higher)), but it's a safe bet that it will go down over time (see also "future directions" below).

future directions

things to try/would accept open issue discussions and PRs:

specify .md files for each generated file, with further prompts that could finetune the output in each of them
- so basically like popup.html.md and content_script.js.md and so on
bootstrap the prompt.md for existing codebases - write a script to read in a codebase and write a descriptive, bullet pointed prompt that generates it
- done by smol pm, but its not very good yet - would love for some focused polish/effort until we have quine smol developer that can generate itself lmao
ability to install its own dependencies
- this leaks into depending on the execution environment, which we all know is the path to dependency madness. how to avoid? dockerize? nix? web container?
- Modal has an interesting possibility: generate functions that speak modal which also solves the dependency thing https://twitter.com/akshat_b/status/1658146096902811657
self-heal by running the code itself and use errors as information for reprompting
- however its a bit hard to get errors from the chrome extension environment so we did not try this
using anthropic as the coding layer
- you can run modal run anthropic.py --prompt prompt.md --outputdir=anthropic to try it
- but it doesnt work because anthropic doesnt follow instructions to generate file code very well.
make agents that autonomously run this code in a loop/watch the prompt file and regenerate code each time, on a new git branch
- the code could be generated on 5 simultaneous git branches and checking their output would just involve switching git branches

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Getting Started

Competitor Comparisons

Pros of openai-cookbook

Cons of openai-cookbook

Code Comparison

Pros of Semantic Kernel

Cons of Semantic Kernel

Code Comparison

Pros of langchain

Cons of langchain

Code Comparison

Pros of TaskMatrix

Cons of TaskMatrix

Code Comparison

Summary

Pros of AutoGPT

Cons of AutoGPT

Code Comparison

Convert designs to code with AI

README

ð£ smol developer

Basic Usage

In Git Repo mode

In Library mode

In API mode (via Agent Protocol)

examples/prompt gallery

major forks/alternatives

innovations and insights

caveats

future directions

Top Related Projects

Convert designs to code with AI

ð£ smol developer