screenshot-to-code

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

70,227

8,673

70,227

124

View on GitHub

Top Related Projects

diffusionbee-stable-diffusion-ui

13,285

Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.

text-generation-webui

44,456

LLM UI with advanced features, easy setup, and multiple backend support.

whisper

85,961

Robust Speech Recognition via Large-Scale Weak Supervision

DALL-E

10,878

PyTorch package for the discrete VAE used for DALL·E.

Quick Overview

Screenshot-to-code is an open-source project that uses AI to convert screenshots of user interfaces into functional HTML/CSS code. It leverages computer vision and large language models to analyze images and generate corresponding code, aiming to streamline the process of translating designs into web implementations.

Pros

Accelerates the design-to-code process, potentially saving developers significant time
Provides a useful tool for rapid prototyping and concept visualization
Supports multiple frontend frameworks and can generate code for various technologies
Continuously improving with community contributions and AI advancements

Cons

Generated code may require manual refinement for production-ready implementations
Accuracy can vary depending on the complexity of the input screenshot
May not capture all nuances of responsive design or advanced UI interactions
Reliance on external AI services could raise privacy concerns for sensitive designs

Code Examples

This project is not primarily a code library, but rather a tool that generates code. However, here are some examples of how to use the generated code:

<!-- Example of generated HTML structure -->
<div class="container">
  <header>
    <h1>Welcome to My Website</h1>
    <nav>
      <ul>
        <li><a href="#home">Home</a></li>
        <li><a href="#about">About</a></li>
        <li><a href="#contact">Contact</a></li>
      </ul>
    </nav>
  </header>
  <main>
    <section id="content">
      <p>This is the main content area.</p>
    </section>
  </main>
</div>

/* Example of generated CSS styles */
.container {
  max-width: 1200px;
  margin: 0 auto;
  padding: 20px;
}

header {
  display: flex;
  justify-content: space-between;
  align-items: center;
}

nav ul {
  display: flex;
  list-style-type: none;
}

nav ul li {
  margin-left: 20px;
}

Getting Started

To use screenshot-to-code:

Clone the repository: git clone https://github.com/abi/screenshot-to-code.git
Install dependencies: pip install -r requirements.txt
Set up your OpenAI API key as an environment variable
Run the application: python app.py
Upload a screenshot through the web interface
Review and download the generated code

Note: Detailed setup instructions and requirements are available in the project's README file on GitHub.

Competitor Comparisons

diffusionbee-stable-diffusion-ui

13,285

Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.

Pros of DiffusionBee-Stable-Diffusion-UI

Focuses on image generation using Stable Diffusion models
Provides a user-friendly GUI for easy interaction
Supports various image generation features like inpainting and outpainting

Cons of DiffusionBee-Stable-Diffusion-UI

Limited to image generation tasks, not web development
May require more computational resources for running Stable Diffusion models
Less versatile in terms of output formats compared to Screenshot-to-Code

Code Comparison

While a direct code comparison is not particularly relevant due to the different nature of these projects, we can highlight some key differences in their implementation:

Screenshot-to-Code:

def generate_code(screenshot, code_template):
    # AI-based code generation from screenshot

DiffusionBee-Stable-Diffusion-UI:

def generate_image(prompt, model):
    # Stable Diffusion image generation

These snippets illustrate the fundamental difference in purpose between the two projects. Screenshot-to-Code focuses on generating code from visual input, while DiffusionBee-Stable-Diffusion-UI is designed for image generation based on text prompts.

text-generation-webui

44,456

LLM UI with advanced features, easy setup, and multiple backend support.

Pros of text-generation-webui

Supports a wide range of language models and architectures
Offers a user-friendly web interface for text generation tasks
Provides extensive customization options and parameters

Cons of text-generation-webui

Requires more setup and configuration compared to screenshot-to-code
May have a steeper learning curve for users new to language models
Focuses solely on text generation, lacking image processing capabilities

Code Comparison

text-generation-webui:

def generate_reply(
    prompt, state, stopping_strings=None, is_chat=False
):
    # Generate text based on prompt and parameters
    # ...

screenshot-to-code:

def generate_code(image_path, model):
    # Process image and generate HTML/CSS code
    # ...

The code snippets highlight the different focus areas of the two projects. text-generation-webui is centered around text generation with various parameters, while screenshot-to-code emphasizes image processing and code generation based on visual input.

stable-diffusion-webui

153,957

Stable Diffusion web UI

Pros of stable-diffusion-webui

More comprehensive and feature-rich, offering a wide range of image generation and manipulation tools
Highly customizable with a large ecosystem of extensions and models
Active community support and frequent updates

Cons of stable-diffusion-webui

Steeper learning curve due to its extensive features and options
Requires more computational resources and setup time
Primarily focused on image generation, not web development or UI creation

Code Comparison

While a direct code comparison isn't particularly relevant due to the different purposes of these projects, here's a brief example of how they might be used:

screenshot-to-code:

from screenshot_to_code import generate_code
code = generate_code("screenshot.png")
print(code)

stable-diffusion-webui:

import modules.scripts as scripts
from modules import sd_samplers
result = scripts.process_images(prompt="A beautiful landscape")
result.images[0].save("output.png")

screenshot-to-code is focused on converting UI designs to code, while stable-diffusion-webui is primarily used for generating and manipulating images using AI models. The choice between them depends on the specific task at hand: UI development vs. image generation.

TaskMatrix

34,436

Pros of TaskMatrix

Broader scope: Handles a wide range of tasks beyond UI generation
Multi-modal capabilities: Integrates vision, language, and action
More flexible: Can adapt to various types of inputs and outputs

Cons of TaskMatrix

Less specialized: May not produce as refined UI code as Screenshot-to-code
Potentially more complex to use due to its broader functionality
Might require more computational resources for its diverse capabilities

Code Comparison

TaskMatrix (Python-based approach):

from taskmatrix import TaskMatrix

tm = TaskMatrix()
result = tm.process_image_and_generate_task("image.jpg", "Generate UI code")
print(result)

Screenshot-to-code (JavaScript-based approach):

import { generateCode } from 'screenshot-to-code';

const screenshot = 'path/to/screenshot.png';
const code = await generateCode(screenshot);
console.log(code);

Summary

TaskMatrix offers a more versatile approach to AI-driven tasks, including UI generation, while Screenshot-to-code focuses specifically on translating UI designs into code. TaskMatrix's broader scope may appeal to users needing multi-modal AI capabilities, while Screenshot-to-code might be preferred for its specialized UI code generation. The choice between them depends on the specific use case and desired level of specialization.

whisper

85,961

Robust Speech Recognition via Large-Scale Weak Supervision

Pros of Whisper

Highly accurate speech recognition across multiple languages
Versatile model capable of transcription, translation, and language identification
Extensive research and development backing from OpenAI

Cons of Whisper

Focused solely on audio processing, lacking visual or UI generation capabilities
Requires significant computational resources for optimal performance

Code Comparison

While a direct code comparison isn't particularly relevant due to the different nature of these projects, here's a brief example of how each might be used:

Whisper:

import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(result["text"])

Screenshot-to-code:

from screenshot_to_code import generate_code

screenshot = "screenshot.png"
code = generate_code(screenshot)
print(code)

Summary

Whisper excels in audio processing and speech recognition, offering a robust solution for transcription and translation tasks. Screenshot-to-code, on the other hand, focuses on converting visual designs into code, addressing a different set of challenges in the realm of UI development. While both projects showcase impressive AI capabilities, they serve distinct purposes in the developer ecosystem.

DALL-E

10,878

PyTorch package for the discrete VAE used for DALL·E.

Pros of DALL-E

Generates unique and creative images from text descriptions
Capable of producing a wide variety of artistic styles and concepts
Useful for brainstorming visual ideas and inspiration

Cons of DALL-E

Does not generate functional code or UI elements
Limited to image generation, not suitable for web development tasks
Requires careful prompt engineering to achieve desired results

Code Comparison

While a direct code comparison is not relevant due to the different nature of these projects, here's a brief overview of how they might be used:

DALL-E (Python API example):

import openai

response = openai.Image.create(
    prompt="A website homepage for a coffee shop",
    n=1,
    size="1024x1024"
)
image_url = response['data'][0]['url']

Screenshot-to-code (Python usage example):

from screenshot_to_code import generate_code

screenshot_path = "coffee_shop_homepage.png"
generated_code = generate_code(screenshot_path)
print(generated_code)

DALL-E is focused on image generation from text prompts, while Screenshot-to-code aims to convert visual designs into functional code. They serve different purposes in the development process, with DALL-E being more suited for creative ideation and Screenshot-to-code for implementation.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

screenshot-to-code

A simple tool to convert screenshots, mockups and Figma designs into clean, functional code using AI. Now supporting Claude Sonnet 3.7!

https://github.com/abi/screenshot-to-code/assets/23818/6cebadae-2fe3-4986-ac6a-8fb9db030045

Supported stacks:

HTML + Tailwind
HTML + CSS
React + Tailwind
Vue + Tailwind
Bootstrap
Ionic + Tailwind
SVG

Supported AI models:

Claude Sonnet 3.7 - Best model!
GPT-4o - also recommended!
DALL-E 3 or Flux Schnell (using Replicate) for image generation

See the Examples section below for more demos.

We also just added experimental support for taking a video/screen recording of a website in action and turning that into a functional prototype.

google in app quick 3

Learn more about video here.

Follow me on Twitter for updates.

ð Hosted Version

Try it live on the hosted version (paid).

ð Getting Started

The app has a React/Vite frontend and a FastAPI backend.

Keys needed:

OpenAI API key with access to GPT-4 or Anthropic key (optional)
Both are recommended so you can compare results from both Claude and GPT4o

If you'd like to run the app with Ollama open source models (not recommended due to poor quality results), follow this comment.

Run the backend (I use Poetry for package management - pip install --upgrade poetry if you don't have it):

cd backend
echo "OPENAI_API_KEY=sk-your-key" > .env
echo "ANTHROPIC_API_KEY=your-key" > .env
poetry install
poetry shell
poetry run uvicorn main:app --reload --port 7001

You can also set up the keys using the settings dialog on the front-end (click the gear icon after loading the frontend).

Run the frontend:

cd frontend
yarn
yarn dev

Open http://localhost:5173 to use the app.

If you prefer to run the backend on a different port, update VITE_WS_BACKEND_URL in frontend/.env.local

For debugging purposes, if you don't want to waste GPT4-Vision credits, you can run the backend in mock mode (which streams a pre-recorded response):

MOCK=true poetry run uvicorn main:app --reload --port 7001

Docker

If you have Docker installed on your system, in the root directory, run:

echo "OPENAI_API_KEY=sk-your-key" > .env
docker-compose up -d --build

The app will be up and running at http://localhost:5173. Note that you can't develop the application with this setup as the file changes won't trigger a rebuild.

ðââï¸ FAQs

I'm running into an error when setting up the backend. How can I fix it? Try this. If that still doesn't work, open an issue.
How do I get an OpenAI API key? See https://github.com/abi/screenshot-to-code/blob/main/Troubleshooting.md
How can I configure an OpenAI proxy? - If you're not able to access the OpenAI API directly (due to e.g. country restrictions), you can try a VPN or you can configure the OpenAI base URL to use a proxy: Set OPENAI_BASE_URL in the backend/.env or directly in the UI in the settings dialog. Make sure the URL has "v1" in the path so it should look like this: https://xxx.xxxxx.xxx/v1
How can I update the backend host that my front-end connects to? - Configure VITE_HTTP_BACKEND_URL and VITE_WS_BACKEND_URL in front/.env.local For example, set VITE_HTTP_BACKEND_URL=http://124.10.20.1:7001
Seeing UTF-8 errors when running the backend? - On windows, open the .env file with notepad++, then go to Encoding and select UTF-8.
How can I provide feedback? For feedback, feature requests and bug reports, open an issue or ping me on Twitter.

ð Examples

NYTimes

Original	Replica

Instagram page (with not Taylor Swift pics)

https://github.com/abi/screenshot-to-code/assets/23818/503eb86a-356e-4dfc-926a-dabdb1ac7ba1

Hacker News but it gets the colors wrong at first so we nudge it

https://github.com/abi/screenshot-to-code/assets/23818/3fec0f77-44e8-4fb3-a769-ac7410315e5d

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of DiffusionBee-Stable-Diffusion-UI

Cons of DiffusionBee-Stable-Diffusion-UI

Code Comparison

Pros of text-generation-webui

Cons of text-generation-webui

Code Comparison

Pros of stable-diffusion-webui

Cons of stable-diffusion-webui

Code Comparison

Pros of TaskMatrix

Cons of TaskMatrix

Code Comparison

Summary

Pros of Whisper

Cons of Whisper

Code Comparison

Summary

Pros of DALL-E

Cons of DALL-E

Code Comparison

Convert designs to code with AI

README

screenshot-to-code

ð Hosted Version

ð Getting Started

Docker

ðââï¸ FAQs

ð Examples

Top Related Projects

Convert designs to code with AI

ð Hosted Version

ð Getting Started

ðââï¸ FAQs

ð Examples