openv0

AI generated UI components

3,883

347

3,883

View on GitHub

Top Related Projects

whisper

85,961

Robust Speech Recognition via Large-Scale Weak Supervision

whisper.cpp

41,097

Port of OpenAI's Whisper model in C/C++

whisperX

16,462

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

faster-whisper

17,373

Faster Whisper transcription with CTranslate2

Whisper

9,434

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

jukebox

8,014

Code for the paper "Jukebox: A Generative Model for Music"

Quick Overview

OpenV0 is an open-source project aimed at creating a fully open and reproducible vision-language model. It combines a vision encoder and a language model to generate text descriptions of images. The project is in its early stages and seeks community contributions to improve and expand its capabilities.

Pros

Fully open-source and transparent, allowing for community-driven development and customization
Combines vision and language models for image-to-text generation
Provides a foundation for researchers and developers to build upon and experiment with vision-language models
Encourages collaboration and knowledge sharing in the AI community

Cons

Still in early development stages, with limited functionality compared to more established models
May require significant computational resources for training and inference
Documentation and examples are currently limited, potentially making it challenging for newcomers to get started
Performance and accuracy may not yet match that of proprietary or more mature vision-language models

Code Examples

Here are a few code examples demonstrating the usage of OpenV0:

Loading the model and tokenizer:

from openv0 import OpenV0

model = OpenV0.from_pretrained("path/to/model")
tokenizer = OpenV0Tokenizer.from_pretrained("path/to/tokenizer")

Generating a caption for an image:

from PIL import Image

image = Image.open("path/to/image.jpg")
caption = model.generate_caption(image, tokenizer)
print(caption)

Fine-tuning the model on custom data:

from openv0 import OpenV0Trainer

trainer = OpenV0Trainer(model, tokenizer, train_dataset, eval_dataset)
trainer.train()

Getting Started

To get started with OpenV0, follow these steps:

Install the library:

pip install openv0

Download the pre-trained model and tokenizer:

from openv0 import OpenV0, OpenV0Tokenizer

model = OpenV0.from_pretrained("openv0/openv0-base")
tokenizer = OpenV0Tokenizer.from_pretrained("openv0/openv0-base")

Generate a caption for an image:

from PIL import Image

image = Image.open("path/to/your/image.jpg")
caption = model.generate_caption(image, tokenizer)
print(caption)

Note: As the project is in its early stages, make sure to check the official repository for the most up-to-date installation and usage instructions.

Competitor Comparisons

whisper

85,961

Robust Speech Recognition via Large-Scale Weak Supervision

Pros of Whisper

More mature and widely adopted project with extensive documentation
Supports a broader range of languages and accents
Offers pre-trained models for immediate use

Cons of Whisper

Larger model size, requiring more computational resources
Less flexibility for customization and fine-tuning
Primarily focused on speech recognition, with limited additional features

Code Comparison

Whisper:

import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(result["text"])

OpenV0:

from openv0 import OpenV0

model = OpenV0()
transcription = model.transcribe("audio.mp3")
print(transcription)

Key Differences

Whisper is specifically designed for speech recognition, while OpenV0 aims to be a more general-purpose AI model
OpenV0 focuses on lightweight implementation and ease of use
Whisper offers multiple model sizes, whereas OpenV0 provides a single, compact model

Use Cases

Whisper: Ideal for production-ready speech recognition tasks across various languages
OpenV0: Suitable for developers seeking a simple, customizable AI model for diverse applications

whisper.cpp

41,097

Port of OpenAI's Whisper model in C/C++

Pros of whisper.cpp

Highly optimized C++ implementation for efficient speech recognition
Supports various platforms and architectures, including mobile devices
Extensive documentation and active community support

Cons of whisper.cpp

Limited to speech recognition tasks, less versatile than OpenV0
Requires more technical expertise to integrate and customize

Code Comparison

whisper.cpp:

#include "whisper.h"

int main(int argc, char** argv) {
    struct whisper_context * ctx = whisper_init_from_file("ggml-base.en.bin");
    whisper_full_default(ctx, wparams, pcmf32.data(), pcmf32.size());
}

OpenV0:

from openv0 import OpenV0

model = OpenV0()
result = model.generate("Describe this image", image_path="example.jpg")
print(result)

The code snippets highlight the different focus areas of the two projects, with whisper.cpp being specialized for speech recognition and OpenV0 offering a more versatile approach to AI tasks, including image analysis.

whisperX

16,462

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Pros of WhisperX

Specialized in audio transcription and alignment, offering more advanced features for this specific task
Provides word-level timestamps and speaker diarization capabilities
Actively maintained with regular updates and improvements

Cons of WhisperX

Limited to audio processing tasks, lacking the broader AI capabilities of OpenV0
May require more computational resources for advanced features like speaker diarization
Less flexible for general-purpose AI applications compared to OpenV0's modular approach

Code Comparison

WhisperX:

import whisperx

model = whisperx.load_model("large-v2")
result = model.transcribe("audio.mp3")
result = whisperx.align(result["segments"], model, "audio.mp3")

OpenV0:

from openv0 import OpenV0

ai = OpenV0()
result = ai.run("Transcribe the audio file 'audio.mp3' and provide timestamps.")

While WhisperX offers more specialized audio processing capabilities, OpenV0 provides a more versatile and user-friendly interface for various AI tasks, including audio transcription. WhisperX may be preferred for projects requiring advanced audio analysis, while OpenV0 is better suited for general-purpose AI applications with its modular and extensible architecture.

faster-whisper

17,373

Faster Whisper transcription with CTranslate2

Pros of faster-whisper

Optimized for speed, offering faster transcription performance
Supports multiple languages and provides language detection
Implements efficient CPU and GPU inference

Cons of faster-whisper

Focused solely on speech recognition, lacking broader AI capabilities
May require more setup and dependencies for optimal performance
Limited to audio input, not designed for multi-modal tasks

Code Comparison

faster-whisper:

from faster_whisper import WhisperModel

model = WhisperModel("large-v2", device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.mp3", beam_size=5)

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

openv0:

from openv0 import OpenV0

agent = OpenV0()
response = agent.run("Describe the image and transcribe any speech in it", 
                     image="image.jpg", audio="audio.mp3")
print(response)

Summary

faster-whisper excels in speech recognition tasks, offering optimized performance and multi-language support. However, it's limited to audio processing. openv0, on the other hand, provides a more versatile AI agent capable of handling multiple modalities, including both image and audio inputs. While faster-whisper may offer superior speed for pure transcription tasks, openv0 provides a broader range of AI capabilities in a single package.

Whisper

9,434

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

Pros of Whisper

Optimized for performance with GPU acceleration
Supports multiple languages and provides language detection
Offers both streaming and batch processing capabilities

Cons of Whisper

Limited to speech recognition and transcription tasks
Requires more setup and configuration compared to OpenV0
May have higher system requirements due to GPU optimization

Code Comparison

OpenV0:

from openv0 import OpenV0

client = OpenV0()
response = client.chat("Tell me a joke")
print(response)

Whisper:

#include "whisper.h"

whisper_context * ctx = whisper_init_from_file("model.bin");
whisper_full_params params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY);
whisper_full(ctx, params, pcm, n_samples);

Summary

Whisper focuses on high-performance speech recognition with GPU acceleration, supporting multiple languages and offering both streaming and batch processing. OpenV0, on the other hand, provides a more general-purpose AI interface with a simpler setup process. While Whisper excels in speech-related tasks, OpenV0 offers broader functionality for various AI applications. The choice between the two depends on the specific requirements of the project and the desired balance between performance and ease of use.

jukebox

8,014

Code for the paper "Jukebox: A Generative Model for Music"

Pros of Jukebox

More advanced and established project for AI music generation
Backed by OpenAI, with extensive research and documentation
Capable of generating high-quality, multi-instrumental music samples

Cons of Jukebox

Requires significant computational resources to run
Less focused on real-time generation or interactive use
More complex to set up and use for non-technical users

Code Comparison

Jukebox:

import jukebox
from jukebox.make_models import make_vqvae, make_prior
vqvae = make_vqvae(hps)
prior = make_prior(hps)

OpenV0:

from openv0 import OpenV0
model = OpenV0()
output = model.generate("Generate a happy melody")

Key Differences

Jukebox is specifically designed for music generation, while OpenV0 is a more general-purpose AI model
OpenV0 aims for easier integration and use in various applications
Jukebox offers more control over musical elements, while OpenV0 focuses on natural language prompts
OpenV0 is designed for real-time generation, whereas Jukebox typically requires more processing time

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Note :

openv0 is no longer maintained , the new project is Cofounder

openv0

project website - openv0.com

openv0 is a generative UI component framework

It allows you to AI generate and iterate on UI components, with live preview.

openv0 makes use of open source component libraries and icons to build a library of assets for the generative pipeline.
openv0 is highly modular, and structured for elaborate generative processes
Component generation is a multipass pipeline - where every pass is a fully independent plugin

(say hi @n_raidenai ð)

Currently Supported

Frontend frameworks
- React
- Next.js
- Svelte
UI libraries
- NextUI
- Flowbite
- Shadcn
Icons libraries
- Lucide

The latest openv0 update makes it easier to integrate new frameworks, libraries and plugins.

Docs & guides on how to do so will be soon posted.

Next updates :

public explore+share web app on openv0.com (you can use the openv0 share API already)
multimodal UIray vision model (more details soon)
better validation passes, more integrations & plugins

Demos

Current version

https://github.com/raidendotai/openv0/assets/127366981/a249cf0d-ae44-4155-a5c1-fc2528bf05b5

Previous version

openv0_demo.webm

Install

Open your terminal and run

npx openv0@latest

It will download openv0, configure it based on your choices & install dependencies. Then :

Start the local server + webapp
- start server cd server && node api.js
- start webapp cd webapp && npm run dev
Open you web browser, go to http://localhost:5173/

That is all. Have fun ð

Alternatively - you can also clone this repo and install manually

To do so :

Clone repo, run npm i in server/
Unzip server/library/icons/lucide/vectordb/index.zip into that same folder
Configure your OpenAI key in server/.env
Web apps starter templates are in webapps-starters/
- run npm i in the web app starter of your choice
- make sure that WEBAPP_ROOT variable server/.env matches your webapp folder path
Start the server with node api.js and the web app with npm run dev

Try openv0

You can try openv0 (using React as a framework) with minimal configuration below

Replit

StackBlitz

How It Works

Multipass Workflow

A simple explanation is the following image

openv0_process

Codebase

Youtube video by user @elie2222 explains parts of the previous openv0 code base

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot