Convert Figma logo to code with AI

browser-use logoweb-ui

Run AI Agent in your browser.

10,109
1,662
10,109
232

Top Related Projects

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

90,111

JavaScript API for Chrome and Firefox

31,345

A browser automation framework and ecosystem.

47,669

Fast, easy and reliable testing for anything that runs in a browser.

Next-gen browser and mobile automation test framework for Node.js

19,258

Cross-platform automation framework for all kinds of apps, built on top of the W3C WebDriver protocol

Quick Overview

The browser-use/web-ui repository is a collection of web components and utilities for building user interfaces. It provides a set of reusable UI elements and tools to streamline the development of web applications, focusing on modern web standards and best practices.

Pros

  • Modular architecture allowing easy integration and customization
  • Lightweight and performance-optimized components
  • Comprehensive documentation and examples
  • Cross-browser compatibility

Cons

  • Limited ecosystem compared to more established UI libraries
  • Steeper learning curve for developers new to web components
  • May require additional polyfills for older browser support

Code Examples

Creating a custom button component:

import { LitElement, html, css } from 'lit';

class CustomButton extends LitElement {
  static styles = css`
    button {
      padding: 10px 20px;
      background-color: #007bff;
      color: white;
      border: none;
      border-radius: 4px;
      cursor: pointer;
    }
  `;

  render() {
    return html`
      <button @click=${this._handleClick}>
        <slot></slot>
      </button>
    `;
  }

  _handleClick() {
    this.dispatchEvent(new CustomEvent('button-click'));
  }
}

customElements.define('custom-button', CustomButton);

Using the modal component:

import { Modal } from '@browser-use/web-ui';

const modal = new Modal({
  title: 'Welcome',
  content: 'This is a modal dialog',
  onClose: () => console.log('Modal closed')
});

modal.open();

Implementing a responsive grid layout:

<div class="grid-container">
  <div class="grid-item">Item 1</div>
  <div class="grid-item">Item 2</div>
  <div class="grid-item">Item 3</div>
</div>

<style>
  .grid-container {
    display: grid;
    grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
    gap: 20px;
  }
  .grid-item {
    background-color: #f0f0f0;
    padding: 20px;
    text-align: center;
  }
</style>

Getting Started

To start using the browser-use/web-ui library in your project:

  1. Install the package:

    npm install @browser-use/web-ui
    
  2. Import and use components in your JavaScript:

    import { Button, Modal, Tabs } from '@browser-use/web-ui';
    
    // Use components in your application
    const button = new Button({ label: 'Click me' });
    document.body.appendChild(button);
    
  3. Include the necessary CSS:

    <link rel="stylesheet" href="node_modules/@browser-use/web-ui/dist/styles.css">
    

For more detailed usage instructions and component documentation, refer to the official documentation in the repository.

Competitor Comparisons

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

Pros of Playwright

  • Supports multiple browsers (Chromium, Firefox, WebKit) out of the box
  • Provides a powerful API for automating web browsers and testing web applications
  • Has extensive documentation and a large, active community

Cons of Playwright

  • Steeper learning curve due to its comprehensive feature set
  • Requires more setup and configuration for basic tasks
  • May be overkill for simple web scraping or automation tasks

Code Comparison

Playwright:

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await browser.close();
})();

web-ui:

const { launch } = require('web-ui');

(async () => {
  const browser = await launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await browser.close();
})();

Summary

Playwright is a more comprehensive solution for browser automation and testing, offering multi-browser support and a rich API. However, it may be more complex for simple tasks. web-ui appears to be a simpler alternative, potentially easier to set up and use for basic web automation, but with fewer features and less extensive documentation.

90,111

JavaScript API for Chrome and Firefox

Pros of Puppeteer

  • More comprehensive and feature-rich API for browser automation
  • Extensive documentation and large community support
  • Built-in support for generating PDFs and screenshots

Cons of Puppeteer

  • Heavier resource usage due to full browser automation
  • Steeper learning curve for beginners
  • Requires Node.js environment to run

Code Comparison

Web-UI example:

const browser = await launch();
const page = await browser.newPage();
await page.goto('https://example.com');
const title = await page.title();
await browser.close();

Puppeteer example:

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
const title = await page.title();
await browser.close();

While the basic usage appears similar, Puppeteer offers more advanced features and methods for complex automation tasks. Web-UI focuses on simplicity and ease of use for basic web interactions, making it potentially more accessible for beginners or simpler use cases. However, Puppeteer's extensive capabilities make it a more powerful tool for comprehensive browser automation and testing scenarios.

31,345

A browser automation framework and ecosystem.

Pros of Selenium

  • Mature and widely adopted framework with extensive documentation and community support
  • Supports multiple programming languages (Java, Python, C#, etc.)
  • Offers cross-browser testing capabilities

Cons of Selenium

  • Can be complex to set up and maintain, especially for beginners
  • Slower execution compared to newer, lightweight alternatives
  • Requires separate WebDriver installations and management

Code Comparison

Selenium (Python):

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://example.com")
element = driver.find_element(By.ID, "my-element")
element.click()
driver.quit()

web-ui (JavaScript):

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.click('#my-element');
  await browser.close();
})();

The code comparison shows that Selenium requires more setup and explicit WebDriver management, while web-ui (using Playwright) offers a more streamlined approach with built-in browser automation. Selenium's syntax is more verbose, whereas web-ui provides a more concise and modern API for browser interactions.

47,669

Fast, easy and reliable testing for anything that runs in a browser.

Pros of Cypress

  • More comprehensive end-to-end testing framework with built-in assertion library
  • Extensive documentation and active community support
  • Real-time reloading and debugging capabilities

Cons of Cypress

  • Limited cross-browser testing (primarily focused on Chrome)
  • Steeper learning curve for beginners
  • Potential performance issues with large test suites

Code Comparison

Cypress example:

describe('Login Test', () => {
  it('should log in successfully', () => {
    cy.visit('/login')
    cy.get('#username').type('testuser')
    cy.get('#password').type('password123')
    cy.get('#submit').click()
    cy.url().should('include', '/dashboard')
  })
})

Web-UI example:

const { test } = require('@playwright/test');

test('Login Test', async ({ page }) => {
  await page.goto('/login');
  await page.fill('#username', 'testuser');
  await page.fill('#password', 'password123');
  await page.click('#submit');
  await page.waitForURL('**/dashboard');
});

The code comparison shows that both frameworks allow for similar test scenarios, but Cypress uses its own custom commands and assertions, while Web-UI relies on Playwright's API. Cypress's syntax is more concise and readable for simple tests, but Web-UI offers more flexibility for complex scenarios and cross-browser testing.

Next-gen browser and mobile automation test framework for Node.js

Pros of WebdriverIO

  • More comprehensive and feature-rich automation framework
  • Larger community and better documentation
  • Supports multiple programming languages and testing frameworks

Cons of WebdriverIO

  • Steeper learning curve for beginners
  • More complex setup and configuration
  • Potentially slower execution due to its extensive feature set

Code Comparison

WebdriverIO:

describe('My Login application', () => {
    it('should login with valid credentials', async () => {
        await browser.url(`https://the-internet.herokuapp.com/login`);
        await $('#username').setValue('tomsmith');
        await $('#password').setValue('SuperSecretPassword!');
        await $('button[type="submit"]').click();
        await expect($('#flash')).toBeExisting();
    });
});

web-ui:

const { browser } = require('web-ui');

(async () => {
    await browser.goto('https://example.com');
    await browser.type('#username', 'user123');
    await browser.click('#submit');
    console.log(await browser.text('h1'));
})();

The code comparison shows that WebdriverIO uses a more structured testing approach with describe and it blocks, while web-ui offers a simpler, more straightforward syntax for browser automation tasks. WebdriverIO's code is more verbose but provides better readability and organization for complex test suites.

19,258

Cross-platform automation framework for all kinds of apps, built on top of the W3C WebDriver protocol

Pros of Appium

  • Broader support for multiple platforms (iOS, Android, Windows)
  • Larger community and more extensive documentation
  • More robust and feature-rich for mobile automation testing

Cons of Appium

  • Steeper learning curve and more complex setup
  • Slower test execution compared to native frameworks
  • Requires more resources and can be less stable in some scenarios

Code Comparison

Appium (JavaScript):

const driver = await wdio.remote(opts);
await driver.init();
await driver.$('~myButton').click();
await driver.deleteSession();

web-ui (JavaScript):

const browser = await launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.click('#myButton');
await browser.close();

Both repositories focus on browser automation and testing, but Appium is primarily designed for mobile app testing across multiple platforms, while web-ui appears to be more focused on web browser automation. Appium offers a more comprehensive solution for cross-platform mobile testing, but may be overkill for simple web automation tasks. web-ui seems to provide a simpler approach for web-specific automation, potentially with a lower barrier to entry for web developers.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Browser Use Web UI

GitHub stars Discord Documentation WarmShao

This project builds upon the foundation of the browser-use, which is designed to make websites accessible for AI agents.

We would like to officially thank WarmShao for his contribution to this project.

WebUI: is built on Gradio and supports most of browser-use functionalities. This UI is designed to be user-friendly and enables easy interaction with the browser agent.

Expanded LLM Support: We've integrated support for various Large Language Models (LLMs), including: Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama etc. And we plan to add support for even more models in the future.

Custom Browser Support: You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording.

Persistent Browser Sessions: You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions.

Installation Guide

Prerequisites

  • Python 3.11 or higher
  • Git (for cloning the repository)

Option 1: Local Installation

Read the quickstart guide or follow the steps below to get started.

Step 1: Clone the Repository

git clone https://github.com/browser-use/web-ui.git
cd web-ui

Step 2: Set Up Python Environment

We recommend using uv for managing the Python environment.

Using uv (recommended):

uv venv --python 3.11

Activate the virtual environment:

  • Windows (Command Prompt):
.venv\Scripts\activate
  • Windows (PowerShell):
.\.venv\Scripts\Activate.ps1
  • macOS/Linux:
source .venv/bin/activate

Step 3: Install Dependencies

Install Python packages:

uv pip install -r requirements.txt

Install Browsers in Playwright: You can install specific browsers by running:

playwright install --with-deps chromium

To install all browsers:

playwright install

Step 4: Configure Environment

  1. Create a copy of the example environment file:
  • Windows (Command Prompt):
copy .env.example .env
  • macOS/Linux/Windows (PowerShell):
cp .env.example .env
  1. Open .env in your preferred text editor and add your API keys and other settings

Option 2: Docker Installation

Prerequisites

Installation Steps

  1. Clone the repository:
git clone https://github.com/browser-use/web-ui.git
cd web-ui
  1. Create and configure environment file:
  • Windows (Command Prompt):
copy .env.example .env
  • macOS/Linux/Windows (PowerShell):
cp .env.example .env

Edit .env with your preferred text editor and add your API keys

  1. Run with Docker:
# Build and start the container with default settings (browser closes after AI tasks)
docker compose up --build
# Or run with persistent browser (browser stays open between AI tasks)
CHROME_PERSISTENT_SESSION=true docker compose up --build
  1. Access the Application:
  • Web Interface: Open http://localhost:7788 in your browser
  • VNC Viewer (for watching browser interactions): Open http://localhost:6080/vnc.html
    • Default VNC password: "youvncpassword"
    • Can be changed by setting VNC_PASSWORD in your .env file

Usage

Local Setup

  1. Run the WebUI: After completing the installation steps above, start the application:
    python webui.py --ip 127.0.0.1 --port 7788
    
  2. WebUI options:
    • --ip: The IP address to bind the WebUI to. Default is 127.0.0.1.
    • --port: The port to bind the WebUI to. Default is 7788.
    • --theme: The theme for the user interface. Default is Ocean.
      • Default: The standard theme with a balanced design.
      • Soft: A gentle, muted color scheme for a relaxed viewing experience.
      • Monochrome: A grayscale theme with minimal color for simplicity and focus.
      • Glass: A sleek, semi-transparent design for a modern appearance.
      • Origin: A classic, retro-inspired theme for a nostalgic feel.
      • Citrus: A vibrant, citrus-inspired palette with bright and fresh colors.
      • Ocean (default): A blue, ocean-inspired theme providing a calming effect.
    • --dark-mode: Enables dark mode for the user interface.
  3. Access the WebUI: Open your web browser and navigate to http://127.0.0.1:7788.
  4. Using Your Own Browser(Optional):
    • Set CHROME_PATH to the executable path of your browser and CHROME_USER_DATA to the user data directory of your browser. Leave CHROME_USER_DATA empty if you want to use local user data.
      • Windows
         CHROME_PATH="C:\Program Files\Google\Chrome\Application\chrome.exe"
         CHROME_USER_DATA="C:\Users\YourUsername\AppData\Local\Google\Chrome\User Data"
        

        Note: Replace YourUsername with your actual Windows username for Windows systems.

      • Mac
         CHROME_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
         CHROME_USER_DATA="/Users/YourUsername/Library/Application Support/Google/Chrome"
        
    • Close all Chrome windows
    • Open the WebUI in a non-Chrome browser, such as Firefox or Edge. This is important because the persistent browser context will use the Chrome data when running the agent.
    • Check the "Use Own Browser" option within the Browser Settings.
  5. Keep Browser Open(Optional):
    • Set CHROME_PERSISTENT_SESSION=true in the .env file.

Docker Setup

  1. Environment Variables:

    • All configuration is done through the .env file
    • Available environment variables:
      # LLM API Keys
      OPENAI_API_KEY=your_key_here
      ANTHROPIC_API_KEY=your_key_here
      GOOGLE_API_KEY=your_key_here
      
      # Browser Settings
      CHROME_PERSISTENT_SESSION=true   # Set to true to keep browser open between AI tasks
      RESOLUTION=1920x1080x24         # Custom resolution format: WIDTHxHEIGHTxDEPTH
      RESOLUTION_WIDTH=1920           # Custom width in pixels
      RESOLUTION_HEIGHT=1080          # Custom height in pixels
      
      # VNC Settings
      VNC_PASSWORD=your_vnc_password  # Optional, defaults to "vncpassword"
      
  2. Platform Support:

    • Supports both AMD64 and ARM64 architectures
    • For ARM64 systems (e.g., Apple Silicon Macs), the container will automatically use the appropriate image
  3. Browser Persistence Modes:

    • Default Mode (CHROME_PERSISTENT_SESSION=false):

      • Browser opens and closes with each AI task
      • Clean state for each interaction
      • Lower resource usage
    • Persistent Mode (CHROME_PERSISTENT_SESSION=true):

      • Browser stays open between AI tasks
      • Maintains history and state
      • Allows viewing previous AI interactions
      • Set in .env file or via environment variable when starting container
  4. Viewing Browser Interactions:

    • Access the noVNC viewer at http://localhost:6080/vnc.html
    • Enter the VNC password (default: "vncpassword" or what you set in VNC_PASSWORD)
    • Direct VNC access available on port 5900 (mapped to container port 5901)
    • You can now see all browser interactions in real-time
  5. Container Management:

    # Start with persistent browser
    CHROME_PERSISTENT_SESSION=true docker compose up -d
    
    # Start with default mode (browser closes after tasks)
    docker compose up -d
    
    # View logs
    docker compose logs -f
    
    # Stop the container
    docker compose down
    

Changelog

  • 2025/01/26: Thanks to @vvincent1234. Now browser-use-webui can combine with DeepSeek-r1 to engage in deep thinking!
  • 2025/01/10: Thanks to @casistack. Now we have Docker Setup option and also Support keep browser open between tasks.Video tutorial demo.
  • 2025/01/06: Thanks to @richard-devbot. A New and Well-Designed WebUI is released. Video tutorial demo.