Convert Figma logo to code with AI

web-infra-dev logomidscene

Let AI be your browser operator.

7,557
430
7,557
41

Top Related Projects

Central repository for tools, tutorials, resources, and documentation for robotics simulation in Unity.

35,173

A toolkit for developing and comparing reinforcement learning algorithms.

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

12,912

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.

Quick Overview

Midscene is an open-source 3D scene editor designed for web applications. It provides a user-friendly interface for creating and manipulating 3D scenes directly in the browser, making it easier for developers and designers to work with 3D content without the need for complex desktop software.

Pros

  • Browser-based, making it accessible across different platforms
  • User-friendly interface for easy 3D scene creation and editing
  • Integrates well with web technologies and frameworks
  • Open-source, allowing for community contributions and customization

Cons

  • May have performance limitations compared to desktop 3D editing software
  • Potentially limited feature set compared to more established 3D editors
  • Dependency on browser capabilities and WebGL support
  • Learning curve for users new to 3D scene editing

Code Examples

// Initialize Midscene
const scene = new Midscene.Scene();
const renderer = new Midscene.Renderer(document.getElementById('canvas'));

// Add a 3D object to the scene
const cube = new Midscene.Cube();
scene.add(cube);

// Render the scene
renderer.render(scene);
// Add lighting to the scene
const light = new Midscene.PointLight();
light.position.set(0, 5, 10);
scene.add(light);

// Apply material to an object
const material = new Midscene.StandardMaterial({
  color: '#ff0000',
  metalness: 0.5,
  roughness: 0.5
});
cube.material = material;
// Add user interaction
const controls = new Midscene.OrbitControls(renderer.camera, renderer.domElement);

// Animate the scene
function animate() {
  requestAnimationFrame(animate);
  cube.rotation.y += 0.01;
  renderer.render(scene);
}
animate();

Getting Started

  1. Include Midscene in your project:

    <script src="https://cdn.jsdelivr.net/npm/midscene@latest/dist/midscene.min.js"></script>
    
  2. Create a canvas element in your HTML:

    <canvas id="scene-canvas"></canvas>
    
  3. Initialize Midscene and create a basic scene:

    const scene = new Midscene.Scene();
    const renderer = new Midscene.Renderer(document.getElementById('scene-canvas'));
    const cube = new Midscene.Cube();
    scene.add(cube);
    renderer.render(scene);
    
  4. Run your web application and see the 3D scene in action!

Competitor Comparisons

Central repository for tools, tutorials, resources, and documentation for robotics simulation in Unity.

Pros of Unity-Robotics-Hub

  • Comprehensive robotics simulation environment with ROS integration
  • Extensive documentation and tutorials for robotics development
  • Active community support and regular updates

Cons of Unity-Robotics-Hub

  • Steeper learning curve for non-Unity developers
  • Limited to Unity engine, which may not be suitable for all robotics projects
  • Requires more computational resources for complex simulations

Code Comparison

Unity-Robotics-Hub:

using Unity.Robotics.ROSTCPConnector;
using RosMessageTypes.Geometry;

public class RobotController : MonoBehaviour
{
    ROSConnection ros;
    public string topicName = "cmd_vel";

Midscene:

import { Scene, PerspectiveCamera, WebGLRenderer } from 'three';
import { GLTFLoader } from 'three/examples/jsm/loaders/GLTFLoader';

const scene = new Scene();
const camera = new PerspectiveCamera(75, window.innerWidth / window.innerHeight, 0.1, 1000);

The Unity-Robotics-Hub code snippet demonstrates ROS integration and robot control, while the Midscene code focuses on 3D scene setup using Three.js. Unity-Robotics-Hub is more specialized for robotics, whereas Midscene is a general-purpose 3D visualization tool.

35,173

A toolkit for developing and comparing reinforcement learning algorithms.

Pros of gym

  • Well-established and widely used in the reinforcement learning community
  • Extensive documentation and tutorials available
  • Supports a wide range of environments for various RL tasks

Cons of gym

  • Primarily focused on reinforcement learning, limiting its use in other domains
  • Can be complex for beginners to set up and use effectively
  • Some environments may require additional dependencies

Code Comparison

gym:

import gym
env = gym.make('CartPole-v1')
observation = env.reset()
for _ in range(1000):
    action = env.action_space.sample()
    observation, reward, done, info = env.step(action)

midscene:

import { Scene } from 'midscene';
const scene = new Scene();
scene.add(new Cube());
scene.render();
scene.export('scene.gltf');

Key Differences

  • gym is Python-based and focused on reinforcement learning environments
  • midscene is JavaScript-based and designed for 3D scene creation and manipulation
  • gym provides a standardized interface for RL algorithms
  • midscene offers tools for creating and exporting 3D scenes

Use Cases

  • gym: Ideal for researchers and developers working on reinforcement learning projects
  • midscene: Suitable for web developers creating 3D scenes for visualization or game development

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Pros of Habitat-lab

  • Comprehensive 3D simulation platform for embodied AI research
  • Extensive documentation and tutorials for easy onboarding
  • Large community support and active development

Cons of Habitat-lab

  • Steeper learning curve due to complex architecture
  • Higher computational requirements for running simulations
  • Limited flexibility for custom environments outside its predefined scenarios

Code Comparison

Habitat-lab example:

import habitat
env = habitat.Env(
    config=habitat.get_config("benchmark/nav/pointnav/pointnav_gibson.yaml")
)
observations = env.reset()

Midscene example:

from midscene import MidScene
scene = MidScene()
scene.load("path/to/scene.json")

Summary

Habitat-lab is a robust platform for embodied AI research with extensive features and community support, while Midscene appears to be a simpler tool for scene manipulation. Habitat-lab offers more comprehensive simulation capabilities but may require more resources and learning time. Midscene seems more lightweight and potentially easier to integrate for specific scene-related tasks.

12,912

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.

Pros of bullet3

  • More mature and widely adopted physics engine with extensive documentation
  • Supports a broader range of physics simulations, including soft body dynamics
  • Offers better performance for large-scale simulations

Cons of bullet3

  • Steeper learning curve due to its complexity and extensive feature set
  • Larger codebase and potentially higher resource requirements

Code Comparison

bullet3:

btDefaultCollisionConfiguration* collisionConfiguration = new btDefaultCollisionConfiguration();
btCollisionDispatcher* dispatcher = new btCollisionDispatcher(collisionConfiguration);
btBroadphaseInterface* overlappingPairCache = new btDbvtBroadphase();
btSequentialImpulseConstraintSolver* solver = new btSequentialImpulseConstraintSolver;
btDiscreteDynamicsWorld* dynamicsWorld = new btDiscreteDynamicsWorld(dispatcher, overlappingPairCache, solver, collisionConfiguration);

midscene:

import { Scene } from 'midscene';

const scene = new Scene();
scene.addObject(new Cube({ position: [0, 0, 0], size: [1, 1, 1] }));
scene.render();

Note: The code comparison highlights the difference in complexity and setup between the two libraries. bullet3 requires more detailed configuration for its physics simulation, while midscene offers a simpler, higher-level API for scene creation and rendering.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Midscene.js

Midscene.js

Let AI be your browser operator.

npm version huagging face model downloads License discord twitter

Midscene.js lets AI be your browser operator 🤖.Just describe what you want to do in natural language, and it will help you operate web pages, validate content, and extract data. Whether you want a quick experience or deep development, you can get started easily.

Showcases

The following recorded example video is based on the UI-TARS 7B SFT model, and the video has not been sped up at all~

InstructionVideo
Post a Tweet
Use JS code to drive task orchestration, collect information about Jay Chou's concert, and write it into Google Docs

📢 New open-source model choice - UI-TARS and Qwen2.5-VL

Besides the default model GPT-4o, we have added two new recommended open-source models to Midscene.js: UI-TARS and Qwen2.5-VL. (Yes, Open Source !) They are dedicated models for image recognition and UI automation, which are known for performing well in UI automation scenarios. Read more about it in Choose a model.

💡 Features

  • Natural Language Interaction 👆: Just describe your goals and steps, and Midscene will plan and operate the user interface for you.
  • Chrome Extension Experience 🖥️: Start experiencing immediately through the Chrome extension, no coding required.
  • Puppeteer/Playwright Integration 🔧: Supports Puppeteer and Playwright integration, allowing you to combine AI capabilities with these powerful automation tools for easy automation.
  • Support Open-Source Models 🤖: Supports private deployment of UI-TARS and Qwen2.5-VL, which outperforms closed-source models like GPT-4o and Claude in UI automation scenarios while better protecting data security.
  • Support General Models 🌟: Supports general large models like GPT-4o and Claude, adapting to various scenario needs.
  • Visual Reports for Debugging 🎞️: Through our test reports and Playground, you can easily understand, replay and debug the entire process.
  • Support Caching 🔄: The first time you execute a task through AI, it will be cached, and subsequent executions of the same task will significantly improve execution efficiency.
  • Completely Open Source 🔥: Experience a whole new automation development experience, enjoy!
  • Understand UI, JSON Format Responses 🔍: You can specify data format requirements and receive responses in JSON format.
  • Intuitive Assertions 🤔: Express your assertions in natural language, and AI will understand and process them.

✨ Model Choices

  • You can use general-purpose LLMs like gpt-4o, it works well for most cases. And also, gemini-1.5-pro, qwen-vl-max-latest are supported.
  • You can also use UI-TARS model, which is an open-source model dedicated for UI automation. You can deploy it on your own server, and it will dramatically improve the performance and data privacy.
  • Read more about Choose a model

👀 Comparing to ...

There are so many UI automation tools out there, and each one seems to be all-powerful. What's special about Midscene.js?

  • Debugging Experience: You will soon find that debugging and maintaining automation scripts is the real challenge point. No matter how magic the demo is, you still need to debug the process to make it stable over time. Midscene.js offers a visualized report file, a built-in playground, and a Chrome Extension to debug the entire process. This is what most developers really need. And we're continuing to work on improving the debugging experience.

  • Open Source, Free, Deploy as you want: Midscene.js is an open-source project. It's decoupled from any cloud service and model provider, you can choose either public or private deployment. There is always a suitable plan for your business.

  • Integrate with Javascript: You can always bet on Javascript 😎

📄 Resources

🤝 Community

Citation

If you use Midscene.js in your research or project, please cite:

@software{Midscene.js,
  author = {Zhou, Xiao and Yu, Tao},
  title = {Midscene.js: Let AI be your browser operator.},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/web-infra-dev/midscene}
}

📝 License

Midscene.js is MIT licensed.


If this project helps you or inspires you, please give us a ⭐️