windmill

Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13x vs Airflow). Open-source alternative to Retool and Temporal.

14,698

807

14,698

529

View on GitHub

Top Related Projects

prefect

19,925

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

dagster

13,694

An orchestration platform for the development, production, and observation of data assets.

airflow

41,350

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

great_expectations

10,608

Always know what to expect from your data.

mage-ai

8,434

🧙 Build, run, and manage data pipelines for integrating and transforming data.

ploomber

3,597

The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

Quick Overview

Windmill is an open-source developer platform designed to turn scripts into workflows and UIs. It allows developers to build internal tools and automation quickly using their preferred programming languages. Windmill aims to bridge the gap between development and operations by providing a collaborative environment for creating and managing workflows.

Pros

Supports multiple programming languages (Python, Deno/TypeScript, Bash, Go)
Provides a user-friendly interface for creating and managing workflows
Offers built-in scheduling and triggering capabilities
Enables easy collaboration and sharing of scripts and workflows

Cons

Relatively new project, which may lead to potential stability issues
Limited ecosystem compared to more established workflow automation tools
Steeper learning curve for users not familiar with scripting or workflow concepts
May require additional setup and maintenance compared to fully managed solutions

Code Examples

Creating a simple Python script in Windmill:

import wmill

def hello(name: str):
    return f"Hello, {name}!"

if __name__ == "__main__":
    result = wmill.run_function("hello", {"name": "Windmill"})
    print(result)

Defining a TypeScript function with input schema:

import { wmill } from "https://deno.land/x/windmill@v1.61.1/mod.ts"

export const input_schema = {
    type: 'object',
    properties: {
        message: { type: 'string' }
    },
    required: ['message']
}

export async function echo(input: { message: string }) {
    return { result: input.message }
}

Creating a simple workflow in Windmill:

summary: Simple Workflow Example
schema:
  type: object
  properties:
    name:
      type: string
steps:
  - id: greet
    script: scripts/hello.py
    args:
      name: ${{ inputs.name }}
  - id: log
    script: scripts/log.ts
    args:
      message: ${{ steps.greet.output }}

Getting Started

To get started with Windmill:

Install Windmill using Docker:

docker run -d -p 8000:8000 ghcr.io/windmill-labs/windmill:main

Open http://localhost:8000 in your browser and create an account.
Create a new workspace and start building scripts and workflows using the web interface.

Use the Windmill CLI for local development:

npm install -g windmill-cli
windmill login
windmill pull

Start creating and pushing scripts to your Windmill instance.

Competitor Comparisons

prefect

19,925

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

Pros of Prefect

More mature project with a larger community and ecosystem
Extensive documentation and tutorials for easier onboarding
Advanced features like distributed execution and real-time monitoring

Cons of Prefect

Steeper learning curve due to more complex architecture
Requires more setup and configuration for basic workflows
Heavier resource usage, which may be overkill for simpler projects

Code Comparison

Prefect workflow example:

from prefect import task, Flow

@task
def say_hello(name):
    print(f"Hello, {name}!")

with Flow("My Flow") as flow:
    say_hello("World")

flow.run()

Windmill workflow example:

import windmill as w

def say_hello(name: str):
    print(f"Hello, {name}!")

w.Job(say_hello, args={"name": "World"}).run()

Both Prefect and Windmill provide workflow orchestration capabilities, but Prefect offers a more comprehensive suite of features for complex, distributed workflows. Windmill, on the other hand, focuses on simplicity and ease of use, making it more suitable for smaller projects or teams new to workflow automation. The code comparison shows that Windmill's syntax is more straightforward, while Prefect's approach allows for more detailed flow definitions and task dependencies.

dagster

13,694

An orchestration platform for the development, production, and observation of data assets.

Pros of Dagster

More mature and established project with a larger community and ecosystem
Offers advanced features like asset-based orchestration and software-defined assets
Provides a rich UI for monitoring and debugging data pipelines

Cons of Dagster

Steeper learning curve due to its comprehensive feature set
Requires more setup and configuration compared to simpler alternatives
Can be overkill for small-scale data projects or simple workflows

Code Comparison

Dagster:

@job
def my_job():
    process_data()

@op
def process_data():
    # Data processing logic here
    pass

Windmill:

import wmill

@wmill.entrypoint()
def process_data():
    # Data processing logic here
    pass

Both Dagster and Windmill are workflow orchestration tools, but they cater to different scales and complexities. Dagster is more suited for large-scale data engineering projects with complex dependencies and requirements. It offers a comprehensive set of features for building, testing, and monitoring data pipelines. Windmill, on the other hand, is designed to be more lightweight and user-friendly, making it easier to get started with for smaller projects or teams new to workflow orchestration. While Dagster provides more advanced features, Windmill focuses on simplicity and ease of use, allowing users to quickly create and deploy workflows without extensive setup or configuration.

airflow

41,350

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Pros of Airflow

Mature and widely adopted project with a large community and extensive ecosystem
Supports a wide range of integrations and operators for various data sources and tools
Offers advanced features like dynamic DAG generation and complex dependency management

Cons of Airflow

Steeper learning curve and more complex setup compared to Windmill
Can be resource-intensive, especially for smaller projects or teams
Configuration and DAG definitions can become verbose and difficult to maintain

Code Comparison

Airflow DAG definition:

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime

def hello_world():
    print("Hello, World!")

dag = DAG('hello_world', start_date=datetime(2023, 1, 1), schedule_interval='@daily')
task = PythonOperator(task_id='hello_task', python_callable=hello_world, dag=dag)

Windmill script:

import wmill

def hello_world():
    print("Hello, World!")

if __name__ == "__main__":
    wmill.run(hello_world)

The Airflow example demonstrates a more complex setup with explicit DAG and task definitions, while the Windmill script is more concise and focuses on the core functionality. Windmill's approach may be easier for beginners or smaller projects, but Airflow's structure provides more flexibility for complex workflows.

great_expectations

10,608

Always know what to expect from your data.

Pros of Great Expectations

Focused on data quality and validation, providing a comprehensive framework for data testing
Extensive documentation and community support
Integrates well with various data platforms and tools

Cons of Great Expectations

Steeper learning curve due to its specialized focus on data validation
May be overkill for simpler workflow automation tasks
Less flexibility for general-purpose scripting and task automation

Code Comparison

Great Expectations:

import great_expectations as ge

df = ge.read_csv("my_data.csv")
df.expect_column_values_to_be_between("age", min_value=0, max_value=120)

Windmill:

from windmill import script

@script
def validate_age(df):
    assert all(0 <= age <= 120 for age in df["age"])

Summary

Great Expectations is a specialized tool for data quality and validation, offering robust features for ensuring data integrity. It has a strong community and extensive documentation but may have a steeper learning curve.

Windmill, on the other hand, is a more general-purpose workflow automation tool that can handle a variety of tasks, including data processing. It offers greater flexibility for scripting but may lack some of the specialized data validation features of Great Expectations.

The choice between the two depends on the specific needs of the project, with Great Expectations being more suitable for dedicated data quality assurance, while Windmill is better for broader workflow automation tasks.

mage-ai

8,434

🧙 Build, run, and manage data pipelines for integrating and transforming data.

Pros of Mage-ai

More focused on data integration and ETL processes
Provides a visual interface for building data pipelines
Offers built-in data quality checks and monitoring features

Cons of Mage-ai

Less flexible for general-purpose workflow automation
May have a steeper learning curve for non-data professionals
Limited support for custom scripting languages compared to Windmill

Code Comparison

Mage-ai pipeline definition:

@data_loader
def load_data():
    return pd.read_csv('data.csv')

@transformer
def transform_data(df):
    return df.groupby('category').sum()

@data_exporter
def export_data(df):
    df.to_csv('output.csv')

Windmill script:

import wmill
from mymodule import process_data

def main(arg1: str, arg2: int):
    data = wmill.get_resource("my_data")
    result = process_data(data, arg1, arg2)
    wmill.set_variable("processed_result", result)

Both Mage-ai and Windmill offer workflow automation capabilities, but they cater to different use cases. Mage-ai is more specialized for data engineering tasks, while Windmill provides a more general-purpose automation platform. The choice between the two depends on the specific requirements of your project and the expertise of your team.

ploomber

3,597

The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

Pros of Ploomber

Focused on data science workflows and pipeline orchestration
Supports multiple programming languages (Python, R, SQL)
Integrates well with Jupyter notebooks and existing data science tools

Cons of Ploomber

Less emphasis on general-purpose automation and scripting
May have a steeper learning curve for non-data scientists
Limited built-in integrations compared to Windmill

Code Comparison

Ploomber pipeline definition:

from ploomber import DAG

dag = DAG()

dag.add_task(
    source='clean.py',
    product='clean.parquet',
    name='clean'
)

Windmill script definition:

import wmill

@wmill.task()
def clean_data():
    # Data cleaning logic here
    return cleaned_data

Both projects aim to simplify workflow management, but Ploomber is more tailored for data science pipelines, while Windmill offers a broader range of automation capabilities. Ploomber's syntax is designed around DAGs and data products, whereas Windmill uses a more general task-based approach. The choice between them depends on the specific use case and the team's familiarity with data science concepts.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Open-source developer infrastructure for internal tools (APIs, background jobs, workflows and UIs). Self-hostable alternative to Retool, Pipedream, Superblocks and a simplified Temporal with autogenerated UIs and custom UIs to trigger workflows and scripts as internal apps.

Scripts are turned into sharable UIs automatically, and can be composed together into flows or used into richer apps built with low-code. Supported script languages supported are: Python, TypeScript, Go, Bash, SQL, and GraphQL.

Try it - Docs - Discord - Hub - Contributor's guide

Windmill - Developer platform for APIs, background jobs, workflows and UIs

Windmill is fully open-sourced (AGPLv3) and Windmill Labs offers dedicated instance and commercial support and licenses.

Windmill Diagram

https://github.com/windmill-labs/windmill/assets/122811744/0b132cd1-ee67-4505-822f-0c7ee7104252

Windmill - Developer platform for APIs, background jobs, workflows and UIs

Main Concepts

Define a minimal and generic script in Python, TypeScript, Go or Bash that solves a specific task. The code can be defined in the provided Web IDE or synchronized with your own GitHub repo (e.g. through VS Code extension):
Your scripts parameters are automatically parsed and generate a frontend.

Step 2

Step 3

Make it flow! You can chain your scripts or scripts made by the community shared on WindmillHub.
Build complex UIs on top of your scripts and flows.

Scripts and flows can also be triggered by a cron schedule (e.g. '_/5 _ * * *') or through webhooks.

You can build your entire infra on top of Windmill!

Show me some actual script code

//import any dependency  from npm
import * as wmill from "windmill-client";
import * as cowsay from "cowsay@1.5.0";

// fill the type, or use the +Resource type to get a type-safe reference to a resource
type Postgresql = {
  host: string;
  port: number;
  user: string;
  dbname: string;
  sslmode: string;
  password: string;
};

export async function main(
  a: number,
  b: "my" | "enum",
  c: Postgresql,
  d = "inferred type string from default arg",
  e = { nested: "object" }
  //f: wmill.Base64
) {
  const email = process.env["WM_EMAIL"];
  // variables are permissioned and by path
  let variable = await wmill.getVariable("f/company-folder/my_secret");
  const lastTimeRun = await wmill.getState();
  // logs are printed and always inspectable
  console.log(cowsay.say({ text: "hello " + email + " " + lastTimeRun }));
  await wmill.setState(Date.now());

  // return is serialized as JSON
  return { foo: d, variable };
}

CLI

We have a powerful CLI to interact with the windmill platform and sync your scripts from local files, GitHub repos and to run scripts and flows on the instance from local commands. See more details.

CLI Screencast

Running scripts locally

You can run your script locally easily, you simply need to pass the right environment variables for the wmill client library to fetch resources and variables from your instance if necessary. See more: https://www.windmill.dev/docs/advanced/local_development.

To develop & test locally scripts & flows, we recommend using the Windmill VS Code extension: https://www.windmill.dev/docs/cli_local_dev/vscode-extension.

Stack

Postgres as the database.
Backend in Rust with the following highly-available and horizontally scalable. Architecture:
- Stateless API backend.
- Workers that pull jobs from a queue in Postgres (and later, Kafka or Redis. Upvote #173 if interested).
Frontend in Svelte.
Scripts executions are sandboxed using Google's nsjail.
Javascript runtime is the deno_core rust library (which itself uses the rusty_v8 and hence V8 underneath).
TypeScript runtime is Bun and deno.
Python runtime is python3.
Golang runtime is 1.19.1.

Fastest Self-Hostable Workflow Engine

We have compared Windmill to other self-hostable workflow engines (Airflow, Prefect & Temporal) and Windmill is the most performant solution for both benchmarks: one flow composed of 40 lightweight tasks & one flow composed of 10 long-running tasks.

All methodology & results on our Benchmarks page.

Fastest workflow engine

Security

Sandboxing

Windmill can use nsjail. It is production multi-tenant grade secure. Do not take our word for it, take fly.io's one.

Secrets, credentials and sensitive values

There is one encryption key per workspace to encrypt the credentials and secrets stored in Windmill's K/V store.

In addition, we strongly recommend that you encrypt the whole Postgres database. That is what we do at https://app.windmill.dev.

Performance

Once a job started, there is no overhead compared to running the same script on the node with its corresponding runner (Deno/Go/Python/Bash). The added latency from a job being pulled from the queue, started, and then having its result sent back to the database is ~50ms. A typical lightweight deno job will take around 100ms total.

Architecture

How to self-host

We only provide docker-compose setup here. For more advanced setups, like compiling from source or using without a postgres super user, see Self-Host documentation.

Docker compose

Windmill can be deployed using 3 files: (docker-compose.yml, Caddyfile and a .env) in a single command.

Make sure Docker is started, and run:

curl https://raw.githubusercontent.com/windmill-labs/windmill/main/docker-compose.yml -o docker-compose.yml
curl https://raw.githubusercontent.com/windmill-labs/windmill/main/Caddyfile -o Caddyfile
curl https://raw.githubusercontent.com/windmill-labs/windmill/main/.env -o .env

docker compose up -d

Go to http://localhost et voilÃ :)

The default super-admin user is: admin@windmill.dev / changeme.

From there, you can follow the setup app and create other users.

Kubernetes (k8s) and Helm charts

We publish helm charts at: https://github.com/windmill-labs/windmill-helm-charts.

Run from binaries

Each release includes the corresponding binaries for x86_64. You can simply download the latest windmill binary using the following set of bash commands.

BINARY_NAME='windmill-amd64' # or windmill-ee-amd64 for the enterprise edition
LATEST_RELEASE=$(curl -L -s -H 'Accept: application/json' https://github.com/windmill-labs/windmill/releases/latest)
LATEST_VERSION=$(echo $LATEST_RELEASE | sed -e 's/.*"tag_name":"\([^"]*\)".*/\1/')
ARTIFACT_URL="https://github.com/windmill-labs/windmill/releases/download/$LATEST_VERSION/$BINARY_NAME"
wget "$ARTIFACT_URL" -O windmill

OAuth, SSO & SMTP

Windmill Community Edition allows to configure the OAuth, SSO (including Google Workspace SSO, Microsoft/Azure and Okta) directly from the UI in the superadmin settings. Do note that there is a limit of 10 SSO users on the community edition.

See documentation.

Commercial license

See the LICENSE file for the full license text.

The "Community Edition" of Windmill available in the docker images hosted under ghcr.io/windmill-labs/windmill and the github binary releases contains the files under the AGPLv3 and Apache 2 sources but also includes proprietary and non-public code and features which are not open source and under the following terms: Windmill Labs, Inc. grants a right to use all the features of the "Community Edition" for free without restrictions other than the limits and quotas set in the software and a right to distribute the community edition as is but not to sell, resell, serve Windmill as a managed service, modify or wrap under any form without an explicit agreement.

The binary compilable from source code in this repository without the "enterprise" feature flag is open-source under the LICENSE-AGPLv3 License terms and conditions.

To re-expose directly any Windmill parts to your users as a feature of your product, with the exception of iframed public Windmill "apps", or to build a feature on top of "Windmill Community Edition" that you sell commercially or embed in a distributable product or binary, you must get a commercial license. Contact us at sales@windmill.dev if you have any questions. To do the same from the binary compiled from the source code in this repository without the "enterprise" feature flag, you must comply with the AGPLv3 license terms and conditions or get a commercial license from Windmill Labs, Inc.

To use Windmill "Community Edition" as is internally in your organization, or to use its APIs as is, you do NOT need a commercial license.

Integrations

In Windmill, integrations are referred to as resources and resource types. Each Resource has a Resource Type that defines the schema that the resource needs to implement.

On self-hosted instances, you might want to import all the approved resource types from WindmillHub. A setup script will prompt you to have it being synced automatically everyday.

Environment Variables

Environment Variable name	Default	Description	Api Server/Worker/All
DATABASE_URL		The Postgres database url.	All
WORKER_GROUP	default	The worker group the worker belongs to and get its configuration pulled from	Worker
MODE	standalone	The mode if the binary. Possible values: standalone, worker, server, agent	All
METRICS_ADDR	None	(ee only) The socket addr at which to expose Prometheus metrics at the /metrics path. Set to "true" to expose it on port 8001	All
JSON_FMT	false	Output the logs in json format instead of logfmt	All
BASE_URL	http://localhost:8000	The base url that is exposed publicly to access your instance. Is overriden by the instance settings if any.	Server
ZOMBIE_JOB_TIMEOUT	30	The timeout after which a job is considered to be zombie if the worker did not send pings about processing the job (every server check for zombie jobs every 30s)	Server
RESTART_ZOMBIE_JOBS	true	If true then a zombie job is restarted (in-place with the same uuid and some logs), if false the zombie job is failed	Server
SLEEP_QUEUE	50	The number of ms to sleep in between the last check for new jobs in the DB. It is multiplied by NUM_WORKERS such that in average, for one worker instance, there is one pull every SLEEP_QUEUE ms.	Worker
KEEP_JOB_DIR	false	Keep the job directory after the job is done. Useful for debugging.	Worker
LICENSE_KEY (EE only)	None	License key checked at startup for the Enterprise Edition of Windmill	Worker
SLACK_SIGNING_SECRET	None	The signing secret of your Slack app. See Slack documentation	Server
COOKIE_DOMAIN	None	The domain of the cookie. If not set, the cookie will be set by the browser based on the full origin	Server
DENO_PATH	/usr/bin/deno	The path to the deno binary.	Worker
PYTHON_PATH		The path to the python binary if wanting to not have it managed by uv.	Worker
GO_PATH	/usr/bin/go	The path to the go binary.	Worker
GOPRIVATE		The GOPRIVATE env variable to use private go modules	Worker
GOPROXY		The GOPROXY env variable to use	Worker
NETRC		The netrc content to use a private go registry	Worker
PY_CONCURRENT_DOWNLOADS	20	Sets the maximum number of in-flight concurrent python downloads that windmill will perform at any given time.	Worker
PATH	None	The path environment variable, usually inherited	Worker
HOME	None	The home directory to use for Go and Bash , usually inherited	Worker
DATABASE_CONNECTIONS	50 (Server)/3 (Worker)	The max number of connections in the database connection pool	All
SUPERADMIN_SECRET	None	A token that would let the caller act as a virtual superadmin superadmin@windmill.dev	Server
TIMEOUT_WAIT_RESULT	20	The number of seconds to wait before timeout on the 'run_wait_result' endpoint	Worker
QUEUE_LIMIT_WAIT_RESULT	None	The number of max jobs in the queue before rejecting immediately the request in 'run_wait_result' endpoint. Takes precedence on the query arg. If none is specified, there are no limit.	Worker
DENO_AUTH_TOKENS	None	Custom DENO_AUTH_TOKENS to pass to worker to allow the use of private modules	Worker
DISABLE_RESPONSE_LOGS	false	Disable response logs	Server
CREATE_WORKSPACE_REQUIRE_SUPERADMIN	true	If true, only superadmins can create new workspaces	Server
MIN_FREE_DISK_SPACE_MB	15000	Minimum amount of free space on worker. Sends critical alert if worker has less free space.	Worker
RUN_UPDATE_CA_CERTIFICATE_AT_START	false	If true, runs CA certificate update command at startup before other initialization	All
RUN_UPDATE_CA_CERTIFICATE_PATH	/usr/sbin/update-ca-certificates	Path to the CA certificate update command/script to run when RUN_UPDATE_CA_CERTIFICATE_AT_START is true	All

Run a local dev setup

Using Nix (Recommended).

See the ./frontend/README_DEV.md file for all running options.

only Frontend

This will use the backend of https://app.windmill.dev but your own frontend with hot-code reloading. Note that you will need to use a username / password login due to CSRF checks using a different auth provider.

In the frontend/ directory:

install the dependencies with npm install (or pnpm install or yarn)
generate the windmill client:

npm run generate-backend-client
## on mac use
npm run generate-backend-client-mac

Run your dev server with npm run dev
Et voilÃ , windmill should be available at http://localhost/

Backend + Frontend

See the ./frontend/README_DEV.md file for all running options.

Start a local Postgres database using for instance the start-dev-db.sh script which will make a database available at postgres://postgres:changeme@localhost:5432/windmill Then run the migrations using the following command:
```
cargo install sqlx-cli
env DATABASE_URL=<YOUR_DATABASE_URL> sqlx migrate run
```
This will also avoid compile time issue with sqlx's query! macro.
(optional, linux only) Install nsjail and have it accessible in your PATH
Install bun, deno and python3 (+ any languages you want to use), have the bins at /usr/bin/bun,/usr/bin/deno, and /usr/local/bin/python3 or set the corresponding environment variables.
(optional) Install the lld linker
Go to frontend/:
1. npm install, npm run generate-backend-client then REMOTE=http://localhost:8000 npm run dev
2. You might need to set some extra heap space for the node runtime export NODE_OPTIONS="--max-old-space-size=4096"
3. Create an empty frontend/build folder using mkdir frontend/build
Go to backend/:
1. env DATABASE_URL=<YOUR_DATABASE_URL> RUST_LOG=info cargo run
2. You can specify any feature flag you want to enable, for example cargo run --features python to enable the python executor.
Et voilÃ , windmill should be available at http://localhost:3000

Contributors

Copyright

Windmill Labs, Inc 2023

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot