chronos

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules

4,385

525

4,385

233

View on GitHub

Top Related Projects

airflow

41,350

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

spark

41,366

Apache Spark - A unified analytics engine for large-scale data processing

beam

8,228

Apache Beam is a unified programming model for Batch and Streaming data processing.

luigi

18,399

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

dagster

13,694

An orchestration platform for the development, production, and observation of data assets.

Quick Overview

Chronos is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos. It's designed for running complex cron-like jobs in a distributed environment, making it ideal for large-scale, data-intensive applications. Chronos can be used to schedule jobs across a cluster of machines, ensuring high availability and efficient resource utilization.

Pros

Fault-tolerant and highly available, with automatic failover
Supports complex job dependencies and scheduling patterns
Integrates well with other Mesos frameworks and Docker containers
Provides a RESTful API and web interface for easy management

Cons

Steep learning curve for users new to distributed systems
Requires a running Mesos cluster, which adds complexity to the setup
Documentation can be sparse or outdated in some areas
Development has slowed down in recent years

Getting Started

To get started with Chronos, follow these steps:

Set up an Apache Mesos cluster
Install Chronos on your Mesos master node:

wget https://github.com/mesos/chronos/releases/download/v3.0.2/chronos-3.0.2.tgz
tar xzf chronos-3.0.2.tgz
cd chronos-3.0.2

Configure Chronos by editing config/chronos.yml
Start Chronos:

bin/start-chronos.bash

Access the Chronos web interface at http://<mesos-master>:8080

To create a job using the Chronos API:

curl -L -H 'Content-Type: application/json' -X POST -d '{
  "name": "my-job",
  "command": "echo Hello World",
  "schedule": "R/2014-03-08T20:00:00Z/PT24H"
}' http://<chronos-endpoint>:8080/scheduler/iso8601

This creates a job named "my-job" that runs every 24 hours, starting from March 8, 2014, at 20:00 UTC.

Competitor Comparisons

airflow

41,350

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Pros of Airflow

More extensive ecosystem with a wide range of plugins and integrations
Better support for complex workflows with advanced features like branching and dynamic task generation
Active development and large community support

Cons of Airflow

Steeper learning curve due to more complex architecture and concepts
Higher resource requirements, especially for larger deployments
Potential overkill for simple scheduling needs

Code Comparison

Chronos task definition:

{
  "name": "my-job",
  "command": "echo hello",
  "schedule": "R/2014-03-08T20:00:00Z/PT24H"
}

Airflow DAG definition:

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta

dag = DAG('my_dag', start_date=datetime(2023, 1, 1), schedule_interval=timedelta(days=1))

task = BashOperator(
    task_id='my_task',
    bash_command='echo hello',
    dag=dag
)

Airflow offers more flexibility and power in defining workflows, while Chronos provides a simpler JSON-based configuration for basic scheduling needs. Airflow's Python-based DAGs allow for more complex logic and better integration with other Python libraries and tools.

flink

25,110

Apache Flink

Pros of Flink

Powerful stream processing capabilities with low latency and high throughput
Supports both batch and stream processing in a unified framework
Large and active community with frequent updates and improvements

Cons of Flink

Steeper learning curve due to its complex architecture and concepts
Requires more resources and configuration for optimal performance
Less integrated with Mesos ecosystem compared to Chronos

Code Comparison

Flink (Java):

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<String> text = env.readTextFile("input.txt");
DataStream<Tuple2<String, Integer>> counts = text
    .flatMap(new Tokenizer())
    .keyBy(0)
    .sum(1);
counts.print();

Chronos (JSON):

{
  "name": "my-job",
  "command": "echo 'Hello World'",
  "schedule": "R/2014-03-08T20:00:00Z/PT2H"
}

Summary

Flink is a more comprehensive data processing framework with advanced stream processing capabilities, while Chronos is primarily focused on job scheduling within the Mesos ecosystem. Flink offers greater flexibility for complex data processing tasks but may require more setup and resources. Chronos provides simpler job scheduling with tighter Mesos integration but lacks advanced data processing features.

spark

41,366

Apache Spark - A unified analytics engine for large-scale data processing

Pros of Spark

More versatile and powerful for large-scale data processing and analytics
Supports multiple programming languages (Scala, Java, Python, R)
Offers a wider range of built-in libraries and APIs for machine learning, graph processing, and streaming

Cons of Spark

Higher resource requirements and complexity for small-scale tasks
Steeper learning curve, especially for beginners in distributed computing
May be overkill for simple job scheduling and workflow management

Code Comparison

Chronos (Job Definition):

{
  "name": "my-job",
  "command": "echo 'Hello World'",
  "schedule": "R/2014-03-08T20:00:00Z/PT24H"
}

Spark (Simple Job):

import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder().appName("SimpleJob").getOrCreate()
val data = spark.read.csv("input.csv")
data.createOrReplaceTempView("myTable")
val result = spark.sql("SELECT * FROM myTable WHERE value > 10")
result.write.csv("output.csv")

Summary

Chronos is focused on distributed and fault-tolerant job scheduling, while Spark is a more comprehensive data processing engine. Chronos is simpler for basic task scheduling, whereas Spark excels in complex data analytics and processing pipelines. The choice between them depends on the specific use case and scale of the project.

beam

8,228

Apache Beam is a unified programming model for Batch and Streaming data processing.

Pros of Beam

Supports multiple programming languages (Java, Python, Go)
Provides a unified model for batch and streaming data processing
Has a larger and more active community, with frequent updates

Cons of Beam

Steeper learning curve due to its more complex programming model
Requires more setup and configuration compared to Chronos
May be overkill for simple job scheduling tasks

Code Comparison

Chronos (Scala):

import org.apache.mesos.chronos.scheduler.jobs.BaseJob

class MyJob extends BaseJob {
  def run(job: BaseJob): Unit = {
    // Job logic here
  }
}

Beam (Java):

import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.transforms.*;

public class MyPipeline {
  public static void main(String[] args) {
    Pipeline p = Pipeline.create();
    p.apply(Create.of("Hello", "World"))
     .apply(ParDo.of(new DoFn<String, String>() {
       @ProcessElement
       public void processElement(ProcessContext c) {
         // Processing logic here
       }
     }));
    p.run();
  }
}

Chronos is focused on job scheduling and execution in distributed systems, while Beam provides a more comprehensive data processing framework. Chronos offers simplicity for scheduling tasks, whereas Beam excels in complex data processing pipelines. The code examples illustrate the difference in complexity and focus between the two projects.

luigi

18,399

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Pros of Luigi

More flexible and extensible, supporting a wider range of task types and data sources
Better suited for complex data pipelines and workflows
Larger and more active community, with more frequent updates and contributions

Cons of Luigi

Steeper learning curve due to its more complex architecture
Requires more setup and configuration compared to Chronos' simpler approach
Less integrated with cluster management systems like Mesos

Code Comparison

Luigi task example:

class MyTask(luigi.Task):
    def requires(self):
        return SomeOtherTask()

    def run(self):
        # Task logic here

Chronos job example:

{
  "name": "my-job",
  "command": "echo hello",
  "schedule": "R/2014-03-08T20:00:00Z/PT2H"
}

Luigi focuses on defining tasks and their dependencies in Python, allowing for more complex workflows. Chronos uses a simpler JSON configuration for defining jobs and schedules, which is easier to set up but less flexible for complex scenarios.

dagster

13,694

An orchestration platform for the development, production, and observation of data assets.

Pros of Dagster

More active development with frequent updates and releases
Broader scope, supporting full data pipeline orchestration beyond just scheduling
Extensive documentation and growing community support

Cons of Dagster

Steeper learning curve due to more complex architecture
Requires more setup and configuration compared to Chronos' simplicity
May be overkill for simple scheduling needs

Code Comparison

Chronos job definition:

{
  "name": "my-job",
  "command": "echo Hello",
  "schedule": "R/2014-03-08T20:00:00Z/PT24H"
}

Dagster job definition:

@job
def my_job():
    @op
    def say_hello():
        print("Hello")

    say_hello()

@schedule(cron_schedule="0 0 * * *", job=my_job)
def daily_hello_schedule(context):
    return {}

Summary

Dagster offers a more comprehensive data orchestration solution with active development and community support, while Chronos provides a simpler, Mesos-focused scheduling tool. Dagster's broader scope comes with increased complexity, while Chronos excels in straightforward job scheduling. Choose based on your specific needs and infrastructure requirements.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Chronos

Chronos is a replacement for cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes sh (on most systems bash) scripts.

Chronos can be used to interact with systems such as Hadoop (incl. EMR), even if the Mesos agents on which execution happens do not have Hadoop installed. Included wrapper scripts allow transfering files and executing them on a remote machine in the background and using asynchronous callbacks to notify Chronos of job completion or failures. Chronos is also natively able to schedule jobs that run inside Docker containers.

Chronos has a number of advantages over regular cron. It allows you to schedule your jobs using ISO8601 repeating interval notation, which enables more flexibility in job scheduling. Chronos also supports the definition of jobs triggered by the completion of other jobs. It supports arbitrarily long dependency chains.

The easiest way to use Chronos is to use DC/OS and install chronos via the universe.

Features

Web UI
ISO8601 Repeating Interval Notation
Handles dependencies
Job Stats (e.g. 50th, 75th, 95th and 99th percentile timing, failure/success)
Job History (e.g. job duration, start time, end time, failure/success)
Fault Tolerance (leader/follower)
Configurable Retries
Multiple Workers (i.e. Mesos agents)
Native Docker support

Documentation and Support

Chronos documentation is available on the Chronos GitHub pages site.

Documentation for installing and configuring the full Mesosphere stack including Mesos and Chronos is available on the Mesosphere website.

For questions and discussions around Chronos, please use the Google Group "chronos-scheduler": Chronos Scheduler Group.

If you'd like to take part in design research and test new features in Chronos before they're released, please add your name to Mesosphere's UX Research list.

Packaging

Mesosphere publishes Docker images for Chronos to Dockerhub, at https://hub.docker.com/r/mesosphere/chronos/.

Contributing

Instructions on how to contribute to Chronos are available on the Contributing docs page.

License

The use and distribution terms for this software are covered by the Apache 2.0 License (http://www.apache.org/licenses/LICENSE-2.0.html) which can be found in the file LICENSE at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

Contributors

Florian Leibert (@flo)
Andy Kramolisch (@andykram)
Harry Shoff (@hshoff)
Elizabeth Lingg

Reporting Bugs

Please see the support page for information on how to report bugs.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot