Convert Figma logo to code with AI

alibaba logojstorm

Enterprise Stream Process Engine

3,915
1,804
3,915
226

Top Related Projects

6,587

Apache Storm

23,783

Apache Flink

39,274

Apache Spark - A unified analytics engine for large-scale data processing

7,760

Apache Beam is a unified programming model for Batch and Streaming data processing.

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter

Quick Overview

JStorm is an open-source, distributed, and fault-tolerant real-time computation system developed by Alibaba. It is designed to process unbounded streams of data at scale, providing a Java-based alternative to Apache Storm with enhanced performance and easier operability.

Pros

  • High performance and low latency for real-time data processing
  • Improved stability and easier operability compared to Apache Storm
  • Seamless integration with other Alibaba ecosystem tools
  • Active development and maintenance by Alibaba

Cons

  • Less widespread adoption compared to Apache Storm or Apache Flink
  • Documentation and community resources primarily in Chinese
  • Steeper learning curve for developers not familiar with Storm-like systems
  • Limited ecosystem of third-party connectors and libraries

Code Examples

  1. Creating a basic topology:
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new SplitSentenceBolt(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCountBolt(), 12).fieldsGrouping("split", new Fields("word"));

Config conf = new Config();
conf.setNumWorkers(3);
StormSubmitter.submitTopology("word-count", conf, builder.createTopology());
  1. Implementing a custom spout:
public class MySpout extends BaseRichSpout {
    private SpoutOutputCollector collector;

    @Override
    public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
        this.collector = collector;
    }

    @Override
    public void nextTuple() {
        String message = generateMessage();
        collector.emit(new Values(message));
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("message"));
    }
}
  1. Implementing a custom bolt:
public class MyBolt extends BaseRichBolt {
    private OutputCollector collector;

    @Override
    public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
        this.collector = collector;
    }

    @Override
    public void execute(Tuple input) {
        String message = input.getString(0);
        String processedMessage = processMessage(message);
        collector.emit(new Values(processedMessage));
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("processed_message"));
    }
}

Getting Started

  1. Add JStorm dependency to your Maven pom.xml:
<dependency>
    <groupId>com.alibaba.jstorm</groupId>
    <artifactId>jstorm-core</artifactId>
    <version>2.4.0</version>
</dependency>
  1. Create a topology class with spouts and bolts.
  2. Configure and submit the topology:
Config conf = new Config();
conf.setNumWorkers(3);
StormSubmitter.submitTopology("my-topology", conf, builder.createTopology());
  1. Package your application as a JAR file and submit it to the JStorm cluster using the jstorm command-line tool.

Competitor Comparisons

6,587

Apache Storm

Pros of Storm

  • Larger and more active community support
  • More extensive documentation and resources
  • Better integration with other Apache projects

Cons of Storm

  • Generally slower performance for certain workloads
  • Less optimized for specific use cases in Chinese tech ecosystem
  • Steeper learning curve for beginners

Code Comparison

JStorm:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new TestWordSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

Storm:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

The code structure is very similar between JStorm and Storm, with minor differences in class names and import statements. Both use the TopologyBuilder to construct the topology, set spouts and bolts, and define groupings. The main differences lie in the specific implementations of spouts and bolts, which may be optimized differently for each system.

23,783

Apache Flink

Pros of Flink

  • More active development and larger community support
  • Broader ecosystem with extensive libraries and connectors
  • Advanced features like stateful stream processing and event time processing

Cons of Flink

  • Steeper learning curve due to more complex API
  • Higher resource requirements for small-scale applications

Code Comparison

JStorm example:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new TestWordSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

Flink example:

DataStream<String> text = env.addSource(new FlinkKafkaConsumer<>("topic", new SimpleStringSchema(), properties));
DataStream<Tuple2<String, Integer>> wordCounts = text
    .flatMap(new Tokenizer())
    .keyBy(value -> value.f0)
    .sum(1);

Both frameworks offer distributed stream processing capabilities, but Flink provides more advanced features and a wider range of use cases. JStorm, being more lightweight, may be easier to set up for simpler applications. Flink's programming model is more flexible, supporting both stream and batch processing, while JStorm is primarily focused on stream processing.

39,274

Apache Spark - A unified analytics engine for large-scale data processing

Pros of Spark

  • Wider ecosystem and community support
  • More extensive documentation and learning resources
  • Better performance for large-scale data processing and machine learning tasks

Cons of Spark

  • Steeper learning curve, especially for complex use cases
  • Higher memory requirements, which can be costly for large datasets
  • Slower startup time compared to JStorm

Code Comparison

Spark (Scala):

val conf = new SparkConf().setAppName("WordCount")
val sc = new SparkContext(conf)
val textFile = sc.textFile("input.txt")
val counts = textFile.flatMap(line => line.split(" "))
                     .map(word => (word, 1))
                     .reduceByKey(_ + _)

JStorm (Java):

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

Both frameworks offer distributed processing capabilities, but Spark provides a more comprehensive ecosystem for big data analytics and machine learning. JStorm, on the other hand, focuses on real-time stream processing with lower latency. The code examples demonstrate the different approaches to defining data processing pipelines in each framework.

7,760

Apache Beam is a unified programming model for Batch and Streaming data processing.

Pros of Beam

  • Supports multiple programming languages (Java, Python, Go)
  • Provides a unified model for batch and stream processing
  • Offers a rich set of built-in transforms and connectors

Cons of Beam

  • Steeper learning curve due to its abstraction layer
  • May have higher resource requirements for simple use cases
  • Less focused on real-time processing compared to JStorm

Code Comparison

JStorm (Java):

public class ExampleTopology {
    public static void main(String[] args) throws Exception {
        TopologyBuilder builder = new TopologyBuilder();
        builder.setSpout("spout", new RandomSentenceSpout(), 5);
        builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
        builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
    }
}

Beam (Java):

public class WordCount {
    public static void main(String[] args) {
        Pipeline p = Pipeline.create(PipelineOptionsFactory.create());
        p.apply(TextIO.read().from("input.txt"))
         .apply(FlatMapElements.into(TypeDescriptors.strings()).via((String line) -> Arrays.asList(line.split("\\W+"))))
         .apply(Count.<String>perElement())
         .apply(MapElements.into(TypeDescriptors.strings()).via((KV<String, Long> wordCount) -> wordCount.getKey() + ": " + wordCount.getValue()))
         .apply(TextIO.write().to("output"));
    }
}

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter

Pros of Heron

  • Better performance and lower latency compared to JStorm
  • More flexible and modular architecture, allowing easier customization
  • Stronger community support and active development as an Apache project

Cons of Heron

  • Steeper learning curve due to more complex architecture
  • Less mature and potentially less stable than JStorm
  • Smaller ecosystem of third-party integrations and tools

Code Comparison

JStorm topology definition:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new TestWordSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

Heron topology definition:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new TestWordSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", "word");

The code structure is similar, with minor differences in syntax. Heron uses a string for field grouping instead of a Fields object.

Both projects aim to provide distributed stream processing capabilities, but Heron offers improved performance and flexibility at the cost of increased complexity. JStorm may be a better choice for simpler use cases or when working with existing Alibaba infrastructure.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Alibaba Group has donated JStorm project to the Apache Software Foundation as a subproject of Apache Storm. The improvements and features have been merged into Apache Storm. This is an archived and read-only repository which doesn't accept new issues. Please use Apache Storm instead and report issues there.