Convert Figma logo to code with AI

StarRocks logostarrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.

8,646
1,746
8,646
1,166

Top Related Projects

12,278

Apache Doris is an easy-to-use, high performance and unified analytics database.

ClickHouse® is a real-time analytics DBMS

13,453

Apache Druid: a high performance real-time analytics database.

10,324

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

5,385

Apache Pinot - A realtime distributed OLAP datastore

Quick Overview

StarRocks is an open-source, high-performance analytical database designed for real-time analytics on massive-scale data. It combines the benefits of MPP databases and columnar storage engines, offering fast query performance and high concurrency for both multi-dimensional analytics and real-time data analysis.

Pros

  • Excellent query performance, especially for complex analytical queries
  • High concurrency support, allowing multiple users to query simultaneously
  • Flexible data ingestion methods, including batch and real-time options
  • Compatibility with various data ecosystems and BI tools

Cons

  • Relatively new project, still maturing compared to some established alternatives
  • Limited documentation and community resources compared to more established databases
  • Steeper learning curve for users unfamiliar with MPP databases
  • May require more hardware resources for optimal performance compared to some alternatives

Getting Started

To get started with StarRocks:

  1. Download and install StarRocks:
wget https://github.com/StarRocks/starrocks/releases/download/2.5.4/StarRocks-2.5.4.tar.gz
tar -xzf StarRocks-2.5.4.tar.gz
cd StarRocks-2.5.4
  1. Start the StarRocks cluster:
./bin/start_fe.sh
./bin/start_be.sh
  1. Connect to StarRocks using the MySQL client:
mysql -h127.0.0.1 -P9030 -uroot
  1. Create a database and table:
CREATE DATABASE example_db;
USE example_db;
CREATE TABLE example_table (
    id INT,
    name VARCHAR(50),
    value DOUBLE
) ENGINE=OLAP
DISTRIBUTED BY HASH(id) BUCKETS 10;
  1. Insert data and run queries:
INSERT INTO example_table VALUES (1, 'John', 10.5), (2, 'Jane', 20.3);
SELECT * FROM example_table WHERE value > 15;

For more detailed instructions and advanced features, refer to the official StarRocks documentation.

Competitor Comparisons

12,278

Apache Doris is an easy-to-use, high performance and unified analytics database.

Pros of Doris

  • Mature Apache project with a larger community and longer history
  • Better documentation and more comprehensive user guides
  • Stronger support for SQL compliance and compatibility

Cons of Doris

  • Generally slower query performance, especially for complex analytical queries
  • Less flexible storage architecture, limiting scalability for very large datasets
  • Fewer advanced features for real-time analytics and streaming data ingestion

Code Comparison

Doris query example:

SELECT user_id, SUM(order_amount) AS total_amount
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY user_id
HAVING total_amount > 1000
ORDER BY total_amount DESC
LIMIT 10;

StarRocks query example:

SELECT user_id, SUM(order_amount) AS total_amount
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY user_id
HAVING total_amount > 1000
ORDER BY total_amount DESC
LIMIT 10;

The SQL syntax for both systems is very similar, as they both aim for MySQL compatibility. However, StarRocks may offer better performance for this type of analytical query due to its optimized storage and query engine.

ClickHouse® is a real-time analytics DBMS

Pros of ClickHouse

  • Mature project with a larger community and more extensive documentation
  • Wider range of data types and functions supported
  • Better performance for certain types of analytical queries

Cons of ClickHouse

  • Steeper learning curve and more complex configuration
  • Less optimized for real-time analytics and updates
  • Limited support for distributed transactions

Code Comparison

ClickHouse query example:

SELECT
    toYear(date) AS year,
    sum(amount) AS total_amount
FROM sales
GROUP BY year
ORDER BY year

StarRocks query example:

SELECT
    year(date) AS year,
    sum(amount) AS total_amount
FROM sales
GROUP BY year
ORDER BY year

Both systems use SQL-like syntax, but ClickHouse often requires more specific function names (e.g., toYear instead of year). StarRocks generally aims for a more familiar SQL experience, which can be easier for users transitioning from traditional databases.

While both systems excel in analytical processing, StarRocks focuses more on real-time analytics and easier integration with big data ecosystems. ClickHouse, on the other hand, offers more flexibility and power for complex analytical queries, albeit with a steeper learning curve.

13,453

Apache Druid: a high performance real-time analytics database.

Pros of Druid

  • Mature project with a large community and extensive documentation
  • Excellent for real-time analytics and time-series data
  • Highly scalable and fault-tolerant architecture

Cons of Druid

  • Steeper learning curve and more complex setup compared to StarRocks
  • Less efficient for ad-hoc queries on large datasets
  • Limited support for complex joins and subqueries

Code Comparison

Druid query example:

SELECT COUNT(*) AS count
FROM my_datasource
WHERE timestamp >= CURRENT_TIMESTAMP - INTERVAL '1' DAY
GROUP BY time_floor(__time, 'PT1H')

StarRocks query example:

SELECT COUNT(*) AS count
FROM my_table
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY DATE_TRUNC('HOUR', timestamp)

Both systems use SQL-like syntax, but Druid has some specific functions like time_floor, while StarRocks uses more standard SQL functions like DATE_TRUNC. StarRocks generally offers a more familiar SQL experience for users coming from traditional database backgrounds.

10,324

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Pros of Trino

  • More mature project with a larger community and ecosystem
  • Supports a wider range of data sources and connectors
  • Better suited for complex, federated queries across multiple data sources

Cons of Trino

  • Generally slower query performance for analytical workloads
  • Higher resource consumption, especially for memory-intensive operations
  • More complex setup and configuration process

Code Comparison

Trino SQL query:

SELECT
  customer.name,
  SUM(orders.total_price) AS total_spent
FROM
  hive.sales.customer
  JOIN mysql.ecommerce.orders ON customer.id = orders.customer_id
GROUP BY
  customer.name

StarRocks SQL query:

SELECT
  customer.name,
  SUM(orders.total_price) AS total_spent
FROM
  customer
  JOIN orders ON customer.id = orders.customer_id
GROUP BY
  customer.name

The main difference in these examples is that Trino explicitly specifies the data source for each table (hive.sales and mysql.ecommerce), while StarRocks assumes all tables are within the same database. This highlights Trino's strength in federated queries across multiple data sources, while StarRocks focuses on optimized performance within a single analytical database.

5,385

Apache Pinot - A realtime distributed OLAP datastore

Pros of Pinot

  • More mature project with a larger community and ecosystem
  • Supports real-time ingestion and near real-time query processing
  • Offers flexible schema design and automatic schema inference

Cons of Pinot

  • Higher complexity in setup and configuration
  • Steeper learning curve for beginners
  • May require more resources for optimal performance

Code Comparison

Pinot query example:

SELECT COUNT(*) FROM myTable
WHERE timeColumn BETWEEN 1000 AND 2000
GROUP BY dimension1, dimension2
TOP 50

StarRocks query example:

SELECT COUNT(*) FROM myTable
WHERE timeColumn BETWEEN 1000 AND 2000
GROUP BY dimension1, dimension2
ORDER BY COUNT(*) DESC
LIMIT 50

Both systems use SQL-like syntax for querying, with minor differences in syntax for certain operations. Pinot uses TOP for limiting results, while StarRocks uses the more standard LIMIT clause.

StarRocks generally offers a simpler setup process and easier management, making it more suitable for users who prioritize ease of use. Pinot, on the other hand, provides more advanced features and flexibility, which can be beneficial for complex use cases and larger-scale deployments.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Download | Docs | Benchmarks | Demo

JAVA&C++ Commit Activities Open Issues Website Slack Twitter

StarRocks, a Linux Foundation project, is the next-generation data platform designed to make data-intensive real-time analytics fast and easy. It delivers query speeds 5 to 10 times faster than other popular solutions. StarRocks can perform real-time analytics well while updating historical records. It can also enhance real-time analytics with historical data from data lakes easily. With StarRocks, you can get rid of the de-normalized tables and get the best performance and flexibility.

Learn more 👉🏻 What Is StarRocks: Features and Use Cases



Features

  • 🚀 Native vectorized SQL engine: StarRocks adopts vectorization technology to make full use of the parallel computing power of CPU, achieving sub-second query returns in multi-dimensional analyses, which is 5 to 10 times faster than previous systems.
  • 📊 Standard SQL: StarRocks supports ANSI SQL syntax (fully supported TPC-H and TPC-DS). It is also compatible with the MySQL protocol. Various clients and BI software can be used to access StarRocks.
  • 💡 Smart query optimization: StarRocks can optimize complex queries through CBO (Cost Based Optimizer). With a better execution plan, the data analysis efficiency will be greatly improved.
  • ⚡ Real-time update: The updated model of StarRocks can perform upsert/delete operations according to the primary key, and achieve efficient query while concurrent updates.
  • 🪟 Intelligent materialized view: The materialized view of StarRocks can be automatically updated during the data import and automatically selected when the query is executed.
  • ✨ Querying data in data lakes directly: StarRocks allows direct access to data from Apache Hive™, Apache Iceberg™, Delta Lake™ and Apache Hudi™ without importing.
  • 🎛️ Resource management: This feature allows StarRocks to limit resource consumption for queries and implement isolation and efficient use of resources among tenants in the same cluster.
  • 💠 Easy to maintain: Simple architecture makes StarRocks easy to deploy, maintain and scale out. StarRocks tunes its query plan agilely, balances the resources when the cluster is scaled in or out, and recovers the data replica under node failure automatically.

Architecture Overview

StarRocks’s streamlined architecture is mainly composed of two modules: Frontend (FE) and Backend (BE). The entire system eliminates single points of failure through seamless and horizontal scaling of FE and BE, as well as replication of metadata and data.

Starting from version 3.0, StarRocks supports a new shared-data architecture, which can provide better scalability and lower costs.


Resources

📚 Read the docs

SectionDescription
Quick StartsHow-tos and Tutorials.
DeployLearn how to run and configure StarRocks.
DocsFull documentation.
BlogsStarRocks deep dive and user stories.

❓ Get support


Contributing to StarRocks

We welcome all kinds of contributions from the community, individuals and partners. We owe our success to your active involvement.

  1. See Contributing.md to get started.
  2. Set up StarRocks development environment:
  1. Understand our GitHub workflow for opening a pull request; use this PR Template when submitting a pull request.
  2. Pick a good first issue and start contributing.

📝 License: StarRocks is licensed under Apache License 2.0.

👥 Community Membership: Learn more about different contributor roles in StarRocks community.


Used By

This project is used by the following companies. Learn more about their use cases: