Convert Figma logo to code with AI

scylladb logoscylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra

13,743
1,309
13,743
3,272

Top Related Projects

Apache Cassandra®

30,019

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

26,228

The MongoDB Database

Free and Open Source, Distributed, RESTful Search Engine

37,055

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/

Quick Overview

ScyllaDB is an open-source distributed NoSQL database that is designed to be a drop-in replacement for Apache Cassandra. It is written in C++ and aims to provide better performance and lower latency than Cassandra while maintaining compatibility with its ecosystem.

Pros

  • High performance and low latency due to its C++ implementation and optimized architecture
  • Seamless compatibility with Apache Cassandra, allowing easy migration and use of existing tools
  • Automatic sharding and replication for improved scalability and fault tolerance
  • Support for both on-premises and cloud deployments

Cons

  • Relatively smaller community compared to more established databases like Cassandra or MongoDB
  • Limited support for advanced features found in some other NoSQL databases
  • Steeper learning curve for developers not familiar with Cassandra-like systems
  • Fewer third-party integrations and tools compared to more popular databases

Code Examples

Here are a few examples of using ScyllaDB with the Python driver:

  1. Connecting to ScyllaDB:
from cassandra.cluster import Cluster

cluster = Cluster(['127.0.0.1'])
session = cluster.connect()
  1. Creating a keyspace and table:
session.execute("""
    CREATE KEYSPACE IF NOT EXISTS example_keyspace
    WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}
""")

session.execute("""
    CREATE TABLE IF NOT EXISTS example_keyspace.users (
        id UUID PRIMARY KEY,
        name TEXT,
        age INT
    )
""")
  1. Inserting and querying data:
from cassandra.util import uuid

user_id = uuid.uuid4()
session.execute(
    "INSERT INTO example_keyspace.users (id, name, age) VALUES (%s, %s, %s)",
    (user_id, "John Doe", 30)
)

rows = session.execute("SELECT * FROM example_keyspace.users WHERE id = %s", [user_id])
for row in rows:
    print(f"User: {row.name}, Age: {row.age}")

Getting Started

To get started with ScyllaDB:

  1. Install ScyllaDB using Docker:
docker pull scylladb/scylla
docker run --name scylla-node -d scylladb/scylla
  1. Install the Python driver:
pip install cassandra-driver
  1. Use the code examples above to connect, create a keyspace and table, and perform basic operations.

  2. For more advanced usage, refer to the ScyllaDB documentation and the Python driver documentation.

Competitor Comparisons

Apache Cassandra®

Pros of Cassandra

  • Mature and battle-tested with a large community and extensive documentation
  • Highly scalable and fault-tolerant, designed for large-scale distributed systems
  • Rich ecosystem of tools and integrations

Cons of Cassandra

  • Written in Java, which can lead to higher memory usage and longer garbage collection pauses
  • Generally slower performance compared to ScyllaDB, especially for read-heavy workloads
  • More complex configuration and tuning required for optimal performance

Code Comparison

Cassandra (CQL):

CREATE TABLE users (
  user_id uuid PRIMARY KEY,
  username text,
  email text
);

ScyllaDB (CQL):

CREATE TABLE users (
  user_id uuid PRIMARY KEY,
  username text,
  email text
) WITH compression = { 'sstable_compression' : 'LZ4Compressor' };

Both ScyllaDB and Cassandra use CQL (Cassandra Query Language) for data manipulation, making them syntactically similar. However, ScyllaDB offers some additional options for performance tuning, such as specifying compression algorithms at the table level.

ScyllaDB aims to be a drop-in replacement for Cassandra, focusing on improved performance and reduced operational complexity. While Cassandra has a longer history and wider adoption, ScyllaDB leverages its C++ implementation and shard-per-core architecture to achieve better resource utilization and lower latencies, especially for larger datasets and read-intensive workloads.

30,019

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

Pros of CockroachDB

  • Stronger consistency model with serializable isolation
  • Built-in multi-region support for global deployments
  • More mature SQL support with advanced features

Cons of CockroachDB

  • Higher resource consumption and overhead
  • Steeper learning curve for operations and tuning
  • Less predictable performance under high concurrency

Code Comparison

ScyllaDB (CQL):

CREATE TABLE users (
  id UUID PRIMARY KEY,
  name TEXT,
  email TEXT
);

CockroachDB (SQL):

CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name STRING,
  email STRING
);

Both databases use similar syntax for basic operations, but CockroachDB follows standard SQL more closely. ScyllaDB uses CQL (Cassandra Query Language), which is similar to SQL but with some differences in data types and features.

CockroachDB offers more advanced SQL features like foreign keys, indexes, and joins out of the box, while ScyllaDB focuses on high-performance, low-latency operations for simpler data models.

ScyllaDB's approach is generally more suitable for write-heavy workloads and time-series data, while CockroachDB excels in scenarios requiring strong consistency and complex queries across distributed data.

26,228

The MongoDB Database

Pros of MongoDB

  • More mature and widely adopted ecosystem with extensive documentation and community support
  • Flexible schema design allows for easier adaptation to changing data structures
  • Rich query language and aggregation framework for complex data operations

Cons of MongoDB

  • Generally slower performance compared to ScyllaDB, especially for write-heavy workloads
  • Less efficient use of system resources, potentially requiring more hardware
  • Scaling can be more complex and costly, particularly for large datasets

Code Comparison

MongoDB query example:

db.users.find({
  age: { $gte: 18 },
  status: "active"
}).sort({ name: 1 })

ScyllaDB query example (using CQL):

SELECT * FROM users
WHERE age >= 18 AND status = 'active'
ORDER BY name ASC;

Both databases offer different query languages, with MongoDB using a JSON-like syntax and ScyllaDB using CQL, which is similar to SQL. MongoDB's query language is often considered more flexible, while ScyllaDB's CQL is more familiar to those with SQL experience.

ScyllaDB is designed for high performance and scalability, particularly suited for large-scale, write-intensive applications. MongoDB, on the other hand, offers more flexibility in data modeling and querying, making it a popular choice for a wide range of applications, especially those with evolving schemas.

Free and Open Source, Distributed, RESTful Search Engine

Pros of Elasticsearch

  • More mature and widely adopted, with a larger ecosystem and community support
  • Powerful full-text search capabilities and advanced querying options
  • Extensive documentation and learning resources available

Cons of Elasticsearch

  • Higher resource consumption and slower performance for large-scale deployments
  • More complex setup and configuration process
  • Licensing changes have caused concerns in the open-source community

Code Comparison

Elasticsearch query example:

GET /my_index/_search
{
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

ScyllaDB query example:

SELECT * FROM my_table
WHERE title LIKE '%scylladb%';

ScyllaDB focuses on high-performance, low-latency operations for large datasets, while Elasticsearch excels in full-text search and complex querying. ScyllaDB uses a CQL-like syntax, similar to SQL, making it more familiar for developers with relational database experience. Elasticsearch uses a JSON-based query DSL, which is powerful but may require a steeper learning curve.

Both databases have their strengths and are suited for different use cases. Elasticsearch is ideal for search-heavy applications, while ScyllaDB is better for high-throughput, low-latency workloads with large amounts of data.

37,055

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/

Pros of TiDB

  • SQL compatibility: TiDB offers MySQL compatibility, making it easier for users familiar with SQL databases
  • Horizontal scalability: Designed for distributed scaling across multiple nodes
  • HTAP capabilities: Supports both OLTP and OLAP workloads in a single system

Cons of TiDB

  • Higher resource consumption: Generally requires more resources compared to ScyllaDB
  • Complexity: More complex architecture and setup process than ScyllaDB
  • Learning curve: Steeper learning curve for optimization and management

Code Comparison

TiDB SQL query example:

SELECT * FROM users
WHERE age > 25 AND city = 'New York'
ORDER BY name
LIMIT 10;

ScyllaDB CQL query example:

SELECT * FROM users
WHERE age > 25 AND city = 'New York'
ORDER BY name
LIMIT 10;

While the syntax looks similar, TiDB supports a wider range of SQL features and functions compared to ScyllaDB's CQL. ScyllaDB focuses on high-performance, low-latency operations for specific use cases, while TiDB aims to provide a more comprehensive SQL-compatible distributed database solution.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Scylla

Slack Twitter

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

Build with the latest Seastar Check Reproducible Build clang-nightly

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

  • The community forum and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
  • The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.