yugabyte-db
YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
Top Related Projects
CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
Apache Cassandra®
Vitess is a database clustering system for horizontal scaling of MySQL.
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Quick Overview
YugabyteDB is an open-source, high-performance distributed SQL database designed for global, internet-scale applications. It combines the scalability of NoSQL databases with the consistency and SQL features of traditional relational databases, offering a unique solution for modern cloud-native applications.
Pros
- Highly scalable and distributed architecture
- Strong consistency and ACID compliance
- PostgreSQL-compatible, supporting standard SQL
- Multi-region and multi-cloud deployment capabilities
Cons
- Relatively new compared to established databases
- Limited ecosystem and third-party tool support
- Steeper learning curve for teams unfamiliar with distributed systems
- Resource-intensive for smaller applications
Code Examples
- Connecting to YugabyteDB using Python:
from yugabyte import YugabyteConnection
conn = YugabyteConnection('host=127.0.0.1 port=5433 dbname=yugabyte user=yugabyte')
cursor = conn.cursor()
- Creating a table and inserting data:
CREATE TABLE users (
id INT PRIMARY KEY,
name TEXT,
email TEXT
);
INSERT INTO users (id, name, email) VALUES
(1, 'John Doe', 'john@example.com'),
(2, 'Jane Smith', 'jane@example.com');
- Performing a distributed transaction:
with conn.transaction():
cursor.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
cursor.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
Getting Started
To get started with YugabyteDB:
-
Install YugabyteDB:
wget https://downloads.yugabyte.com/yugabyte-2.13.1.0-linux.tar.gz tar xvfz yugabyte-2.13.1.0-linux.tar.gz cd yugabyte-2.13.1.0/
-
Start a local cluster:
./bin/yugabyted start
-
Connect using ysqlsh:
./bin/ysqlsh
-
Create a database and table:
CREATE DATABASE myapp; \c myapp CREATE TABLE users (id INT PRIMARY KEY, name TEXT);
-
Insert and query data:
INSERT INTO users VALUES (1, 'Alice'); SELECT * FROM users;
Competitor Comparisons
CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
Pros of CockroachDB
- Better support for global, multi-region deployments with advanced geo-partitioning features
- More mature and battle-tested in production environments
- Stronger consistency guarantees with serializable isolation level by default
Cons of CockroachDB
- Higher resource consumption, especially for smaller deployments
- Steeper learning curve due to more complex architecture and configuration options
- Less flexible in terms of storage engine options (only RocksDB)
Code Comparison
CockroachDB (SQL syntax):
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name STRING,
created_at TIMESTAMP DEFAULT current_timestamp()
);
YugabyteDB (SQL syntax):
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name TEXT,
created_at TIMESTAMP DEFAULT now()
);
Both databases use similar SQL syntax, with minor differences in function names and data types. CockroachDB uses STRING
for text data, while YugabyteDB uses TEXT
. The UUID generation functions also have slightly different names.
Overall, CockroachDB and YugabyteDB are both distributed SQL databases with similar goals, but CockroachDB has a slight edge in maturity and global deployment features, while YugabyteDB offers more flexibility in terms of storage engines and potentially lower resource usage for smaller deployments.
TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
Pros of TiDB
- More mature project with a larger community and ecosystem
- Better support for distributed OLAP workloads
- More advanced query optimizer for complex SQL queries
Cons of TiDB
- Steeper learning curve and more complex architecture
- Less compatibility with PostgreSQL ecosystem
- Higher resource requirements for small-scale deployments
Code Comparison
TiDB (Go):
func (s *tikvStore) Begin() (kv.Transaction, error) {
txn, err := newTiKVTxn(s)
if err != nil {
return nil, errors.Trace(err)
}
return txn, nil
}
YugabyteDB (C++):
Status YBClient::OpenTable(const string& table_name,
shared_ptr<YBTable>* table) {
return OpenTable(table_name, {}, table);
}
Both projects implement distributed SQL databases, but TiDB focuses on scalability and HTAP workloads, while YugabyteDB emphasizes PostgreSQL compatibility and ease of use. TiDB uses a more complex architecture with separate storage and computation layers, while YugabyteDB has a more integrated approach. TiDB is written primarily in Go, whereas YugabyteDB uses C++ for its core components.
Apache Cassandra®
Pros of Cassandra
- Mature and battle-tested with a large community and extensive ecosystem
- Highly scalable and designed for massive distributed deployments
- Strong support for multi-datacenter replication and tunable consistency levels
Cons of Cassandra
- Limited support for ACID transactions and complex queries
- Steep learning curve and complex configuration process
- Lacks built-in SQL support, requiring the use of CQL or additional tools
Code Comparison
Cassandra query (CQL):
SELECT * FROM users
WHERE user_id = 123
AND timestamp > '2023-01-01'
LIMIT 10;
YugabyteDB query (YSQL):
SELECT * FROM users
WHERE user_id = 123
AND timestamp > '2023-01-01'
LIMIT 10;
Key Differences
- YugabyteDB offers PostgreSQL-compatible SQL support (YSQL) in addition to Cassandra-compatible APIs
- YugabyteDB provides stronger consistency guarantees and ACID transactions out of the box
- Cassandra has a longer track record and more extensive production deployments
- YugabyteDB aims to combine the scalability of Cassandra with the ease of use and features of traditional RDBMSs
Both databases excel in distributed environments, but YugabyteDB offers a more familiar SQL experience and stronger consistency, while Cassandra provides unparalleled scalability and a proven track record in large-scale deployments.
Vitess is a database clustering system for horizontal scaling of MySQL.
Pros of Vitess
- Designed for horizontal scaling of MySQL databases, making it ideal for large-scale deployments
- Provides advanced sharding capabilities, allowing for efficient data distribution
- Offers seamless integration with Kubernetes for orchestration and management
Cons of Vitess
- Steeper learning curve due to its complex architecture and components
- Limited support for non-MySQL databases, focusing primarily on MySQL compatibility
- May introduce additional latency in some scenarios due to its proxy-based architecture
Code Comparison
Vitess (VTGate query execution):
func (e *Executor) Execute(ctx context.Context, session *vtgatepb.Session, sql string, bindVariables map[string]*querypb.BindVariable) (*sqltypes.Result, error) {
// Query execution logic
}
YugabyteDB (SQL execution):
Status PgSession::ExecuteStatements(const string& query_string,
StatementExecutedCallback cb) {
// SQL execution logic
}
Both projects implement query execution, but Vitess focuses on routing and proxying MySQL queries, while YugabyteDB directly executes SQL statements in its distributed database engine.
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Pros of ScyllaDB
- Higher performance and lower latency due to its C++ implementation and shared-nothing architecture
- Better resource utilization and scalability, especially for large datasets and high-throughput workloads
- More mature and battle-tested in production environments
Cons of ScyllaDB
- Less flexible in terms of consistency models compared to YugabyteDB's tunable consistency
- Limited support for distributed ACID transactions across multiple partitions
- Narrower ecosystem integration and fewer enterprise features than YugabyteDB
Code Comparison
ScyllaDB (CQL):
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name TEXT,
email TEXT
);
YugabyteDB (YSQL):
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name TEXT,
email TEXT
);
Both databases support SQL-like syntax, with ScyllaDB using Cassandra Query Language (CQL) and YugabyteDB offering PostgreSQL-compatible YSQL. The example above shows a simple table creation, which is nearly identical in both systems. However, YugabyteDB's YSQL provides more advanced SQL features and better compatibility with existing PostgreSQL applications.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
What is YugabyteDB?
YugabyteDB is a high-performance, cloud-native, distributed SQL database that aims to support all PostgreSQL features. It is best suited for cloud-native OLTP (i.e., real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.
- Core Features
- Get Started
- Build Apps
- What's being worked on?
- Architecture
- Need Help?
- Contribute
- License
- Read More
Core Features
-
Powerful RDBMS capabilities Yugabyte SQL (YSQL for short) reuses the query layer of PostgreSQL (similar to Amazon Aurora PostgreSQL), thereby supporting most of its features (datatypes, queries, expressions, operators and functions, stored procedures, triggers, extensions, etc). Here is a detailed list of features currently supported by YSQL.
-
Distributed transactions The transaction design is based on the Google Spanner architecture. Strong consistency of writes is achieved by using Raft consensus for replication and cluster-wide distributed ACID transactions using hybrid logical clocks. Snapshot, serializable and read committed isolation levels are supported. Reads (queries) have strong consistency by default, but can be tuned dynamically to read from followers and read-replicas.
-
Continuous availability YugabyteDB is extremely resilient to common outages with native failover and repair. YugabyteDB can be configured to tolerate disk, node, zone, region, and cloud failures automatically. For a typical deployment where a YugabyteDB cluster is deployed in one region across multiple zones on a public cloud, the RPO is 0 (meaning no data is lost on failure) and the RTO is 3 seconds (meaning the data being served by the failed node is available in 3 seconds).
-
Horizontal scalability Scaling a YugabyteDB cluster to achieve more IOPS or data storage is as simple as adding nodes to the cluster.
-
Geo-distributed, multi-cloud YugabyteDB can be deployed in public clouds and natively inside Kubernetes. It supports deployments that span three or more fault domains, such as multi-zone, multi-region, and multi-cloud deployments. It also supports xCluster asynchronous replication with unidirectional master-slave and bidirectional multi-master configurations that can be leveraged in two-region deployments. To serve (stale) data with low latencies, read replicas are also a supported feature.
-
Multi API design The query layer of YugabyteDB is built to be extensible. Currently, YugabyteDB supports two distributed SQL APIs: Yugabyte SQL (YSQL), a fully relational API that re-uses query layer of PostgreSQL, and Yugabyte Cloud QL (YCQL), a semi-relational SQL-like API with documents/indexing support with Apache Cassandra QL roots.
-
100% open source YugabyteDB is fully open-source under the Apache 2.0 license. The open-source version has powerful enterprise features such as distributed backups, encryption of data-at-rest, in-flight TLS encryption, change data capture, read replicas, and more.
Read more about YugabyteDB in our FAQ.
Get Started
- Quick Start
- Try running a real-world demo application:
Cannot find what you are looking for? Have a question? Please post your questions or comments on our Community Slack or Forum.
Build Apps
YugabyteDB supports many languages and client drivers, including Java, Go, NodeJS, Python, and more. For a complete list, including examples, see Drivers and ORMs.
What's being worked on?
This section was last updated in July, 2024.
Current roadmap
Here is a list of some of the key features being worked on for the upcoming releases. The YugabyteDB v2024.1 release has been released in June, 2024.
Feature | Status | Progress | Comments |
---|---|---|---|
Upgrade to PostgreSQL v15 | PROGRESS | Track | For latest features, new PostgreSQL extensions, performance, and community fixes |
Support PostgreSQL Publication/Replication slot API in CDC | PROGRESS | Track | PostgreSQL has a huge community that needs a PG-compatible API to set up and consume database changes. |
Bitmap scan support | PROGRESS | Track | Bitmap Scan support for using Index Scans, remote filter and enhance Cost Model. |
YSQL-table statistics and cost based optimizer(CBO) | PROGRESS | Track | Improve YSQL query performance |
YSQL parallel query execution | PROGRESS | Track | Devise query plans that can leverage multiple CPUs in order to answer queries faster. |
YSQL-Feature support - ALTER TABLE | PROGRESS | Track | Support for various ALTER TABLE variants |
Connection Management | PROGRESS | Track | Server side connection management |
Recently released features
Feature | Status | Release Target | Docs / Enhancements | Comments |
---|---|---|---|---|
Support for transactions in async xCluster replication | â DONE | v2.19 | Docs | Preserve and guarantee transactional atomicity and global ordering when propagating change data from one universe to another |
Support wait-on-conflict concurrency control | â DONE | v2.19 | Support wait-on-conflict concurrency control | |
Faster Bulk-Data Loading in YugabyteDB | â DONE | v2.15 | Track | Faster Bulk-Data Loading in YugabyteDB |
Change Data Capture | â DONE | v2.13 | Change data capture (CDC) allows multiple downstream apps and services to consume the continuous and never-ending stream(s) of changes to Yugabyte databases | |
Support for materalized views | â DONE | v2.13 | Docs | A materialized view is a pre-computed data set derived from a query specification and stored for later use |
Geo-partitioning support for the transaction status table | â DONE | v2.13 | Docs | Instead of central remote transaction execution metatda, it is now optimized for access from different regions. Since the transaction metadata is also geo partitioned, it eliminates the need for round-trip to remote regions to update transaction statuses. |
Transparently restart transactions | â DONE | v2.13 | Decrease the incidence of transaction restart errors seen in various scenarios | |
Row-level geo-partitioning | â DONE | v2.13 | Docs | Row-level geo-partitioning allows fine-grained control over pinning data in a user table (at a per-row level) to geographic locations, thereby allowing the data residency to be managed at the table-row level. |
YSQL-Support GIN indexes | â DONE | v2.11 | Docs | Support for generalized inverted indexes for container data types like jsonb, tsvector, and array |
YSQL-Collation Support | â DONE | v2.11 | Docs | Allows specifying the sort order and character classification behavior of data per-column, or even per-operation according to language and country-specific rules |
YSQL-Savepoint Support | â DONE | v2.11 | Docs | Useful for implementing complex error recovery in multi-statement transaction |
xCluster replication management through Platform | â DONE | v2.11 | Docs | |
Spring Data YugabyteDB module | â DONE | v2.9 | Track | Bridges the gap for learning the distributed SQL concepts with familiarity and ease of Spring Data APIs |
Support Liquibase, Flyway, ORM schema migrations | â DONE | v2.9 | Docs | |
Support ALTER TABLE add primary key | â DONE | v2.9 | Track | |
YCQL-LDAP Support | â DONE | v2.8 | Docs | support LDAP authentication in YCQL API |
Platform Alerting and Notification | â DONE | v2.8 | Docs | To get notified in real time about database alerts, user defined alert policies notify you when a performance metric rises above or falls below a threshold you set. |
Platform API | â DONE | v2.8 | Docs | Securely Deploy YugabyteDB Clusters Using Infrastructure-as-Code |
Architecture
Review detailed architecture in our Docs.
Need Help?
-
You can ask questions, find answers, and help others on our Community Slack, Forum, Stack Overflow, as well as Twitter @Yugabyte
-
Please use GitHub issues to report issues or request new features.
-
To Troubleshoot YugabyteDB, cluser/node level issues, Please refer to Troubleshooting documentation
Contribute
As an open-source project with a strong focus on the user community, we welcome contributions as GitHub pull requests. See our Contributor Guides to get going. Discussions and RFCs for features happen on the design discussions section of our Forum.
License
Source code in this repository is variously licensed under the Apache License 2.0 and the Polyform Free Trial License 1.0.0. A copy of each license can be found in the licenses directory.
The build produces two sets of binaries:
- The entire database with all its features (including the enterprise ones) are licensed under the Apache License 2.0
- The binaries that contain
-managed
in the artifact and help run a managed service are licensed under the Polyform Free Trial License 1.0.0.
By default, the build options generate only the Apache License 2.0 binaries.
Read More
- To see our updates, go to The Distributed SQL Blog.
- For an in-depth design and the YugabyteDB architecture, see our design specs.
- Tech Talks and Videos.
- See how YugabyteDB compares with other databases.
Top Related Projects
CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
Apache Cassandra®
Vitess is a database clustering system for horizontal scaling of MySQL.
NoSQL data store using the seastar framework, compatible with Apache Cassandra
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot