opentsdb

A scalable, distributed Time Series Database.

5,039

1,242

5,039

531

View on GitHub

Top Related Projects

prometheus

59,181

The Prometheus monitoring system and time series database.

influxdb

29,932

Scalable datastore for metrics, events, and real-time analytics

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

graphite-web

5,990

A highly scalable real-time graphing system

timescaledb

18,976

A time-series database for high-performance real-time analytics packaged as a Postgres extension

VictoriaMetrics

14,388

VictoriaMetrics: fast, cost-effective monitoring solution and time series database

Quick Overview

OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. It is designed to store and serve massive amounts of time series data without losing granularity, making it ideal for monitoring systems, IoT applications, and financial data analysis.

Pros

Highly scalable and can handle billions of data points per day
Efficient data compression to minimize storage requirements
Flexible querying capabilities with support for aggregations and downsampling
Integration with popular visualization tools like Grafana

Cons

Requires a complex setup with dependencies on HBase and other components
Steep learning curve for newcomers to time series databases
Limited built-in visualization options compared to some newer alternatives
Can be resource-intensive for smaller deployments

Code Examples

Inserting data points:

import net.opentsdb.core.TSDB;
import net.opentsdb.utils.Config;

Config config = new Config(true);
TSDB tsdb = new TSDB(config);

long timestamp = System.currentTimeMillis() / 1000;
tsdb.addPoint("sys.cpu.user", timestamp, 42.5, "host=webserver01");

Querying data:

import net.opentsdb.core.Query;
import net.opentsdb.core.DataPoints;

Query query = tsdb.newQuery();
query.setStartTime(timestamp - 3600);  // 1 hour ago
query.setEndTime(timestamp);
query.setMetric("sys.cpu.user");
query.setTimeSeries("host=webserver01");

DataPoints[] results = query.run();
for (DataPoints points : results) {
    System.out.println(points);
}

Downsampling data:

import net.opentsdb.core.Aggregators;

Query query = tsdb.newQuery();
query.setStartTime(timestamp - 86400);  // 24 hours ago
query.setEndTime(timestamp);
query.setMetric("sys.cpu.user");
query.downsample(3600, Aggregators.AVG);  // 1-hour average

DataPoints[] results = query.run();

Getting Started

Install HBase and configure it for OpenTSDB
Download and install OpenTSDB
Configure OpenTSDB (edit opentsdb.conf)

Create necessary tables in HBase:

env COMPRESSION=NONE HBASE_HOME=/path/to/hbase ./src/create_table.sh

Start OpenTSDB:

./build/tsdb tsd --port=4242 --staticroot=./build/staticroot --cachedir=/tmp/opentsdb

Use the HTTP API or client libraries to insert and query data

Competitor Comparisons

prometheus

59,181

The Prometheus monitoring system and time series database.

Pros of Prometheus

Built-in alerting and query language (PromQL)
Pull-based architecture, making it easier to monitor ephemeral services
More active development and larger community support

Cons of Prometheus

Limited long-term storage options without additional components
Less scalable for very high cardinality data compared to OpenTSDB

Code Comparison

Prometheus configuration (prometheus.yml):

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'example'
    static_configs:
      - targets: ['localhost:8080']

OpenTSDB configuration (opentsdb.conf):

tsd.core.auto_create_metrics = true
tsd.storage.hbase.zk_quorum = localhost
tsd.storage.fix_duplicates = true
tsd.http.request.enable_chunked = true
tsd.http.request.max_chunk = 65536

Both Prometheus and OpenTSDB are powerful time-series databases, but they have different architectures and use cases. Prometheus is often preferred for cloud-native environments and microservices, while OpenTSDB excels in handling high-cardinality data and long-term storage. The choice between them depends on specific project requirements and infrastructure.

influxdb

29,932

Scalable datastore for metrics, events, and real-time analytics

Pros of InfluxDB

Built-in HTTP API and query language (InfluxQL) for easier data manipulation
Native support for tags and fields, allowing more flexible data modeling
Better out-of-the-box performance for time-series data

Cons of InfluxDB

Less scalable for extremely large datasets compared to OpenTSDB
More resource-intensive, especially for memory usage

Code Comparison

InfluxDB query example:

SELECT mean("value") FROM "cpu_load" WHERE "host" = 'server01' AND time >= now() - 1h GROUP BY time(5m)

OpenTSDB query example:

/api/query?start=1h-ago&m=avg:5m-avg:cpu.load{host=server01}

Additional Notes

Both InfluxDB and OpenTSDB are powerful time-series databases, but they have different strengths. InfluxDB offers a more user-friendly experience with its query language and built-in features, making it suitable for a wide range of applications. OpenTSDB, built on top of HBase, excels in handling extremely large datasets and offers better scalability for big data scenarios. The choice between the two depends on specific project requirements, expected data volume, and desired query flexibility.

grafana

68,692

Pros of Grafana

More versatile and supports multiple data sources, not limited to time-series data
Offers a rich, user-friendly interface with customizable dashboards and alerting
Active development with frequent updates and a large community

Cons of Grafana

Requires additional setup and configuration for data sources
Can be resource-intensive for large-scale deployments
Learning curve for advanced features and query languages

Code Comparison

Grafana (JavaScript):

const panel = new PanelModel({
  type: 'graph',
  title: 'CPU Usage',
  datasource: 'Prometheus',
  targets: [{ expr: 'node_cpu_usage' }]
});

OpenTSDB (Java):

TSQuery query = new TSQuery();
query.setStart("1h-ago");
query.setEnd("now");
query.addSubQuery(new TSSubQuery()
    .setMetric("sys.cpu.user")
    .setAggregator("sum"));

While OpenTSDB focuses on efficient time-series data storage and querying, Grafana provides a more comprehensive visualization and monitoring solution. OpenTSDB excels in handling large volumes of time-series data, but Grafana offers greater flexibility and ease of use for creating dashboards and alerts across various data sources.

graphite-web

5,990

A highly scalable real-time graphing system

Pros of graphite-web

More user-friendly web interface for data visualization
Easier to set up and configure for smaller-scale deployments
Better integration with existing Python-based ecosystems

Cons of graphite-web

Less scalable for extremely large datasets compared to OpenTSDB
Limited support for high-cardinality metrics
Slower query performance for complex aggregations

Code Comparison

graphite-web

from django.conf.urls import url
from . import views

urlpatterns = [
    url('^render/?$', views.renderView, name='render'),
    url('^metrics/find/?$', views.metricsFindView, name='metrics_find'),
]

opentsdb

public class TsdbQuery {
  private final TSDB tsdb;
  private final List<String> tsuids = new ArrayList<String>();
  private final List<String> metrics = new ArrayList<String>();
  private long startTime;
  private long endTime;
}

The code snippets showcase the different approaches and languages used in each project. graphite-web uses Python with Django for URL routing, while OpenTSDB employs Java for its core query functionality.

timescaledb

18,976

A time-series database for high-performance real-time analytics packaged as a Postgres extension

Pros of TimescaleDB

Built on PostgreSQL, offering full SQL support and relational database features
Automatic partitioning and indexing for improved query performance
Seamless integration with existing PostgreSQL tools and ecosystem

Cons of TimescaleDB

Higher resource requirements compared to OpenTSDB
Steeper learning curve for users not familiar with PostgreSQL

Code Comparison

TimescaleDB:

CREATE TABLE metrics (
  time        TIMESTAMPTZ NOT NULL,
  device_id   TEXT,
  temperature DOUBLE PRECISION,
  cpu_usage   DOUBLE PRECISION
);

SELECT time_bucket('1 hour', time) AS hour,
       avg(temperature) AS avg_temp
FROM metrics
WHERE device_id = 'device1'
GROUP BY hour
ORDER BY hour;

OpenTSDB:

tsdb put sys.cpu.user 1356998400 42.5 host=webserver01 cpu=0
tsdb put sys.cpu.user 1356998400 43.2 host=webserver01 cpu=1

tsdb query 1356998400 1356998460 sum sys.cpu.user host=webserver01

TimescaleDB offers a more familiar SQL syntax and relational database features, while OpenTSDB uses a simpler key-value approach for data storage and retrieval. TimescaleDB provides better query flexibility and integration with existing PostgreSQL tools, but may require more resources and have a steeper learning curve compared to OpenTSDB's lightweight and scalable design.

VictoriaMetrics

14,388

VictoriaMetrics: fast, cost-effective monitoring solution and time series database

Pros of VictoriaMetrics

Higher ingestion and query performance, especially for high-cardinality data
Better storage efficiency, resulting in lower operational costs
Native support for multi-tenancy and horizontal scalability

Cons of VictoriaMetrics

Less mature ecosystem compared to OpenTSDB
Fewer third-party integrations and tools available
Steeper learning curve for users familiar with OpenTSDB's architecture

Code Comparison

VictoriaMetrics query example:

sum(rate(http_requests_total{job="api-server"}[5m])) by (method)

OpenTSDB query example:

/api/query?start=1h-ago&m=sum:rate:http.requests{job=api-server}&group_by=method

Both systems support similar query capabilities, but VictoriaMetrics uses PromQL-like syntax, which is more expressive and flexible compared to OpenTSDB's query language.

VictoriaMetrics offers better performance and scalability, making it suitable for large-scale deployments. However, OpenTSDB has a more established ecosystem and may be easier to integrate with existing tools. The choice between the two depends on specific requirements, such as performance needs, scalability, and existing infrastructure.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

   ___                 _____ ____  ____  ____
  / _ \ _ __   ___ _ _|_   _/ ___||  _ \| __ )
 | | | | '_ \ / _ \ '_ \| | \___ \| | | |  _ \
 | |_| | |_) |  __/ | | | |  ___) | |_| | |_) |
  \___/| .__/ \___|_| |_|_| |____/|____/|____/
       |_|    The modern time series database.

OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.

Thanks to HBase's scalability, OpenTSDB allows you to collect thousands of metrics from tens of thousands of hosts and applications, at a high rate (every few seconds). OpenTSDB will never delete or downsample data and can easily store hundreds of billions of data points.

OpenTSDB is free software and is available under both LGPLv2.1+ and GPLv3+. Find more about OpenTSDB at http://opentsdb.net

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot