thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.

13,471

2,151

13,471

773

View on GitHub

Top Related Projects

prometheus

59,181

The Prometheus monitoring system and time series database.

influxdb

29,932

Scalable datastore for metrics, events, and real-time analytics

VictoriaMetrics

14,388

VictoriaMetrics: fast, cost-effective monitoring solution and time series database

cortex

5,620

A horizontally scalable, highly available, multi-tenant, long term Prometheus.

timescaledb

18,976

A time-series database for high-performance real-time analytics packaged as a Postgres extension

Quick Overview

Thanos is an open-source project that extends Prometheus, providing highly available and long-term storage capabilities for metrics. It allows for seamless querying across multiple Prometheus instances and offers global view of metrics, making it ideal for large-scale monitoring systems.

Pros

Enables long-term storage of Prometheus metrics
Provides a global query view across multiple Prometheus instances
Supports high availability and horizontal scalability
Offers downsampling and compaction features for efficient data management

Cons

Increases complexity compared to standalone Prometheus
Requires additional infrastructure and resources
Learning curve for setup and configuration
Potential performance overhead for smaller deployments

Code Examples

Querying metrics using Thanos Query API:

import (
    "context"
    "fmt"
    "github.com/thanos-io/thanos/pkg/store/storepb"
)

func queryMetrics(ctx context.Context, client storepb.StoreClient, query string) error {
    resp, err := client.Series(ctx, &storepb.SeriesRequest{
        Query: query,
    })
    if err != nil {
        return err
    }
    for _, series := range resp.Series {
        fmt.Printf("Series: %v\n", series)
    }
    return nil
}

Configuring Thanos Sidecar for Prometheus:

sidecar:
  - --tsdb.path=/path/to/prometheus/data
  - --prometheus.url=http://prometheus:9090
  - --objstore.config-file=/path/to/bucket_config.yaml

Setting up Thanos Compactor:

compactor:
  - --data-dir=/var/thanos/compactor
  - --objstore.config-file=/etc/thanos/bucket_config.yaml
  - --retention.resolution-raw=30d
  - --retention.resolution-5m=90d
  - --retention.resolution-1h=1y

Getting Started

To get started with Thanos:

Install Thanos binary:

go get github.com/thanos-io/thanos/cmd/thanos

Configure Prometheus with Thanos Sidecar:

prometheus:
  external_labels:
    cluster: demo
  storage:
    tsdb:
      path: /path/to/data
      retention: 24h

Run Thanos Sidecar alongside Prometheus:

thanos sidecar --tsdb.path=/path/to/data --prometheus.url=http://localhost:9090 --objstore.config-file=bucket_config.yaml

Set up Thanos Query to access multiple Prometheus instances:

thanos query --http-address 0.0.0.0:19192 --store <sidecar1_address> --store <sidecar2_address>

For more detailed setup instructions, refer to the official Thanos documentation.

Competitor Comparisons

prometheus

59,181

The Prometheus monitoring system and time series database.

Pros of Prometheus

Simpler setup and configuration for single-instance monitoring
Lower resource requirements for smaller-scale deployments
Native integration with Kubernetes and cloud-native ecosystems

Cons of Prometheus

Limited long-term storage capabilities
Challenges with horizontal scaling for large-scale deployments
Lack of built-in high availability features

Code Comparison

Prometheus configuration example:

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

Thanos configuration example:

type: FILESYSTEM
config:
  directory: "/path/to/data"

objstore:
  type: S3
  config:
    bucket: "thanos"
    endpoint: "s3.amazonaws.com"
    access_key: "..."
    secret_key: "..."

The code examples highlight the difference in configuration complexity. Prometheus has a simpler setup for basic monitoring, while Thanos requires additional configuration for object storage and distributed components.

Thanos builds upon Prometheus, extending its capabilities for long-term storage, high availability, and global query view. It's particularly useful for large-scale, multi-cluster environments where Prometheus alone may face limitations.

loki

25,277

Like Prometheus, but for logs.

Pros of Loki

Designed specifically for log aggregation, making it more efficient for log data
Simpler setup and lower resource requirements
Better integration with Grafana for visualization

Cons of Loki

Limited to log data, while Thanos can handle various metric types
Less mature ecosystem compared to Prometheus-based solutions
May require additional components for advanced querying and analysis

Code Comparison

Loki query example:

{app="myapp"} |= "error" | json | line_format "{{.message}}"

Thanos query example (using PromQL):

sum(rate(http_requests_total{job="api-server"}[5m])) by (method)

Key Differences

Loki focuses on log aggregation, while Thanos extends Prometheus for long-term storage and global querying
Loki uses a unique index-free approach, whereas Thanos builds on Prometheus' time-series database model
Thanos provides advanced features like downsampling and compaction, which are not present in Loki

Use Cases

Choose Loki for centralized log management and analysis
Opt for Thanos when dealing with metrics at scale or requiring long-term storage of Prometheus data

Both projects are open-source and actively maintained, with strong community support. The choice between them depends on specific monitoring and observability requirements.

influxdb

29,932

Scalable datastore for metrics, events, and real-time analytics

Pros of InfluxDB

Purpose-built time series database with optimized data storage and querying
Integrated data collection, visualization, and alerting capabilities
Flexible data retention policies and continuous queries for data management

Cons of InfluxDB

Limited horizontal scalability compared to Thanos' distributed architecture
Lacks long-term storage capabilities without additional components
More complex setup for multi-node clusters

Code Comparison

InfluxDB query example:

SELECT mean("value") FROM "cpu_usage"
WHERE time >= now() - 1h
GROUP BY time(5m)

Thanos query example (using PromQL):

avg_over_time(cpu_usage[1h]) / 5m

Key Differences

InfluxDB is a standalone time series database, while Thanos extends Prometheus for long-term storage and global querying
InfluxDB uses its own query language (InfluxQL or Flux), whereas Thanos uses PromQL
Thanos focuses on high availability and unlimited retention for Prometheus metrics, while InfluxDB provides a more general-purpose time series solution

Both projects offer powerful solutions for time series data management, with InfluxDB excelling in single-node deployments and integrated features, while Thanos shines in distributed Prometheus environments with long-term storage requirements.

VictoriaMetrics

14,388

VictoriaMetrics: fast, cost-effective monitoring solution and time series database

Pros of VictoriaMetrics

Simpler architecture and easier to set up
Lower resource consumption and better performance
Native multi-tenancy support

Cons of VictoriaMetrics

Less mature ecosystem and community compared to Thanos
Fewer integrations with other tools in the observability stack
Limited support for advanced querying features

Code Comparison

VictoriaMetrics configuration example:

storageDataPath: /victoria-metrics-data
retentionPeriod: 1

Thanos configuration example:

type: FILESYSTEM
config:
  directory: /var/thanos/store

Both projects aim to provide scalable, long-term storage solutions for Prometheus metrics, but they differ in their approach and feature set. VictoriaMetrics offers a more streamlined, single-binary solution with excellent performance characteristics, while Thanos provides a more modular architecture with a focus on high availability and seamless integration with existing Prometheus deployments.

VictoriaMetrics is well-suited for organizations looking for a simpler, high-performance solution, especially those with multi-tenancy requirements. Thanos, on the other hand, excels in complex, distributed environments where advanced querying capabilities and extensive ecosystem integration are crucial.

cortex

5,620

A horizontally scalable, highly available, multi-tenant, long term Prometheus.

Pros of Cortex

Offers multi-tenancy support out of the box
Provides a more comprehensive query frontend with caching capabilities
Integrates well with cloud-native storage solutions like S3, GCS, and Azure Blob Storage

Cons of Cortex

Generally more complex to set up and operate compared to Thanos
Requires more resources to run effectively, especially for smaller deployments
Less flexible in terms of deployment options, as it's designed primarily for cloud environments

Code Comparison

Cortex configuration example:

storage:
  engine: blocks
  azure:
    account_name: my_account
    account_key: my_key
    container_name: cortex

Thanos configuration example:

objstore:
  type: GCS
  config:
    bucket: thanos-store
    service_account: service-account.json

Both projects aim to provide scalable, long-term storage solutions for Prometheus metrics, but they approach the problem differently. Cortex offers a more integrated, cloud-native solution with built-in multi-tenancy, while Thanos provides a more modular approach that can be easier to adopt incrementally. The choice between the two often depends on specific use cases, existing infrastructure, and scalability requirements.

timescaledb

18,976

A time-series database for high-performance real-time analytics packaged as a Postgres extension

Pros of TimescaleDB

Specialized for time-series data, offering optimized performance for time-based queries
Seamless integration with PostgreSQL, leveraging existing ecosystem and tools
Built-in functions for time-series analysis and data retention policies

Cons of TimescaleDB

Limited to PostgreSQL, not as flexible for multi-database environments
Requires more setup and configuration compared to Thanos' out-of-the-box solution
May have higher resource requirements for large-scale deployments

Code Comparison

TimescaleDB (SQL query):

SELECT time_bucket('1 hour', time) AS hour,
       avg(temperature) AS avg_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY hour
ORDER BY hour;

Thanos (PromQL query):

avg_over_time(temperature[1h])

TimescaleDB focuses on SQL-based time-series analysis within PostgreSQL, while Thanos extends Prometheus for long-term storage and querying of metrics data. TimescaleDB offers more advanced time-series functionality but is limited to PostgreSQL, whereas Thanos provides a scalable solution for Prometheus metrics across multiple data sources.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Thanos Logo

ð¢ ThanosCon happened on 19th March 2024 as a co-located half-day on KubeCon EU in Paris.

Overview

Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.

Thanos is a CNCF Incubating project.

Thanos leverages the Prometheus 2.0 storage format to cost-efficiently store historical metric data in any object storage while retaining fast query latencies. Additionally, it provides a global query view across all Prometheus installations and can merge data from Prometheus HA pairs on the fly.

Concretely the aims of the project are:

Global query view of metrics.
Unlimited retention of metrics.
High availability of components, including Prometheus.

Getting Started

Features

Global querying view across all connected Prometheus servers
Deduplication and merging of metrics collected from Prometheus HA pairs
Seamless integration with existing Prometheus setups
Any object storage as its only, optional dependency
Downsampling historical data for massive query speedup
Cross-cluster federation
Fault-tolerant query routing
Simple gRPC "Store API" for unified data access across all metric data
Easy integration points for custom metric providers

Architecture Overview

Deployment with Sidecar for Kubernetes:

Sidecar

Deployment with Receive in order to scale out or implement with other remote write compatible sources:

Receive

Thanos Philosophy

The philosophy of Thanos and our community is borrowing much from UNIX philosophy and the golang programming language.

Each subcommand should do one thing and do it well
- e.g. thanos query proxies incoming calls to known store API endpoints merging the result
Write components that work together
- e.g. blocks should be stored in native prometheus format
Make it easy to read, write, and, run components
- e.g. reduce complexity in system design and implementation

Releases

Main branch should be stable and usable. Every commit to main builds docker image named main-<date>-<sha> in quay.io/thanos/thanos and thanosio/thanos dockerhub (mirror)

We also perform minor releases every 6 weeks.

During that, we build tarballs for major platforms and release docker images.

See release process docs for details.

Contributing

Contributions are very welcome! See our CONTRIBUTING.md for more information.

Community

Thanos is an open source project and we value and welcome new contributors and members of the community. Here are ways to get in touch with the community:

Slack: #thanos
Issue Tracker: GitHub Issues

Adopters

See Adopters List.

Maintainers

See MAINTAINERS.md

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot