thanos
Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
Top Related Projects
The Prometheus monitoring system and time series database.
Like Prometheus, but for logs.
Scalable datastore for metrics, events, and real-time analytics
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
A horizontally scalable, highly available, multi-tenant, long term Prometheus.
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
Quick Overview
Thanos is an open-source project that extends Prometheus, providing highly available and long-term storage capabilities for metrics. It allows for seamless querying across multiple Prometheus instances and offers global view of metrics, making it ideal for large-scale monitoring systems.
Pros
- Enables long-term storage of Prometheus metrics
- Provides a global query view across multiple Prometheus instances
- Supports high availability and horizontal scalability
- Offers downsampling and compaction features for efficient data management
Cons
- Increases complexity compared to standalone Prometheus
- Requires additional infrastructure and resources
- Learning curve for setup and configuration
- Potential performance overhead for smaller deployments
Code Examples
- Querying metrics using Thanos Query API:
import (
"context"
"fmt"
"github.com/thanos-io/thanos/pkg/store/storepb"
)
func queryMetrics(ctx context.Context, client storepb.StoreClient, query string) error {
resp, err := client.Series(ctx, &storepb.SeriesRequest{
Query: query,
})
if err != nil {
return err
}
for _, series := range resp.Series {
fmt.Printf("Series: %v\n", series)
}
return nil
}
- Configuring Thanos Sidecar for Prometheus:
sidecar:
- --tsdb.path=/path/to/prometheus/data
- --prometheus.url=http://prometheus:9090
- --objstore.config-file=/path/to/bucket_config.yaml
- Setting up Thanos Compactor:
compactor:
- --data-dir=/var/thanos/compactor
- --objstore.config-file=/etc/thanos/bucket_config.yaml
- --retention.resolution-raw=30d
- --retention.resolution-5m=90d
- --retention.resolution-1h=1y
Getting Started
To get started with Thanos:
-
Install Thanos binary:
go get github.com/thanos-io/thanos/cmd/thanos
-
Configure Prometheus with Thanos Sidecar:
prometheus: external_labels: cluster: demo storage: tsdb: path: /path/to/data retention: 24h
-
Run Thanos Sidecar alongside Prometheus:
thanos sidecar --tsdb.path=/path/to/data --prometheus.url=http://localhost:9090 --objstore.config-file=bucket_config.yaml
-
Set up Thanos Query to access multiple Prometheus instances:
thanos query --http-address 0.0.0.0:19192 --store <sidecar1_address> --store <sidecar2_address>
For more detailed setup instructions, refer to the official Thanos documentation.
Competitor Comparisons
The Prometheus monitoring system and time series database.
Pros of Prometheus
- Simpler setup and configuration for single-instance monitoring
- Lower resource requirements for smaller-scale deployments
- Native integration with Kubernetes and cloud-native ecosystems
Cons of Prometheus
- Limited long-term storage capabilities
- Challenges with horizontal scaling for large-scale deployments
- Lack of built-in high availability features
Code Comparison
Prometheus configuration example:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Thanos configuration example:
type: FILESYSTEM
config:
directory: "/path/to/data"
objstore:
type: S3
config:
bucket: "thanos"
endpoint: "s3.amazonaws.com"
access_key: "..."
secret_key: "..."
The code examples highlight the difference in configuration complexity. Prometheus has a simpler setup for basic monitoring, while Thanos requires additional configuration for object storage and distributed components.
Thanos builds upon Prometheus, extending its capabilities for long-term storage, high availability, and global query view. It's particularly useful for large-scale, multi-cluster environments where Prometheus alone may face limitations.
Like Prometheus, but for logs.
Pros of Loki
- Designed specifically for log aggregation, making it more efficient for log data
- Simpler setup and lower resource requirements
- Better integration with Grafana for visualization
Cons of Loki
- Limited to log data, while Thanos can handle various metric types
- Less mature ecosystem compared to Prometheus-based solutions
- May require additional components for advanced querying and analysis
Code Comparison
Loki query example:
{app="myapp"} |= "error" | json | line_format "{{.message}}"
Thanos query example (using PromQL):
sum(rate(http_requests_total{job="api-server"}[5m])) by (method)
Key Differences
- Loki focuses on log aggregation, while Thanos extends Prometheus for long-term storage and global querying
- Loki uses a unique index-free approach, whereas Thanos builds on Prometheus' time-series database model
- Thanos provides advanced features like downsampling and compaction, which are not present in Loki
Use Cases
- Choose Loki for centralized log management and analysis
- Opt for Thanos when dealing with metrics at scale or requiring long-term storage of Prometheus data
Both projects are open-source and actively maintained, with strong community support. The choice between them depends on specific monitoring and observability requirements.
Scalable datastore for metrics, events, and real-time analytics
Pros of InfluxDB
- Purpose-built time series database with optimized data storage and querying
- Integrated data collection, visualization, and alerting capabilities
- Flexible data retention policies and continuous queries for data management
Cons of InfluxDB
- Limited horizontal scalability compared to Thanos' distributed architecture
- Lacks long-term storage capabilities without additional components
- More complex setup for multi-node clusters
Code Comparison
InfluxDB query example:
SELECT mean("value") FROM "cpu_usage"
WHERE time >= now() - 1h
GROUP BY time(5m)
Thanos query example (using PromQL):
avg_over_time(cpu_usage[1h]) / 5m
Key Differences
- InfluxDB is a standalone time series database, while Thanos extends Prometheus for long-term storage and global querying
- InfluxDB uses its own query language (InfluxQL or Flux), whereas Thanos uses PromQL
- Thanos focuses on high availability and unlimited retention for Prometheus metrics, while InfluxDB provides a more general-purpose time series solution
Both projects offer powerful solutions for time series data management, with InfluxDB excelling in single-node deployments and integrated features, while Thanos shines in distributed Prometheus environments with long-term storage requirements.
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
Pros of VictoriaMetrics
- Simpler architecture and easier to set up
- Lower resource consumption and better performance
- Native multi-tenancy support
Cons of VictoriaMetrics
- Less mature ecosystem and community compared to Thanos
- Fewer integrations with other tools in the observability stack
- Limited support for advanced querying features
Code Comparison
VictoriaMetrics configuration example:
storageDataPath: /victoria-metrics-data
retentionPeriod: 1
Thanos configuration example:
type: FILESYSTEM
config:
directory: /var/thanos/store
Both projects aim to provide scalable, long-term storage solutions for Prometheus metrics, but they differ in their approach and feature set. VictoriaMetrics offers a more streamlined, single-binary solution with excellent performance characteristics, while Thanos provides a more modular architecture with a focus on high availability and seamless integration with existing Prometheus deployments.
VictoriaMetrics is well-suited for organizations looking for a simpler, high-performance solution, especially those with multi-tenancy requirements. Thanos, on the other hand, excels in complex, distributed environments where advanced querying capabilities and extensive ecosystem integration are crucial.
A horizontally scalable, highly available, multi-tenant, long term Prometheus.
Pros of Cortex
- Offers multi-tenancy support out of the box
- Provides a more comprehensive query frontend with caching capabilities
- Integrates well with cloud-native storage solutions like S3, GCS, and Azure Blob Storage
Cons of Cortex
- Generally more complex to set up and operate compared to Thanos
- Requires more resources to run effectively, especially for smaller deployments
- Less flexible in terms of deployment options, as it's designed primarily for cloud environments
Code Comparison
Cortex configuration example:
storage:
engine: blocks
azure:
account_name: my_account
account_key: my_key
container_name: cortex
Thanos configuration example:
objstore:
type: GCS
config:
bucket: thanos-store
service_account: service-account.json
Both projects aim to provide scalable, long-term storage solutions for Prometheus metrics, but they approach the problem differently. Cortex offers a more integrated, cloud-native solution with built-in multi-tenancy, while Thanos provides a more modular approach that can be easier to adopt incrementally. The choice between the two often depends on specific use cases, existing infrastructure, and scalability requirements.
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
Pros of TimescaleDB
- Specialized for time-series data, offering optimized performance for time-based queries
- Seamless integration with PostgreSQL, leveraging existing ecosystem and tools
- Built-in functions for time-series analysis and data retention policies
Cons of TimescaleDB
- Limited to PostgreSQL, not as flexible for multi-database environments
- Requires more setup and configuration compared to Thanos' out-of-the-box solution
- May have higher resource requirements for large-scale deployments
Code Comparison
TimescaleDB (SQL query):
SELECT time_bucket('1 hour', time) AS hour,
avg(temperature) AS avg_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY hour
ORDER BY hour;
Thanos (PromQL query):
avg_over_time(temperature[1h])
TimescaleDB focuses on SQL-based time-series analysis within PostgreSQL, while Thanos extends Prometheus for long-term storage and querying of metrics data. TimescaleDB offers more advanced time-series functionality but is limited to PostgreSQL, whereas Thanos provides a scalable solution for Prometheus metrics across multiple data sources.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
ð¢ ThanosCon happened on 19th March 2024 as a co-located half-day on KubeCon EU in Paris.
Overview
Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.
Thanos is a CNCF Incubating project.
Thanos leverages the Prometheus 2.0 storage format to cost-efficiently store historical metric data in any object storage while retaining fast query latencies. Additionally, it provides a global query view across all Prometheus installations and can merge data from Prometheus HA pairs on the fly.
Concretely the aims of the project are:
- Global query view of metrics.
- Unlimited retention of metrics.
- High availability of components, including Prometheus.
Getting Started
Features
- Global querying view across all connected Prometheus servers
- Deduplication and merging of metrics collected from Prometheus HA pairs
- Seamless integration with existing Prometheus setups
- Any object storage as its only, optional dependency
- Downsampling historical data for massive query speedup
- Cross-cluster federation
- Fault-tolerant query routing
- Simple gRPC "Store API" for unified data access across all metric data
- Easy integration points for custom metric providers
Architecture Overview
Deployment with Sidecar for Kubernetes:
Deployment with Receive in order to scale out or implement with other remote write compatible sources:
Thanos Philosophy
The philosophy of Thanos and our community is borrowing much from UNIX philosophy and the golang programming language.
- Each subcommand should do one thing and do it well
- e.g. thanos query proxies incoming calls to known store API endpoints merging the result
- Write components that work together
- e.g. blocks should be stored in native prometheus format
- Make it easy to read, write, and, run components
- e.g. reduce complexity in system design and implementation
Releases
Main branch should be stable and usable. Every commit to main builds docker image named main-<date>-<sha>
in quay.io/thanos/thanos and thanosio/thanos dockerhub (mirror)
We also perform minor releases every 6 weeks.
During that, we build tarballs for major platforms and release docker images.
See release process docs for details.
Contributing
Contributions are very welcome! See our CONTRIBUTING.md for more information.
Community
Thanos is an open source project and we value and welcome new contributors and members of the community. Here are ways to get in touch with the community:
- Slack: #thanos
- Issue Tracker: GitHub Issues
Adopters
See Adopters List
.
Maintainers
See MAINTAINERS.md
Top Related Projects
The Prometheus monitoring system and time series database.
Like Prometheus, but for logs.
Scalable datastore for metrics, events, and real-time analytics
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
A horizontally scalable, highly available, multi-tenant, long term Prometheus.
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot