Thanos is an open-source CNCF Sandbox project that builds upon Prometheus components to create a global-scale and highly available monitoring system. It seamlessly extends Prometheus in a few simple steps and is already used in production by dozens of companies that aim for high multi-cloud scale for metrics while keeping low maintenance cost. One of the key features of Thanos is Global Querying, which provides a unified view of metrics across multiple Prometheus instances. This is achieved through the Thanos Query component, which aggregates data from multiple Prometheus instances and provides a single endpoint for querying metrics.
Global Querying in Thanos provides several options for querying metrics across multiple Prometheus instances:
Querier use cases, why do I need this component? The Thanos Querier is responsible for querying data from multiple Prometheus instances and aggregating the results. It is used to provide a unified view of metrics across multiple Prometheus instances.
Global View The Thanos Querier provides a global view of metrics by aggregating data from multiple Prometheus instances. This is achieved through the use of the Thanos Sidecar, which is deployed alongside each Prometheus instance and is responsible for shipping data to the Thanos Querier.
Run-time deduplication of HA groups The Thanos Querier performs run-time deduplication of HA groups to ensure that metrics are not duplicated in the query results. This is achieved through the use of the Thanos Store API, which provides a unified view of metrics across multiple Prometheus instances.
Metric Query Flow Overview The Thanos Querier uses the Thanos Store API to query data from multiple Prometheus instances. The Store API provides a unified view of metrics across multiple Prometheus instances, allowing the Querier to perform run-time deduplication of HA groups and provide a global view of metrics.
Deduplication The Thanos Querier performs deduplication of metrics at query time to ensure that metrics are not duplicated in the query results. This is achieved through the use of the Thanos Store API, which provides a unified view of metrics across multiple Prometheus instances.
An example with a single replica labels: The Thanos Querier provides a global view of metrics by aggregating data from multiple Prometheus instances. For example, if there are two Prometheus instances with the same set of metrics, the Thanos Querier will aggregate the data from both instances and provide a single set of metrics in the query results.
The same output will be present for this example with multiple replica labels: The Thanos Querier provides a global view of metrics by aggregating data from multiple Prometheus instances, even if the instances have different sets of labels. For example, if there are two Prometheus instances with the same set of metrics but different labels, the Thanos Querier will aggregate the data from both instances and provide a single set of metrics in the query results, with the labels from both instances.
Query API Overview The Thanos Querier provides a Query API that allows users to query metrics from multiple Prometheus instances. The Query API supports both instant and range queries, and provides a unified view of metrics across multiple Prometheus instances.
Partial Response The Thanos Querier supports partial response, which allows it to return query results as soon as they are available, rather than waiting for the entire query to complete. This can improve query performance and reduce query latency.
Deduplication replica labels. The Thanos Querier performs deduplication of replica labels at query time to ensure that metrics are not duplicated in the query results. This is achieved through the use of the Thanos Store API, which provides a unified view of metrics across multiple Prometheus instances.
Deduplication Enabled The Thanos Querier has deduplication enabled by default, which ensures that metrics are not duplicated in the query results. This can improve query performance and reduce query latency.
Auto downsampling The Thanos Querier supports auto downsampling, which allows it to automatically downsample high-cardinality metrics to improve query performance and reduce query latency.
Partial Response Strategy The Thanos Querier supports partial response strategy, which allows it to return query results as soon as they are available, rather than waiting for the entire query to complete. This can improve query performance and reduce query latency.
Custom Response Fields The Thanos Querier supports custom response fields, which allows users to customize the format of the query results.
Expose UI on a sub-path The Thanos Querier supports exposing the UI on a sub-path, which allows users to access the UI from a specific URL.
For more information, please refer to the following resources:
- Thanos Query documentation: https://thanos.io/v0.11/query.md
- Thanos gRPC Query API proposal: https://github.com/thanos-io/thanos/blob/main/docs/proposals-done/202203-grpc-query-api.md
- Thanos Distributed Query Execution proposal: https://github.com/thanos-io/thanos/blob/main/docs/proposals-accepted/202301-distributed-query-execution.md
- Thanos Query Logging proposal: https://github.com/thanos-io/thanos/blob/main/docs/proposals-done/202005-query-logging.md