Sample Data Handling - jaegertracing/jaeger-lib

The sample package in Jaeger helps manage the sampling of trace data for efficient analysis. Jaeger provides different sampling strategies to control the amount of data collected, reducing storage and processing requirements while still providing valuable information for debugging and performance optimization.

There are several possible options for configuring sampling in Jaeger:

  1. File-based Sampling Configuration: Jaeger collectors can be instantiated with the --sampling.strategies-file option, pointing to a file containing sampling strategies to be served to Jaeger clients. The file can be a JSON file or an HTTP URL, and its contents can be periodically reloaded. If no configuration is provided, Jaeger collectors will return the default probabilistic sampling policy with a probability of 0.001 (0.1%) for all services.

Example strategies.json:

{
"service": {
"name": "my-service",
"tags": {
"sampler.type": "const",
"sampler.param": 1
}
}
}
  1. Remote Sampling: Client SDKs can be configured to use remote sampling configuration, allowing sampling rates to be controlled centrally via Jaeger collectors. In this setup, a sampling strategy configuration is served to the client SDK that describes endpoints and their sampling probabilities.

  2. Adaptive Sampling: Jaeger collectors can dynamically calculate sampling strategies based on traffic. Adaptive sampling stores observed traffic data and computed probabilities in memory or Cassandra.

  3. Environment Variables: Sampling can be configured using environment variables, such as JAEGER_SAMPLER_TYPE, JAEGER_SAMPLER_PARAM, and JAEGER_TAGS.

  4. Client Libraries: Jaeger provides client libraries for various programming languages, allowing developers to initialize Jaeger tracers with customized parameters, such as changing the default sampler or the location of the Jaeger agent.

For more information, refer to the following links: