Monitoring and Logging for fluxcd/flux2

This document provides a step-by-step guide to monitoring FluxCD/Flux2 in production environments. The focus is on setting up effective monitoring using the new configurations introduced in Flux v2.2, which leverage the kube-prometheus-stack.

Overview

As of Flux v2.2, the Flux monitoring configurations have been revamped. Users are encouraged to transition to the new monitoring setup that employs kube-prometheus-stack, which includes components such as kube-state-metrics to support custom Prometheus metrics related to Flux.

To implement monitoring in a Flux production deployment, follow the structured approach outlined below.

Step 1: Setting Up kube-prometheus-stack

The kube-prometheus-stack packages Prometheus, Grafana, and various exporters including kube-state-metrics. To deploy this stack for monitoring Flux, you can utilize Helm or Kustomize.

Example using Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install kube-prometheus prometheus-community/kube-prometheus-stack

This command installs the kube-prometheus-stack with default configurations. You can customize the configuration as needed in your values.yaml file.

Step 2: Configuring Flux Custom Metrics

To monitor Flux’s state and performance, you must expose custom metrics. Flux v2 supports custom Prometheus metrics that allow you to monitor the health and status of resources managed by Flux.

Create a Custom Metrics Configuration

Create a new YAML file (e.g., flux-metrics.yaml) to define custom metrics.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: flux-metrics
  namespace: flux-system
spec:
  groups:
  - name: flux-metrics
    rules:
    - alert: FluxHelmReleaseFailed
      expr: flux_helm_release_failed_count > 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Flux Helm Release {{ $labels.release }} has failed"
        description: "Flux Helm Release {{ $labels.release }} in namespace {{ $labels.namespace }} has failed"

This example defines a custom alert for failed Helm releases.

Step 3: Deploying Custom Metrics

Apply the custom metrics configuration to your cluster:

kubectl apply -f flux-metrics.yaml

Step 4: Monitoring Resources

Monitoring of the resources managed by Flux can be implemented via the StatusChecker in the codebase to automatically track the state of Kubernetes objects.

Implementation of StatusChecker

The StatusChecker can be leveraged in either production monitoring scripts or integrated into existing CI/CD pipelines. Below is a sample implementation demonstrating how to check the status of resources.

package main

import (
  "context"
  "sort"
  "time"

  "github.com/fluxcd/flux2/v2/pkg/status"
)

func checkResourceStatus(client status.Client, pollInterval time.Duration, timeout time.Duration) error {
  checker, err := status.NewStatusChecker(client.KubeConfig, pollInterval, timeout, logger)
  if err != nil {
    return err
  }

  identifiers := []status.ObjMetadata{
    {Name: "example-resource", Namespace: "flux-system"},
    // Additional resources can be added here.
  }

  return checker.Assess(identifiers...)
}

In this function, the StatusChecker polls the status of the specified resources and logs their readiness using the predefined logger.

Step 5: Visualizing Metrics in Grafana

You can visualize the collected metrics in Grafana. After deploying the kube-prometheus-stack, Grafana will be reachable via the default service URL. You can add the custom dashboards to monitor Flux metrics effectively.

Importing Dashboards

Access Grafana (default URL: http://<grafana-service>:3000).
Navigate to the “Dashboards” section and click on “Import”.
Use a JSON configuration file that contains your desired metrics visualization.

Step 6: Testing Monitoring Configuration

To ensure that the monitoring setup is functioning correctly, you can run the relevant tests. This can be done through the Makefile provided in the FluxCD repository.

Running Tests

Use the test target from the Makefile to run your tests, ensuring you respect the build constraints defined in the project.

make test

This will execute tests that adhere to the constraints such as unit and e2e.

Conclusion

Following these steps, you will have a comprehensive monitoring setup for FluxCD/Flux2 in your production environment. Rely on Prometheus and Grafana to visualize the state of your Flux managed resources, utilize custom metrics for deeper insights, and ensure that your deployments are monitored effectively.

For further details on the new Flux monitoring configurations, refer to the Flux Monitoring Documentation.

Note: Always ensure you are using the configurations compliant with the latest Flux versions as updates may introduce new features or deprecate old ones.