Enhancing Kubernetes Observability with Cilium

Scenario: A developer is working on a Kubernetes environment and wants to enhance observability for better visibility and monitoring. They are specifically interested in deep packet inspection and monitoring. To achieve this goal, they will utilize Cilium’s observability features.

Background: Kubernetes environments can be complex, with numerous applications and services communicating with each other. To effectively manage and monitor these environments, it’s essential to have a comprehensive understanding of the network traffic and application behavior. Traditional observability tools may have limitations, and focusing solely on metrics can limit the visibility and understanding of the system. Instead, a holistic approach that considers the entire system, including the underlying infrastructure, is necessary for effective observability.

Cilium is an open-source framework that extends the power of eBPF (Extended Berkeley Packet Filter) to platform teams for secure cloud native connectivity. It has been chosen as the default network and security layer by managed Kubernetes offerings from major cloud providers like Google Cloud, AWS, and Microsoft Azure. Cilium’s eBPF-powered approach enables a highly efficient and powerful connectivity and security fabric with observability built in as a first-class citizen.

Cilium generates in-kernel eBPF programs based on the identity of the workload. These eBPF programs export observability data to the Grafana Labs LGTM (Loki for logs, Grafana for visualization, Tempo for traces, Mimir for metrics) open source observability stack. This integration provides rich connectivity observability data, which is essential for monitoring and troubleshooting applications on Kubernetes.

Steps:

  1. Install and configure Cilium in your Kubernetes environment. Refer to the official Cilium documentation for installation instructions: https://docs.cilium.io/en/stable/gettingstarted/

  2. Configure Cilium to generate observability data. This can be done by creating a Cilium configuration file that specifies the desired observability features. For example, to enable L7 observability, add the following lines to the Cilium configuration file:

apiVersion: v2.2
kind: CiliumConfig
metadata:
name: cilium
spec:
hubble:
enabled: true
logLevel: info
k8s:
cni:
binDir: /opt/cilium/bin
confDir: /etc/cilium
logLevel: info
networkID: cilium-net
runMode: global
log:
level: info
  1. Verify that Cilium is generating observability data. You can check the Cilium logs to ensure that the eBPF programs are being loaded and that observability data is being exported:
kubectl logs <cilium-pod-name> -n <cilium-namespace>
  1. Configure Grafana Labs LGTM stack to consume the Cilium observability data. This can be done by installing the appropriate LGTM components and configuring them to connect to the Cilium data sources. Refer to the official Grafana Labs documentation for installation and configuration instructions: https://grafana.com/docs/loki/latest/getting-started/

  2. Create Grafana dashboards to visualize the Cilium observability data. Grafana provides a set of pre-built templates for visualizing Cilium data, which can be customized to suit your specific use case. For example, you can create a dashboard to display service-to-service traffic, or a dashboard to monitor application latency and error rates.

Tests:

To verify the solution, perform the following tests:

  1. Check that Cilium is generating observability data by examining the Cilium logs.
  2. Verify that the Grafana Labs LGTM stack is correctly consuming the Cilium observability data by checking the LGTM logs and the Grafana dashboards.
  3. Test the functionality of the Grafana dashboards by simulating traffic between applications and verifying that the dashboards accurately reflect the network behavior.
  4. Perform a stress test on the Kubernetes environment by deploying a large number of applications and verifying that the observability data accurately reflects the network traffic and application behavior.

Conclusion: By utilizing Cilium’s observability features, developers can gain deep insights into the connectivity, security, and performance of applications running on Kubernetes. This holistic approach to observability enables effective monitoring and troubleshooting, ensuring that issues are identified and resolved efficiently.