Scaling kubernetes-client/csharp

This documentation covers production scaling of applications using the Kubernetes Client for C#. It provides a step-by-step guide detailing the implementation of Horizontal Pod Autoscalers (HPAs) to manage scalability in production environments.

Overview

Production scaling allows applications to handle fluctuations in user demand by dynamically adjusting the number of pod replicas. The Kubernetes Horizontal Pod Autoscaler automatically increases or decreases the number of pods based on observed CPU utilization or other select metrics.

Before implementing scaling, ensure your Kubernetes cluster has the necessary metrics server and that your application is built to expose the required metrics.

Prerequisites

Ensure the following prerequisites are met:

Access to a Kubernetes cluster.
Appropriate permissions to create or update resources.
Metrics server installed and running in the cluster.

Step 1: Define Horizontal Pod Autoscaler

The first step in production scaling is to define the Horizontal Pod Autoscaler. Below is a C# code example that demonstrates how to create an HPA using the Kubernetes C# client.

Code Example

using k8s;
using k8s.Models;
using System;
using System.Threading.Tasks;

public class HpaExample {
    public static async Task Main(string[] args) {
        var config = KubernetesClientConfiguration.BuildConfigFromConfigFile();
        IKubernetes client = new Kubernetes(config);

        var hpa = new V2HorizontalPodAutoscaler {
            ApiVersion = "autoscaling/v2",
            Kind = "HorizontalPodAutoscaler",
            Metadata = new V1ObjectMeta {
                Name = "example-hpa",
                NamespaceProperty = "default"
            },
            Spec = new V2HorizontalPodAutoscalerSpec {
                ScaleTargetRef = new V2CrossVersionObjectReference {
                    ApiVersion = "apps/v1",
                    Kind = "Deployment",
                    Name = "example-deployment"
                },
                MinReplicas = 1,
                MaxReplicas = 10,
                Metrics = new List<V2MetricSpec> {
                    new V2MetricSpec {
                        Type = "Resource",
                        Resource = new V2ResourceMetricSource {
                            Name = "cpu",
                            Target = new V2MetricTarget {
                                Type = "Utilization",
                                AverageUtilization = 50
                            }
                        }
                    }
                }
            }
        };

        // Create the HPA
        var createdHpa = await client.CreateNamespacedHorizontalPodAutoscalerAsync(hpa, "default");
        Console.WriteLine($"HPA {createdHpa.Metadata.Name} created.");
    }
}

Explanation

ScaleTargetRef: Points to the deployment to be scaled.
MinReplicas: The minimum number of pods that the HPA can scale down to.
MaxReplicas: The maximum number of pods that the HPA can scale up to.
Metrics: Defines what metrics the autoscaler should monitor and the thresholds for triggering scaling.

Step 2: Monitor Metrics

Metrics allow the autoscaler to make informed scaling decisions. Kubernetes typically uses CPU and memory metrics, but custom metrics can also be used. Ensure your application is instrumented to expose the relevant metrics.

Relevant resources can be found here:

Metrics Collection

Exposing Metrics

Using built-in .NET metrics can be achieved with HttpClient when properly configured. For example:

using System.Diagnostics;

var source = new ActivitySource("MyCompany.MyProduct.MyComponent");
var activity = source.StartActivity("HttpClient Request");
// Example HttpClient call
var response = await httpClient.GetAsync("https://your-api-endpoint");
activity.Stop();

Refer to this documentation for more details on built-in metrics: Built-in .NET Metrics.

Step 3: Test the HPA

After deploying your HPA, it is crucial to test its functionality. This can often be achieved by simulating high traffic or CPU usage. Monitor the cluster to ensure the HPA scales pods up and down as expected.

Commands

Use the following command to view the status of the HPA:

kubectl get hpa

This command will display the current status, including the number of replicas and metrics being used.

Step 4: Adjust HPA Configuration

Based on insights from monitoring your HPA, you may find that adjustments to your scale policies and metrics are necessary. This can be done by updating the HPA resource:

// Update the existing HPA
var updatedHpa = await client.ReplaceNamespacedHorizontalPodAutoscalerAsync(hpa, "example-hpa", "default");
Console.WriteLine($"HPA {updatedHpa.Metadata.Name} updated.");

Best Practices

Monitor Your Metrics: Continuously observe the metrics being monitored to ensure the scaling thresholds suit workload patterns.
Set Reasonable Constraints: Define minReplicas and maxReplicas conservatively to avoid excessive resource consumption.
Test Before Production: Always perform load testing in a staging environment to fine-tune the HPA configuration.

By following these steps and utilizing the provided code snippets, effective production scaling can be achieved with the Kubernetes Client for C#. Ensure that your applications and infrastructure are monitored regularly to keep performance at optimal levels.