Kubernetes Security Best Practices for Multi-Tenant Clusters

This repository provides a comprehensive approach to implementing Kubernetes security best practices for multi-tenant clusters. The goal is to ensure secure and isolated environments for each tenant, protecting sensitive data and resources.

Tenant Isolation

  • Flux Multi-Tenancy Lockdown: Leverage Flux’s built-in multi-tenancy lockdown features for tenant isolation at the Control Plane level, eliminating the need for external admission controllers like Kyverno.

  • Enforce Controller Restrictions:

    • Cross-Namespace References: Prevent tenants from accessing resources or events in other tenants’ namespaces.
    • Remote Bases: Ensure all resources refer to local files within the tenant’s Git repository, preventing unauthorized access to Kustomize remote bases.
  • Default Service Account:

    • Set a default service account with minimal privileges for kustomize-controller and helm-controller.
    • Each tenant should create a service account with the necessary permissions for its operations, following the principle of least privilege.
  • Service Account Impersonation: Flux impersonates the default service account within the tenant namespace when applying changes, ensuring that the tenant’s service account has the necessary access.

  • Service Account Validation: Implement a validation webhook (e.g., Kyverno or OPA Gatekeeper) to enforce the spec.serviceAccountName field in Kustomization and HelmRelease resources, preventing deployments without a dedicated service account.

Secure Git Repository Integration

  • SOPS Encryption: Utilize Mozilla’s SOPS CLI to encrypt Kubernetes secrets with OpenPGP, Age, or KMS for safe storage in Git repositories.

  • Secret Generation:

    • GPG Key Management:

      • Generate a GPG key without a passphrase for secure storage and retrieval.
      • Create a Kubernetes secret in the flux-system namespace to store the GPG private key.
      • Store the GPG private key securely for disaster recovery.
      • Share the GPG public key with the platform team to enable encryption.
    • SSH Keys: Generate a Kubernetes secret with SSH and known host keys for secure access to Git repositories.

    • Basic Auth Credentials: Generate a Kubernetes secret with basic auth credentials (username and password) for accessing Git repositories, ensuring the token has read-only access.

Image Provenance and Security

  • Image Verification:
    • Kyverno Policies: Implement Kyverno policies (e.g., verify-flux-images) to enforce the use of signed Flux images, ensuring the authenticity and integrity of container images.
    • Attestors: Configure Kyverno policies to verify image attestations using the rekor service from Sigstore (https://rekor.sigstore.dev).

Cluster Bootstrap and Tenant Workload Reconciliations

  • Infrastructure Setup: Define the order of reconciliation for infrastructure components and tenant workloads using Kustomizations within the clusters directory.

  • Kyverno Deployment: Reconcile the Kyverno Custom Resource Definitions and controllers using Kustomizations.

  • Tenant Configuration: Configure tenant namespaces, service accounts, and role bindings.

    • Decrypt Git Credentials: Decrypt tenant Git credentials using the GPG private key.
    • Git Credentials Secret: Create a Kubernetes secret in the tenant namespace to store the Git credentials.
    • Repository Cloning: Clone the tenant repository using the provided credentials.
    • Workload Reconciliation: Apply the tenant’s Kubernetes manifests (from the specified directory within the repository) using the tenant’s service account.

Continuous Integration and Validation

  • GitHub CI Workflows:
    • Test Workflow: Validate Kubernetes manifests and Kustomize overlays with kubeconform.
    • E2E Workflow: Test the multi-tenant setup in a Kubernetes Kind cluster.

Example: Staging Tenant Configuration

apiVersion: kustomize.config.k8s.io/v1beta1
          kind: Kustomization
          namespace: apps
          resources:
          - ../base/dev-team
          patches:
          - path: dev-team-patch.yaml
          

This configuration ensures that the Flux instance running on the staging cluster clones the dev-team repository and reconciles the ./staging directory from the tenant’s repo using the dev-team service account. The dev-team service account is restricted to the apps namespace, ensuring the repository contains Kubernetes objects scoped to this namespace only.

Example: Kyverno Policy for Image Verification

apiVersion: kyverno.io/v1
          kind: ClusterPolicy
          metadata:
            name: verify-flux-images
          spec:
            validationFailureAction: Audit
            background: false
            webhookTimeoutSeconds: 30
            failurePolicy: Fail
            rules:
              - name: verify-cosign-signature
                match:
                  any:
                  - resources:
                      kinds:
                        - Pod
                verifyImages:
                  - imageReferences:
                      - "ghcr.io/fluxcd/source-controller:*"
                      - "ghcr.io/fluxcd/kustomize-controller:*"
                      - "ghcr.io/fluxcd/helm-controller:*"
                      - "ghcr.io/fluxcd/notification-controller:*"
                      - "ghcr.io/fluxcd/image-reflector-controller:*"
                      - "ghcr.io/fluxcd/image-automation-controller:*"
                      - "docker.io/fluxcd/source-controller:*"
                      - "docker.io/fluxcd/kustomize-controller:*"
                      - "docker.io/fluxcd/helm-controller:*"
          

This policy verifies that all Pods utilize images from the Flux team’s official repositories, enforcing the use of trusted and secure images.

Top-Level Directory Explanations

clusters/ - This directory contains configuration and scripts for managing Kubernetes clusters.

clusters/production/ - This directory contains configuration and scripts for managing the production Kubernetes cluster.

clusters/production/flux-system/ - This directory contains configuration and scripts for the FluxCD system in the production cluster.

clusters/staging/ - This directory contains configuration and scripts for managing the staging Kubernetes cluster.

clusters/staging/flux-system/ - This directory contains configuration and scripts for the FluxCD system in the staging cluster.

infrastructure/ - This directory contains infrastructure-related configuration files and scripts.

infrastructure/kyverno-policies/ - This directory contains the actual policy files for Kyverno.

infrastructure/kyverno/ - This directory contains configuration files and scripts for Kyverno, an open-source Kubernetes policy engine.

scripts/ - This directory contains scripts used for various tasks, such as automation and deployment.

tenants/ - This directory contains configuration and scripts for managing tenants, which are separate namespaces or projects within the Kubernetes cluster.

tenants/base/ - This directory contains configuration and scripts for the base tenant.

tenants/base/dev-team/ - This directory contains configuration and scripts for the development team within the base tenant.

tenants/production/ - This directory contains configuration and scripts for the production tenant.

tenants/staging/ - This directory contains configuration and scripts for the staging tenant.