failed to garbage collect required amount of images. Wanted to free 473842483 bytes, but freed 0 bytes

Issue Summary: Kubernetes Garbage Collection Failing to Free Disk Space

Issue Details

Title: Failed to garbage collect required amount of images
Reported By: samuela
Date: December 8, 2018
Kubernetes Version: 1.10.7 (Client and Server)
Environment: GKE; nodes using Container-Optimized OS; minimum disk size set to 10 GB

Description

The kubelet on GKE nodes has repeatedly failed to free necessary disk space through image garbage collection (GC). The kubelet logs indicate a persistent issue, attempting to free millions of bytes, but failing to reclaim any space. The specific error messages include:

“failed to garbage collect required amount of images. Wanted to free X bytes, but freed 0 bytes.”

This has resulted in the eviction of several pods due to disk pressure, as observed in the eviction warnings and kubelet events.

Expected Behavior

Image GC should successfully reclaim disk space, or the system should prevent scheduling new pods on nodes with insufficient disk space.

Reproduction Steps

Deploy and delete multiple pods on a node to observe disk pressure.
Monitor the kubelet’s garbage collection attempts and related events.

Additional Context

The issue seems reproducible on other environments, with users on GKE, AWS, and AKS reporting similar garbage collection failures and disk pressure incidents.
Cordon and draining nodes temporarily relieves pressure, but the root cause remains unresolved.
The problem may relate to improper configuration or thresholds in GC, potential bugs in the kubelet’s image management logic, or the size limitations encountered with underpowered node disks.

Recent Activity

Users are encouraged to investigate kubelet logs for insights.
Contributors have been assigned and discussions continue regarding root causes and possible solutions.

Labels

kind/bug
sig/node
good first issue
help wanted
triage/accepted

This issue remains active, with ongoing efforts to investigate and resolve the GC inefficiencies on GKE. Advanced developers are welcome to contribute to finding a solution, considering the overall impact on Kubernetes reliability and performance.