If you run queue processors, batch workers, or event-driven workloads that sit idle for hours between bursts, you're paying for compute you don't need. Kubernetes HPA can scale these deployments to zero replicas — no KEDA, no Knative, no external controllers required. You need one feature gate, an external metrics source, and about twenty minutes of setup. When the queue is empty your pods disappear, and if you pair this with cluster autoscaler, the nodes disappear too. Real scale-to-zero, using nothing but native Kubernetes primitives.

Quick Reference

Requirement

Detail

Feature gate