Kafka compression waste is usually a batch depth problem, not a codec problem. Better batching improves producer compression, which reduces consumer CPU and cross-AZ cost downstream.

In one production deployment, changing batch sizing and linger settings cut the consumer fleet in half and moved compression from under 10% to over 50% - with no codec change. The cause wasn't the codec. It was batch depth.

Why batch depth controls what the codec sees

Kafka producers compress batches, not individual messages. The compression codec sees whatever the producer has accumulated by the time it flushes. linger.ms sets how long the producer waits to accumulate records. batch.size caps how large that accumulation can grow.

Both settings are conservative by default. When per-producer throughput is low - because traffic is light, or because it's spread across too many producer instances - the linger window closes before much data has arrived.