Context pruning removes low-value tokens before inference to cut LLM costs and improve output. Learn core techniques and where semantic caching fits in.