Getting more from each token: How Copilot improves context handling and model routing

As Copilot takes on more agentic work, from planning and editing to debugging, reviewing, and calling tools across longer sessions, efficiency means more than using fewer tokens. It means being smarter about how you use them.

Increasing efficiency starts with reducing what Copilot has to repeat from turn to turn, including context, tool definitions, and cached state. It continues with choosing the right model for the job. A quick explanation, a focused edit, and a complex multi-file change should not all be treated the same way.

We are working on both: improving the Copilot harness so more of each session goes toward the task itself, and expanding Auto so Copilot can pick the model that fits the work without asking developers to make that choice every time. This post focuses on harness improvements in GitHub Copilot for VS Code and on ongoing work to expand Auto across Copilot surfaces.

Increased prompt caching and deferred tools

In longer GitHub Copilot sessions in VS Code, the harness prepares a lot of recurring information for the model: instructions, repository context, conversation history, available tools, and the current state of the task. Some of that context is needed. Some of it can be cached, deferred, or loaded only when it becomes relevant.

Increased prompt caching and deferred tools

Getting more from each token: How Copilot improves context handling and model routing

Getting more from each token: How Copilot improves context handling and model routing

Other newsrooms on this story

Related reading

Custom Copilot Agents: Building Domain-Expert AI Teammates with Skills, MCP…

How we made GitHub Copilot CLI more selective about delegation

GitHub Copilot is usage-based now. Here's what that changes for terminal users.

Copilot Cowork: A new way of getting work done | Microsoft 365 Blog

Copilot usage metrics now include more of your active users - GitHub Changelog

Larger context windows and configurable reasoning levels for GitHub Copilot -…

Related reading

Custom Copilot Agents: Building Domain-Expert AI Teammates with Skills, MCP…

How we made GitHub Copilot CLI more selective about delegation

GitHub Copilot is usage-based now. Here's what that changes for terminal users.

Copilot Cowork: A new way of getting work done | Microsoft 365 Blog

Copilot usage metrics now include more of your active users - GitHub Changelog

Larger context windows and configurable reasoning levels for GitHub Copilot -…

Other newsrooms on this story