Last week I asked Claude Code to implement something relatively trivial in my codebase. Three turns in, the conversation used up >80K tokens and Claude was still missing some crucial information I'd forgotten to include. That's the loop you fall into without retrieval: paste too little and the agent guesses, paste too much and pay for context the agent isn't using.
Most teams solve this with RAG over the codebase. The typical setup is a vector database plus a custom MCP server process bridging the two. Weaviate simplifies this: the MCP server is built into the database, at /v1/mcp on the same port as the REST API. One env var enables it. The same hybrid search you'd use for any other Weaviate workload powers code retrieval, with the BM25 half keeping function identifiers like connect_to_local matchable and the vector half finding semantic intent like "how do I init a client."
This post walks through building a coding assistant on top of that built-in MCP server: ingest a codebase, ingest its docs, connect Claude Code, Cursor, and VS Code, and run real queries. Topics covered:
Why your coding assistant needs more than its training data
Why Weaviate MCP fits this job











