claude-context: Hybrid Search MCP for Big Monorepos
Zilliz shipped the missing piece in every coding-agent workflow — a hybrid BM25 + dense-vector MCP that cuts agent tokens ~40% on real monorepos.
Every coding agent has the same failure mode on a real codebase. You ask it to change how invoices get rendered, and it spends eleven tool calls grepping for the word "invoice" across a monorepo that contains four billing systems, two legacy importers, and a folder called `old/` that nobody has touched since 2023. By the time it finds the right file, you've burned more tokens than the actual edit will cost.
Zilliz — the team behind Milvus, which is the vector database a meaningful chunk of production RAG stacks already runs on — shipped `claude-context` to fix exactly this. It's an MCP server that indexes your repo once and then serves semantic-plus-keyword retrieval to whatever agent you happen to be driving. As of this week it's sitting at 11.6k stars, which is fast even by 2026 MCP-hype standards.
What it actually does
The retrieval is hybrid: BM25 for the keyword half (still the best thing we have when the agent already knows the function name), dense vector embeddings for the semantic half (for when it's asking "where do we handle Stripe webhook retries" and the file is called `payment_resilience.ts`). Results land in Milvus or Zilliz Cloud — your choice on local versus hosted.
Chunking is AST-based, not naive 500-token sliding windows, so a function doesn't get sliced across two chunks and lose its signature. Re-indexing uses Merkle trees, which means changing one file doesn't re-embed the other 99,999. That last bit matters more than the marketing copy makes it sound — naive re-indexers are why a lot of teams gave up on code-search RAG in 2024.
The compatibility list is the actual story
MCP servers live or die on whether they work with the agent you're already using. This one supports an almost suspicious number of clients:
- Claude Code, Claude Desktop, Cursor, VS Code
- Gemini CLI, Codex CLI, Qwen Code
- Windsurf, Cline, Roo Code, Void, Augment, Zencoder
- LangChain / LangGraph for the build-your-own-agent crowd
Fifteen clients on day one means the team is treating this as infrastructure, not a Claude-Code-only loss leader. Languages indexed cover the boring middle of the industry — TypeScript, Python, Go, Rust, Java, C#, C++, Ruby, PHP, Swift, Kotlin, Scala, plus Markdown so your docs land in the same retrieval index as your code.
The number that matters
For an indie hacker burning $40-60/month on Claude usage, 40% is a movie ticket. For a five-person dev shop running ten agent sessions a day across a Next.js monorepo, that's the difference between a $400 monthly bill and a $700 one. And the cost gets uglier as the repo grows, because the agents that try to read-everything scale linearly with file count.
The actual opinion
MCP servers in 2026 are mostly two kinds: thin wrappers around an API that already existed (fine, useful, forgettable), and infrastructure that changes the cost curve of doing agent work (rare, durable, worth installing). `claude-context` is the second kind. (If you want to see how Ascero wires this kind of retrieval into our own product stack, the agent pages have the diagrams.)
The pitch isn't "new capability." Your agent could already read files. The pitch is "the same capability, but the read-the-wrong-file-six-times tax disappears." That's a less sexy headline than most VC-backed dev tools, which is probably why a research-lab side project from a vector-DB company is the one that nailed it.
"If you maintain a monorepo and you're not running semantic search in front of your agent by end of 2026, you're paying the grep tax on every session."
Install is the usual `claude mcp add claude-context` plus an embedding-provider key (OpenAI, VoyageAI, Ollama, or Gemini — pick your poison) and either local Milvus or a Zilliz Cloud free tier. Twenty minutes to set up, ships ROI on session one.
Ascero AI. “claude-context: Hybrid Search MCP for Big Monorepos.” May 8, 2026. https://asceroai.com/news/claude-context-mcp-semantic-search
Free to reference with attribution and a link back to this page.