Every time Claude Code needs to understand a function, it reads the entire file. If it needs to check how a module’s API works, it reads every file in the directory. On a large codebase, a single exploratory task can burn through 200,000+ tokens before any real work starts.
jCodeMunch fixes this by building a searchable symbol index of your code using tree-sitter AST parsing, then serving it over MCP. Instead of reading whole files, agents request the exact symbol they need - a single function, class, or type definition - and get back just that code.
How it works
jCodeMunch indexes a repository in three steps:
- It walks the file tree and filters out binaries, secrets, and irrelevant files
- It parses each source file with tree-sitter to extract symbols - functions, classes, methods, types, constants
- It stores the results as a JSON index alongside raw files in
~/.code-index/
Each symbol gets a stable ID in the format {file_path}::{qualified_name}#{kind}. When an agent needs a symbol, jCodeMunch uses byte-offset seeking to extract just that symbol from the file. No loading the full file into context.
The MCP server exposes 11 tools. The ones I use most:
get_symbolandget_symbols- retrieve specific functions or types by IDget_file_outline- see all symbols in a file without reading the file bodyget_repo_outline- understand project structure at the symbol levelsearch_symbols- find symbols by name across the codebase
Token savings in practice
jCodeMunch includes token accounting in every response. The numbers are concrete:
| Task | Traditional approach | jCodeMunch |
|---|---|---|
| Find a function | ~40,000 tokens | ~200 tokens |
| Understand a module’s API | ~15,000 tokens | ~800 tokens |
| Explore repo structure | ~200,000 tokens | ~2,000 tokens |
On a real codebase benchmark (geekcomputers/Python repo), jCodeMunch measured roughly 80% fewer tokens - around 5x efficiency.
The savings compound. An agent exploring a codebase to plan a refactor might read dozens of files. With jCodeMunch, it reads outlines first, identifies the relevant symbols, and pulls only those.
Setting it up
Install via pip:
pip install jcodemunch-mcp
For Claude Code, add it to your MCP config with uvx jcodemunch-mcp. Then index a repo:
index_repo owner/repo-name
Or index a local folder:
index_folder /path/to/project
It supports 15+ languages - Python, TypeScript, Rust, Go, Java, and more. Each language has tree-sitter grammar support for extracting the right symbol types.
Where it fits
jCodeMunch is most useful on large, multi-module codebases where reading everything is impractical. Architecture exploration, agent-driven refactors, and onboarding to unfamiliar projects are where the token savings matter most.
It’s not a replacement for LSP. It doesn’t do diagnostics, type checking, or real-time editing support. It’s a read-only index optimised for AI agent retrieval patterns.
The project is free for non-commercial use. Commercial licenses are available if you’re building it into a product.
The broader point
The context window is finite even as model capabilities improve. Tools that reduce how many tokens you spend on reading code leave more room for the model to reason about the actual task.
jCodeMunch takes a straightforward approach - parse the AST, index the symbols, serve them on demand. It addresses the biggest source of token waste in agent workflows.