jCodeMunch: Use 80% Less Context

Every time Claude Code needs to understand a function, it reads the entire file. If it needs to check how a module’s API works, it reads every file in the directory. On a large codebase, a single exploratory task can burn through 200,000+ tokens before any real work starts.

jCodeMunch fixes this by building a searchable symbol index of your code using tree-sitter AST parsing, then serving it over MCP. Instead of reading whole files, agents request the exact symbol they need - a single function, class, or type definition - and get back just that code.

How it works

jCodeMunch indexes a repository in three steps:

  1. It walks the file tree and filters out binaries, secrets, and irrelevant files
  2. It parses each source file with tree-sitter to extract symbols - functions, classes, methods, types, constants
  3. It stores the results as a JSON index alongside raw files in ~/.code-index/

Each symbol gets a stable ID in the format {file_path}::{qualified_name}#{kind}. When an agent needs a symbol, jCodeMunch uses byte-offset seeking to extract just that symbol from the file. No loading the full file into context.

The MCP server exposes 11 tools. The ones I use most:

  • get_symbol and get_symbols - retrieve specific functions or types by ID
  • get_file_outline - see all symbols in a file without reading the file body
  • get_repo_outline - understand project structure at the symbol level
  • search_symbols - find symbols by name across the codebase

Token savings in practice

jCodeMunch includes token accounting in every response. The numbers are concrete:

TaskTraditional approachjCodeMunch
Find a function~40,000 tokens~200 tokens
Understand a module’s API~15,000 tokens~800 tokens
Explore repo structure~200,000 tokens~2,000 tokens

On a real codebase benchmark (geekcomputers/Python repo), jCodeMunch measured roughly 80% fewer tokens - around 5x efficiency.

The savings compound. An agent exploring a codebase to plan a refactor might read dozens of files. With jCodeMunch, it reads outlines first, identifies the relevant symbols, and pulls only those.

Setting it up

Install via pip:

pip install jcodemunch-mcp

For Claude Code, add it to your MCP config with uvx jcodemunch-mcp. Then index a repo:

index_repo owner/repo-name

Or index a local folder:

index_folder /path/to/project

It supports 15+ languages - Python, TypeScript, Rust, Go, Java, and more. Each language has tree-sitter grammar support for extracting the right symbol types.

Where it fits

jCodeMunch is most useful on large, multi-module codebases where reading everything is impractical. Architecture exploration, agent-driven refactors, and onboarding to unfamiliar projects are where the token savings matter most.

It’s not a replacement for LSP. It doesn’t do diagnostics, type checking, or real-time editing support. It’s a read-only index optimised for AI agent retrieval patterns.

The project is free for non-commercial use. Commercial licenses are available if you’re building it into a product.

The broader point

The context window is finite even as model capabilities improve. Tools that reduce how many tokens you spend on reading code leave more room for the model to reason about the actual task.

jCodeMunch takes a straightforward approach - parse the AST, index the symbols, serve them on demand. It addresses the biggest source of token waste in agent workflows.