mcpproxy benchmark

Token cost of loading tools into an agent's context, by routing mode.

Corpus corpus_v1 · 45 tools · encoding cl100k_base

Mode	Tools in context	Context tokens	Savings vs. baseline
`baseline`	45	1730	—
`retrieve_tools`	10	1431	17.3%
`code_execution`	6	986	43.0%

Methodology notes

Token counts use the tiktoken cl100k_base encoding as a reproducible, model-agnostic estimator; exact counts for a pinned model may differ.
Proxy-mode tools are the full per-mode catalog derived from the live server builders (internal/server.ProxyModeToolDefs), including the shared management tool set (upstream_servers, quarantine_security, search_servers, list_registries).
Counts tool name + description only; JSON input schemas are excluded uniformly from both the baseline and the proxy modes, so this is a name+description-only metric (not unambiguously conservative). See bench/README.md for the live run with full schemas.
Corpus is the frozen Spec 065 snapshot (specs/065-evaluation-foundation/datasets/corpus_v1.tools.json); see bench/README.md for the live run with full schemas.