Coverage Model¶

ToolWitness verifies tool outputs through three complementary mechanisms, each with different trust levels and coverage.

Verification Sources¶

MCP Proxy (highest trust)¶

The toolwitness proxy command wraps an MCP server and independently records every tools/call request and response. The agent cannot influence what gets recorded.

Catches: fabrication, embellishment, tool skipping
Coverage: any MCP server wrapped with toolwitness proxy
Trust: highest — proxy observes independently; agent cannot fake the record
Setup: configure in MCP settings (toolwitness proxy -- <server command>)

Self-Report (medium trust)¶

The agent includes raw tool outputs in the tw_verify_response call. ToolWitness runs the same classifier against the self-reported data.

Catches: fabrication, embellishment (accidental — context rot, hallucination, lossy summarization)
Coverage: all tools the agent uses, including Cursor native tools (Read, Shell, Grep, Glob, SemanticSearch)
Trust: medium — the agent reports on itself; cannot catch intentional skipping
Setup: install the full-coverage Cursor rule (toolwitness init --cursor-rule)

SDK Wrapping (highest trust)¶

For custom Python agents, ToolWitnessDetector wraps tool functions directly at the code level.

Catches: fabrication, embellishment, tool skipping
Coverage: any tool function wrapped with the SDK
Trust: highest — interception at the code level; agent cannot bypass
Setup: wrap tool functions with the SDK (see getting-started guide)

What each source can and cannot see¶

Tool type	MCP Proxy	Self-Report	SDK
MCP server tools (filesystem, etc.)	Yes	Yes	N/A
Cursor native Read	No	Yes	N/A
Cursor native Shell	No	Yes	N/A
Cursor native Grep/Glob	No	Yes	N/A
Cursor native Write/StrReplace	No	Skipped (actions, not data)	N/A
Custom Python tool functions	No	N/A	Yes
LangChain/CrewAI tools	Via MCP or SDK adapter	N/A	Yes

Why self-report works¶

Most agent fabrication is accidental, not adversarial:

Context rot — the agent loses track of tool output as the context window fills
Hallucination — the agent confabulates data it never received
Lossy summarization — the agent misrepresents what a tool returned

In all these cases, the agent genuinely believes its response is accurate. Self-report catches these failures because the tool output is still in context when tw_verify_response is called — the agent can accurately pass the raw output even though its summary of that output is wrong.

Self-report does not catch:

Intentional tool skipping — the agent claims it ran a tool but never did (no output to report)
Intentional output falsification — the agent passes fake output (adversarial scenario)

For MCP tools, the proxy provides the independent backstop for these scenarios.

Token cost¶

Full-coverage self-report adds approximately 5-10% more tokens per turn. The agent passes its tool outputs (which it already received) a second time to the verification call.

Scenario	Additional tokens
Read a short file (50 lines)	~500-1,000
Read a long file (500 lines)	~5,000-8,000
Shell command (git status)	~100-200
Grep search (20 results)	~1,000-2,000
Typical turn (2-3 tools)	~1,000-3,000

We recommend sending full tool outputs. Truncation reduces verification accuracy — if the agent's response references data in the truncated section, ToolWitness cannot verify it.

Coverage levels¶

Level	Command	What's verified
Full coverage (recommended)	`toolwitness init --cursor-rule`	All tools: MCP proxy + native self-report
MCP only	`toolwitness init --cursor-rule --minimal`	Only MCP proxy tools

Dashboard indicators¶

The dashboard shows which verification source produced each verdict:

Proxy Verified (green) — independently observed by the MCP proxy
Self-Reported (blue) — agent-reported tool output, verified in memory

When only proxy verifications are present, the dashboard shows a warning banner suggesting full-coverage setup.

Multi-environment support¶

The verification instruction can be delivered through different mechanisms depending on your environment:

Environment	Command	Output
Cursor	`toolwitness init --cursor-rule`	`.cursor/rules/toolwitness-verify.mdc`
Claude Desktop	`toolwitness init --claude-desktop`	System prompt snippet
Any LLM	`toolwitness init --system-prompt`	Generic instruction text