Skip to content

Coverage Model

ToolWitness verifies tool outputs through three complementary mechanisms, each with different trust levels and coverage.

Verification Sources

MCP Proxy (highest trust)

The toolwitness proxy command wraps an MCP server and independently records every tools/call request and response. The agent cannot influence what gets recorded.

  • Catches: fabrication, embellishment, tool skipping
  • Coverage: any MCP server wrapped with toolwitness proxy
  • Trust: highest — proxy observes independently; agent cannot fake the record
  • Setup: configure in MCP settings (toolwitness proxy -- <server command>)

Self-Report (medium trust)

The agent includes raw tool outputs in the tw_verify_response call. ToolWitness runs the same classifier against the self-reported data.

  • Catches: fabrication, embellishment (accidental — context rot, hallucination, lossy summarization)
  • Coverage: all tools the agent uses, including Cursor native tools (Read, Shell, Grep, Glob, SemanticSearch)
  • Trust: medium — the agent reports on itself; cannot catch intentional skipping
  • Setup: install the full-coverage Cursor rule (toolwitness init --cursor-rule)

SDK Wrapping (highest trust)

For custom Python agents, ToolWitnessDetector wraps tool functions directly at the code level.

  • Catches: fabrication, embellishment, tool skipping
  • Coverage: any tool function wrapped with the SDK
  • Trust: highest — interception at the code level; agent cannot bypass
  • Setup: wrap tool functions with the SDK (see getting-started guide)

What each source can and cannot see

Tool type MCP Proxy Self-Report SDK
MCP server tools (filesystem, etc.) Yes Yes N/A
Cursor native Read No Yes N/A
Cursor native Shell No Yes N/A
Cursor native Grep/Glob No Yes N/A
Cursor native Write/StrReplace No Skipped (actions, not data) N/A
Custom Python tool functions No N/A Yes
LangChain/CrewAI tools Via MCP or SDK adapter N/A Yes

Why self-report works

Most agent fabrication is accidental, not adversarial:

  • Context rot — the agent loses track of tool output as the context window fills
  • Hallucination — the agent confabulates data it never received
  • Lossy summarization — the agent misrepresents what a tool returned

In all these cases, the agent genuinely believes its response is accurate. Self-report catches these failures because the tool output is still in context when tw_verify_response is called — the agent can accurately pass the raw output even though its summary of that output is wrong.

Self-report does not catch:

  • Intentional tool skipping — the agent claims it ran a tool but never did (no output to report)
  • Intentional output falsification — the agent passes fake output (adversarial scenario)

For MCP tools, the proxy provides the independent backstop for these scenarios.

Token cost

Full-coverage self-report adds approximately 5-10% more tokens per turn. The agent passes its tool outputs (which it already received) a second time to the verification call.

Scenario Additional tokens
Read a short file (50 lines) ~500-1,000
Read a long file (500 lines) ~5,000-8,000
Shell command (git status) ~100-200
Grep search (20 results) ~1,000-2,000
Typical turn (2-3 tools) ~1,000-3,000

We recommend sending full tool outputs. Truncation reduces verification accuracy — if the agent's response references data in the truncated section, ToolWitness cannot verify it.

Coverage levels

Level Command What's verified
Full coverage (recommended) toolwitness init --cursor-rule All tools: MCP proxy + native self-report
MCP only toolwitness init --cursor-rule --minimal Only MCP proxy tools

Dashboard indicators

The dashboard shows which verification source produced each verdict:

  • Proxy Verified (green) — independently observed by the MCP proxy
  • Self-Reported (blue) — agent-reported tool output, verified in memory

When only proxy verifications are present, the dashboard shows a warning banner suggesting full-coverage setup.

Multi-environment support

The verification instruction can be delivered through different mechanisms depending on your environment:

Environment Command Output
Cursor toolwitness init --cursor-rule .cursor/rules/toolwitness-verify.mdc
Claude Desktop toolwitness init --claude-desktop System prompt snippet
Any LLM toolwitness init --system-prompt Generic instruction text