CLI Reference¶

ToolWitness provides a command-line interface for inspecting verification results, running reports, and managing configuration.

Commands¶

`toolwitness check`¶

Show recent verification results.

toolwitness check                                  # All recent results
toolwitness check --last 10                        # Last 10 results
toolwitness check --classification fabricated      # Filter by classification
toolwitness check --fail-if "failure_rate > 0.05"  # CI gate mode
toolwitness check --fail-if "fabricated_count > 0" # Fail on any fabrication

CI gate: When --fail-if is provided, the command exits with code 1 if the condition is met. Use in CI pipelines to block deployments when agent reliability drops below a threshold.

Option	Description
`--last N`	Show the last N results
`--classification TYPE`	Filter by classification (verified, embellished, fabricated, skipped)
`--fail-if CONDITION`	Exit with code 1 if condition is met

`toolwitness stats`¶

Show per-tool failure rates and classification counts.

toolwitness stats

Output:

Tool              Total  Verified  Fabricated  Skipped  Fail %
get_weather         12        10           1        1   16.7%
search_web           8         8           0        0    0.0%
get_customer         5         3           2        0   40.0%

`toolwitness watch`¶

Live-tail verification results as they happen.

toolwitness watch

Streams new verifications to the terminal in real time. Press Ctrl+C to stop.

`toolwitness report`¶

Generate a verification report.

toolwitness report --format html    # Self-contained HTML report
toolwitness report --format json    # JSON data export

The HTML report includes:

KPI summary cards
Classification breakdown
Session timelines with color-coded nodes
Failure detail cards with evidence
Remediation suggestions
Per-tool failure rates

`toolwitness dashboard`¶

Start the local web dashboard.

toolwitness dashboard                    # Default: localhost:8321
toolwitness dashboard --port 9000        # Custom port
toolwitness dashboard --host 0.0.0.0     # Bind to all interfaces

The dashboard serves:

Overview (/) — KPI cards, classification breakdown, recent verifications
Report (/report) — full HTML report with session timelines and failure details
About (/about) — product information and install instructions
API (/api/verifications, /api/stats, /api/sessions, /api/health)

Auto-refreshes every 5 seconds.

Option	Default	Description
`--host`	`127.0.0.1`	Host to bind to
`--port`	`8321`	Port to listen on

`toolwitness verify`¶

Verify agent text against recent proxy-recorded tool executions. This is the command that closes the MCP proxy gap — the proxy records what tools returned (Conversation 1), and this command compares the agent's text (Conversation 2) against those recordings.

toolwitness verify --text "The file is 6169 bytes, modified March 27"
toolwitness verify --file response.txt --since 10
echo "agent output" | toolwitness verify --file -

Output:

Verified against 2 recent tool execution(s):

  VERIFIED   get_file_info                  confidence=99%
  FABRICATED get_weather                    confidence=78%
    ↳ temp_f: expected=72, found_in_response=False

⚠ Failures detected — agent response may not accurately reflect tool outputs.

Option	Default	Description
`--text TEXT`	—	Agent response text to verify
`--file PATH`	—	File containing the response (use `-` for stdin)
`--since MINUTES`	`5`	Look back window for matching executions
`--no-persist`	off	Don't save results to the database

Pair with the MCP proxy

Run toolwitness proxy to record tool calls, then use toolwitness verify to check if an agent's response accurately reflects what the tools returned. Results appear on the dashboard alongside other verifications.

`toolwitness serve`¶

Start the ToolWitness MCP verification server. This exposes tw_verify_response, tw_recent_executions, and tw_session_stats as MCP tools that agents can call to self-check their responses in real time.

toolwitness serve                    # Default database
toolwitness serve --db /path/to.db   # Custom database

Configure in your MCP host alongside the proxy:

Cursor (~/.cursor/mcp.json)Claude Desktop

{
  "mcpServers": {
    "filesystem-monitored": {
      "command": "/full/path/to/toolwitness",
      "args": ["proxy", "--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/path"]
    },
    "toolwitness": {
      "command": "/full/path/to/toolwitness",
      "args": ["serve"]
    }
  }
}

{
  "mcpServers": {
    "filesystem-monitored": {
      "command": "/full/path/to/toolwitness",
      "args": ["proxy", "--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/path"]
    },
    "toolwitness": {
      "command": "/full/path/to/toolwitness",
      "args": ["serve"]
    }
  }
}

Pair with a Cursor rule to make verification automatic — see examples/cursor-rule-verify.md in the repo.

Option	Default	Description
`--db PATH`	`~/.toolwitness/toolwitness.db`	SQLite database path

Requires the MCP SDK

Install with pip install 'toolwitness[serve]' or pip install mcp.

MCP tools exposed:

Tool	Description
`tw_verify_response`	Verify response text against recent executions. Returns per-tool classifications.
`tw_recent_executions`	List recently recorded tool calls with receipt IDs and output previews.
`tw_session_stats`	Aggregate verification statistics (verified/fabricated/skipped counts).

`toolwitness digest`¶

Generate a verification activity digest for a time period. Designed for daily reports and team notifications.

toolwitness digest                              # Last 24h, text to stdout
toolwitness digest --period 7d --format json    # Last 7 days, JSON
toolwitness digest --send                       # Deliver via Slack/webhook
toolwitness digest --period 1h --format slack   # Last hour, Slack blocks

Output (text):

ToolWitness Digest — last 24h
==================================================

  Total verifications:  47
  Failures:             3
  Failure rate:         6.4%

  Breakdown:
    verified          44
    fabricated         2
    skipped            1

  Top offending tools:
    read_file                           2 failures / 15 total
    get_file_info                       1 failures / 12 total

Cron setup: Schedule with --send to deliver daily reports via configured channels:

# Run at 6pm daily
0 18 * * * /path/to/toolwitness digest --send --period 24h

Requires slack_webhook_url or webhook_url in toolwitness.yaml (or environment variables) for delivery.

Option	Default	Description
`--period DURATION`	`24h`	Time window: `1h`, `24h`, `7d`, etc.
`--format FORMAT`	`text`	Output: `text`, `json`, or `slack`
`--send`	off	Deliver via configured Slack/webhook channels

`toolwitness executions`¶

Show recorded tool executions — especially useful for MCP Proxy users whose tool calls are recorded as executions (not verifications).

toolwitness executions                    # Last 10 executions
toolwitness executions --last 20          # Last 20 executions
toolwitness executions --tool read_file   # Filter by tool name
toolwitness executions --session abc123   # Filter by session ID

Output:

  12:34:56 RECORDED   read_file                      receipt=2e7614db-d6a…  session=e65c6897b7
  12:34:55 RECORDED   list_directory                  receipt=cbb6612c-7a6…  session=e65c6897b7
  12:34:54 ERROR      read_file                       receipt=b94328c7-4ba…  session=e65c6897b7

Option	Description
`--last N`	Show the last N executions (default: 10)
`--tool NAME`	Filter by tool name
`--session ID`	Filter by session ID

Executions vs verifications

toolwitness check shows verifications (VERIFIED, FABRICATED, etc.) from the SDK path. toolwitness executions shows raw tool calls with receipts — this is what the MCP Proxy records. Both are viewable in the dashboard.

`toolwitness proxy`¶

Run as a transparent MCP proxy. Wraps any MCP server to record tool calls for the dashboard and CLI — zero code changes.

toolwitness proxy -- npx -y @modelcontextprotocol/server-filesystem /path/to/folder
toolwitness proxy --db /path/to/custom.db -- python my_server.py
toolwitness proxy --session-id my-session -- npx your-server

The -- separator is required — everything after it is the real MCP server command.

Typical usage: Add to your MCP host config (Cursor, Claude Desktop) so the proxy launches automatically.

Use the full path to toolwitness

MCP hosts don't inherit your shell's PATH. Use which toolwitness to find the full path, then use that in your config.

Cursor: use the global config

Add the server to ~/.cursor/mcp.json (global), not the project-level .cursor/mcp.json. Project-level configs may not load reliably in all Cursor versions. After editing, reload: Cmd+Shift+P → "Developer: Reload Window".

Cursor (~/.cursor/mcp.json)Claude Desktop

{
  "mcpServers": {
    "my-server": {
      "command": "/full/path/to/toolwitness",
      "args": ["proxy", "--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/path"]
    }
  }
}

{
  "mcpServers": {
    "my-server": {
      "command": "/full/path/to/toolwitness",
      "args": ["proxy", "--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/path"]
    }
  }
}

Option	Default	Description
`--db PATH`	`~/.toolwitness/toolwitness.db`	SQLite database path
`--session-id ID`	auto-generated	Custom session identifier

All tool calls are recorded with HMAC-signed receipts and stored locally. View results with toolwitness executions, toolwitness dashboard, or the /api/executions endpoint.

`toolwitness export`¶

Export verification data.

toolwitness export --format json     # JSON export
toolwitness export --format csv      # CSV export

`toolwitness purge`¶

Remove old or demo data from the database. ToolWitness stores all data locally in SQLite — this command helps you manage that data over time.

toolwitness purge --demo              # Remove all demo sessions
toolwitness purge --before 7d         # Remove data older than 7 days
toolwitness purge --source demo       # Remove by source type
toolwitness purge --before 24h --dry-run  # Preview what would be deleted
toolwitness purge --all --yes         # Wipe everything (skip confirmation)

Purge deletes matching sessions and all related data (executions, verifications, alerts, false-positive annotations).

Option	Description
`--demo`	Shorthand for `--source demo`
`--source TYPE`	Remove sessions by source: `demo`, `sdk`, `mcp_proxy`, `test`
`--before DURATION`	Remove sessions older than duration: `24h`, `7d`, `2w`, `30d`
`--all`	Remove everything (requires confirmation)
`--dry-run`	Show what would be deleted without deleting
`-y, --yes`	Skip the confirmation prompt

Session sources

Every session is tagged with a source that identifies how it was created:

sdk — from ToolWitnessDetector in your agent code
mcp_proxy — from the toolwitness proxy MCP wrapper
verification — from the verification bridge (toolwitness verify or tw_verify_response)
demo — from demo/seed scripts
test — from test harnesses

The dashboard shows these as colored badges (Bridge, MCP Proxy, SDK, etc.). Use --source to purge a specific type.

`toolwitness init`¶

Create a configuration file with commented defaults.

toolwitness init                     # Creates toolwitness.yaml

Data Lifecycle¶

ToolWitness stores all data in a local SQLite database at ~/.toolwitness/toolwitness.db. Data accumulates over time as you run agents or use the MCP Proxy.

Dashboard time filter: The dashboard includes a time range dropdown (1h / 24h / 7d / 30d / All) so you can focus on recent data without deleting anything.

Source badges: Sessions in the dashboard show their source (SDK, MCP Proxy, Demo) so you always know what you're looking at.

Cleanup: Use toolwitness purge to remove old data. Common patterns:

After demoing: toolwitness purge --demo removes demo data
Weekly cleanup: toolwitness purge --before 7d keeps the last week
Fresh start: toolwitness purge --all wipes everything

Demo data: The scripts/seed_demo_data.py and scripts/demo_data.py scripts write to demo/toolwitness-demo.db (not your production database). View demo data with toolwitness dashboard --db demo/toolwitness-demo.db.

Global options¶

Option	Description
`--db PATH`	Path to SQLite database (default: `~/.toolwitness/toolwitness.db`)
`--config PATH`	Path to config file (default: `toolwitness.yaml`)
`--verbose`	Enable debug logging
`--help`	Show help for any command

Configuration precedence¶

Environment variables (TOOLWITNESS_*) — highest priority
YAML file (toolwitness.yaml)
Code defaults — lowest priority

Next¶

Getting Started — install and first verification
How It Works — verification engine details

CLI Reference¶

Commands¶

toolwitness check¶

toolwitness stats¶

toolwitness watch¶

toolwitness report¶

toolwitness dashboard¶

toolwitness verify¶

toolwitness serve¶

toolwitness digest¶

toolwitness executions¶

toolwitness proxy¶

toolwitness export¶

toolwitness purge¶

toolwitness init¶

Data Lifecycle¶

Global options¶

Configuration precedence¶

Next¶

`toolwitness check`¶

`toolwitness stats`¶

`toolwitness watch`¶

`toolwitness report`¶

`toolwitness dashboard`¶

`toolwitness verify`¶

`toolwitness serve`¶

`toolwitness digest`¶

`toolwitness executions`¶

`toolwitness proxy`¶

`toolwitness export`¶

`toolwitness purge`¶

`toolwitness init`¶