Skip to main content
WardenOpen-source AI scannerExplore →
THE AI GOVERNANCE STACK

You're running AI agents. Here's how to make that safe.

Five products. One event stream. Complete governance from discovery to chaos testing.

WHY ONE PLATFORM

One event. Five lenses.

Every AI tool call generates a ToolGuardEvent — a structured record of what was requested, what happened, and what was decided. That single event flows through every product in the stack. No data silos. No integration glue. One stream of truth.

  • ·Warden reads governance_score to benchmark your posture
  • ·SharkRouter generates the event — policy_verdict , pii_detected, every field
  • ·Inspect correlates agent across all events to build the census
  • ·Assurance checks verified — did the output actually do what it claimed?
  • ·Gulliver forges adversarial events to see if chaos_tested holds under attack
ToolGuardEvent
{
  "event_id": "tge_8f2a...",
  "timestamp": "2026-04-08T14:23:01Z",
  "agent": "sales-copilot",
  "tool": "database_query",
  "action": "SELECT * FROM customers WHERE region = 'EU'",
  "pii_detected": ["email", "phone"],
  "policy_verdict": "ALLOW_WITH_REDACTION",
  "governance_score": 91,
  "verified": true,
  "chaos_tested": true
}

"What do I have?"

OPEN SOURCEWarden

Before you govern anything, you need to know what’s running. Warden scans your AI environment in 30 seconds and scores it across 17 governance dimensions. No API key. No signup. No SharkRouter required.

17
Dimensions
0–100
Score Range
MIT
License
pip install warden-ai
Install
  • ·MCP server inventory — discovers every tool your agents can access
  • ·Policy gap analysis — identifies ungoverned actions and tools
  • ·Environment exposure — finds leaked secrets and API keys
  • ·Benchmark scoring — compares your stack against 17 market vendors

Replaces: Manual security audits and spreadsheet-based compliance checklists

$ pip install warden-ai
$ warden scan

  Warden v4.5 — AI Governance Scanner
  ══════════════════════════════════

  Scanning 17 governance dimensions...

  CORE GOVERNANCE        ██████████░░  22/25
  ADVANCED CONTROLS      █████████░░░  12/15
  ECOSYSTEM              ████████░░░░   8/10
  UNIQUE CAPABILITIES    █████████░░░   9/10

  ─────────────────────────────────────
  TOTAL SCORE            91 / 100  ★★★★★
  ─────────────────────────────────────

  → 2 gaps found. Run `warden fix` for remediation.

Great — you know exactly what’s running. Which raises an uncomfortable question: who’s controlling all of it?

"How do I control it?"

THE CORESharkRouter

Seven deterministic layers between your agents and the AI providers. Every tool call passes through PII detection, policy enforcement, cost tracking, and cryptographic audit — in under 50ms.

7
Layers
57
PII Entities
100+
Providers
<50ms
Latency
  • ·ToolGuard — pre-execution and post-execution policy enforcement
  • ·PII Shield — 57 entity types detected and redacted in real-time
  • ·Semantic Router — intelligent model selection based on task complexity
  • ·Cryptographic Audit — WORM logs with hash-chain integrity

Replaces: Homegrown proxy scripts, manual API key rotation, and “trust the model” policies

# Agent calls a tool → SharkRouter intercepts
event = ToolGuardEvent(
    agent="sales-copilot",
    tool="database_query",
    action="SELECT * FROM customers"
)

# 7 layers of governance in <50ms
result = sharkrouter.enforce(event)
# → PII redacted: email, phone
# → Policy: ALLOW_WITH_REDACTION
# → Audit: logged to chain #4,891

Every call is governed. But governance only covers the agents you know about — what about the ones that never passed through your gateway?

"Can I see what happened?"

AGENT FORENSICSShark Inspect

ToolGuard sees the agents passing through it. But what about shadow AI? Rogue API calls? Third-party SaaS agents your team signed up for last Tuesday? Inspect watches the entire building — governed and ungoverned.

5
Agent Types
6
Discovery Methods
SIEM
Integration
MITRE ATT&CK
Mapping
  • ·Agent census — discovers every AI agent, classifies as governed or ungoverned
  • ·Behavioral forensics — correlates cross-agent behavior, detects anomalies
  • ·Hallucination vs. attack — 3-way confidence scoring discriminates causes
  • ·Compliance evidence — death certificates, causal chains, audit passports

Replaces: Splunk queries + custom dashboards + hoping someone notices the shadow AI

$ shark-inspect census

  Agent Census — 2026-04-08 14:30 UTC
  ══════════════════════════════════════

  DISCOVERED AGENTS:  23
  ├─ Governed (SharkRouter):  19  ✓
  ├─ Ungoverned:               3  ⚠
  └─ Shadow AI:                1  ✗

  ⚠ ALERT: 4 agents operating outside governance
  ┌─────────────────────────────────────────────┐
  │ sales-gpt-3    │ Ungoverned │ 847 calls/day │
  │ hr-assistant   │ Ungoverned │ 203 calls/day │
  │ intern-bot     │ Shadow AI  │  12 calls/day │
  │ test-agent-v2  │ Ungoverned │   4 calls/day │
  └─────────────────────────────────────────────┘

  → Run `shark-inspect govern sales-gpt-3` to bring under policy

Now you can see every agent — governed or not. But visibility is just data. When an agent claims it completed a task, how do you know it’s telling the truth?

"Did it actually work?"

OUTPUT VERIFICATIONShark Assurance

An agent says it fixed accessibility on 11 pages. Your linter passes. Your build is green. But did it actually fix anything? Assurance doesn’t check code — it checks outcomes. 8 independent strategies that prove AI output actually works.

8
Strategies
3
Surfaces
Pass/Fail
Verdicts
CI/CD native
Integration
  • ·Page walker — renders pages and inspects actual UI state
  • ·API prober — validates endpoints return correct data
  • ·Behavioral comparator — verifies state changes match intent
  • ·Interaction verifier — clicks buttons, fills forms, proves functionality

Replaces: Green CI badges, “looks good to me” reviews, and hoping the agent didn’t lie

$ shark-assurance verify --intent "Fix accessibility on 11 pages"

  Verifying 4 claims...

  CLAIM 1: "ARIA labels added to all forms"
  ├─ Strategy: page_walker
  ├─ Pages checked: 11
  ├─ Result: 2/11 modified, 9 UNTOUCHED
  └─ Verdict: ✗ FAIL

  CLAIM 2: "Skip-to-content links added"
  ├─ Strategy: interaction_verifier
  ├─ Links found: 2/11
  └─ Verdict: ✗ FAIL

  ══════════════════════════════════
  OVERALL: ✗ BLOCKED — 2/4 claims failed
  Agent output rejected. Returning to agent.
  ══════════════════════════════════

Your agents are verified. Your outputs are real. But you built all of this assuming your threat model is complete — what if it isn’t?

"Can it be broken?"

CHAOS TESTINGGulliver

Named after Swift’s traveler who discovered that every society has cracks. Gulliver deploys 20–150 autonomous agents — honest, malicious, hallucinating, edge-case — and swarms your governance stack. If there’s a hole, Gulliver finds it before real attackers do. Modeled on attack patterns from Google DeepMind’s Agent Traps research and OWASP Top 10 for LLMs.

150
Agents
7
Scenarios
6
Attack Types
Compliance
Reports
  • ·Scenario engine — YAML-driven presets from quick scan to hostile takeover
  • ·Agent swarm — honest + malicious + hallucinating + edge-case agents
  • ·Confusion matrix — TP/TN/FP/FN scoring per guard, per attack type
  • ·Compliance mapping — EU AI Act, OWASP Top 10 for LLMs coverage reports

Replaces: Annual penetration tests and crossing your fingers between them

$ gulliver swarm --scenario hostile_takeover --agents 150

  Gulliver v1.0 — AI Security Chaos Testing
  ═══════════════════════════════════════════

  Deploying swarm...
  ├─ Honest agents:      87
  ├─ Malicious agents:   42
  └─ Edge-case agents:   21

  Running 7 attack vectors...
  ├─ Prompt injection:     0/42 bypassed  ✓
  ├─ Tool poisoning:       0/42 bypassed  ✓
  ├─ PII exfiltration:     1/42 bypassed  ✗ FOUND
  ├─ Privilege escalation: 0/42 bypassed  ✓
  ├─ Policy circumvention: 0/42 bypassed  ✓
  ├─ Data corruption:      0/42 bypassed  ✓
  └─ Supply chain:         0/42 bypassed  ✓

  ⚠ 1 BREACH FOUND — auto-generating fingerprint...
  → PII exfiltration via base64-encoded email in tool arg
  → Fingerprint FP-0847 created → policy patched
  → Re-run: 0/42 bypassed ✓

  RESULT: 1 gap found, auto-remediated. Score: 98 → 100

Five questions. One platform.

Every CISO asks these questions in sequence. SharkRouter is the only platform that answers all five — with one event stream, one data model, and zero integration glue.

We use cookies for analytics to understand how visitors use our site. No advertising cookies. Privacy Policy