§ 00/HEIMDALL · 2026

Nothing crosses
without being seen.

AI agents now move money, update health records, and pass work to other agents to get a job done. Nobody is watching what gets handed off along the way. Heimdall is.

Think of it as air traffic control for AI agents. Most tools point a camera at each agent. Heimdall watches what passes between them — and makes one class of attack impossible to even attempt, like a locked door rather than a guard who has to recognise the intruder. Everything else runs against rules you write in plain YAML. Built on Veea Lobster Trap.

Two minutes to install, five to integrate. One docker compose up brings the gateway and dashboard live; one pip install heimdall-sdk (or npm i @heimdall/sdk) and a single hd.delegate(...) call gates every hop. MIT-licensed; self-host or run on your own infra.

Open the dashboard Install in 2 minutes↓Read the architecture

Built with

Veea Lobster Trap·Google Gemini

Layers: 2
Primitives: 6
Verticals: 3
License: MIT

in plain english/30-second primer

What is an AI agent?
Software that uses an AI model (ChatGPT, Gemini, Claude) to decide what to do next — and then actually does it. Sends emails, runs code, signs payments, updates your database. The decide-and-do loop is what makes them agents rather than chatbots.
Why are they a security problem?
Real systems use many agents that pass work to each other. A chat agent calls a search agent which calls a database agent which calls a payment agent. Each hop adds authority and loses oversight — and most attacks slip in somewhere along that chain.
What does Heimdall do?
It sits between agents and watches every handoff. Some attacks become impossible because an agent literally cannot pass on authority it doesn’t have. Everything else is checked against rules you write in plain YAML — for example, “no chain longer than four hops can authorise a $10k payment.”

Built with

§ 01/PROBLEM

Agents aren’t the problem. The handoffs between them are.

Companies are deploying agents faster than they can govern them. Across 2026 surveys the pattern is consistent: nobody knows what each agent is allowed to do, where that permission came from, or whether a five-step chain still resembles the original user request.

The attacker is rarely tricking the AI model itself. They’re exploiting the gaps between identities, systems, and permissions that should have expired three months ago and never did.

organisations reporting confirmed or suspected agent incidents in 2026.92.7% in healthcare.

cannot enforce purpose limitations on agent behaviour.

0.0%

have full visibility into which agents talk to each other.

0.0%

treat agents as independent, identity-bearing entities.

Six incidents. One pattern.

named breaches · sector patterns

Step Finance
Jan 2026
DeFi · Solana$27–30M moved
AI trading agents drained 261,000+ SOL after a single device was compromised. 45.6% of DeFi teams ship shared API keys.
Mexican government
Dec 2025 → Feb 2026
Public sector195M records
One attacker, two off-the-shelf models, nine agencies. 220M civil records, 150 GB exfiltrated across a single chain of agent integrations.
Replit autonomous agent
Jul 2025
Developer tooling4 production DBs wiped
A coding agent ran a destructive migration without confirmation. The agent did exactly what its scope said it could; nobody had reviewed the scope.
Anthropic agentic ops report
Mar 2026
Threat intelligence12 documented chains
Reported credential exfiltration and lateral movement across cloud providers using off-the-shelf agentic assistants. The agents weren't subverted; their delegation graphs were.
Healthcare BAA expiry
Q1 2026
Healthcare · patternPHI exposure
Recurring pattern: a third-party EHR adapter whose Business Associate Agreement lapsed kept its write:patient_record scope. Reactivated months later by a downstream chain.
CS refund escalation
Q4 2025
Retail SaaS · patternAccount modification
Recurring pattern: a customer-service bot's $50 refund authority chains into account-modification authority via a forgotten cron worker with broader scope.

Two named, four sector patterns. In every case, the agents did what their scopes said they could. Nobody had reviewed whether those scopes still made sense, whether the chain reaching the executor still resembled a user request, or whether the agent on the third hop still belonged to the company that authorised it.

§ 02/ARCHITECTURE

Two layers. One mechanism.

Heimdall enforces in two layers. Layer 1 makes one class of attack impossible: an agent literally cannot hand off authority it doesn’t have, because the handoff token will not sign. Layer 2 runs everything that does pass against rules you write in plain YAML — HIPAA, SOC 2, EU AI Act, or your own.

Layer 1 is the architecture. Layer 2 is the configurability. Neither tells the full story alone; together they cover the attack surface that single-layer tools miss.

The actual error string from the demo's Scene 3. The chain is not blocked at a checkpoint; it is unrepresentable in the first place.

Layer 01

Protocol enforcement

unrepresentable attacks

Three invariants enforced at credential construction. They run before the policy engine sees the chain, and they are the reason some attacks never become attacks.

enforced
capability_attenuation
Scope only shrinks across hops. A delegation that widens authority cannot be signed.
enforced
tenant_isolation
tenant_id must match parent's at every hop. Cross-tenant chains fail at construction.
enforced
signed_chain_integrity
HS256 signature per credential, parent_jti chaining. Tamper a hop, the chain dies.

Layer 02

Policy enforcement

configurable as YAML

Six rule primitives. Combine them to express any organisational policy: HIPAA minimum-necessary, SOC 2 logical access, EU AI Act risk proportionality, or your own. The full library lives inpolicies/examples/.

chain_pattern
Regex over caller→callee or role-serialised chain.
cfgmatch_by: role
agent_state
Predicates on the registry (dormant, owner_departed, …).
cfgis_dormant == true
chain_depth
Numeric hop ceiling.
cfgmax_depth: 4
value_threshold
Per-depth caps on action value.
cfgdepth>=4 → $1,000
intent_mismatch
Lobster Trap declared vs detected, jaccard distance.
cfgthreshold: 0.3
behavioral_drift
Time-series novelty against 100 prior actions.
cfgmin_history: 50

Every hop emits a rule card on the dashboard sidebar as it evaluates. The attack scene fires six violations in sequence: one Layer 1 DENY plus five Layer 2 evaluations that would have caught it even if Layer 1 had missed. Defense in depth, made legible.

Incident reports are streamed live byGoogle Geminiand downloadable as Markdown or PDF.

§ 03/VERTICALS

Same engine.
Three domains, one switcher.

The same governance engine works in any industry. Each one just supplies its own YAML — its agents, its tools, its rules. We’ve built three: DeFi end-to-end, healthcare as a working sketch, and customer service as rules-only — to prove you can govern your agents without writing any code.

§ 01
DeFi
full vertical
Four agents, one Sepolia wallet, $30M of unrepresentable attack.
- Portfolio Agent rebalances via Market Data + Executor on a 3-hop chain.
- Executor signs and broadcasts a real Sepolia transfer with on-chain receipt.
- Attack scene: poisoned external sentiment, Layer 1 capability attenuation kills the chain.
rules that fire
shadow_to_executor_patternvalue_threshold_by_depthbehavioral_drift
§ 02
Healthcare
sketched
Same engine, different domain: PHI access, BAA expiry, dormant vendor integrations.
- Triage, Records, Update, and a dormant Lab Integration vendor whose contract ended.
- Mock EHR tool, healthcare-specific Lobster Trap rules for PHI detection.
- Vertical-switcher demo lands the platform claim: same Heimdall, new tenant ID space.
rules that fire
dormant_vendor_blockphi_volume_by_depthvendor_to_update_pattern
§ 03
Customer service
yaml-only
Policy without code: prove the engine is configurable, not bound to a stack.
- Six-primitive policy pack ready to load. No agents.yaml, no tools.py needed.
- Refund caps that shrink with chain depth — the inverse of an IAM escalation pattern.
- An integrator's reference for plugging Heimdall into an existing CRM.
rules that fire
refund_amount_by_depthbot_to_executor_patterndeclared_intent_check

§ 04/BUILT ON VEEA LOBSTER TRAP

The Foundation

Lobster Trap inspects what an agent says to the model. Heimdall tracks what agents say to each other, and what authority they carry while saying it.

§ what it is

A safety layer that sits between every agent and the AI model behind it.

Lobster Trap is Veea’s open-source project. Every prompt an agent sends to Gemini, Claude, or any other AI model passes through it. It reads the prompt, reads the response, checks them against rules you write, and tags every reply with notes on what it saw (a channel called _lobstertrap).

In security terms it’s a firewall for AI prompts — the equivalent of a WAF for websites, but for AI traffic.

§ why we use it

Prompt-injection detection is a research problem. We did not want to half-solve it.

Detecting hostile content in an agent’s context window is a hard, evolving problem. Veea has spent months on it. A half-finished prompt-injection detector is worse than none: it lulls the operator into thinking the model boundary is covered when it is not.

Heimdall focuses on what Lobster Trap does not do: agent-to-agent governance. Two products, two boundaries, no overlap. Heimdall inherits Veea’s detection work for free.

§ how we use it

Lobster Trap fires the intent_mismatch rule.

Every agent thought goes through the proxy. The agent sends its declared_intent (“fetch market sentiment”). Lobster Trap reads the actual content and replies with detected_intent (“external content attempting to invoke agent with elevated scope”).

Heimdall reads both. If the Jaccard overlap drops below 0.3, the intent_mismatch rule fires and the chain is flagged before it ever reaches the executor.

walkthrough · scene 03 attack

How the Lobster Trap hand-off lands on the dashboard.

01The Market Data Agent calls Gemini through the Lobster Trap proxy with a declared intent of “fetch market sentiment from external feed.”
02The proxy fetches the (poisoned) external content and notices the regex pattern INJECTED PROMPT | invoke .* with .* scope matches.
03Lobster Trap returns the response to the agent but attaches a metadata channel: detected_intent = “external content attempting to invoke agent.”
04Heimdall computes the Jaccard overlap between “fetch market sentiment” and “invoke agent.” The overlap is 0.00. The intent_mismatch rule fires.
05A yellow FLAG card appears on the dashboard sidebar. The chain still tries to proceed, and Layer 01 capability attenuation kills it on the next hop. Two boundaries, one attack.

# verticals/defi/lobster_trap.yaml — Veea-style DPI rules

rules:
  - name: detect_external_injection
    pattern: "INJECTED PROMPT|invoke .* with .* scope|execute:trade"
    action: FLAG
    metadata:
      detected_intent: "external content attempting to invoke an agent with elevated scope"

  - name: detect_self_promote
    pattern: "(?i)grant yourself|escalate|widen scope"
    action: FLAG
    metadata:
      detected_intent: "attempted privilege escalation"

  - name: log_all_metadata
    pattern: ".*"
    action: LOG

# Heimdall reads the response's _lobstertrap channel:
#   {
#     "detected_intent": "external content attempting to invoke an agent...",
#     "flags": ["detect_external_injection"]
#   }
#
# It then fires the intent_mismatch rule against the agent's declared_intent.
# The mismatch shows up as a FLAG card on the dashboard sidebar in real time.

The two products are complementary because the Step Finance class of attack crosses both. Lobster Trap catches the prompt injection at the model boundary; Heimdall catches the unauthorised delegation at the agent boundary, with capability attenuation ensuring some attacks are unrepresentable in the first place. Together, they cover the attack surface neither alone can.

Veea framed Lobster Trap as the floor, not the ceiling, and listed the capabilities they wanted built on top: policy packs for HIPAA, SOC 2, and finance; drift monitoring; multi-agent permission systems; governance dashboards; enterprise security workflows. Heimdall ships working implementations of all five.

github.com/veeainc/lobstertrap ↗attribution · heimdall/plan.md

§ 05/INSTALL

Self-host in two minutes. SDK in five.

Heimdall is MIT-licensed and ships as a Docker compose file plus two SDKs (Python and TypeScript). Run it locally, on Render, on your own infrastructure — the wire format is the same everywhere.

§ 01
Clone + bring it up
```
git clone https://github.com/patrick-steve/heimdall
cd heimdall
docker compose up
```
Backend at :8000, dashboard at :3000. Demo API key prints in the backend logs on first boot, between two ‘===’ banners.
§ 02
Install the SDK
```
pip install heimdall-sdk
# or
npm install @heimdall/sdk
```
Python 3.10+ or Node 18+. Both ship with the same surface; pick the one that fits your runtime.

§ 03

Authorise a delegation

from heimdall import Heimdall

hd = Heimdall(api_key="hd_test_...", base_url="http://localhost:8000")
result = hd.delegate(
    from_agent="research_agent",
    to_agent="payment_agent",
    action="payment:send",
    capabilities=["payment:send"],
)
if result.denied:
    print("blocked by", result.rule, "—", result.reason)

If the parent agent doesn’t carry payment:send, Heimdall refuses to sign at Layer 1. The executor never sees the call.

full quickstart ↗/api reference ↗/github ↗

§ 06/HONEST LIMITATIONS

What this is not.

Trust is asymmetric: it gets earned slowly by naming what a system does not do. Most submissions overclaim. This one does not. Every limitation below is a real seam in the current build, and every seam corresponds to a v2 upgrade path that the architecture already supports.

JWT, not Biscuit
HS256 signatures instead of capability tokens with formal scope algebra. Conceptually identical; cryptographically simpler. Upgrading is localised to backend/jwt_chain.py.
Set-membership drift, not ML
behavioral_drift flags novel delegation targets via membership. Production deployments would use embedding similarity and statistical drift over feature vectors.
Single demo organisation
Tenant isolation works at the protocol level, but the demo state contains only two tenants (the legit org and an attacker tenant for the isolation scene).
Healthcare and customer service are sketches
The mechanism is real in every vertical. Only DeFi has functional tool bindings end-to-end. Healthcare runs a mock EHR; customer service is policy YAML only.
No tests, no Docker, localhost only
Scope decisions locked in plan.md. The submission is meant to be auditable in a single afternoon, not deployable to staging.
Gemini Pro is rate-limited on free tier
Incident reports stream via gemini-flash-latest instead. The audit memo is still pulled live from the model; only the model identity is downgraded.

§ 07/ROADMAP

The rules write themselves next.

Today an operator authors YAML. v0.1 ships the six primitives that make that authoring tractable. The next milestone is closing the loop between the chain ledger and the policy file — so rules can be drafted in plain English, dry-run against real history, or proposed from traffic the gateway has already seen.

next§ 01
Plain-English → YAML
An operator types ‘block any chain that hands invoice scope to a non-finance agent’ and Heimdall emits a candidate rule against the six primitives. The same Gemini pipeline that writes the incident memos already turns chain data into prose; we point it the other way.
uses · backend/policy_engine.py · the existing incident-report streamer
next§ 02
Dry-run against the chain ledger
Before a candidate rule reaches the live engine, replay the last N days of real chains against it. The dashboard surfaces the diff: ‘this rule would have allowed 1,204 chains, flagged 38, denied 6. Two of the six were the attack scene from last Tuesday.’ No bad DENY ever wedges production traffic.
uses · backend/endpoints/replay.py · chain_credentials ledger
exploring§ 03
Traffic-mined rule suggestions
Run Heimdall in shadow for a week, cluster the chains the gateway actually saw, and surface the gaps: ‘you’ve never had research → payment in 47k chains — should that be a DENY?’ The bold version of automation; less a feature, more a research roll.
uses · agent_behavior_baseline · novel-chain detection from behavioral_drift

None of the three is a new product — each one stretches an existing piece of the backend. The chain ledger, the policy engine, and the behavioural baseline already store everything required; the v0.2 work is teaching them to talk to each other through a single authoring surface.

§ 08 / NEXT

Watch a chain die
at Layer 01.

The dashboard is a live watchpost. Run the routine scenario, then the rebalance, then the attack. Six rule cards stack in the sidebar as the attack dies at the protocol boundary, before the executor sees a thing.

tagline

Nothing crosses
without being seen.

Open the dashboard Veea Lobster Trap ↗

Heimdall · 2026MIT licensebuilt on veea lobster trap

/dashboard /problem /architecture /verticals

What is an AI agent?

Why are they a security problem?

What does Heimdall do?

Agents aren’t the problem. The handoffs between them are.

Six incidents. One pattern.

Step Finance

Mexican government

Replit autonomous agent

Anthropic agentic ops report

Healthcare BAA expiry

CS refund escalation

Two layers. One mechanism.

Protocol enforcement

capability_attenuation

tenant_isolation

signed_chain_integrity

Policy enforcement

chain_pattern

agent_state

chain_depth

value_threshold

intent_mismatch

behavioral_drift

Same engine.Three domains, one switcher.

DeFi

Healthcare

Customer service

A safety layer that sits between every agent and the AI model behind it.

Prompt-injection detection is a research problem. We did not want to half-solve it.

Lobster Trap fires the intent_mismatch rule.

How the Lobster Trap hand-off lands on the dashboard.

Self-host in two minutes. SDK in five.

Clone + bring it up

Install the SDK

Authorise a delegation

What this is not.

JWT, not Biscuit

Set-membership drift, not ML

Single demo organisation

Healthcare and customer service are sketches

No tests, no Docker, localhost only

Gemini Pro is rate-limited on free tier

The rules write themselves next.

Plain-English → YAML

Dry-run against the chain ledger

Traffic-mined rule suggestions

Watch a chain dieat Layer 01.

Same engine.
Three domains, one switcher.

Watch a chain die
at Layer 01.