§ 00/HEIMDALL · 2026

Nothing crosses without being seen.

AI agents now move money, update health records, and pass work to other agents to get a job done. Nobody is watching what gets handed off along the way. Heimdall is.

Think of it as air traffic control for AI agents. Most tools point a camera at each agent. Heimdall watches what passes between them — and makes one class of attack impossible to even attempt, like a locked door rather than a guard who has to recognise the intruder. Everything else runs against rules you write in plain YAML. Built on Veea Lobster Trap.

Two minutes to install, five to integrate. One docker compose up brings the gateway and dashboard live; one pip install heimdall-sdk (or npm i @heimdall/sdk) and a single hd.delegate(...) call gates every hop. MIT-licensed; self-host or run on your own infra.

Layers
2
Primitives
6
Verticals
3
License
MIT
in plain english/30-second primer
  • What is an AI agent?

    Software that uses an AI model (ChatGPT, Gemini, Claude) to decide what to do next — and then actually does it. Sends emails, runs code, signs payments, updates your database. The decide-and-do loop is what makes them agents rather than chatbots.

  • Why are they a security problem?

    Real systems use many agents that pass work to each other. A chat agent calls a search agent which calls a database agent which calls a payment agent. Each hop adds authority and loses oversight — and most attacks slip in somewhere along that chain.

  • What does Heimdall do?

    It sits between agents and watches every handoff. Some attacks become impossible because an agent literally cannot pass on authority it doesn’t have. Everything else is checked against rules you write in plain YAML — for example, “no chain longer than four hops can authorise a $10k payment.”

§ 01/PROBLEM

Agents aren’t the problem. The handoffs between them are.

Companies are deploying agents faster than they can govern them. Across 2026 surveys the pattern is consistent: nobody knows what each agent is allowed to do, where that permission came from, or whether a five-step chain still resembles the original user request.

The attacker is rarely tricking the AI model itself. They’re exploiting the gaps between identities, systems, and permissions that should have expired three months ago and never did.

0%

organisations reporting confirmed or suspected agent incidents in 2026.92.7% in healthcare.

0%

cannot enforce purpose limitations on agent behaviour.

0.0%

have full visibility into which agents talk to each other.

0.0%

treat agents as independent, identity-bearing entities.

Six incidents. One pattern.

named breaches · sector patterns
  1. Step Finance

    Jan 2026
    DeFi · Solana$27–30M moved

    AI trading agents drained 261,000+ SOL after a single device was compromised. 45.6% of DeFi teams ship shared API keys.

  2. Mexican government

    Dec 2025 → Feb 2026
    Public sector195M records

    One attacker, two off-the-shelf models, nine agencies. 220M civil records, 150 GB exfiltrated across a single chain of agent integrations.

  3. Replit autonomous agent

    Jul 2025
    Developer tooling4 production DBs wiped

    A coding agent ran a destructive migration without confirmation. The agent did exactly what its scope said it could; nobody had reviewed the scope.

  4. Anthropic agentic ops report

    Mar 2026
    Threat intelligence12 documented chains

    Reported credential exfiltration and lateral movement across cloud providers using off-the-shelf agentic assistants. The agents weren't subverted; their delegation graphs were.

  5. Healthcare BAA expiry

    Q1 2026
    Healthcare · patternPHI exposure

    Recurring pattern: a third-party EHR adapter whose Business Associate Agreement lapsed kept its write:patient_record scope. Reactivated months later by a downstream chain.

  6. CS refund escalation

    Q4 2025
    Retail SaaS · patternAccount modification

    Recurring pattern: a customer-service bot's $50 refund authority chains into account-modification authority via a forgotten cron worker with broader scope.

Two named, four sector patterns. In every case, the agents did what their scopes said they could. Nobody had reviewed whether those scopes still made sense, whether the chain reaching the executor still resembled a user request, or whether the agent on the third hop still belonged to the company that authorised it.

§ 02/ARCHITECTURE

Two layers. One mechanism.

Heimdall enforces in two layers. Layer 1 makes one class of attack impossible: an agent literally cannot hand off authority it doesn’t have, because the handoff token will not sign. Layer 2 runs everything that does pass against rules you write in plain YAML — HIPAA, SOC 2, EU AI Act, or your own.

Layer 1 is the architecture. Layer 2 is the configurability. Neither tells the full story alone; together they cover the attack surface that single-layer tools miss.

Layer 01

Protocol enforcement

unrepresentable attacks

Three invariants enforced at credential construction. They run before the policy engine sees the chain, and they are the reason some attacks never become attacks.

  • enforced

    capability_attenuation

    Scope only shrinks across hops. A delegation that widens authority cannot be signed.

  • enforced

    tenant_isolation

    tenant_id must match parent's at every hop. Cross-tenant chains fail at construction.

  • enforced

    signed_chain_integrity

    HS256 signature per credential, parent_jti chaining. Tamper a hop, the chain dies.

Layer 02

Policy enforcement

configurable as YAML

Six rule primitives. Combine them to express any organisational policy: HIPAA minimum-necessary, SOC 2 logical access, EU AI Act risk proportionality, or your own. The full library lives inpolicies/examples/.

  • chain_pattern

    Regex over caller→callee or role-serialised chain.

    cfgmatch_by: role
  • agent_state

    Predicates on the registry (dormant, owner_departed, …).

    cfgis_dormant == true
  • chain_depth

    Numeric hop ceiling.

    cfgmax_depth: 4
  • value_threshold

    Per-depth caps on action value.

    cfgdepth>=4 → $1,000
  • intent_mismatch

    Lobster Trap declared vs detected, jaccard distance.

    cfgthreshold: 0.3
  • behavioral_drift

    Time-series novelty against 100 prior actions.

    cfgmin_history: 50

Every hop emits a rule card on the dashboard sidebar as it evaluates. The attack scene fires six violations in sequence: one Layer 1 DENY plus five Layer 2 evaluations that would have caught it even if Layer 1 had missed. Defense in depth, made legible.

Incident reports are streamed live byGoogle Geminiand downloadable as Markdown or PDF.

§ 03/VERTICALS

Same engine.
Three domains, one switcher.

The same governance engine works in any industry. Each one just supplies its own YAML — its agents, its tools, its rules. We’ve built three: DeFi end-to-end, healthcare as a working sketch, and customer service as rules-only — to prove you can govern your agents without writing any code.

  1. § 01

    DeFi

    full vertical

    Four agents, one Sepolia wallet, $30M of unrepresentable attack.

    • Portfolio Agent rebalances via Market Data + Executor on a 3-hop chain.
    • Executor signs and broadcasts a real Sepolia transfer with on-chain receipt.
    • Attack scene: poisoned external sentiment, Layer 1 capability attenuation kills the chain.
    rules that fire
    shadow_to_executor_patternvalue_threshold_by_depthbehavioral_drift
  2. § 02

    Healthcare

    sketched

    Same engine, different domain: PHI access, BAA expiry, dormant vendor integrations.

    • Triage, Records, Update, and a dormant Lab Integration vendor whose contract ended.
    • Mock EHR tool, healthcare-specific Lobster Trap rules for PHI detection.
    • Vertical-switcher demo lands the platform claim: same Heimdall, new tenant ID space.
    rules that fire
    dormant_vendor_blockphi_volume_by_depthvendor_to_update_pattern
  3. § 03

    Customer service

    yaml-only

    Policy without code: prove the engine is configurable, not bound to a stack.

    • Six-primitive policy pack ready to load. No agents.yaml, no tools.py needed.
    • Refund caps that shrink with chain depth — the inverse of an IAM escalation pattern.
    • An integrator's reference for plugging Heimdall into an existing CRM.
    rules that fire
    refund_amount_by_depthbot_to_executor_patterndeclared_intent_check
§ 04/BUILT ON VEEA LOBSTER TRAP
Veea Lobster Trap
The Foundation
Lobster Trap inspects what an agent says to the model. Heimdall tracks what agents say to each other, and what authority they carry while saying it.
§ what it is

A safety layer that sits between every agent and the AI model behind it.

Lobster Trap is Veea’s open-source project. Every prompt an agent sends to Gemini, Claude, or any other AI model passes through it. It reads the prompt, reads the response, checks them against rules you write, and tags every reply with notes on what it saw (a channel called _lobstertrap).

In security terms it’s a firewall for AI prompts — the equivalent of a WAF for websites, but for AI traffic.

§ why we use it

Prompt-injection detection is a research problem. We did not want to half-solve it.

Detecting hostile content in an agent’s context window is a hard, evolving problem. Veea has spent months on it. A half-finished prompt-injection detector is worse than none: it lulls the operator into thinking the model boundary is covered when it is not.

Heimdall focuses on what Lobster Trap does not do: agent-to-agent governance. Two products, two boundaries, no overlap. Heimdall inherits Veea’s detection work for free.

§ how we use it

Lobster Trap fires the intent_mismatch rule.

Every agent thought goes through the proxy. The agent sends its declared_intent (“fetch market sentiment”). Lobster Trap reads the actual content and replies with detected_intent (“external content attempting to invoke agent with elevated scope”).

Heimdall reads both. If the Jaccard overlap drops below 0.3, the intent_mismatch rule fires and the chain is flagged before it ever reaches the executor.

walkthrough · scene 03 attack

How the Lobster Trap hand-off lands on the dashboard.

  1. 01The Market Data Agent calls Gemini through the Lobster Trap proxy with a declared intent of “fetch market sentiment from external feed.”
  2. 02The proxy fetches the (poisoned) external content and notices the regex pattern INJECTED PROMPT | invoke .* with .* scope matches.
  3. 03Lobster Trap returns the response to the agent but attaches a metadata channel: detected_intent = “external content attempting to invoke agent.”
  4. 04Heimdall computes the Jaccard overlap between “fetch market sentiment” and “invoke agent.” The overlap is 0.00. The intent_mismatch rule fires.
  5. 05A yellow FLAG card appears on the dashboard sidebar. The chain still tries to proceed, and Layer 01 capability attenuation kills it on the next hop. Two boundaries, one attack.

The two products are complementary because the Step Finance class of attack crosses both. Lobster Trap catches the prompt injection at the model boundary; Heimdall catches the unauthorised delegation at the agent boundary, with capability attenuation ensuring some attacks are unrepresentable in the first place. Together, they cover the attack surface neither alone can.

Veea framed Lobster Trap as the floor, not the ceiling, and listed the capabilities they wanted built on top: policy packs for HIPAA, SOC 2, and finance; drift monitoring; multi-agent permission systems; governance dashboards; enterprise security workflows. Heimdall ships working implementations of all five.

github.com/veeainc/lobstertrap ↗attribution · heimdall/plan.md
§ 05/INSTALL

Self-host in two minutes. SDK in five.

Heimdall is MIT-licensed and ships as a Docker compose file plus two SDKs (Python and TypeScript). Run it locally, on Render, on your own infrastructure — the wire format is the same everywhere.

  1. § 01

    Clone + bring it up

    git clone https://github.com/patrick-steve/heimdall
    cd heimdall
    docker compose up

    Backend at :8000, dashboard at :3000. Demo API key prints in the backend logs on first boot, between two ‘===’ banners.

  2. § 02

    Install the SDK

    pip install heimdall-sdk
    # or
    npm install @heimdall/sdk

    Python 3.10+ or Node 18+. Both ship with the same surface; pick the one that fits your runtime.

  3. § 03

    Authorise a delegation

    from heimdall import Heimdall
    
    hd = Heimdall(api_key="hd_test_...", base_url="http://localhost:8000")
    result = hd.delegate(
        from_agent="research_agent",
        to_agent="payment_agent",
        action="payment:send",
        capabilities=["payment:send"],
    )
    if result.denied:
        print("blocked by", result.rule, "—", result.reason)

    If the parent agent doesn’t carry payment:send, Heimdall refuses to sign at Layer 1. The executor never sees the call.

§ 06/HONEST LIMITATIONS

What this is not.

Trust is asymmetric: it gets earned slowly by naming what a system does not do. Most submissions overclaim. This one does not. Every limitation below is a real seam in the current build, and every seam corresponds to a v2 upgrade path that the architecture already supports.

  • JWT, not Biscuit

    HS256 signatures instead of capability tokens with formal scope algebra. Conceptually identical; cryptographically simpler. Upgrading is localised to backend/jwt_chain.py.

  • Set-membership drift, not ML

    behavioral_drift flags novel delegation targets via membership. Production deployments would use embedding similarity and statistical drift over feature vectors.

  • Single demo organisation

    Tenant isolation works at the protocol level, but the demo state contains only two tenants (the legit org and an attacker tenant for the isolation scene).

  • Healthcare and customer service are sketches

    The mechanism is real in every vertical. Only DeFi has functional tool bindings end-to-end. Healthcare runs a mock EHR; customer service is policy YAML only.

  • No tests, no Docker, localhost only

    Scope decisions locked in plan.md. The submission is meant to be auditable in a single afternoon, not deployable to staging.

  • Gemini Pro is rate-limited on free tier

    Incident reports stream via gemini-flash-latest instead. The audit memo is still pulled live from the model; only the model identity is downgraded.

§ 07/ROADMAP

The rules write themselves next.

Today an operator authors YAML. v0.1 ships the six primitives that make that authoring tractable. The next milestone is closing the loop between the chain ledger and the policy file — so rules can be drafted in plain English, dry-run against real history, or proposed from traffic the gateway has already seen.

  1. next§ 01

    Plain-English → YAML

    An operator types ‘block any chain that hands invoice scope to a non-finance agent’ and Heimdall emits a candidate rule against the six primitives. The same Gemini pipeline that writes the incident memos already turns chain data into prose; we point it the other way.

    uses · backend/policy_engine.py · the existing incident-report streamer

  2. next§ 02

    Dry-run against the chain ledger

    Before a candidate rule reaches the live engine, replay the last N days of real chains against it. The dashboard surfaces the diff: ‘this rule would have allowed 1,204 chains, flagged 38, denied 6. Two of the six were the attack scene from last Tuesday.’ No bad DENY ever wedges production traffic.

    uses · backend/endpoints/replay.py · chain_credentials ledger

  3. exploring§ 03

    Traffic-mined rule suggestions

    Run Heimdall in shadow for a week, cluster the chains the gateway actually saw, and surface the gaps: ‘you’ve never had research → payment in 47k chains — should that be a DENY?’ The bold version of automation; less a feature, more a research roll.

    uses · agent_behavior_baseline · novel-chain detection from behavioral_drift

None of the three is a new product — each one stretches an existing piece of the backend. The chain ledger, the policy engine, and the behavioural baseline already store everything required; the v0.2 work is teaching them to talk to each other through a single authoring surface.

§ 08 / NEXT

Watch a chain die
at Layer 01.

The dashboard is a live watchpost. Run the routine scenario, then the rebalance, then the attack. Six rule cards stack in the sidebar as the attack dies at the protocol boundary, before the executor sees a thing.

tagline

Nothing crosses
without being seen.

Heimdall · 2026MIT licensebuilt on veea lobster trap