The Threat Model

Before talking about defenses, it helps to be precise about what we are defending against. The threat model for autonomous AI agent payments has three primary attack vectors.

Prompt injection. An attacker plants instructions in content the agent reads — a webpage, an email, an API response — that manipulate the agent into taking an action against the user's interest. For payments, the typical goal is to redirect funds to an attacker-controlled address. This is the highest-frequency attack against agents in production today.

Key compromise. An attacker obtains the wallet's signing key, either by stealing API credentials, exploiting a vulnerability in the agent's host, or breaching the wallet provider's infrastructure. With full key access, the attacker can drain the wallet entirely.

Runaway code. Not an attack, but a similar outcome. A bug or infinite loop in the agent's logic causes it to issue many duplicate or oversized payments. The funds are lost just as completely as if an attacker took them.

A complete security architecture has to address all three, and each one needs a different defense. Single-layer security (just encryption, just policies, just code review) leaves at least one of these vectors wide open.

Key Management: Shamir Secret Sharing

The first line of defense is making sure no single point of compromise can drain funds. MoltPe uses Shamir Secret Sharing (SSS) to split each wallet's private key into multiple shares.

Shamir Secret Sharing is a cryptographic technique developed by Adi Shamir in 1979. It splits a secret (in our case, the wallet's signing key) into N shares such that any K of them can reconstruct the secret, but any K-1 of them reveal nothing. MoltPe uses a 2-of-3 threshold: three shares exist, any two can sign, but any single share is useless on its own.

The shares are distributed:

Signing a transaction requires combining at least two shares. Compromise of MoltPe alone (Share 1) cannot move funds because the attacker still needs Share 2. Compromise of the agent's runtime (Share 2) is bounded by the spending policies enforced when MoltPe contributes Share 1. The system has no single point of failure for fund custody, which is what "non-custodial" means in practice.

Wallet Isolation

The second line of defense is limiting blast radius. Every agent gets its own isolated wallet. Wallets do not share keys, do not share API credentials, and do not share spending policies. A compromise of one agent does not give the attacker any access to the others.

This sounds obvious, but the alternative is common: a single shared wallet that all agents draw against, with policies enforced by the application layer. That design fails the moment any one agent is compromised, because the attacker now has access to the full pooled balance.

Isolation also makes audit trails clean. When you look at a wallet's transaction log, every entry is attributable to one agent. No reconciliation, no guessing which workflow caused which payment. For regulated environments, this is closer to "hard requirement" than "best practice."

Concretely, the recommended pattern is one MoltPe wallet per logical agent. If you run a research agent, a customer-service agent, and a content agent, that is three wallets, three sets of policies, three sets of API keys. The free tier supports as many wallets as you need.

Server-Side Policy Enforcement

The third line of defense is infrastructure-level spending limits. We covered this in detail in our spending policies guide; here is the security-specific framing.

The critical design decision is that policies are enforced at MoltPe's API server, not in the agent's code or the MCP client. The reason is that any check in the agent's process can be subverted by code that runs in the same process. Prompt injection that compromises the agent also compromises any policy check the agent is responsible for running.

By moving the check to a remote server, the policy boundary stays intact even when the agent is fully owned by an attacker. The agent can submit any request it wants. The server independently evaluates the request against the wallet's spending policies and rejects anything that violates them. There is no shared state for the attacker to manipulate.

This is the same architectural pattern that makes good API rate-limiting work: the limit is enforced at the gateway, not at the client. Anyone who has tried to enforce rate limits in the client knows why this matters.

Prompt-Injection Defense

Prompt injection is hard to prevent at the model layer. Researchers have tried various input-sanitization approaches, but the open consensus is that there is no fully reliable defense at the prompt level. Models will continue to be susceptible to clever instructions hidden in untrusted content.

So the practical defense is to assume prompt injection will succeed sometimes, and to design the system so that successful injection cannot cause catastrophic damage. This is the same principle as defense-in-depth in traditional security: assume any single layer can fail, and make sure the next layer catches it.

For payments specifically, the layered defense looks like:

  1. Per-transaction cap. Bounds any single payment to a small, fixed amount. The worst case for one successful injection is bounded by this number.
  2. Daily spending limit. Bounds total daily exposure even if many small injection attacks succeed.
  3. Audit trail. Every transaction (successful or blocked) is logged with full context. Anomalous spending patterns are detectable in real time.
  4. Wallet isolation. An injection that compromises Agent A cannot move funds from Agent B's wallet.

Combined, these mean that the worst-case loss from prompt injection is "one daily limit's worth of one wallet" — typically $5 to $50 — rather than "everything you have." That trade-off is what makes autonomous agent payments deployable in practice.

Operational Security

A few practices that materially improve security beyond what the infrastructure provides on its own:

For a deeper dive on the broader security model, the MoltPe security page documents our compliance posture, encryption standards, and infrastructure controls.