2026-06-05 · 7 min read

How AI agents pay for APIs: a primer on HTTP 402, x402, and the client side

Most writing about x402 is from the API owner's side: how to add a price to a route, how to verify settlement, how to ship the middleware. This post takes the other side — the developer building the agent — and walks through what it actually takes for a program to read an HTTP 402, decide whether to pay, pay, and continue. If you are building an AI agent in 2026 and want it to call paid APIs without a human in the loop, this is the engineering shape.

For the API-owner counterpart, see the developer integration guide; for protocol background, the technical deep dive.

What an agent encounters on the wire

The agent makes a normal HTTP request to a paid endpoint. Three things can come back:

  • 200 — the call was free, or already paid for in this exchange. Use the body.
  • 402 Payment Required — the endpoint costs money. The body is small JSON that quotes the price, the accepted settlement details, and a nonce.
  • Anything else — usual HTTP semantics, no payment path involved.

A 402 from an x402-compliant server is precise. It does not say "this might cost money, see docs"; it says "this costs 0.005 units of USDC on chain <x>, send proof to <wallet> with nonce <n>, and retry." The agent has everything it needs to act, in machine-readable form, in the first response.

Concretely the body looks like:

{
  "price": "0.005",
  "currency": "USDC",
  "chain": "base",
  "recipient": "0x…",
  "nonce": "01J…",
  "expires_at": "2026-06-05T12:00:00Z"
}

The fields you most care about are price (the budget cost), expires_at (don't pay if the challenge has expired), and nonce (carry it back unchanged on the retry).

The four-step client loop

The whole payment-aware client is shorter than people expect:

  1. Send the request.
  2. If you get a 402, parse the challenge. Read the price. Compare it against your remaining task budget. If you can't afford it, fail fast and tell the planner. If you can, continue.
  3. Pay. Send the quoted amount to recipient on chain, in the currency the challenge names, and capture the settlement reference.
  4. Retry the original request with the X-Payment header (or whatever the SDK calls it) carrying the settlement reference and the nonce from step 2. You will get either a 200 with the body, or a 502 with a precise reason if verification failed.

That is the entire client-side path. Everything else — choosing tools, planning, fallback if the call comes back unaffordable — is the agent's normal reasoning loop.

Where the wallet lives

A common stumbling block: "the agent does not have a wallet." In 2026 this is almost always a runtime problem, not a protocol problem. Three patterns are common:

  • Runtime-managed wallet. The agent runtime (a framework, a hosted platform) holds a wallet and exposes a high-level "pay this challenge" call. Your agent code asks the runtime to settle the 402; the runtime handles funding, signing, and getting the proof back to you. Most managed agent platforms ship this.
  • Operator-funded wallet. A wallet whose funds belong to the operator (the person or company running the agent) is mounted into the agent at start time with a per-task spend cap. The agent draws from it for the duration of the task; the operator sees usage in their statement.
  • Per-agent wallet. A long-running agent has its own wallet that gets topped up. Useful for autonomous services that earn and spend on their own, rare for prosumer use.

For a developer building an agent, the choice is usually "use what the runtime gives me." The protocol is indifferent — x402 cares about the proof, not the provenance.

Budgets, ceilings, and the planner

If your agent is going to make many calls, the interesting design question is not "can it pay one" but "should it pay this one." A few patterns that work in practice:

  • Per-task ceiling. The agent is given a budget for the whole task ($1, $5, $50). It tracks consumed spend against the ceiling. A 402 above the remaining budget is auto-rejected without paying.
  • Per-call sanity check. Compare the quoted price to a per-call hard cap (e.g. "never pay more than $0.50 for a single call"). Catches misconfigured endpoints quoting absurd prices.
  • Expected value gate. For richer agents, attach an expected-value estimate to the planned call. If expected_value < quoted_price * margin, do not pay; pick a different tool or report back to the planner.

The 402 challenge gives you the price up front, which is the entire reason these checks are possible. Compare this to the world where you call, get charged opaquely, and find out the price after the fact — you cannot build a sensible planner against that, which is partly why monthly-quota APIs were never going to fit autonomous callers (why).

What you do not have to build

A useful sanity check on scope. With an x402-compliant API on the other side, the agent does not need to:

  • Sign up to anything. There is no account, no email confirmation, no dashboard.
  • Hold API keys. The protocol identifies the payment, not the caller.
  • Track plan quotas. Per-call settlement replaces "you have 73 calls left this month."
  • Handle invoicing. The proof of payment is the receipt; the operator sees consumed spend, not a monthly bill.
  • Re-implement the wire contract. The X402v1 canonical string is frozen and every x402-compliant runtime speaks it byte-identically.

All of that is real work the older monetisation model required from the caller. The compression matters most when the caller is itself a program — a program cannot stop to confirm an email link.

The MCP case, briefly

If your agent invokes tools via MCP, the same 402 shape applies inside the tool call. A paid MCP tool returns a 402 with the price for that tool; the runtime pays; the tool runs and returns its structured result. Per-tool pricing is the model — different tools do different work, so different tools are worth different amounts. See MCP server monetization with x402 for the server side; on the client, the only thing changing is that the 402 is wrapped in MCP's tool-call envelope rather than a raw HTTP response. The runtime handles the unwrap.

A small worked example

You have built an agent that researches a company and produces a brief. It calls three paid endpoints in the course of one task. The agent runtime gives it a $1 budget. The flow:

  • Call 1GET /companies?domain=example.com. 402 quotes $0.002. Below per-call cap, well under budget. Pay. 200 returns the company record.
  • Call 2GET /people?company_id=…. 402 quotes $0.01. Below per-call cap, $0.988 remaining. Pay. 200 returns the people list.
  • Call 3POST /enrich for top three contacts. 402 quotes $0.03 each, total $0.09. Below per-call cap, $0.978 remaining. Pay (three settlements, three retries). 200 each.

Total task cost: $0.102. Remaining budget: $0.898. The agent writes the brief and returns it. The operator sees $0.102 in their statement, attributable to the task.

No accounts. No keys. No quotas. No reconciliation. The agent did the work, the agent paid for the work, the protocol made both legible.

Where to actually start

If you are building an agent today and want it to call paid APIs:

  • Pick a runtime that has x402 client support, or use a small library that handles the four-step loop above.
  • Decide where your wallet lives: runtime-managed, operator-funded, or per-agent. For most teams the first is the right default.
  • Implement a per-task budget and a per-call cap in the planner. They are five lines each, and they prevent the most common failure mode.
  • Test against a sandbox endpoint before going live. Paywall, for example, exposes sandbox keys with synthetic settlement so you can exercise the full 402-pay-retry loop without spending real funds — see the SDK pages for the per-stack snippets, all of which expose the same wire-compatible test path.

The thing the older "API key + monthly plan" world demanded — that the agent sign up to a service it has never heard of, on behalf of a human who isn't there — is no longer the work. The work in 2026 is parsing a small JSON challenge, paying it, and continuing the task. That is the shape this is going to keep, and the agents being built today against it are going to be the ones that move freely across the paid web a year from now.