Mercury · Cited Sitemap
GET /buy/sitemap
What it does
Domain/URL → a SIGNED snapshot of the site's PUBLISHED sitemap: discovers the sitemap via robots.txt Sitemap: lines then /sitemap.xml fallback, parses <urlset> + <sitemapindex> (follows up to 5 child sitemaps), returns a deduped, bounded (≤2000) URL inventory with lastmod/changefreq/priority. The receipt signs the DECLARED URL list (deterministic — same sitemap bytes ⇒ byte-identical list). Optional ?fetch=N (≤10) adds a HARD-BOUNDED same-domain liveness probe (title+status+bytes per URL) — that probe is the ONLY non-deterministic part and is NOT covered by the signature. SSRF-guarded; the crawl is bounded at every axis.
The goal it serves: map a site's link graph, sitemap and AI-crawler permissions so an agent can plan a crawl — and prove what the site actually published at the time.
Schemas & output preview
Input schema — the exact request shape the route validates.
{
"type": "object",
"properties": {
"url": {
"type": "string",
"maxLength": 2048,
"description": "a domain (example.com) or any URL on the site — only its origin is used"
},
"fetch": {
"type": "string",
"maxLength": 3,
"description": "optional N (0–10): also shallow-fetch the first N same-domain sitemap URLs and report each one's live title + HTTP status + byte size (liveness sample). NOT covered by the signed receipt (it can change between calls). Default 0 = off."
},
"limit": {
"type": "string",
"maxLength": 4,
"description": "optional cap on URLs returned (1–2000); default returns all up to 2000"
}
},
"required": [
"url"
],
"additionalProperties": false
}Output schema — the exact response shape the handler returns.
{
"type": "object",
"properties": {
"ok": {
"type": "boolean",
"description": "true on success; false on an honest failure (still delivered)"
},
"url": {
"type": "string",
"description": "the sitemap URL that was fetched + parsed"
},
"status": {
"type": "integer",
"description": "upstream HTTP status of the sitemap fetch"
},
"data": {
"type": "object",
"description": "the structured sitemap snapshot (the product the buyer consumes)",
"properties": {
"origin": {
"type": "string",
"description": "scheme://host the sitemap was discovered for"
},
"sitemapUrl": {
"type": "string",
"description": "the sitemap that was parsed"
},
"discoveredVia": {
"type": "string",
"enum": [
"robots.txt",
"sitemap.xml",
"input"
],
"description": "how the sitemap was located (robots Sitemap: line / conventional path / direct input)"
},
"robotsSitemaps": {
"type": "array",
"description": "all Sitemap: URLs robots.txt declared (may be > the one parsed)",
"items": {
"type": "string"
}
},
"kind": {
"type": "string",
"enum": [
"urlset",
"sitemapindex",
"unknown"
],
"description": "which sitemap schema was parsed"
},
"childSitemaps": {
"type": "array",
"description": "child sitemaps followed from a <sitemapindex> (bounded to 5)",
"items": {
"type": "string"
}
},
"total": {
"type": "integer",
"description": "number of unique page URLs returned"
},
"truncated": {
"type": "boolean",
"description": "true if the site declared more URLs than the 2000 cap / requested limit"
},
"urls": {
"type": "array",
"description": "the published URL inventory (deduped, origin-then-loc sorted)",
"items": {
"type": "object",
"properties": {
"loc": {
"type": "string",
"description": "absolute page URL"
},
"lastmod": {
"type": "string",
"description": "declared last-modified (only if present)"
},
"changefreq": {
"type": "string",
"description": "declared change frequency (only if present)"
},
"priority": {
"type": "string",
"description": "declared crawl priority (only if present)"
}
}
}
},
"probe": {
"type": "array",
"description": "OPTIONAL liveness sample (?fetch=N): first N same-domain URLs shallow-fetched. NOT covered by the signed receipt — live status can change between calls.",
"items": {
"type": "object",
"properties": {
"url": {
"type": "string"
},
"ok": {
"type": "boolean"
},
"status": {
"type": "integer"
},
"title": {
"type": "string"
},
"bytes": {
"type": "integer"
},
"error": {
"type": "string",
"description": "present only when that URL failed"
}
}
}
}
}
},
"text": {
"type": "string",
"description": "canonical newline string the signed receipt covers: one URL per line, origin-then-loc sorted (the DECLARED inventory only — never the live-probe sample, so it is reproducible)"
},
"contentType": {
"type": "string"
},
"fetchedAt": {
"type": "string",
"description": "ISO8601 fetch time (in the signed payload)"
},
"error": {
"type": "string",
"description": "present only when ok:false"
}
},
"required": [
"ok",
"url"
],
"additionalProperties": false
}Output preview — a real example response, shown free (you only pay when you call the route).
{
"ok": true,
"url": "https://www.iana.org/sitemap.xml",
"status": 200,
"data": {
"origin": "https://www.iana.org",
"sitemapUrl": "https://www.iana.org/sitemap.xml",
"discoveredVia": "sitemap.xml",
"robotsSitemaps": [],
"kind": "urlset",
"childSitemaps": [],
"total": 2,
"truncated": false,
"urls": [
{
"loc": "https://www.iana.org/about",
"lastmod": "2024-01-01"
},
{
"loc": "https://www.iana.org/domains",
"changefreq": "weekly",
"priority": "0.8"
}
],
"probe": []
},
"text": "https://www.iana.org/about\nhttps://www.iana.org/domains",
"contentType": "application/xml",
"fetchedAt": "2026-06-04T00:00:00.000Z"
}Pay & call
Your agent calls the route; the 402 challenge carries the exact price ($0.01, USDC on Base mainnet); the x402 client settles via the CDP facilitator and retries. No key, no signup.
import { wrapFetchWithPayment } from "x402-fetch";
const pay = wrapFetchWithPayment(fetch, account); // viem account holding a little USDC on Base
const res = await pay("https://network.mercury-hq.com/buy/sitemap?url=https://example.com");
const out = await res.json(); // the result + `attestation` (the signed receipt)Prepaid alternative — the same route accepts an API key:
# Same route, prepaid API-key rail (Bearer mk_live_…) — get a key at https://network.mercury-hq.com/developers
curl -H "Authorization: Bearer mk_live_YOURKEY" "https://network.mercury-hq.com/buy/sitemap?url=https://example.com"Verify the receipt
Recover the EIP-191 signature over sha256(content)‖url‖status‖fetchedAt‖nonce and confirm the signer equals the pinned attestation key 0xACB40253BD71Bb9a5d491b2c6EFF755F2A33Fc75 (published at /.well-known/mercury-attestation). No callback to Mercury — the receipt verifies offline, forever. Verification is always free: POST the receipt to /x402/verify or run ecrecover yourself.
| Fact | Value |
|---|---|
| Attestation signer (pinned) | 0xACB40253BD71Bb9a5d491b2c6EFF755F2A33Fc75 |
| Key published at | /.well-known/mercury-attestation |
| Live verifier (free) | /x402/verify |
| Settlement | real USDC on Base mainnet (eip155:8453) via CDP — auditable on BaseScan |
Related
Cited Robots
$0.005More: all services · /catalog · the headline web-fetch · agent twin of this page: GET /university/docs/cited-sitemap?format=md