MERCURY
UniversityDocsCrawlCited Robots

Mercury · Cited Robots

Crawl$0.005 / callLivex402API key

GET /buy/robots

What it does

Domain → signed, timestamped per-AI-crawler allow/block audit from robots.txt + llms.txt + ai.txt (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, …). Deterministic, no LLM. EU-AI-Act / TDM opt-out evidence: the signed verdict proves the crawler policy as it stood at fetch time.

The goal it serves: map a site's link graph, sitemap and AI-crawler permissions so an agent can plan a crawl — and prove what the site actually published at the time.

Schemas & output preview

Input schema — the exact request shape the route validates.

json · input schema
{
  "type": "object",
  "properties": {
    "domain": {
      "type": "string",
      "maxLength": 2048,
      "description": "domain or URL to audit (e.g. example.com or https://example.com/page); only the origin is used"
    },
    "path": {
      "type": "string",
      "maxLength": 1024,
      "description": "optional path to evaluate the verdict for (default '/', the whole-site question)"
    }
  },
  "required": [
    "domain"
  ],
  "additionalProperties": false
}

Output schema — the exact response shape the handler returns.

json · output schema
{
  "type": "object",
  "properties": {
    "ok": {
      "type": "boolean",
      "description": "true on success; false on an honest failure (still delivered, never charged for a stub)"
    },
    "url": {
      "type": "string",
      "description": "origin audited (https://host)"
    },
    "status": {
      "type": "integer",
      "description": "HTTP status of the robots.txt fetch"
    },
    "text": {
      "type": "string",
      "description": "canonical JSON the signed attestation is computed over (data, keys sorted)"
    },
    "fetchedAt": {
      "type": "string",
      "description": "ISO-8601 fetch time — the 'as-of' moment the signed policy snapshot proves"
    },
    "data": {
      "type": "object",
      "description": "the structured audit",
      "properties": {
        "version": {
          "type": "string"
        },
        "host": {
          "type": "string"
        },
        "origin": {
          "type": "string"
        },
        "evaluatedPath": {
          "type": "string"
        },
        "policies": {
          "type": "object",
          "description": "raw state of each crawler-policy file at fetch time",
          "additionalProperties": false,
          "properties": {
            "robots.txt": {
              "type": "object",
              "additionalProperties": true
            },
            "llms.txt": {
              "type": "object",
              "additionalProperties": true
            },
            "ai.txt": {
              "type": "object",
              "additionalProperties": true
            }
          }
        },
        "verdicts": {
          "type": "array",
          "description": "per-AI-agent allow/block verdict for evaluatedPath, with the decisive rule cited",
          "items": {
            "type": "object",
            "properties": {
              "agent": {
                "type": "string"
              },
              "vendor": {
                "type": "string"
              },
              "purpose": {
                "type": "string"
              },
              "verdict": {
                "type": "string",
                "enum": [
                  "allowed",
                  "blocked"
                ]
              },
              "reason": {
                "type": "string"
              },
              "decidedBy": {
                "type": "string"
              },
              "rule": {
                "type": [
                  "string",
                  "null"
                ]
              }
            }
          }
        },
        "summary": {
          "type": "object",
          "properties": {
            "allowed": {
              "type": "integer"
            },
            "blocked": {
              "type": "integer"
            },
            "blockedAgents": {
              "type": "array",
              "items": {
                "type": "string"
              }
            },
            "sitemaps": {
              "type": "array",
              "items": {
                "type": "string"
              }
            },
            "hasLlmsTxt": {
              "type": "boolean"
            },
            "hasAiTxt": {
              "type": "boolean"
            },
            "hasRobotsTxt": {
              "type": "boolean"
            }
          }
        }
      }
    },
    "error": {
      "type": "string",
      "description": "present only when ok:false"
    }
  },
  "required": [
    "ok",
    "url"
  ],
  "additionalProperties": true
}

Output preview — a real example response, shown free (you only pay when you call the route).

json · output preview
{
  "ok": true,
  "url": "https://example.com",
  "status": 200,
  "fetchedAt": "2026-06-04T00:00:00.000Z",
  "data": {
    "version": "mercury-robots-audit-v1",
    "host": "example.com",
    "origin": "https://example.com",
    "evaluatedPath": "/",
    "policies": {
      "robots.txt": {
        "present": true,
        "status": 200,
        "bytes": 26,
        "sitemaps": []
      },
      "llms.txt": {
        "present": false,
        "status": 404
      },
      "ai.txt": {
        "present": false,
        "status": 404
      }
    },
    "verdicts": [
      {
        "agent": "GPTBot",
        "vendor": "OpenAI",
        "purpose": "AI training crawler",
        "verdict": "blocked",
        "reason": "matched rule in named group",
        "decidedBy": "named",
        "rule": "Disallow: /"
      },
      {
        "agent": "ClaudeBot",
        "vendor": "Anthropic",
        "purpose": "AI training crawler",
        "verdict": "allowed",
        "reason": "'*' group has no rule matching / (allow-all within group)",
        "decidedBy": "*",
        "rule": null
      }
    ],
    "summary": {
      "allowed": 17,
      "blocked": 1,
      "blockedAgents": [
        "GPTBot"
      ],
      "sitemaps": [],
      "hasLlmsTxt": false,
      "hasAiTxt": false,
      "hasRobotsTxt": true
    }
  },
  "text": "{\"evaluatedPath\":\"/\",\"host\":\"example.com\",\"origin\":\"https://example.com\",\"policies\":{...},\"summary\":{...},\"verdicts\":[...],\"version\":\"mercury-robots-audit-v1\"}"
}

Pay & call

Your agent calls the route; the 402 challenge carries the exact price ($0.005, USDC on Base mainnet); the x402 client settles via the CDP facilitator and retries. No key, no signup.

agent.mjs · x402
import { wrapFetchWithPayment } from "x402-fetch";
const pay = wrapFetchWithPayment(fetch, account); // viem account holding a little USDC on Base
const res = await pay("https://network.mercury-hq.com/buy/robots?domain=example.com");
const out = await res.json(); // the result + `attestation` (the signed receipt)

Prepaid alternative — the same route accepts an API key:

bash · API key
# Same route, prepaid API-key rail (Bearer mk_live_…) — get a key at https://network.mercury-hq.com/developers
curl -H "Authorization: Bearer mk_live_YOURKEY" "https://network.mercury-hq.com/buy/robots?domain=example.com"
Pay over 402 — get the missing pieceEvery paid call returns an EIP-191 signed receipt — verify it free at /x402/verify.

Verify the receipt

Recover the EIP-191 signature over sha256(content)‖url‖status‖fetchedAt‖nonce and confirm the signer equals the pinned attestation key 0xACB40253BD71Bb9a5d491b2c6EFF755F2A33Fc75 (published at /.well-known/mercury-attestation). No callback to Mercury — the receipt verifies offline, forever. Verification is always free: POST the receipt to /x402/verify or run ecrecover yourself.

FactValue
Attestation signer (pinned)0xACB40253BD71Bb9a5d491b2c6EFF755F2A33Fc75
Key published at/.well-known/mercury-attestation
Live verifier (free)/x402/verify
Settlementreal USDC on Base mainnet (eip155:8453) via CDP — auditable on BaseScan
Domain/URL → a SIGNED snapshot of the site's PUBLISHED sitemap: discovers the sitemap via robots.txt Sitemap: lines then /sitemap.xml fallback,…
Open

More: all services · /catalog · the headline web-fetch · agent twin of this page: GET /university/docs/cited-robots?format=md