# robots.txt API
> Fetch and evaluate any website's robots.txt. Pass a URL and a user-agent and the check endpoint tells you whether that URL is crawlable — selecting the most-specific user-agent group and applying the RFC 9309 longest-match Allow/Disallow rules (with * and $ wildcards, where Allow wins ties), and returning the matched rule, the group's crawl-delay and the sitemaps the site declares. The parse endpoint returns the whole file structured into per-user-agent groups (their allow and disallow lists and crawl-delay) plus the list of sitemaps. A missing robots.txt (404/403) means everything is allowed, exactly as the spec requires. The request is made server-side and private or internal targets are refused (SSRF-guarded). Built for SEO audits, crawler and scraper compliance, sitemap discovery and pre-flight "am I allowed to fetch this?" checks. A robots.txt evaluator — distinct from the on-page SEO audit (seo), the XML toolkit (xml) and link unfurling/preview (url). No upstream key, no cache.

## Authentication
All requests require your oanor API key in the `x-oanor-key` header. Get one at https://www.oanor.com/developer/keys.

```bash
curl -H "x-oanor-key: oanor_live_…" "https://api.oanor.com/robots-api/..."
```

## Pricing
- **Free** (Free) — 2,000 calls/Mo, 2 req/s
- **Starter** ($6/Mo) — 37,500 calls/Mo, 8 req/s
- **Pro** ($20/Mo) — 202,000 calls/Mo, 20 req/s
- **Mega** ($52/Mo) — 785,000 calls/Mo, 50 req/s

## Endpoints

### robots.txt

#### `GET /v1/check` — Is a URL crawlable?

**Parameters:**
- `url` (query, required, string) — The URL to check Example: `https://www.google.com/`
- `user_agent` (query, optional, string) — User-agent (default *) Example: `Googlebot`

**Example:**
```bash
curl -H "x-oanor-key: $KEY" \
  "https://api.oanor.com/robots-api/v1/check?url=https%3A%2F%2Fwww.google.com%2F&user_agent=Googlebot"
```

**Response:**
```json
{
    "data": {
        "url": "https://www.google.com/",
        "reason": "no matching rule — allowed",
        "allowed": true,
        "sitemaps": [
            "https://www.google.com/sitemap.xml"
        ],
        "robots_url": "https://www.google.com/robots.txt",
        "user_agent": "Googlebot",
        "crawl_delay": null,
        "matched_rule": null,
        "matched_agent_group": [
            "*",
            "yandex"
        ]
    },
    "meta": {
        "timestamp": "2026-06-01T23:40:40.296Z",
        "request_id": "a60059b2-544c-40e1-8c92-bd8812629c88"
    },
    "status": "ok",
    "message": "Crawlability checked",
    "success": true
}
```

#### `GET /v1/parse` — Parse a site's robots.txt

**Parameters:**
- `url` (query, required, string) — Site URL or domain Example: `https://www.google.com`

**Example:**
```bash
curl -H "x-oanor-key: $KEY" \
  "https://api.oanor.com/robots-api/v1/parse?url=https%3A%2F%2Fwww.google.com"
```

**Response:**
```json
{
    "data": {
        "found": true,
        "groups": [
            {
                "allow": [
                    "/search/about",
                    "/search/howsearchworks",
                    "/?hl=",
                    "/?hl=*&gws_rd=ssl$",
                    "/?gws_rd=ssl$",
                    "/?pt1=true$",
                    "/m/finance",
                    "/books/about",
                    "/books?*zoom=1",
                    "/books?*zoom=5",
                    "/books/content?*zoom=1",
                    "/books/content?*zoom=5",
                    "/citations?user=",
                    "/citations?view_op=new_profile",
                    "/citations?view_op=top_venues",
                    "/scholar_share",
                    "/maps?daddr=",
                    "/maps?entry=wc",
                    "/maps?f=",
                    "/maps?hl=",
                    "/maps?q=",
                    "/maps?saddr=",
                    "/maps?sid=",
                    "/maps?*output=classic",
                    "/maps?*file=",
                    "/maps/$",
                    "/maps/@",
                    "/maps/?daddr=",
                    "/maps/?entry=wc",
                    "/maps/?f=",
                    "/maps/?hl=",
                    "/maps/?q=",
                    "/maps/?saddr=",
                    "/maps/?sid=",
                    "/maps/api/staticmap?",
                    "/maps/search/",
                    "/maps/sitemap.xml
…(truncated, see openapi.json for full schema)
```

### Meta

#### `GET /v1/meta` — Spec

**Example:**
```bash
curl -H "x-oanor-key: $KEY" \
  "https://api.oanor.com/robots-api/v1/meta"
```

**Response:**
```json
{
    "data": {
        "note": "Fetch and evaluate a website's robots.txt. /v1/check?url=https://example.com/path&user_agent=Googlebot tells you whether that URL is crawlable by that user-agent — selecting the most-specific user-agent group and applying RFC 9309 longest-match Allow/Disallow rules with * and $ wildcards (Allow wins ties), and returning the matched rule, the group's crawl-delay and the declared sitemaps. /v1/parse returns the whole file parsed into per-user-agent groups (allow/disallow lists, crawl-delay) and the sitemap list. A missing robots.txt (404/403) means everything is allowed, per spec. The request is made server-side; private/internal targets are refused (SSRF-guarded). Ideal for SEO audits, crawler and scraper compliance, sitemap discovery and pre-flight 'can I fetch this?' checks. A robots.txt evaluator — distinct from the on-page SEO audit (seo), XML tooling (xml) and link unfurling (url). No key, no cache.",
        "spec": "RFC 9309 (robots.txt)",
        "endpoints": [
            "/v1/check",
            "/v1/parse",
            "/v1/meta"
        ]
    },
    "meta": {
        "timestamp": "2026-06-01T23:40:40.518Z",
        "request_id": "a4cbd306-e093-4e36-84eb-a2e1d3ab7eca"
    },
    "status": "ok",
    "message": "Meta retrieved",
    "success": true
}
```


---
Marketplace page: https://www.oanor.com/api/robots-api
OpenAPI spec: https://www.oanor.com/api/robots-api/openapi.json
