Crawler Documentation
OctogenBot
Last updated: May 20, 2026
OctogenBot is the public crawler identity used by Octogen Systems, Inc. Our customers are brands, retailers and agentic-shopping applications. Octogen's crawler reads ecommerce products. The platform then restructures those products into a common enriched schema to make it easier for LLMs to understand the crawled catalog.
User Agent
Requests signed with Octogen's Web Bot Auth identity use this user-agent string:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OctogenBot/1.0; +https://octogen.ai/botsUnsigned crawler traffic may use ordinary browser user agents. The OctogenBot identity is reserved for configured Web Bot Auth traffic.
Bot Details
Verification
Octogen uses Cloudflare Web Bot Auth for eligible crawls. Signed requests include HTTP Message Signature headers and a Signature-Agent header that points to Octogen's public key directory.
https://bots.octogen.ai/.well-known/http-message-signatures-directorySite operators should prefer Web Bot Auth verification over static IP allowlists, since crawl egress can change as infrastructure changes.
Crawler Purpose
OctogenBot reads publicly available ecommerce pages so Octogen can understand how AI shopping agents interpret product catalogs, identify missing product data, and help merchants make their catalogs easier for agents to understand.
- Public product page URLs and page metadata
- Product names, descriptions, images, prices, availability, variants, and attributes
- Public structured data such as Schema.org Product markup
- Public sitemap and robots.txt signals used to scope crawl behavior
Boundaries
- OctogenBot does not try to access authenticated, paywalled, or private content.
- OctogenBot does not collect personal shopper account data from retailer sites.
- OctogenBot does not use signed bot identity unless the crawl is configured for Web Bot Auth.
- OctogenBot is intended for commercial product-data workflows, not search-engine indexing.
Robots.txt
To block OctogenBot, add this rule to your robots.txt file:
User-agent: OctogenBot
Disallow: /To limit crawling to public product pages, use path-specific rules:
User-agent: OctogenBot
Disallow: /account/
Disallow: /checkout/
Crawl-delay: 10Contact
For crawl questions, allowlist requests, robots.txt issues, or urgent operational concerns, email crawler@octogen.ai.