Skip to main content

ChiroBot - Website Indexing Guide

Achiral AI is a privacy-first AI platform for businesses with enterprise-grade, self-hosted, secure infrastructure. ChiroBot is Achiral's intelligent web crawler that indexes your organization's website to provide contextual knowledge to your Chiro AI assistant.

Overview

When you provision Chiro for your organization, ChiroBot automatically crawls your website to understand:

  • Your products and services
  • Your industry and market
  • Your company size and structure
  • Key personnel and team information
  • Target markets and customer segments

This intelligence is then used by Chiro to provide more relevant, context-aware responses to your team.

Intelligence vs. Capabilities

ChiroBot provides intelligence—the knowledge and context about your organization. This is different from capabilities, which define what Chiro can do:

  • Intelligence (from ChiroBot): What Chiro knows about your company
  • Capabilities: What Chiro can do and how it behaves

Think of intelligence as the foundation and capabilities as the skills built on top of it. For more about capabilities, see Assistant Capabilities.

Crawling Behavior

  • Frequency: ChiroBot crawls your website once during initial provisioning, then refreshes every 90-120 days
  • Scope: Up to 20 pages per site, maximum depth of 3 levels
  • Rate Limiting: 1 request per second to minimize server impact
  • User Agent: ChiroBot/1.0 (+https://achiral.ai/chirobot)

Whitelisting ChiroBot

If your website has security restrictions, you may need to whitelist ChiroBot to ensure successful indexing.

robots.txt

Add the following to your robots.txt file:

# Allow Achiral's ChiroBot
User-agent: ChiroBot
Allow: /

# Optional: Set crawl delay (in seconds)
Crawl-delay: 1

To allow ChiroBot while blocking other bots:

# Block all bots by default
User-agent: *
Disallow: /

# Allow ChiroBot specifically
User-agent: ChiroBot
Allow: /

Content Security Policy (CSP)

If your site uses CSP headers, ChiroBot uses a real browser (Puppeteer) so standard CSP policies should not interfere. No special configuration is typically needed.

Cloudflare & Edge Protection

If your site is behind Cloudflare or similar edge protection:

Create a Cloudflare Firewall Rule to allow ChiroBot:

  1. Go to Cloudflare DashboardSecurityWAF
  2. Create a new firewall rule:
    • Rule name: Allow ChiroBot
    • Field: User Agent
    • Operator: Contains
    • Value: ChiroBot
    • Action: Allow

Option 2: Bypass Bot Fight Mode

  1. Go to Cloudflare DashboardSecurityBots
  2. Add ChiroBot to your Verified Bots list (if using Cloudflare for SaaS)
  3. Or temporarily disable Bot Fight Mode during crawl periods

Option 3: IP Allowlist

Contact Achiral support to get ChiroBot's IP addresses and add them to your allowlist.

Other Security Systems

For other security systems (Imperva, Akamai, AWS WAF, etc.), add a rule to allow the user agent ChiroBot/1.0 to access your site.

Troubleshooting

Crawl Failed

If ChiroBot cannot successfully crawl your website, you'll see an error in your organization dashboard. Common causes:

  • robots.txt blocking: Verify ChiroBot is allowed in your robots.txt
  • Cloudflare challenge: Add a firewall rule to allow ChiroBot
  • Authentication required: ChiroBot cannot crawl password-protected pages
  • JavaScript-heavy site: Contact support for manual review

Partial Crawl

If only some pages were indexed:

  • Check that all important pages are linked from your homepage or sitemap
  • Verify no broken links or redirects
  • Ensure pages load within 30 seconds

Manual Refresh

To trigger a manual crawl:

  1. Go to your organization SettingsAI Configuration
  2. Click Refresh Website Intelligence
  3. ChiroBot will re-crawl your site within a few minutes

Privacy & Security

  • ChiroBot only crawls publicly accessible pages (no authentication)
  • Crawled content is stored securely in your organization's isolated Weaviate tenant
  • Content is never shared with other organizations
  • You can request deletion of indexed content at any time

Contact Support

If you need assistance with ChiroBot:

Learn more

Technical Details

  • Technology: Puppeteer (headless Chrome)
  • Fallback: Simple HTTP fetch for static sites
  • User Agent: ChiroBot/1.0 (+https://achiral.ai/chirobot)
  • Respect for Standards: Honors robots.txt, rate limits, and crawl delays
  • Open Source: ChiroBot's crawler code is available in our GitHub repository