A spec-correct llms.txt file plus a matching robots.txt block for 10 AI bots. Honest note: llms.txt is a proposal, not a standard — but it costs nothing and AI tools are starting to read it.
Your most important pages — services, pricing, best guides. Blurb optional (falls back to the URL path).
Checked = allowed. Untick a bot to add a Disallow block for it in the robots.txt output.
llms.txt + robots.txt policy + the schema and page types AI engines actually cite — one checklist to make your site quotable by ChatGPT, Claude and Perplexity.
Name, URL + a one-line description — the H1 and blockquote of the llms.txt spec.
Up to 10 pages worth citing, plus allow/block per AI bot.
llms.txt to the site root, robots.txt block appended to robots.txt, then verify in a browser.
A plain-markdown file at your site root (yoursite.com/llms.txt) that gives AI systems a curated summary of who you are and which pages matter. Proposed by Jeremy Howard in September 2024. Honest framing: it is still a proposal — no major AI vendor has committed to reading it — but it costs five minutes and several AI tools already check for it.
robots.txt controls ACCESS — which bots may crawl which paths. llms.txt is a DESCRIPTION — a readable summary pointing AI systems at your best pages. They are complementary, which is why this tool generates both: the llms.txt file plus a matching robots.txt AI-crawler block.
No. Googlebot is a separate crawler and is unaffected by GPTBot rules. Even Google-Extended only controls Gemini training — blocking it does not touch Google Search rankings or AI Overviews eligibility.
GPTBot (OpenAI training), OAI-SearchBot (ChatGPT search), ChatGPT-User (on-demand fetches), ClaudeBot + Claude-User (Anthropic), PerplexityBot (Perplexity), Google-Extended (Gemini training), CCBot (Common Crawl), Bytespider (ByteDance) and Applebot-Extended (Apple Intelligence training). Each is labelled with what it powers so you can decide per-bot.
My take: usually no. AI answers are becoming a referral channel — when ChatGPT or Perplexity cites you, that is a warm visitor. If you must protect content, block the training-only bots (GPTBot, CCBot, Bytespider) and keep the search and user-request bots open so you stay citable.
Site root, so it resolves at yoursite.com/llms.txt, served as plain text. WordPress on shared hosting: upload to public_html via file manager or FTP. Shopify is the awkward one — you cannot write root files directly, so you need an app or an edge/proxy rule.
Open yoursite.com/llms.txt in a browser or run curl -s yoursite.com/llms.txt — you should see the markdown, not a 404 or an HTML page. For the robots.txt block, recheck yoursite.com/robots.txt and run it through a robots.txt tester for one of the listed user-agents.
No. Schema (JSON-LD) is structured data that search engines parse on each page; llms.txt is a site-level summary aimed at AI assistants. Do both — they answer different questions about your site.
Mandeep Singh, Sprout Sage Solutions. I run AI-visibility work for medspa and Shopify clients, and writing llms.txt + robots.txt policies by hand got old — so I built the generator I use myself.
I install AI receptionists, no-show recovery flows, and review automation for medspas, dental, and aesthetic clinics. Six flows. 60 days. Average client lift: 30% revenue.
See the AI Automation service → +91 97297 12388 WhatsApp
Or book a free 30-min call → /free-consultation/