There is no universal agreement yet. Each crawler may or may not respect it. Some LLM providers document that their bots check robots.txt
and sometimes a specific ai.txt
file, but llm.txt
is not an official convention.
robots.txt
remains the authority.For decades, websites have relied on robots.txt
as the rulebook for crawlers. It tells search engines and bots where they’re welcome and where they’re not. If you want to block access, limit indexing, or set clear boundaries, this is still the file that matters.
Every major crawler — including AI bots like OpenAI’s GPTBot — checks it.
Learn more about robots.txt: developers.google.com
OpenAI GPTBot crawler rules: openai.com
Despite the name similarity, llms.txt
is not a replacement or extension of robots.txt. It doesn’t block crawlers, dictate indexing, or restrict access. Instead, it acts more like a menu – a curated map that guides AI models straight to the most valuable content without making them dig through the entire site.
Think of it more like a hand-crafted sitemap for AI tools, rather than a set of crawling rules.
Proposal and format: llmstxt.orgs
What’s happening today
Here’s the catch: this is not a standard. No major LLM actually looks for llms.txt
today. Adoption is tiny, and most of the buzz comes from SEO discussions and experiments.
In practice, AI crawlers still follow the same norm as search engines: they check your robots.txt
. So while llms.txt
is an interesting idea and might grow into something useful — for now it’s more hype than reality.
If you want to control or guide AI today, the only file that truly matters is still robots.txt
.
robots.txt
+ llms.txt
toolkit free.