llms.txt
An llms.txt file lists your canonical URLs so AI answer engines (Perplexity, ChatGPT search, Claude) index the right pages first. The convention emerged in late 2024 as a complement to robots.txt and sitemap, shaped for how language models actually crawl.
What it means in operation
The file lives at mobitaste.com/llms.txt. Unlike a sitemap, it is plaintext and grouped by topic, with short descriptions per URL. A typical block lists the homepage, the pricing page, the feature deep-dives, a few guides, and the glossary index, with one sentence each so the AI knows what to expect at each URL. MobiTaste also publishes a fuller llms-full.txt that includes excerpted content per URL, so engines that prefer ingesting context can fetch a single file instead of crawling page by page. Both files are generated at build time from the same URL registry that feeds the sitemap.
Why it matters
AI answer engines are starting to drive a meaningful share of inbound traffic, and their crawling patterns differ from Google’s. They prefer concise, structured content and they often hit llms.txt before sitemap. A site that publishes one tells the engine which pages are canonical, which are duplicates, and which are not worth surfacing. A site that does not relies on the engine guessing. For B2B SaaS where buyers ask Perplexity or ChatGPT for comparisons, being the page the engine cites is worth the 30 minutes of setup.
Related terms
- robots.txt: the file llms.txt complements.
- Sitemap: the XML companion for search engines.
- Schema markup: the JSON-LD AI engines also read.