
llms.txt – What is it? Why do we need it?
Well …
You might have missed it but, the Large Language Models (LLMs) have landed: Google AI Search, ChatGPT, Claude, Gemini and their kin.
In all the fuss, the idea that these new constructs are genuinely generative – that is, they are creating something genuinely new – has fallen by the wayside somewhat. Whether anything is entirely new is beyond the scope of this post to determine, but it is abundantly clear that, in the process of generating content, the LLM’s are happy to make free with your existing, hard-wrought content and thinly re-interpret it as “Generative Content”.
This is not ideal – as things stand, copyright concerns mean that although Google AI Overviews are currently allowed in certain EU countries, and they face ongoing scrutiny and legal challenges under various EU regulations. The UK has blithely accepted this challenge to creators’ copyright without a murmur.
Personally, I wish things were otherwise.
Also AI is not as “I” as it purports to be (yet):
- We’ve seen it entirely contradict the intent of the text it interprets.
- ChatGPT 4.0 was reputed to be wildly inaccurate with its citations – the numbers have yet to arrive for ChatGPT 5.
- LLM Hallucinations persist and proliferate – just ask ChatGPT to do some maths!
All these things are improving, but as of now issues of copyright, sustainability and accuracy mean a content creator’s relationship with these beasts should, at best, be cautious.
But they are here, like tanks on the lawn. We cannot make them go away. You can use robots.txt to keep them out of your content: it might work. But, like it or not, AI is the future of search:
Increase in use of ChatGPT as primary search platform in the 12 months to March 25
AI overviews mean that more searches are completed without users clicking through to other websites.
AI searches have reduced organic web traffic by somewhere between 15% to 25%.
In the future, if you want website visibility, you’ll need to get on with those tanks. If you cannot beat the LLMs, the best you can do is accommodate them and work with them.
llms.txt looks like the first step in that process.
In this post, we’ll explore what llms.txt is, what it does, and why you should seriously consider using it.
Table of Contents
What is llms.txt?
The llms.txt file is a plain text file placed at the root of a website (i.e., https://yourdomain.com/llms.txt) that provides guidance to AI models and agents about how they may interact with your website’s content.
It was proposed by leaders in the AI and internet governance space – most notably from projects such as the LLM Permissions Framework – as a way to give publishers and site owners more agency over how their content is used by AI tools.
The inspiration comes directly from robots.txt, the decades-old standard used by websites to instruct search engine bots (like Googlebot) on what to crawl and what to avoid. Similarly, llms.txt aims to become a machine-readable but human-understandable declaration of your preferences, telling AI systems how to use, summarise or reference your content.
What Does llms.txt Do?
The llms.txt file sets policies and preferences do dictate how your content should be accessed, used and represented by LLMs. It cannot enforce these rules by itself (just like robots.txt doesn’t physically block crawlers), but instead acts as a clear signal to responsible AI providers.
Some of the things you can do with llms.txt include:
- Allow or disallow AI access to parts of your site.
- Specify attribution requirements, e.g. how a model should cite or link back to your content.
- Declare commercial usage terms, e.g. whether your content may be used to train models for profit.
- Provide metadata about your site’s content, such as licensing or ownership.
- Set preferred summaries or descriptions of your brand, product, or service.
- Link to other rights-related documents, like terms of service or copyright notices.
Here’s a basic example of what a llms.txt file might look like:
# llms.txt for example.com
llm-access: disallow
summary: "Example.com is a leading source of independent tech news and reviews."
attribution: required
link: https://example.com/terms
Code language: PHP (php)This tells LLMs: don’t use our content unless attribution is provided, here’s how to summarise us, and read our terms for more information.
Here’s ours.
It’s important to note that llms.txt is voluntary – AI companies aren’t legally required to comply (yet), but many responsible providers are starting to pay attention.
Why Should You Use llms.txt?
Whether you run a blog, a business website, a news outlet or an e-commerce platform, there are compelling reasons to add llms.txt to your site today.
1. Control Over How Your Content is Represented
Without your input, AI systems might summarise or paraphrase your content inaccurately, out of context or in a way that doesn’t reflect your voice. With llms.txt, you can suggest preferred summaries or point to canonical descriptions, helping to ensure better representation across AI platforms.
2. Protect Your Intellectual Property
If your content is being used by AI tools – especially if it’s being included in training datasets – you deserve a say. The idea is that llms.txt will help you assert boundaries, especially regarding commercial use or derivative works. While it doesn’t enforce copyright, it strengthens your position if disputes arise later.
3. Support Transparency and Ethical AI
By using llms.txt, you are participating in a broader movement towards responsible AI. You’re making your preferences clear and supporting a more transparent, accountable system where content creators are respected – not just mined for data.
4. Reduce Misinformation or Hallucinations
AI models are prone to confidently asserting false information. If your website is summarised poorly or taken out of context, users may encounter misinformation. With llms.txt, you can reduce the chance of this happening by guiding how content is used and framed.
5. Easy to Implement
Unlike complex backend changes or costly integrations, llms.txt is simply a plain text file. Anyone with basic website access can implement it. There’s no API, no installation – just clarity.
6. Future-Proofing Your Website
LLMs are only going to grow in use and influence. Having a llms.txt file today better positions your site for the evolving ecosystem. Future regulations or standards may draw on this file, and early adopters will be ahead of the curve.
7. Signal to AI Developers
Many AI developers and platforms are actively looking for ways to respect publisher intent – don’t make them guess (or give them the excuse that they “had to guess”).
Adding an llms.txt tells them how to handle your content responsibly.
What llm.txt Cannot Do
To be realistic, there are limits to what llms.txt can achieve:
- It doesn’t guarantee enforcement. Bad actors may ignore it.
- It’s not a legal standard and it never will be. Though it’s a well-respected standard,
robots.txtisn‘t legally enforceable and that’s been in use since 1994! But, active uptake ofllms.txtcould influence future policy. - It relies on adoption by both content creators and AI developers to gain real traction.
But these limitations also applied to robots.txt, which is now ubiquitous. What starts as a social convention can quickly become a de facto standard.
If It’s Not Enforceable, Is It Worth It?
Well, there are no guarantees.
But the LLMs are ultimately owned by someone and reputations matter.
Technology may well be outpacing legislation but legislation will advance.
In any copyright dispute it is helpful to demonstrate that you have attempted to defend your copyright in the past. The exact benefits of acting now may not be entirely code based.
Robots.txt has been an enormously successful standard, almost universally observed; there is no reason to believe this won’t work just as well.

The chances of llms.txt gaining currency will be massively increased if it is widely used.
It is cheap and straightforward to implement – you’d be foolish not to try it out.
How to Get Started
- Create a plain text file named
llms.txt. - Add your preferred directives (you can find templates at llm-permissions.org).
- Upload it to the root directory of your website.
- Monitor developments as adoption grows and standards evolve.
Final Thoughts
lms.txt offers a lightweight, accessible and ethical way to define how large language models should treat your content.
Even though it’s early days, the signal you send today can help shape the future of AI content rights tomorrow.
In a world where content is king but context is everything, llms.txt helps ensure your message is heard the way you intended.
We have the tools to create an accurate llms.txt file for you, we have the skills to promote your content in the world of AI search. Don’t wait until its too late, you need to contact us now.
Need Our Help?
We reckon you do. Talk is free – try some.
