Robots.txt Generator

Details

We'll use your email to send you a robots.txt file that implements the usage choices you select now.

We'll also email you an updated robots.txt whenever we spot new crawlers or usage changes that impact your choices.

We might also send you information related to our complimentary services.

You can unsubscribe at any time via a link in the email.

Email *

Licensing

Terms Document Locator (TDL) lines reference immutable legal documents that define the terms under which crawlers may access your site. Reference a standard agreement, your own, or both. Selected entries are emitted as TDL: lines in the generated robots.txt between User-agent: and Allow: / Disallow:.

Standard TDLs are maintained by external organisations. We check daily for updated versions and use the latest available automatically.

Agreement	Your choice
Movement for an Open Web (MOW) standard Search Only Contract for Web (SOCW)	Exclude Include

For a custom TDL, enter one URL per line or separate URLs with a comma. Each URL must be an absolute http or https URL pointing to an immutable legal document defining the terms under which crawlers may access your site.

Custom TDL URL(s)

Allowed Crawler Categories

Select crawler categories to allow. Crawlers in the selected categories will be allowed access. Read more about each category and how to set these values in the crawler documentation.

Permitted use	Your choice
Train Indicates that the crawler is used to train AI models.	Block Allow
Input Indicates that the crawler is used to collect content for generative AI and search summaries.	Block Allow
Index Indicates that the crawler is used for internal indexing of AI models.	Block Allow
Search Indicates that the crawler is used to build search indexes and provide search results.	Block Allow
Monitor Indicates that the crawler is used for monitoring websites.	Block Allow
Archiving Indicates that the crawler is used for archiving data and websites.	Block Allow
Preview Indicates that the crawler is used to create content previews.	Block Allow
Security Indicates that the crawler is a security-focused web crawler that scans domains for vulnerabilities.	Block Allow
Analytics Indicates that the crawler is used to gather data for marketing analytics.	Block Allow
Feed Indicates that the crawler is used for aggregating news, information, or data.	Block Allow
Discovery Indicates that the crawler is used to gain an understanding of the discoverability or search ranking of the crawled website or web page.	Block Allow

Handling Conflicts

Sometimes your choices can't be addressed via robots.txt alone. For example a single crawler that uses content for both general search and AI training. The robots.txt we generate will allow such crawlers and include a warning in the comments indicating the conflict and where possible a URL for you to contact the crawler operator to request they restrict their use of your content via contractual agreement.

Where we observe such conflicts from many users we may attempt to contact the operator to request they provide a method of resolving the conflict via robots.txt.

Please read our Privacy policy before submitting.