How to block common AI crawlers with the Raptive Ads WordPress plugin

The Raptive Ads WordPress plugin (version 3.6.0 and above) makes it easy to manage AI crawlers on your site by adding rules to your robots.txt file.

This guide provides step-by-step instructions for enabling these settings to protect your content.

Not using WordPress or the Raptive Ads plugin? Check out our guide on how to manually block common AI crawlers.

Should I block or allow AI crawlers on my site?

It depends on your goals. Read our latest article, which explains which bots to block, which to allow, and how those choices can impact your traffic and revenue. It also breaks down how these decisions fit into the bigger picture of AI, search, and future compensation for creators.

Read: How blocking AI bots paves the way for fair compensation 

 

How to block AI crawlers

  1. Log in to your WordPress dashboard
  2. Navigate to your Raptive Ads plugin settings
  3. Find the Block AI Crawlers section
  4. If you want a crawler blocked > toggle On
  5. If you want a crawler not blocked -> toggle Off
  6. Save your changes

Once enabled, the plugin will automatically add the appropriate disallow rules to your site’s robots.txt file.

Block AI Crawlers plugin settings.png

A note about Google-Extended

We do not recommend blocking Google-extended. While Google claims it doesn’t impact search, there’s a lack of transparency in how these controls are actually enforced, so it’s not something we can confidently advise. Additionally, blocking Google-extended doesn’t stop your content from being used in AI features like AI Overviews, and blocking Googlebot altogether would carry too great a risk to your search visibility. Read more here

 

What are the user agents?

  • anthropic-ai: The user agent for AI company Anthropic.
  • Claude-Web: Anthropic’s general-purpose web crawler.
  • CCbot: The user agent for Common Crawl, which maintains an open repository of web crawl data.
  • FacebookBot: Meta’s user agent that crawls public web pages to improve language models for Facebook’s speech recognition technology.
  • Google-Extended: The user agent Google uses to access content to train Gemini and other AI products.
  • GPTBot: The user agent for OpenAI’s web crawler, which crawls web pages to potentially use to improve future models.
  • PiplBot: A user agent that collects documents from the Web to build a searchable index.

What does it mean to "block" AI crawlers?

Technically, we're helping you add an entry to your site’s robots.txt file that tells these crawlers not to crawl your site going forward. It doesn’t prevent them from accessing your site, but it’s the way each company has documented for a site to “opt out” and the industry-standard method for declaring which crawlers you permit to access your site.

Blocking AI crawlers - Robots.txt entries.png

Many large publishers, including The New York Times, Wall Street Journal, Vox, and Reuters, have already blocked most or all of these AI crawlers using this method.

What if I don't see these options in my plugin settings?

If you see this message in your Raptive Ads WordPress plugin, this means you already have an existing robots.txt file on your site, so you will need to manually update the file yourself. Follow these instructions to add new entries to your robots.txt file to block AI crawlers.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Send a message

Want to join Raptive? Apply here!