Skip to content Skip to navigation

How to fix Blocked by robots.txt on WordPress

Author Benjamin Denis
|
Posted on
How to fix Blocked by robots.txt on WordPress

Websites are visited by both humans and robots (crawlers), including Google. In 1994, a system called robots.txt was created to allow website owners to request that robots not crawl their sites. WordPress, for example, generates a default robots.txt file to protect sensitive files from being crawled by any bots.

Google indicates files blocked by robots.txt in Google Search Console. To find the report, log in to Google Search Console and check the Page indexing report. Standard WordPress setups are unlikely to see any issues, but if pages are blocked by robots.txt, you will see a line “Blocked by robots.txt” in the report.

Blocked by robots.txt report in Google Search Console
Blocked by robots.txt report in Google Search Console

This report shows a graph with the number of affected pages over time, followed by a list of example pages. The graph indicates that this site has recently experienced an increase in the number of pages blocked by robots.txt.

Below the graph, there is a list of pages blocked by robots.txt. All the examples in the list are PDF files.

Examples of pages blocked by robots.txt
Examples of pages blocked by robots.txt

By clicking on one of the lines, you can open a pop-up window that shows the robots.txt file and highlights the code that blocks access to this file. In this case, “Disallow: /*.pdf$” blocks all files ending with “.pdf”.

Detail of a page blocked by robots.txt
Detail of a page blocked by robots.txt

In the case of seopress.org, we want to block robots from crawling our PDF files, so this report does not contain any errors. You may, however, see pages in this list that you prefer Google to index and rank. If this is the case, it is an error you will want to correct.

You may also spot an “Indexed, though blocked by robots.txt” in the Page indexation report. Google says that “If someone else links to your page, Google won’t request and crawl the page, but we can still index it using the information from the page that links to your blocked page.” It goes on to explain that using robots.txt is not a good way of asking Google not to index a page. To remove a file from Google’s index, it should not be blocked by robots.txt but instead use a robots Meta tag indicating that the page is “noindex”.

Correcting Blocked by robots.txt errors on WordPress

In the resource Unblock a page blocked by robots.txt provided by Google, it suggests using an external robots.txt validator to check individual URLs blocked by robots.txt. A tool like the robots.txt Validator and Testing Tool from Dentsu may be useful.

Once you understand how to modify the robots.txt file to stop files from being blocked, see this tutorial on how to set up and modify the robots.txt file using SEOPress.

Done fixing? Validate fix

If you have corrected the robots.txt file and want to ensure that Google is no longer blocked by robots.txt, go back to the report and click on the VALIDATE FIX button. Validation can take up to 2 weeks, so be patient. Validating fixes is not compulsory, but it is a good way to get feedback on changes made to improve SEO.

By Benjamin Denis

CEO of SEOPress. 15 years of experience with WordPress. Founder of WP Admin UI & WP Cloudy plugins. Co-organizer of WordCamp Biarritz 2023 & WP BootCamp. WordPress Core Contributor.