Google's John Mueller says it's normal for 30-40% of the URLs in a website's Search Console report to return 404 errors.
This will be stated during Google Search Central's SEO hangout starting February 25th. There we also learned that it is impossible to prevent Google from crawling URLs that no longer exist.
Google may still be trying to crawl URLs for years after it has deleted URLs, and there is nothing website owners can do to prevent it from doing so.
Hence, 404s are inevitable, even for the most diligent SEO.
An SEO named Robb Young asked the series of questions that pulled that information from Miller this week.
Young has a website that returns 404s in the search console for URLs that have been inactive for 8 years. The urls were previously 410 and have no links pointing to them.
Read on below
He wants to know if this is normal or not. Here is Mueller's answer.
John Mueller on Googlebot crawling old URLs
Mueller says 8 years is a long time to crawl non-existent URLs. This cannot be ruled out.
If Google has determined that a URL was active in the past, it may try to crawl the URL again from time to time.
If you know the URL doesn't exist, you can just ignore it in the Search Console report.
"Seven or eight years sounds like a really long time … if it's something we've seen in the past, we'll try to redraw it every now and then.
We're going to tell you, "Oh, that url didn't work." And if you want, "Well, it's not supposed to work." Then that's perfectly fine. "
Read on below
In a follow-up question, Young asks if there is any way he could send a stronger signal to Google that these URLs no longer exist.
Will Google ever stop crawling the remote urls?
"I don't think you can guarantee that we won't at least try [crawl] these URLs. It's one of those things that we have them in our system and we know they have been useful at some point. So when we have time we'll just try again.
It doesn't cause any problems. It's just, we try again and we show you a report and tell you, "Oh, we tried again and it didn't work."
Young is concerned about the volume of 404 in his Search Console report and asks Mueller one more question.
He clarifies that it's not just a handful of URLs returning 404 errors, but 30-40% of the URLs in the report that have 404 errors.
Is The normal?
"That's perfectly fine. This is natural, especially for locations with high churn. If this is a classifieds site where you've classified listings that are valid for a month, expect those listings to drop out. And then over the years we collect a bunch of those urls and try again, and if they return 404s or 410s, whatever, perfectly fine.
I don't think that would look unusual to us. It's not that we would see this as a quality signal or anything. The only time I think 404s would look problematic to us is when the homepage returns 404s. Then this could be a situation where we go, "Oh, I don't know if this website is actually still active."
But if parts of the website are 404, anyway. It's like a technical thing, it doesn't matter. "
Read on below
Google can remember URLs long after they are removed and try to crawl them again at any time. However, you don't need to worry if the Search Console shows 404 errors for URLs that shouldn't be there anyway.
Hear Mueller's full answer in the video below: