SEO How-to, Part 9: Diagnosing Crawler Issues
Search engines must crawl and index your site before it can rank in organic search. Thus optimizing your content is pointless if search engines cannot access it.
This is the ninth installment in my “SEO How-to” series. Previous installments are:
In “Part 2,” I discussed how search engines crawl and index content. Anything that limits crawlable pages can kill your organic search performance.
It’s a worst-case scenario in search engine optimization: Your company has redesigned its site, and suddenly organic performance crashes. Your web analytics indicate that home page traffic is relatively stable. But product page traffic is lower, and your new browse grid pages are nowhere to be found in Google.
What happened? You likely have a crawling or indexation issue.
Your browser is a lot more forgiving than bots. Content that renders on the screen and functions correctly in your browser may not be crawlable for bots. Examples include the inability of bots to recognize internal links (orphaning entire sections) or correctly render page content.
The most advanced bots interpret a page as humans see it in updated browsers and send the information back to the search engine to render the different states for additional content and links.
But that relies on the most advanced search bot (i) crawling your pages, (ii) identifying and triggering elements such as non-standard link coding in navigation, and (iii) assessing a page’s function and meaning.
In short, invest the time to identify and resolve the crawl blockers on your site.
Unfortunately, publicly-available tools such as DeepCrawl and Screaming Frog’s SEO Spider cannot perfectly replicate modern search bots. The tools can show negative results when a search bot might be able to access the content.
The first step in testing whether search bots can crawl your entire site is to check Google’s index. In Google’s search bar, type “site:” before any URL you want to check, such as:
Site queries return a list of pages that Google has indexed that start with the URL string you entered. If the pages missing from your analytics are also missing from Google’s index, you could have a crawl block. However, if the pages are indexed but not driving organic traffic, you likely have a relevance or link authority issue.
You can also check indexation in Google Search Console with the “URL inspection” tool — but only one page at a time.
If the site query fails to unearth pages, try crawling your site with Screaming Frog or DeepCrawl. Let the crawler run on your site, and look for missing areas of a certain type — browse grids, product detail pages, articles.
If you don’t see holes in the crawl, your site is likely crawlable. Search bots, again, are more capable than crawler tools. If a tool can get through a site’s content, so can search bots. And problems identified in a crawler tool could be false negatives.
Also, use crawler tools in preproduction environments to identify crawl problems before launch, or at least provide an idea of what you’ll be dealing with when it goes live.
Leave a ReplyWant to join the discussion?
Feel free to contribute!