Fixing crawl budget issues Chart of crawl budget issue solutions and
Posted: Tue Jan 28, 2025 6:13 am
We're not really going to talk in this video about how to increase your crawl budget. We're going to focus on how to make the best use of the crawl budget you have, which is generally an easier lever to pull in any case. Causes of crawl budget issues So how do issues with crawl budget actually come about? Facets Now I think the main sort of issues on sites that can lead to crawl budget problems are firstly facets.
So you can imagine on an e-comm site, imagine we've got a laptops page. We might qatar gambling data be able to filter that by size. You have a 15-inch screen and 16 gigabytes of RAM. There might be a lot of different permutations there that could lead to a very large number of URLs when actually we've only got one page or one category as we think about it — the laptops page. Similarly, those could then be reordered to create other URLs that do the exact same thing but have to be separately crawled.
Similarly they might be sorted differently. There might be pagination and so on and so forth. So you could have one category page generating a vast number of URLs. Search results pages A few other things that often come about are search results pages from an internal site search can often, especially if they're paginated, they can have a lot of different URLs generated. Listings pages Listings pages. If you allow users to upload their own listings or content, then that can over time build up to be an enormous number of URLs if you think about a job board or something like eBay and it probably has a huge number of pages.
whether they allow crawling, indexing, and PageRank. So what are some of the tools that you can use to address these issues and to get the most out of your crawl budget? So as a baseline, if we think about how a normal URL behaves with Googlebot, we say, yes, it can be crawled, yes, it can be indexed, and yes, it passes PageRank. So a URL like these, if I link to these somewhere on my site and then Google follows that link and indexes these pages, these probably still have the top nav and the site-wide navigation on them.
So you can imagine on an e-comm site, imagine we've got a laptops page. We might qatar gambling data be able to filter that by size. You have a 15-inch screen and 16 gigabytes of RAM. There might be a lot of different permutations there that could lead to a very large number of URLs when actually we've only got one page or one category as we think about it — the laptops page. Similarly, those could then be reordered to create other URLs that do the exact same thing but have to be separately crawled.
Similarly they might be sorted differently. There might be pagination and so on and so forth. So you could have one category page generating a vast number of URLs. Search results pages A few other things that often come about are search results pages from an internal site search can often, especially if they're paginated, they can have a lot of different URLs generated. Listings pages Listings pages. If you allow users to upload their own listings or content, then that can over time build up to be an enormous number of URLs if you think about a job board or something like eBay and it probably has a huge number of pages.
whether they allow crawling, indexing, and PageRank. So what are some of the tools that you can use to address these issues and to get the most out of your crawl budget? So as a baseline, if we think about how a normal URL behaves with Googlebot, we say, yes, it can be crawled, yes, it can be indexed, and yes, it passes PageRank. So a URL like these, if I link to these somewhere on my site and then Google follows that link and indexes these pages, these probably still have the top nav and the site-wide navigation on them.