How to make search engines discover more posts on your blog
As I explained in my tutorial “How to analyse your visitors’ traffic figures”, the search engine traffic is important to have people discover your blog. So we have to do everything we can to make search engines discover our blog’s content, through SEO or Search Engine Optimization.
While there is much written about SEO, any honest expert will confess that there are many mysteries surrounding this subject. That’s why in BlogTips, I only share those techniques I experienced to work: In previous posts, I have proven how changing from one blog platform to another, and generating a site map can dramatically increase the amount of posts a crawler “discovers”. And as crawlers like fast sites, I showed the difference of crawlers’ access speed when switching from shared to dedicated servers.
But… these techniques still let crawlers decide how deep they want to dig into your site. If you have a website, or a blog, with a lot of content, much of it might remain undiscovered. And if crawlers don’t check for your content, people won’t be able to find it through the search engines.
By default, Google determines automatically their crawl rate for your site. The “crawl rate” defines many how pages from your site, or posts from your blog, they’ll check on “each visit”. If the crawl rate is too low, a lot of content might go undiscovered and un-indexed.
So how can we force Google crawler (or gently ask it) to crawl more frequently?
1. How many pages does Google crawl on your blog?
As explained in Five things to do after creating a new blog, it is good practice to submit your site to Google Webmasters.
In the Webmasters “Diagnostics Menu”, check the “Crawl stats”, to find out how many pages Google crawls per day. If, according to the amount of blogposts you have, this is too low, you will need to increase the crawl rate.
2. Using Google Webmasters tools to increase the crawl rate
For any site, even if you don’t selfhost your blog, you can adjust the Google crawl rate using the same Google Webmasters tool: Go to “Site Configuration” – “Settings” and click on “Set custom crawl rate”:
Adjust the slider to increase the crawl rate, save the setting and you are done.
This is not a permanent setting. It is only valid for 90 days. Also if you have a robot.txt on your selfhosted blog’s root (see point 3), than the most restrictive setting will apply.
Beware, this ONLY changes the Google crawlrate, and not the crawlrate for any other search engine. Another thing to keep in mind: adjusting the Google crawlrate will only remain valid for three months. You will have to re-adjust it after that period.
3. Using robots.txt to increase the crawl rate
This solution is only valid if you selfhost your blog, running it on your own server.
“robots.txt” is a simple text file with “instructions” for crawlers. It can restrict the access (for all crawlers or only for specific crawlers) to parts of your website, to protect non-public parts, scripts, etc.. But it can also define the rate in which your site is crawled. SearchEngineLand has a simple tutorial if you want to learn more about robots.txt
In short, to make a robots.txt, to change the crawl rate (say to 60 seconds), for all crawlers, something like this is sufficient: (“#” lines are comments)
# Change the crawlrate for any crawler
Save it in a text file (“robots.txt” – all lower case) and put it on the root directory of your blog, and you’re done!
Attention! All crawlers accept the robot.txt “crawl-delay” setting, except Google. The Google crawl-delay can only be adjusted using Google’s webmaster tools (as described in the previous paragraph).
And does it work?
You bet! AidNews, one of me humanitarian news aggregators, has over 100,000 posts. While I have a lot of traffic on this site, the amount of visitors coming in via Google was way too low. I changed the crawler delay with a robots.txt from 300 to 60, and see how, the same day, the crawler responded on the graph at the top of this post.
Now, the only thing I can do, is wait until the search engine traffic picks up.