X
Business

Bringing home the bacon

Last week, I took Bacon's Information to task for a Web crawler that wasn't behaving as nice as I'd have liked. I'm happy to say that contrary to my bias, they've been very helpful and responsive.
Written by Phil Windley, Contributor

Last week, I took Bacon's Information to task for a Web crawler that wasn't behaving as nice as I'd have liked. I'm happy to say that contrary to my bias, they've been very helpful and responsive.  I think we're both better off for the exchange.

After I blogged about the problem, I got a very polite email from Chris Thilk who is a Senior Digital Media Specialist at Bacon's Information. Chris originally thought I didn't want my site indexed. Actually nothing is further from the truth. I'm happy to have my site indexed so that the content is findable in multiple contexts. I just wanted it indexed more politely.

Chris pointed out that even though I had a robots.txt file, it wasn't blocking the primary culprit, /mt-search.cgi. This is my bad for not updating my robots.txt file as my site changed.  This is a tough thing to keep track of since it requires anticipating problems that changes to your site might cause and then determining a simple way to block them. 

I asked Chris to change their crawler to observe the re="nofollow" attributes on hypertext anchors and that they consider interleaving requests to multiple sites so that a single site sees requests spread out over time--the biggest problem I had was that there were hundreds of requests in just a few minutes. Chris reports that they've made some changes to how they index and since then I haven't had a problem.

My hat's off to Chris and Bacon's Information for being responsive and helpful.

Editorial standards