Electricmonk

Ferry Boender

Programmer, DevOpper, Open Source enthusiast.

Blog

Blocking VoilaBot

Tuesday, August 19th, 2008

There’s a web-crawler out there called VoilaBot, which is hammering my site with needless crawls and which appears to ignore robots.txt files completely. Apparently it’s a crawler for a french portal/search engine. If you need to block this bot from your site, there are two things you can do:

Firewall

If you’ve got a firewall on your box, you can deny access to the two IP ranges 81.52.143.0 / 24 and 193.252.149.0 / 24. That’ll get them off your back permanently. For Linux machines with iptables firewall, the following will do the trick:

iptables -A INPUT --source 193.252.149.0/24 -j DROP
iptables -A INPUT --source 81.52.143.0/24 -j DROP

htaccess

If you don’t want to firewall the bot, you can deny them access to your website by putting a .htaccess file in your web root directory with the following contents:

order allow,deny
deny from 81.52.143.
deny from 193.252.149.

Don’t trust VoilaBot to honour your robots.txt file; it won’t.

The text of all posts on this blog, unless specificly mentioned otherwise, are licensed under this license.