Bots are used all over the internet for various reasons and may also be referred to as spiders or crawlers. Search engines, for example, will use crawlers to to index web pages and update search engine results. If you operate a website, it's important to be aware of what bots are crawling your site and why.

Block Proxies with Blocked

Bad Bots and Scrapers

Blocking the bad bots, scrapers, and crawlers from accessing your website is important. Automated traffic generated from these bad bots can hamper the performance of a website, eat bandwidth, generate spam, or make a copy of your entire website.

Bots are also used to scrape content from websites or to monitor a website for changes. It is just as important for legitimate bots to be able to access and crawl your site as it is to block the bad ones. Bots will sometimes present User Agents to identify themselves, use no User Agent at all, or spoof as a legitimate bot or spider, such as that of a search engine.

Competitive intelligence is also another popular use for bots. For example, a competitor may be crawling your website to get your pricing or inventory so they can have the edge in a competitive market.


Tracking and Blocking the Bots

There are lots of ways to detect and block bots which can complicate the process. Some methods include analyzing the User Agents presented by your visitors while others involve maintaining blacklists of known bot IP addresses. Again, the key to staying on top of the detection methods will involve using these methods on conjunction with each other and finding a reliable source of blacklist data.

With the multitude of blacklists available on the internet, it is imperative that false positives are not produced by blocking legitimate users. IP addresses associated with bot activity one day may not be associated with bot activity in 3 months. Implementing a system which combines detection methods and accurate, updated blacklists is the most effective way to stay on top of the bots. The next step is implementing these technologies to work in concert and become an effective tool for protecting your website.


The best all-in-one solution for detecting and blocking bad bots, scrapers, and crawers is Blocked. Blocked maintains regularly updated blacklists of bot IP addresses which can be used with the detection methods available in the Blocked software.