Web bots have been around for a long time and we all benefit from many of them.
There are good bots (like Googlebot or Bingbot), and there are bad bots that automatically attempt to hack a web application or inject spam into websites. The good ones are generally beneficial, and the bad ones can often be dealt with by a solution such as a Web Application Firewall (WAF) or enterprise bot manager that will recognize malicious requests and block them.
The problematic bots are often those that sit between good and bad. These can be hard to detect, as they will often impersonate a normal user and make requests that, on their own and in isolation, are perfectly safe, legitimate, and seemingly harmless.
Although their intention is normally something other than a DDoS attack, the effect can sometimes be the same when they are either too aggressive or too many instances of a bot hitting a website at once.
These bots are used commercially for a number of reasons including:
Real-world example of a commercial bot causing a lot of issues
We have a client that often sells limited edition products which are very sought after. These products can often fetch 3 times the RRP when sold on eBay, and the retailer will only have a limited supply to sell. Most of these products have a coordinated worldwide launch and therefore the exact time of the launch is well known.
Over the last few years, we have increasingly seen extremely aggressive bots used in the many thousands to attempt to purchase these products to an extent where the performance of the e-commerce platform can be seriously compromised.
In this instance, the bots have been specifically designed for this retailer’s website, and know the exact requests that need to be made to add the product to the basket and go through the checkout. They don’t even need to visit the product display page. They are normally distributed across multiple cloud servers with multiple instances of the bot installed on each server. Because the launch time is public and coordinated, the bots all start to attempt to add the product to the basket and go through the checkout at the exact same time, normally many thousands at once.
The record we have seen is 3 million attempts to purchase a single product in a 12 hour period.
Because the requests are all legitimate and the bot is impersonating a real user, it can be hard to block the bots quickly enough before they do the damage without blocking real users. There is no point in waiting 1 minute to record how many requests a particular IP has made and, if the number is over a certain threshold, you then block them. By this point, the damage has already been done and you have tens of thousands of bots simultaneously in your checkout.
So how do you manage good bots and bad bots?
Many organizations, such as CDNs, have been rapidly developing bot management solutions over the last year in response to the increasing problems with bots that retailers are facing. Some, such as Akamai’s bot manager solution, can be very sophisticated in the way that they attempt to identify a bot, as well as with the options it will give the retailer in how they deal with the bot. Tools like this will constantly adapt and learn to keep up with the bots that are always evolving in an attempt to evade detection.
Simply blocking the bot is not always the answer. If they know they have been blocked, they can just jump to another IP or try to evolve in order to fool the bot manager. A better solution is to fool the bot by showing them the wrong content (maybe higher prices – in the case of a bot used to analyze competitor’s prices) or just slow them down. This is also a useful technique to use for bots that are only harmful because they are too aggressive in their crawling. You don’t want to block them altogether, but you do want to slow them down a little to reduce the impact on your infrastructure.
Although a bot manager solution is certainly a useful tool, it is unlikely to identify and stop all bots and, in the real-world instance detailed above, by the time it would possibly identify the user as a bot, it may be too late as the damage would already be done. Bots will constantly adapt and evolve to stop bot managers blocking them and so it is a moving target.
The solution to effectively managing these bots is multi-faceted. There is no one, single, solution that will catch everything and give you all of the control you need. Different services and solutions will give protection in different areas against different types of bots. Only by deploying multiple defences and solutions can you effectively manage these bots.
It is important to consider implementing some or all of the solutions above rather than just relying on one of them as each will provide defence against these bots in slightly different ways. For example, if you simply relied on an application change to prevent purchasing bots they will still be hammering the rest of your infrastructure and even cause issues such as filling apache or Varnish logs files to an extent that your server could run out of disk space.
Good bot vs bad bot: Don’t ignore the signs
In summary, bots are becoming an increasing commercial threat to e-commerce retailers and dealing with them effectively can be very complex. We have a customer that very recently implemented an enterprise bot manager and they have found that over 65% of their web traffic comes from bots. The bot manager filters every request made to the website and uses various complex techniques and algorithms to identify and block requests that it deems to be from a bot. We knew that they were being hit hard by bots at times but that number took us a bit by surprise.
If you consider this number and the amount of bandwidth and server capacity that is required to serve this traffic and the fact that around 75% of that bot traffic is from ‘bad’ or malicious bots, it is not something that any retailer should ignore.