Imperva Report Claims That 50% Of The World Wide Web Is Now Bots

Automation has been a part of the Internet since long before the appearance of the World Wide Web and the first web browsers, but it’s become a significantly larger part of total traffic the past decade. A recent report by cyber security services company Imperva pins the level of automated traffic (‘bots’) at roughly fifty percent of total traffic, with about 32% of all traffic attributed to ‘bad bots’, meaning automated traffic that crawls and scrapes content to e.g. train large language models (LLMs) and generate automated content as well as perform automated attacks on the countless APIs accessible on the internet.

According to Imperva, this is the fifth year of rising ‘bad bot’ traffic, with the 2023 report noting again a few percent increase. Meanwhile ‘good bot’ traffic also keeps increasing year over year, yet while these are not directly nefarious, many of these bots can throw off analytics and of course generate increased costs for especially smaller websites. Most worrisome are the automated attacks by the bad bots, which ranges from account takeover attempts to exploiting vulnerable web-based APIs. It’s not just Imperva who is making these claims, the idea that automated traffic will soon destroy the WWW has floated around since the late 2010s as the ‘Dead Internet theory‘.

Although the idea that the Internet will ‘die’ is probably overblown, the increase in automated traffic makes it increasingly harder to distinguish human-generated content and human commentators from fake content and accounts. This is worrisome due to how much of today’s opinions are formed and reinforced on e.g. ‘social media’ websites, while more and more comments, images and even videos are manipulated or machine-generated.

Bots That Snag The Hottest Fashion While Breaking Social Trust In Commerce

Scarcity on the Internet is the siren song of bot writers. Maybe you’ve lost an eBay bid in the last milliseconds, or missed out on a hacker con when tickets sold out in under a minute — your corporeal self has been outperformed by a bot. But maybe you didn’t know bots are on a buying frenzy in the hyped-up world of fashion. From limited-run sneakers to anything with the word Supreme printed on it, people who will not accept any substitute in wearing the rarest and most sought after are turning to resellers who use bots to snag unobtanium items and profit on the secondary market.

At DEF CON 27 [FinalPhoenix] took the stage to share her adventures in writing bots and uncovering a world that buys and sells purchasing automation, forming groups much like cryptocurrency mining pools to generate leads on when the latest fashion is about to drop. This is no small market either. If your bots are leet enough, you can make a ton of cash. Let’s take a look at what it takes to write a bot, and at the bots-for-sale economy that has grown up around these concepts.

The internet is built with bots in mind and we have Google to thank for this. Their major innovation was moving us off of a curated internet to one that is machine crawled. Everyone wants good Google juice and that means building a site that is friendly to the Google bots that crawl and index the internet. This makes automation for your own purposes quite a bit easier. Namely, the monitor-bots that are used to detect when a retailer has the latest in stock. [FinalPhoenix] demonstrated a simple script that grabs the XML site map, parsing it for newly in-stock items, flagging them when found. But here’s the killer — if your monitor bot is a good one, you can turn it into a discord channel and sell subscriptions to others playing the reseller game, to the tune of $15-30 a month per subscriber.

Example slide of code used in a web-based buy-bot

Once your bot reports stock, the race is on to buy it before anyone else can. For this, you could use the APIs of the site, but that’s time-consuming and a lot easier for retailers to detect and block bot usage. For this part of her botting tools [FinalPhoenix] likes to use web-based bots that go through a browser framework like Chromium and allow obfuscation techniques like scrolling, clicking other items, random pauses, and other simple-minded actions that make your bot appear to be only human. In the examples for this talk, the Puppeteer framework was used for this purpose. In the end, the main role of this part of the bot is to use a verified account to complete the purchase as fast as robotically possible, which is why they’re called buy-bots. Retailers do have some tricks to combat these web-based attacks like adding secret keys in the DOM that need to be sent with the next post, but these are easy to discover and incorporate into the scripts.

This raises up another interesting part of the scheme, the verified accounts. For the best chance at profit, you need multiple accounts, each used just one time to avoid your buy-bot being detected by the retailer. For this, [FinalPHoenix] turns to services that sell accounts in packages of 500-10,000 and cost around just $5-10 per batch.

But wait, here’s where it gets really wild as recursion takes hold. Yes, these buy-bots are for sale (from sites like AIO Bot and usually around $300-1500), but they’re sold in limited quantities so that it’s harder for retailers to notice and take countermeasures. Just like how the clothing was limited release and incentivized bots-wielding resellers to enter the market, there is a secondary market for the bots themselves. [FinalPhoenix] reports that reselling one of these bots can yield $1000-1500 in profit. The same principles apply, and so what we’ve ended up with is bots buying bots to buy clothes. Who knows how many levels of bot-bot transactions there are, but it certainly feels like turtles all the way down.

Bot-based high-speed trading is the real way to make major bank on the securities market. Your average hacker is shut out of that “legitimate” business, but any enterprising programmer has the option of automating whichever reseller market they find most interesting. This breaks the public trust in commerce — buying quality products from a seller connected to their production for a reasonable price. If frustrates the manufacturer, alienates the consumer, but there appears to be little in place preventing it.