Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We need a project in the spirit of Spamhaus to actively maintain a list of perpetrating IPs. If they're cycling through IPs and IP blocks I don't know how sustainable a CAPTCHA-like solution is.


Just block all of AWS, Alibaba, GCP and Azure, or throttle them aggressively. If you have clients/customers that need more requests per second then have them provide you with their IPs.

The problem is that these companies are fairly well funded and renting infrastructure isn't an issue.


Exactly. They're renting infrastructure on well-known clouds, not cycling through consumer IPs like yesterday's botnets. Block all web traffic from well-known cloud IPs, and you can keep 99% of the LLM bots away. Alibaba seems to be the most common source of bot traffic on my infrastructure lately, and I also see Huawei Cloud from time to time. Not much AWS, probably because of their high IPv4 pricing.

You can allow API access from cloud IPs, as long as you don't do anything expensive before you've authenticated the client.


From the article:

“…they do so using random User-Agents that overlap with end-users and come from tens of thousands of IP addresses - mostly residential, in unrelated subnets, each one making no more than one HTTP request over any time period we tried to measure - actively and maliciously adapting and blending in with end-user traffic and avoiding attempts to characterize their behavior or block their traffic.”

So it looks like much of the traffic, particularly from China, is indeed using consumer ips to disguise itself. That’s why they blocked based on browser type (MS Edge, in this case).


This matches exactly with what I'm seeing on my own sites too and it's from all over the world, not just China.

(I described my bot woes a few weeks ago at https://news.ycombinator.com/item?id=43208623. The "just block bots!" replies were well-intentioned but naive -- I've still found no signal that works reliably well to distinguish bots from real traffic.)


I saw a fair amount of that kind of behavior, too, mostly around the summer of last year. At some point it dropped off sharply. Over the last few months, at least for the servers I keep an eye on, most of the trouble has been from Chinese cloud IPs.

Either the LLM devs got more funding, or maybe the authorities took down the botnet they were using.


Why only in the "spirit of Spamhaus"? Spamhaus still exists. Add Google and Microsoft AS to the DROP/NOROUTE list, that would be hilarious.


Because while this is clearly related to spam, it's not the same thing, and presumably if Spamhaus themselves felt it was within their wheelhouse, they'd already be doing it.


This sounds backwards to me, if you maintain a list of IPs but they are constantly cycling them, it'll get out of date quickly, but a captcha-like system will (hopefully) always stop bot traffic


While some of the residential IPs are from malware, a lot of it is from residential IP proxies, where people are paid to run proxy software from their home. If it starts getting around that people who run this software quickly become blocked by the majority of the internet that will lessen that part of the problem.


Only if your CAPTCHA-like is hurled at every client indiscriminately. Otherwise you'll end up right back where Spamhaus started: maintaining your own list of good and bad actors.

The advantage of a third party service is that you're sharing intel of bad actors.


I can't confirm but I believe it is applied to every client




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: