> An LLM scraper is operating in a hostile environment [...] because you can't particularly tell a JavaScript proof of work system from JavaScript that does other things. [..] for people who would like to exploit your scraper's CPU to do some cryptocurrency mining, or [...] want to waste as much of your CPU as possible).
That's a valid reason to serve JS-based PoW systems scares LLM operators: there's a chance the code might actually be malicious.
That's not a valid reason to serve JS-based PoW systems to human users: the entire reason those proofs work against LLMs is the threat that the code is malicious.
In other words, PoW works against LLM scrapers not because of PoW, but because they could contain malicious code. Why would you threaten your users with that?
And if you can apply the threat only to LLMs, then why don't you cut the PoW garbage start with that instead?
I know, it's because it's not so easy. So instead of wielding the Damocles sword of malware, why not standardize on some PoW algorithm that people can honestly apply without the risks?
I don't know, Sandbox escape from a browser is a big deal, a million dollars bounty kind of deal. I feel safe to put an automated browser in a container or a VM and let it run with a timeout.
And if a site pulls something like that on me, then I just don't take their data. Joke is on them, soon if something is not visible to AI it will not 'exist', like it is now when you are delisted from Google.
Your users - we, browsing the web - are already threatened with this. Adding a PoW changes nothing here.
My browser already has several layers of protection in place. My browser even allows me to improve this protection with addons (ublock etc) and my OSes add even more protection to this. This is enough to allow PoW-thats-legit but block malicious code.
" This movie, built with data collected during the European Space Agency's Huygens probe on Jan. 14, 2005, shows the operation of the Descent Imager/Spectral Radiometer camera during its descent and after touchdown. The camera was funded by NASA.
The almost four-hour-long operation of the camera is shown in less than five minutes. That's 40 times the actual speed up to landing and 100 times the actual speed thereafter.
The first part of the movie shows how Titan looked to the camera as it acquired more and more images during the probe's descent. Each image has a small field of view, and dozens of images were made into mosaics of the whole scene. "
I like how clear this visualization is despite being packed with data. Once you start paying attention to the different parameters, you'll find yourself restarting the 5 minutes video to watch some other property.
I'm not sure I'd call anything using libhybris "Linux-based". Their low-level elements come from Android with all the problems that implies, including Android being Linux only in the most irrelevant technical sense.
Sailfish is based on the Linux kernel, and Mer, which itself is a fork of MeeGo - a mobile Linux distro.
Hybris is just one component that Mer supports: a compatibility layer which allows the use of Android libraries and drivers. But that doesn't make it the backbone of Sailfish...
There were calculators in the '90s where the display would go faint when you covered the tiny solar panel. Perhaps the battery was already drained. Quite common I would say.
RPN is definitely easier to implement. I helped someone do that as a student project and while it was minimally complex, there were no edge cases with the operators.
You pay for that by having a stack rather than a small fixed number of variables.
> You pay for that by having a stack rather than a small fixed number of variables.
you can easily add variables to your rpn calculator. For example ">x" pops the top of the stack into the variable x, and "<x" pushes the value of x to the stack.
You can also interpret parentheses as whitespace to enable users to group parts of the computation (but this may become confusing when they write nonsensical parentheses).
My HP RPN calculator only has four positions available in its stack, which I imagine makes the implementation a bit simpler than a stack of arbitrary size.
The typical 4-function calculator doesn't even allow multiple subtrees of computation, so I think it works out to having something like 2 entries on the stack.
The environmental impact grows a lot more with the number of sold product than the maintenance burden.
For a business that sold a thousand units with a handful remaining, the calculation is going to be a lot less dominated by impact than for a giant who sold millions and has thousands still in use.
And if there's a giant in this industry, that's Apple.
That's a valid reason to serve JS-based PoW systems scares LLM operators: there's a chance the code might actually be malicious.
That's not a valid reason to serve JS-based PoW systems to human users: the entire reason those proofs work against LLMs is the threat that the code is malicious.
In other words, PoW works against LLM scrapers not because of PoW, but because they could contain malicious code. Why would you threaten your users with that?
And if you can apply the threat only to LLMs, then why don't you cut the PoW garbage start with that instead?
I know, it's because it's not so easy. So instead of wielding the Damocles sword of malware, why not standardize on some PoW algorithm that people can honestly apply without the risks?