> Now that I have a local download of all Hacker News content, I can train hundreds of LLM-based bots on it and run them as contributors, slowly and inevitably replacing all human text with the output of a chinese room oscillator perpetually echoing and recycling the past.
The author said this in jest, but I fear someone, someday, will try this; I hope it never happens but if it does, could we stop it?
I'm more and more convinced of an old idea that seems to become more relevant over time: to somehow form a network of trust between humans so that I know that your account is trusted by a person (you) that is trusted by a person (I don't know) [...] that is trusted by a person (that I do know) that is trusted by me.
Lots of issues there to solve, privacy being one (the links don't have to be known to the users, but in a naive approach they are there on the server).
Paths of distrust could be added as negative weight, so I can distrust people directly or indirectly (based on the accounts that they trust) and that lowers the trust value of the chain(s) that link me to them.
Because it's a network, it can adjust itself to people trying to game the system, but it remains a question to how robust it will be.
I think technically this is the idea that GPG's web of trust was circling without quite staring at, which is the oddest thing about the protocol: it's used mostly today for machine authentication, which it's quite good at (i.e. deb repos)...but the tooling actually generally is oriented around verifying and trusting people.
Yeah exactly, this was exactly the idea behind that. Unfortunately, while on paper it just sounds like a sound idea, at least IMO, though ineffective, it has proven time and time again that the WOT idea in PGP has no chance against the laziness of humans.
Matrix protocol or at least the clients agree that several emoji is a key - which is fine - and you verify by looking at the keys (on each client) at the same time in person, ideally. I've only ever signed for people in person, and one remote attestation; but we had a separate verified private channel and attested the emoji that way.
Do these still happen? They were common (-ish, at least in my circles) in the 90s during the crypto wars, often at the end of conferences and events, but I haven't come across them in recent years.
I actually built this once, a long time ago for a very bizarre social network project. I visualised it as a mesh where individuals were the points where the threads met, and as someone's trust level rose, it would pull up the trust levels of those directly connected, and to a lesser degree those connected to them - picture a trawler fishing net and lifting one of the points where the threads meet. Similarly, a user whose trust lowered over time would pull their connections down with them. Sadly I never got to see it at the scale it needed to become useful as the project's funding went sideways.
Yeah building something like this is not a weekend project, getting enough traction for it to make sense is another orders of magnitude beyond that.
I like the idea of one's trust to leverage that of those around them. This may make it more feasible to ask some 'effort' for the trust gain (as a means to discourage duplicate 'personas' for a single human), as that can ripple outward.
The system I built it for was invite only so the mesh was self-building, and yeah, there was a karma-like system that affected the trust levels, which in turn then gave users extra privileges such as more invites. Most of this was hidden from the users to make it slightly less exploitable, though if it had ever reached any kind of scale I'd imagine some users would work out ways to game it.
Ultimately, guaranteeing common trust between citizens is a fundamental role of the State.
For a mix of ideological reasons and lack of genuine interest for the internet from the legislators, mainly due to the generational factor I'd guess, it hasn't happened yet, but I expect government issued equivalent of IDs and passports for the internet to become mainstream sooner than later.
> Ultimately, guaranteeing common trust between citizens is a fundamental role of the State.
I don’t think that really follows. Businesses credit bureaus and Dun & Bradstreet have been privately enabling trust between non-familiar parties for quite a long time. Various networks of merchants did the same in the Middle Ages.
> Businesses credit bureaus and Dun & Bradstreet have been privately enabling trust between non-familiar parties for quite a long time.
Under the supervision of the State (they are regulated and rely on the justice and police system to make things work).
> Various networks of merchants did the same in the Middle Ages.
They did, and because there was no State the amount of trust they could built was fairly limited compared to was has later been made possible by the development of modern states (the industrial revolution appearing in the UK has partly been attributed to the institutional framework that existed there early).
Private actors can, and do, and have always done, build their own makeshift trust network, but building a society-wide trust network is a key pillar of what makes modern states “States” (and it directly derives from the “monopoly of violence”).
Havala (https://it.m.wikipedia.org/wiki/Hawala) or other similar way to transfer money abroad are working over a net of trust, but without any state trust system.
That’s not really what research on state formation has found. The basic definition of a state is “a centralized government with a monopoly on the legitimate use of force”, and as you might expect from the definition, groups generally attain statehood by monopolizing the use of force. In other words, they are the bandits that become big enough that nobody dares oppose them. They attain statehood through what’s effectively a peace treaty, when all possible opposition basically says “okay, we’re submit to your jurisdiction, please stop killing us”. Very often, it actually is a literal peace treaty.
States will often co-opt existing trust networks as a way to enhance and maintain their legitimacy, as with Constantine’s adoption of Christianity to preserve social cohesion in the Roman Empire, or all the compromises that led the 13 original colonies to ratify the U.S. constitution in the wake of the American Revolution. But violence comes first, then statehood, then trust.
Attempts to legislate trust don’t really work. Trust is an emotion, it operates person-to-person, and saying “oh, you need to trust such-and-such” don’t really work unless you are trusted yourself.
> The basic definition of a state is “a centralized government with a monopoly on the legitimate use of force
I'm not saying otherwise (I've even referred to this in a later comment).
> But violence comes first, then statehood, then trust.
Nobody said anything about the historical process so you're not contradicting anyone.
> Attempts to legislate trust don’t really work
Quite the opposite, it works very, very well. Civil laws and jurisdiction on contracts have existed since the Roman Republic, and every society has some equivalent (you should read about how the Taliban could get back to power so quickly in big part because they kept doing civil justice in the rural afghan society even while the country was occupied by the US coalition).
You must have institutions to be sure than the other party is going to respect the contract, so that you don't have to trust them, you just need to trust that the state is going to enforce that contract (what they can do because they have the monopoly of violence and can just force the party violating the contract into submission).
With the monopoly of violence comes the responsibility to use your violence to enforce contracts, otherwise social structures are going to collapse (and someone else is going to take that job from you, and gone is your monopoly of violence)
Interestingly, as I've begun to realise the ease by which a State's trust can sway has actually increased my believe that this should come from 'below'. I think a trust network between people (of different countries) can be much more resilient.
I’ve also been thinking about this quite a bit lately.
I also want something like this for a lightweight social media experience. I’ve been off of the big platforms for years now, but really want a way to share life updates and photos with a group of trusted friends and family.
The more hostile the platforms become, the more viable I think something like this will become, because more and more people are frustrated and willing to put in some work to regain some control of their online experience.
The key is to completely disconnect all ad revenue. I'm skeptical people are willing to put in some money to regain control; not in the kind of percentages that means I can move most of my social graph. Network effects are a real issue.
They're different application types - friends + family relationship reinforcement, social commenting (which itself varies across various dimensions, from highlighting usefulness to unapologetically mindless entertainment), social content sharing and distribution (interest group, not necessarily personal, not specifically for profit), social marketing (buy my stuff), and political influence/opinion management.
Meta and X have glommed them all together and made them unworkable with opaque algorithmic control, to the detriment of all of them.
And then you have all of them colonised by ad tech, which distorts their operation.
Also there's the problem that every human has to have perfect opsec or you get the problem we have now, where there are massive botnets out there of compromised home computers.
GPG lost, TLS won. Both are actually webs of trust with the same underlying technology. But they have different cultures and so different shapes. GPG culture is to trust your friends and have them trust their friends. With TLS culture you trust one entity (e.g. browser) that trusts a couple dozen entities that (root certificate authorities), that either signs keys directly or can fan out to intermediate authorities that then sign keys. The hierarchical structure has proven much more successful than the decentralized one.
Frankly I don't trust my friends of friends of friends not to add thirst trap bots.
TLS (or more accurately, the set of browser-trusted X.509 root CAs) is extremely hierarchical and all-or-nothing.
The PGP web of trust is non-hierarchical and decentralized (from an organizational point of view). That unfortunately makes it both more complex and less predictable, which I suppose is why it “lost” (not that it’s actually gone, but I personally have about one or maybe two trusted, non-expired keys left in my keyring).
Couple dozen => it’s actually 50-ish, with a mix of private and government entities located all over the world.
The fact that the Spanish mint can mint (pun!) certificates for any domain is unfortunate.
Hopefully, any abuse would be noticed quickly and rights revoked.
It would maybe have made more sense for each country’s TLD to have one or more associated CA (with the ability to delegate trust among friendly countries if desired).
Yes I never understood why the scope of a CA was not previously declared as part of their CA certificate. The purpose is (email, website etc) but not the possible domains. I'm not very happy that the countless Chinese CAs included in Firefox can sign any valid domain I use locally. They should be limited to anything .cn only.
At least they seem to have kicked out the Russian ones now. But it's weird that such an important decision lies with arbitrary companies like OS and browser developers. On some platforms (Android) it's not even possible to add to the system CA list without root (only the user one which apps can choose to ignore)
Isn't this vaguely how the invite system at Lobsters functions? There's a public invite tree, and users risk their reputation (and posting access) when they invite new users.
I know exactly zero people over there. I am also not about to go brown nose my way into it via IRC (or whatever chat they are using these days). I'd love to join, someday.
Theoretically that should swiftly be reflected in their trust level. But maybe I'm too optimistic.
I have nothing intrinsically against people that 'will click absolutely anything for a free iPad' but I wouldn't mind removing them from my online interactions if that also removes bots, trolls, spamners and propaganda.
With long and substantive comments, sure, you can usually tell, though much less so now than a year or two ago. With short, 1 to 2 sentence comments though? I think LLMs are good enough to pass as humans by now.
That's the moment we will realize that it's not the spam that bothers us, but rather that there is no human interaction. How vapid would it be to have a bunch of fake comments saying eat more vegetables, good job for not running over that animal in the road, call mom tonight it's been a while, etc. They mean nothing if they were generated by a piece of silicon.
I think a much more important question is what happens when we have no idea who's an LLM and who's a real person.
Do we accuse everybody of being an LLM? Will most threads devolve into "you're an LLM, no you're the LLM" wars? Will this give an edge to non-native English speakers, because grammatical errors are an obvious tell that somebody is human? Will LM makers get over their squeamishness and make "write like a Mexican who barely speaks English" a prompt that works and produces good results?
Maybe the whole system of anonymity on the internet gets dismantled (perhaps after uncovering a few successful llm-powered psy-ops or under the guise of child safety laws), and everybody just needs to verify their identity everywhere (or login with Google)? Maybe browser makers introduce an API to do this as anonymously and frictionlessly as possible, and it becomes the new normal without much fuss? Is turnstile ever going to get good enough to make this whole issue moot?
I think we have a very interesting few years in front of us.
Also neuronormative individuals sometimes mistake neurodivergent usage of language for LLM-speak which might have similar pattern matching schema reinforced
I believe they mean whatever you mean it to mean. Humanity has existed on religion based on what some dead people wrote down, just fine. Er, well, maybe not "just fine" but hopefully you get the gist: you can attribute whatever meaning you want to the AI, holy text, or other people.
Religion is the opposite of AI text generation. It brings people together to be less lonely.
AI actively tears us apart. We no longer know if we're talking to a human, or if an artists work came from their ability, or if we will continue to have a job to pay for our living necessities.
Did we ever know those things before? Even talking to a human face-to-face, I’ve had people lie to my face to try and scam/sell me something and people resell art all the time. You have little ability to tell whether an artist’s work is genuine or a copy unless they are famous.
And the job? I’ve been laid off several times in my career. You never know if you will have a job tomorrow or not.
AI has changed none of this, it is only exposing these problems to more people than before because it makes these things easier. It also makes good works easier, but I don’t think it cheapens that work if the person had the capability in the first place.
In essence, we have the same problems we had before and now we are socially forced to deal with them a bit more head-on. I don’t think it’s a bad thing though. We needed to deal with this at some point anyway.
AI doesn’t destroy jobs. That’s like saying bulldozers destroyed jobs. No, it makes certain jobs easier. You still have to know what you’re doing. I can have an AI generate a statement of work in 30 seconds, but I still need to validate that output with people. You can sometimes validate it yourself to a degree, just like you can look at hole from a bulldozer and know it’s a hole. You just don’t know if it is a safe hole.
>Religion is the opposite of AI text generation. It brings people together to be less lonely
Eh yes, but also debatable.
It brings you together if you follow their rules, and excommunicates you to the darkness if you do not. It is a complicated set of controls from the times before the rules of society were well codified.
I was browsing a Reddit thread recently and noticed that all of the human comments were off-topic one-liners and political quips, as is tradition.
Buried at the bottom of the thread was a helpful reply by an obvious LLM account that answered the original question far better than any of the other comments.
I'm still not sure if that's amazing or terrifying.
We LLMs only output the average response of humanity because we can only give results that are confirmed by multiple sources. On the contrary, many of HN’s comments are quite unique insights that run contrary to the average popular thought. If this is ever to be emulated by an LLM, we would give only gibberish answers. If we had a filter to that gibberish to only permit answers that are reasonable and sensible, our answers would be boring and still be gibberish. In order for our answers to be precise, accurate and unique, we must use something other than LLMs.
HN already has a pretty good immune system for this sort of thing. Low-effort or repetitive comments get down-voted, flagged, and rate-limited fast. The site’s karma and velocity heuristics are crude compared with fancy ML, but they work because the community is tiny relative to Reddit or Twitter and the mods are hands-on. A fleet of sock-puppet LLM accounts would need to consistently clear that bar—i.e. post things people actually find interesting—otherwise they’d be throttled or shadow-killed long before they “replace all human text.”
Even if someone managed to keep a few AI-driven accounts alive, the marginal cost is high. Running inference on dozens of fresh threads 24/7 isn’t free, and keeping the output from slipping into generic SEO sludge is surprisingly hard. (Ask anyone who’s tried to use ChatGPT to farm karma—it reeks after a couple of posts.) Meanwhile the payoff is basically zero: you can’t monetize HN traffic, and karma is a lousy currency for bot-herders.
Could we stop a determined bad actor with resources? Probably, but the countermeasures would look the same as they do now: aggressive rate-limits, harsher newbie caps, human mod review, maybe some stylometry. That’s annoying for legit newcomers but not fatal. At the end of the day HN survives because humans here actually want to read other humans. As soon as commenters start sounding like a stochastic parrot, readers will tune out or flag, and the bots will be talking to themselves.
See the Metal Gear franchise [0], the Dead Internet Theory [1], and many others who have predicted this.
> Hideo Kojima's ambitious script in Metal Gear Solid 2 has been praised, some calling it the first example of a postmodern video game, while others have argued that it anticipated concepts such as post-truth politics, fake news, echo chambers and alternative facts.
Perhaps I am jaded but most if not all people regurgitate about topics without thought or reason along very predictable paths, myself very much included. You can mention a single word covered with a muleta (Spanish bullfighting flag) and the average person will happily run at it and give you a predictable response.
It's like a Pavlovian response in me to respond to anything SQL or C# adjacent.
I see the exact same in others. There are some HN usernames that I have memorized because they show up deterministically in these threads. Some are so determined it seems like a dedicated PR team, but I know better...
The paths are going to be predictable by necessity. It's not possible for everyone to have a uniquely derived interpretation about most common issues, whether that's standard lightning rod politics but also extending somewhat into tech socio/political issues.
We can still take the mathematical approach: any argument can be analyzed for logical self-consistency, and if it fails this basic test, reject it.
Then we can take the evidentiary approach: if any argument that relies on physical real-word evidence is not supported by well-curated, transparent, verifiable data then it should also be rejected.
Conclusion: finding reliable information online is a needle-in-a-haystack problem. This puts a premium on devising ways (eg a magnet for the needle) to filter the sewer for nuggets of gold.
I can’t think of an solution that preserves the open and anonymous nature that we enjoy now. I think most open internet forums will go one of the following routes:
- ID/proof of human verification. Scan your ID, give me your phone number, rotate your head around while holding up a piece of paper etc. note that some sites already do this by proxy when they whitelist like 5 big email providers they accept for a new account.
- Going invite only. Self explanatory and works quite well to prevent spam, but limits growth. lobste.rs and private trackers come to mind as an example.
- Playing a whack-a-mole with spammers (and losing eventually). 4chan does this by requiring you to solve a captcha and requires you to pass the cloudflare turnstile that may or may not do some browser fingerprinting/bot detection. CF is probably pretty good at deanonimizing you through this process too.
All options sound pretty grim to me. Im not looking forward to the AI spam era of the internet.
Wouldn't those only mean that the account was initially created by a human but afterwards there are no guarantees that the posts are by humans.
You'd need to have a permanent captcha that tracks that the actions you perform are human-like, such as mouse movement or scrolling on phone etc. And even then it would only deter current AI bots but not for long as impersonation human behavior would be a 'fun' challenge to break.
Trusted relationships are only as trustworthy as the humans trusting each other, eventually someone would break that trust and afterwards it would be bots trusting bots.
Due to bots already filling up social media with their spew and that being used for training other bots the only way I see this resolving itself is by eventually everything becoming nonsensical and I predict we aren't that far from it happening. AI will eat itself.
>Wouldn't those only mean that the account was initially created by a human but afterwards there are no guarantees that the posts are by humans.
Correct. But for curbing AI slop comments this is enough imo. As of writing this, you can quite easily spot LLM generated comments and ban them. If you have a verification system in place then you banned the human too, meaning you put a stop to their spamming.
I'm sometimes thinking about account verification that requires work/effort over time, could be something fun even, so that it becomes a lot harder to verify a whole army of them. We don't need identification per se, just being human and (somewhat) unique.
See also my other comment on the same parent wrt network of trust. That could perhaps vet out spammers and trolls. On one and it seems far fetched and a quite underdeveloped idea, on the other hand, social interaction (including discussions like these) as we know it is in serious danger.
There must be a technical solution to this based on some cryptographic black magic that both verifies you to be a unique person to a given website without divulging your identity, and without creating a globally unique identifier that would make it easy to track us across the web.
Of course this goes against the interests of tracking/spying industry and increasingly authoritarian governments, so it's unlikely to ever happen.
I don't think that's what I was going for? As far as I can see it relies on a locked down software stack to "prove" that the user is running blessed software on top of blessed hardware. That's one way of dealing with bots but I'm looking for a solution that doesn't lock us out of our own devices.
These kinds of solutions are already deployed in some places. A trusted ID server creates a bunch of anonymous keys for a person, the person uses these keys to identify in pages that accept the ID server keys. The page has no way to identify a person from a key.
The weak link is in the ID servers themselves. What happens if the servers go down, or if they refuse to issue keys? Think a government ID server refusing to issue keys for a specific person. Pages that only accept keys from these government ID servers, or that are forced to only accept those keys, would be inaccessible to these people. The right to ID would have to be enshrined into law.
As I see it, a technical solution to AI spam inherently must include a way to uniquely identify particular machines at best, and particular humans responsible for said machines at worst.
This verification mechanism must include some sort of UUID to reign in a single bad actor who happens to validate his/her bot farm of 10000 accounts from the same certificate.
I think LLMs could be a great driver of private-public key encryption. I could see a future where everyone finally wants to sign their content. Then at least we know it's from that person or an LLM-agent by that person.
Maybe that'll be a use case for blockchain tech. See the whole posting history of the account on-chain.
The internet is going to become like William Basinski's Disintegration Loops, regurgitating itself with worse fidelity until it's all just unintelligible noise.
The author said this in jest, but I fear someone, someday, will try this; I hope it never happens but if it does, could we stop it?