Hacker Newsnew | past | comments | ask | show | jobs | submit | peterlk's commentslogin

LLMs can now detect garbage much more cheaply than humans can. This might increase cost slightly for the companies that own the AIs, but it almost certainly will not result in hiring human reviewers


> LLMs can now detect garbage much more cheaply than humans can.

Off the top of my head, I don't think this is true for training data. I could be wrong, but it seems very fallible to let GPT-5 be the source of ground truth for GPT-6.


I dotn think an LLM even can detect garbage during a training run. While training the system is only tasked with predicting the next token in the training set, it isn't trying to reason about the validity of the training set itself.


Llm-as-a-judge has been working well for years now.

RL from LLMs works.


You can triage with an LLM, at least. Throw away the obvious junk, have a human look at anything doubtful.


There are multiple people claiming this in this thread, but with no more than a "it doesn't work stop". Would be great to hear some concrete information.


What about garbage that are difficult to tell from truth?

For example, say I have an AD&D website, how does AI tell whether a piece of FR history is canon or not? Yeah I know it's a bit extreme, but you get the idea.


If the same garbage is repeated enough all over the net, the AIs will suffer brain rot. GIGO and https://news.ycombinator.com/item?id=45656223

Next step will be to mask the real information with typ0canno. Or parts of the text, otherwise search engines will fail miserably. Also squirrel anywhere so dogs look in the other direction. Up.

Imagine filtering the meaty parts with something like /usr/games/rasterman:

> what about garbage thta are dififult to tell from truth?

> for example.. say i have an ad&d website.. how does ai etll whether a piece of fr history is canon ro not? yeah ik now it's a bit etreme.. but u gewt teh idea...

or /usr/games/scramble:

> Waht aobut ggaabre taht are dficiuflt to tlel form ttruh?

> For eapxlme, say I hvae an AD&D wisbete, how deos AI tlel wthheer a pciee of FR hsiotry is caonn or not? Yaeh I konw it's a bit emxetre, but you get the ieda.

Sadly punny humans will have a harder time decyphering the mess and trying to get the silly references. But that is a sacrifice Titans are willing to make for their own good.

ElectroBuffoon over. bttzzzz


You realise that LLMs are already better at deciphering this than humans?


What cost do they incur while tokenizing highly mistyped text? Woof. To later decide real crap or typ0 cannoe.

Trying to remember the article that tested small inlined weirdness to get surprising output. That was the inspiration for the up up down down left right left right B A approach.

So far LLMs still mix command and data channels.


There are multiple people claiming this in this thread, but with no more than a "it doesn't work stop". Would be great to hear some concrete information.



I think OP is claiming that if enough people are using these obfuscators, the training data will be poisoned. The LLM being able to translate it right now is not a proof that this won't work, since it has enough "clean" data to compare against.


If enough people are doing that then venacular English has changed to be like that.

And it still isn't a problem for LLMs. There is sufficient history for it to learn on, and in any case low resource language learning shows them better than humans at learning language patterns.

If it follows an approximate grammar then an LLM will learn from it.


I don't mean people actually conversing like this on the internet, but using programs like what is in the article to feed it to the bots only.


This is exactly like those search engine traps people implemented in the late 90s and is roughly as effective.

But sure.


Was saying this 3x in this thread necessary?


I'm just interested in opinions from all 3


I thought it was a bot


They can’t easily detect garbage; they can easily detect things that are outside the dataset (for some value of such).

Which means that real “new” things and random garbage could look quite similar.


You're missing the point. The goal of garbage production is not to break the bots or poison LLMs, but to remove load from your own site. The author writes it in the article. He found that feeding bots garbage is the cheapest strategy, that's all.


You’re absolutely right!


Does this target any particular area? Or does it make people more hairy everywhere? If it targets a particular area, can anyone explain to me how scientists discover such specific pathways?


It's the same drug used in Rogaine, but oral form. Grows hair everywhere to some degree.


Counterargument: I can learn things at record speed (for me) because I can learn things in the order that makes sense to me. I find it much more motivating to start with: “why is AI bad at playing video games?”, than to start with “what is the chain rule?”

I will certainly need to learn about the chain rule eventually, but I find that I get lost in the details (and unmotivated to continue) without an end goal that is interesting to me.

AI loves to make vibe charts (sometimes with lots of extra steps), but that’s part of the process. It is nontrivial to wrangle LLMs through larger projects, and people need to learn that too


Of course! We still have computers the size of mainframes that ran on vacuum tubes. They are just built with vastly more powerful hardware and are used for specialized tasks that supercomputing facilities care about.

But it has the potential to alter the economics of AI quite dramatically


Because, similar to the US, they have authoritarian tendencies - strong nationalism and anti-immigration. How are you going to round up the bad people if you don't have surveillance everywhere?


I am unclear on how strong nationalism is an authoritarian signal. Can you go into more detail there?


Because it makes it easier to create scapegoats, and excuses for why restrictions must be created.

Blame the Jews, the immigrants, the trans, and then people will grudgingly accept the Gestapo, ICE, prosecution without proof or courts.

Which then allows you to target the opposition without proof.


Because it’s a fake nationalism where they decide who and what is considered part of the nation and who and what not.


Well the Axis powers from World War II are the most obvious demonstrations of nationalism begetting authoritarianism. Germany, Italy, and Japan were nationalist in the extreme. And Italy from that time is such a clear example that it's basically the canonical example used to teach how fascism emerges.

Contemporary examples include the Philippines, Hungary, Poland's Law and Justice Party, and arguably Russia, Turkey and India. Modi is a Hindu nationalist. The United States unfortunately is shaping up to count as an example as well.

Extreme forms of nationalism tend to have a narrative of grievance, a desire to restore a once a great national identity, and a tendency to divide the world into loyal citizens, and enemies without and within, against whom authoritarians powers must be mobilized.

So there's a conceptual basis, in terms of setting the stage for rationalizing authoritarianism, as well as abundant historical examples demonstrating the marriage of nationalism and authoritarianism in action. There's nothing wrong with not knowing, but I would say there's an extremely strong and familiar historical canon to those who study the topic.


But that would only be something nationalism signaled if the converse weren’t also true — eg, totalitarian states like the USSR, CCP, etc.

Those also had:

- grievance narratives;

- a tendency to divide the world into loyal citizens and enemies; and,

- use the above to justify authoritarian powers.

You haven’t shown that nationalism played a particular part in that cycle; just that it also happened in nationalist states. Almost like the problem is those factors, rather than nationalism.


The USSR absolutely used a nationalist view in their propaganda [0]

As did the CCP [1]:

> Ideals and convictions are the spiritual banners for the united struggle of a country, nation and party, wavering ideals and convictions are the most harmful form of wavering.

[0] https://www.cambridge.org/core/journals/nationalities-papers...

[1] https://en.wikipedia.org/wiki/Ideology_of_the_Chinese_Commun...


I actually considered listing them as additional examples, but I had to stop somewhere and they had their own distinct wrinkles.

I think the major difference in their respective cases pertain to the ideological dynamics of the particular strains of communism that manifested in those countries. What they lack is a fixation on the purity of national heritage as a primary source of moral truth and a foundation for a self conception. Instead they tended to regard themselves as part of universal, international struggle and understood conflict in economic and ideological terms. What they had in common was the sense that conflict with this chosen enemy necessitated authoritarianism.

There's more than one path to authoritarianism, and they overlap. Different mechanisms don't disprove one another, they exist side by side.


Here is an interesting review of how the two are historically strongly correlated[1].

Their conclusion is that "[...] ethnic and elitist forms of nationalism, which combine to forge exclusive nationalism, help to perpetuate autocratic regimes by continually legitimating minority exclusions [...]"

Right-wing nationalism as we're currently experiencing it is exclusive. It broadly advocates for restoring revised historical cultural narratives of a particular ethnic group, for immigration restriction and immigrant removal, for further minority culture erasure, and so on.

1: https://ora.ox.ac.uk/objects/uuid:859c6af4-d4fd-461e-b605-42...


Do you know the history of nationalism in Europe, and Germany in particular? Hint: it’s the “Na” part of “Nazi.”

You are getting downvoted because this pretty basic stuff. Either you’re part of today’s lucky 10k, or your post reads very much like far-right Gish galloping.


I don't know. It seems like from what you saying that you and honestly an enormous amount of people need to actually learn about 20th century European history and WWII. People are throwing around these terms of NAZI and Gestapo and all of this and I think they have no idea what they mean. The left is not against authoritarian. The left does not even want to really eliminate the police. They just want to be the ones to decide who are the thought-criminals and what to do with them. Also, that is not what Gish galloping. I don't know what is happening here.


The "far left" is anarchy. I don't really see anarchists wanting to keep police around.


Far left has traditionally meant communism.


I do know my history. The Nazi party was a pan-German nationalist party. I'm not sure why this is controversial.

Germans, and Germany are obviously quite sensitive to the dangers of nationalism and authoritarianism. Not just because of WW2, but also the experience of East Germany.


Didn't know the term "Gish Galloping". Thanks! I've experienced this so often in discussions with far right people.


Authoritarian? You're saying this because of immigration; this comes from a position that is basically open borders. It is an interesting double standard. The people that hold this position would not consider non-Western countries that don't want to have open borders or have dramatic demographic shifts in their population and culture to be "authoritarian." This whole notion of "rounding up the bad people" is just infantile leftist stuff. How do you have a sovereign country if you are not able to have a policy that prevents unfettered 'immigration' or unable to deport those that immigrated contrary to law?


The whole concept of a country as a related group of people from one ethnicity or historical origin is relatively recent.

Feudalism did not have this concept; a country was the land belonging to a king (or equivalent), mediated through a set of nobles. There was no concept of illegal or legal immigration; the population of a country were the people who worked for, or were owned by, the nobles ruling that country. There were land rights granted to peasants who had historically lived in that place, but these could and were often overruled by nobles.

European nobility had no such idea of ethnicity or national grouping; the English monarchy is a German family, and most of European nobility were related to each other much more closely than to the citizens of their country.

Early post-monarchy states didn't have this concept. The English Civil War and the French Revolution didn't create states that had a defined concept of the citizen as a member of any ethnic grouping. Again, there's no mention of immigration in any of the documents from this period. It just wasn't a concept they thought about.

The whole concept that a nation-state is a formalisation of a historical grouping of ethnically related people is a very recent one, only a couple of hundred years old.

So to answer your question: It is very easy to have a sovereign country without a policy that prevents unfettered immigration; you just don't care about your population being ethnically diverse. Your citizens are the people who live in your country, and have undergone whatever ceremony and formality you decide makes them citizens.

This is, after all, how America historically did this; if you arrived in America and pledged allegiance, you became a citizen of America.


> We also discovered an extra-pair-paternity event in Beethoven’s paternal line

I’m not a Beethoven scholar. Is this a novel discovery?


No idea! But I found this part of the paper confusing: "between the conception of Aert van Beethoven’s son Hendrik in Kampenhout, Belgium, in c.1572, and the conception of Ludwig van Beethoven seven generations later in 1770, in Bonn, Germany". I think that means that there were 8 generations from Aert to the famous composer, so Aert was the great...grandfather with 6 greats. (How many people know their family history that far back?)

The Wikipedia article for the composer's grandfather (who was also called Ludwig) says: in 1712 two boys named Ludwig van Beethoven were born. The two families were distantly related. [...] it is not certain "which Ludwig" actually settled in Bonn in 1733

But presumably both of those boys were officially descended from Aert so it doesn't matter for the purposes of this analysis.


> How many people know their family history that far back?

That is not so uncommon in Europe. Both from my maternal and paternal line the oldest ancestors I am aware of lived almost 400 years ago, in the mid 17th century. Before that the registers from my region are very patchy, because of the devastations of the Thirty Years' War. Were this is not the case, it is not uncommon that one is able to continue the line back to the 16th century, especially for people living in towns (not to speak of the aristocracy).

Or, as another example, here is photo of the reunion of the "descendants and sides relatives of Dr. Martin Luther and Katharina von Bora" (not related to myself): https://www.lutheriden.de/the-lutheriden.html


> How many people know their family history that far back?

Very common to be able to do this in Europe. I've traced my family back to the early 1500s via Ancestry.com, and my family tree isn't anything special. Before that period, records get very sparse among the common folk, though, and you pretty much need to have a strong connection with nobility to go further back in time.


The whole, "no idea!" leading the comment is Norm level digital oration. Take your goddamned upvote. You've earned it slick.


The article says:

> One Beethoven biographer63 has previously suggested, on circumstantial grounds, that Ludwig senior may not have been Johann van Beethoven’s biological father

> 63 Canisius, C. Beethoven “Sehnsucht und Unruhe in der Musik”: Aspekte zu Leben und Werk Originalausgabe. Schott, 1992

So it's been hypothesized but presumably not demonstrated genetically.


This is not true any more. The vaccine has been shown to lower cancer risk for those who already carry the virus, so it is recommended even for people who are HPV positive


That's interesting and I would like to take, can you give me a link/ ref for citation?


This is a totally predictable result, there are dozens of HPV strains and the vaccine will immunize against 9 of them that are high risk.

So, unless you are a sex worker or similar, it's unlikely you "have" all of them to the point where the vaccine is completely useless. You might later get infected with a strain that you didn't yet have, and it's precisely the one that kills you.


so how would one in the US go about getting it? gardasil?


As a side note, I find it really interesting and a bit disheartening to follow the history of “web 3”. The original “web 3” was a beautiful vision including the semantic web and its cousins. The term was hijacked by cryptobros through what could have been a genuine or malicious misunderstanding of the original web 3


There's a general phenomenon where terms get annexed, not by meaning that makes most sense, but by meaning that is most salient, most eye catching or emotionally riveting. This leads to Flanderization of many many important cultural concepts.

I fear this is the reason why intellectual discourse seems to have stagnated. We no longer pursue the spirit of important ideas but try to endlessly put them in uninteresting boxes.


Weird. The first time I came across "web3" was as an interface with Ethereum nodes. This is my first time hearing about the term ever having an earlier meaning.


You're right, I didn't know that. Thanks!

In the US, this could be a municipal service ([email protected]), and then we treat everyone in a geography as an organization. This feels a bit more likely to represent an "organization", especially for smaller municipalities. I would expect service availability and quality to vary widely between different municipalities - similarly to how parks, bike lanes, and other infrastructure quality vary widely between municipalities.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: