More

CamperBob2 · 2025-11-29T05:37:27 1764394647

It turns in top-level performance on original, out-of-distribution problems given in international math and programming competitions, but it's "not comparable to a human level." Got it.

CamperBob2 · 2025-11-29T05:35:28 1764394528

Have LLMs learned to say "I don't know" yet?

All the time, which you'd know very well if you'd spent much time with current-generation reasoning models.

pdimitar · 2025-11-29T18:14:42 1764440082

I spend time with them, not hours every day. I am aware this is starting to get integrated and I like them more lately.

Still far from AGI however, which was my original point. Any general intelligent being would be self-aware and as such self-critical, by extension.

CamperBob2 · 2025-11-29T05:27:31 1764394051

This works now. We just don't dare let it. Self-driving cars are a political problem, not (just) a technical one.

CamperBob2 · 2025-11-29T01:18:49 1764379129

Just get a job as a principal in a VC firm. Upskill or gtfo

CamperBob2 · 2025-11-28T22:17:22 1764368242

Such countries are generally run by personality cults, which are not the least bit atheistic. Same mental bug, just a different exploit.

Go to North Korea and try to sell them atheism, for instance. They'll send your remains home in a cardboard box.

graemep · 2025-11-29T10:03:41 1764410621

That is the no true Scotsman fallacy.

As for North Korea, it is officially atheist.

CamperBob2 · 2025-11-29T16:58:53 1764435533

Yes, it's officially atheist because there's only room for one god figure, who happens to be a man. Christianity and Islam are "officially atheist" in the same absurd way. In NK the one permissible exception is not called Allah or Yahweh but Kim.

You know, the guy whose portrait hangs in everyone's home in the exact same spot where you'd find a crucifix in a southern American home.

And no, the religious nature of personality cults is not a fallacy. If anything, No True Scotsman applies to claims that a personality cult is not a "real religion." They are absolutely indistinguishable from theistic religions, except for the minor, ignorable detail that the god is alive and walking around.

Of course there's also a strong component of ancestor worship in the cult of the Kims. The portrait or other object of veneration is as likely to feature il-Sung as one of the other two.

Same bug, different exploit.

CamperBob2 · 2025-11-28T21:45:57 1764366357

Great read, but he undersells the weight of von Neumann's EDVAC report. If you haven't read that (which I imagine you have), it's crazy how prescient some of the lesser-known ideas are. He seemed to assume that we'd end up with some kind of neural architecture, and it's easy to imagine him being surprised that it took us this long to get serious about the idea.

Apropos of that, I couldn't resist telling Gemini 3 to run with your story prompt from the earlier thread: https://gemini.google.com/share/ac122aba6f7f. Thanks for the inspiration, apologies for following it. :-P

(Also thanks for posting the material you wrote back in the 1980s on the SCP initiative. I had heard of it as an SDI connection or component, but that was all. Reading through it now.)

jonjacky · 2025-12-01T05:22:38 1764566558

Thanks! -- Jon

CamperBob2 · 2025-11-28T20:33:35 1764362015

That's new to me -- what sorts of workloads are centered on 64-bit uints?

CamperBob2 · 2025-11-28T19:46:53 1764359213

What generation had CarPlay disabled? It works very nicely in 95B.2 and .3, and the pre-facelift 95B models with PCM 3 didn't support CarPlay at all without an adapter, did they?

I know full-screen CarPlay isn't supported without a jailbreak, but I don't care about that myself so haven't done it.

CamperBob2 · 2025-11-28T01:06:46 1764292006

At this point they've contributed a reasonably-fair share of open-source code themselves.

No one benefits from locking up 99.999% of all source code, including most of Microsoft's proprietary code and all GPL code.

No one.

When it comes to AI, the only foreseeable outcome to copyright maximalism is that humans will have to waste their time writing the same old shit, over and over, forever less one day [1], because muh copyright!!!1!

1: https://en.wikipedia.org/wiki/Copyright_Term_Extension_Act

Retric · 2025-11-28T03:38:46 1764301126

> only foreseeable outcome to copyright maximalism

Nahh, AI companies had plenty of money to pay for access they simply chose not to.

CamperBob2 · 2025-11-28T03:42:35 1764301355

Clearing those rights, which don't actually exist yet, would have been utterly impossible for any amount of money. Thousands of lawyers would tie up the process in red tape until the end of time.

Retric · 2025-11-28T04:38:48 1764304728

The basic premise of the economy is people do stuff for money. Any rights holder debating with their punishing house or whatever just means they don’t get paid. Some trivial number of people would opt out, but most authors or their estates would happily take an extra few hundred dollars per book.

YouTube on the other hand has permission from everyone uploading videos to make derivative works barring some specific deal with a movie studio etc.

Now there’s a few exceptions like large GPL works but again diminishing returns here, you don’t need to train on literally everything.

CamperBob2 · 2025-11-27T18:57:18 1764269838

The GPL arose from Stallman's frustration at not having access to the source code for a printer driver that was causing him grief.

In a world where he could have just said "Please create a PDP-whatever driver for an IBM-whatever printer," there never would have been a GPL. In that sense AI represents the fulfillment of his vision, not a refutation or violation.

I'd be surprised if he saw it that way, of course.

belorn · 2025-11-27T20:47:01 1764276421

The safeguards will prevent the AI from reproducing the proprietary drivers for the IBM-whatever printer, and it will not provide code that breaks the DRM that exist to prevent third-party drivers from working with the printer. There will however be no such safeguards or filters to prevent IBM to write a proprietary driver for their next printer, using existing GPL drivers as a building block.

Code will only ever go in one direction here.

CamperBob2 · 2025-11-27T21:29:56 1764278996

Then we'd better stop fighting against AI, and start fighting against so-called "safeguards."

belorn · 2025-11-27T22:07:47 1764281267

I wish you luck. The music industry basically won their fight in forcing safeguards against AI music. The film industry are gaining laws regulating AI film actors. The code generating AI are only training on freely accessible code and not proprietary code. There is multiple laws being made against AI porn all over the world (or possible already on the books).

What we should fight is Rules For Thee but Not for Me.

CamperBob2 · 2025-11-28T01:13:20 1764292400

The music industry basically won their fight in forcing safeguards against AI music. The film industry are gaining laws regulating AI film actors. The code generating AI are only training on freely accessible code and not proprietary code. There is multiple laws being made against AI porn all over the world (or possible already on the books).

Yeah, well, we'll see what our friends in China have to say about all that.

throwaway290 · 2025-11-28T01:23:40 1764293020

"we better stop fighting against CCTVs everywhere and start fighting against them used for indiscriminate surveillance"

AnthonyMouse · 2025-11-28T02:39:43 1764297583

That's the inverse. Mass surveillance is bad so it should be banned, vs. using AI to thwart proprietary lock-in is good and so shouldn't be banned.

But also, is the inverse even wrong? If some store has a local CCTV that keeps recordings for a month in case someone robs them, there is no central feed/database and no one else can get them without a warrant, that's not really that objectionable. If Amazon pipes the feed from every Ring camera to the government, that's very different.

throwaway290 · 2025-11-28T04:16:20 1764303380

> If some store has a local CCTV

By "everywhere" I obviously don't mean "on your private property", I mean "everywhere" as in "on every street corner and so on".

If people are OK with their government putting CCTVs on every lamp post on the promise that they are "secure" and "not used to aggregate data and track people" and "only with warrant" then it's kind of "I told you so" when (not if) all of those things turn out to be false.

> using AI to thwart proprietary lock-in is good and so shouldn't be banned.

It's shortsighted because whoever runs LLMs isn't doing it to help you thwart lock in. It might for now but then they don't care about anything for now, they steal content as fast as they can and they lose billions yearly to make sure they are too big too fail. Once they are too big they will tighten the screws and literally they have the freedom to do whatever they want as long as it's legal.

And surprise helping people thwart lock-in is relatively much less legal (in addition to much less profitable) than preventing people from thwarting lock-in.

It's kind of bizarre to see people thinking these LLM operators will be somehow on the side of freedom and copyleft considering what they are doing.

AnthonyMouse · 2025-11-28T09:43:37 1764323017

> By "everywhere" I obviously don't mean "on your private property", I mean "everywhere" as in "on every street corner and so on".

If they're on each person's private property then they're on every street corner and so on. The distinction you're really after is between decentralized and centralized control/access, which is rather the point.

> It's kind of bizarre to see people thinking these LLM operators will be somehow on the side of freedom and copyleft considering what they are doing.

You're conflating the operators with the thing itself.

LLMs exist and nobody can un-exist them now because they're really just code and data. The only question is, are they a thing that does what you want because there are good published models that anybody can run on their own hardware, or are the only up-to-date ones corporate and censored and politically compromised by every clodpoll who can stir up a mob?

throwaway290 · 2025-11-28T09:56:09 1764323769

You really try hard to misunderstand it. A small shop has own cctv to catch intruders = one thing. Local company installing cctv everywhere = different thing. In practice they can be both supplied by one company, centralized and unified and sold and fighting ANY cctv is ultimately the winning move.

> LLMs exist and nobody can un-exist them now because they're really just code and data

"Malware exists and nobody can unexist it now because it's just code and data"

AnthonyMouse · 2025-11-28T10:33:50 1764326030

> A small shop has own cctv to catch intruders = one thing. Local company installing cctv everywhere = different thing.

But that's the thing you were implying couldn't be distinguished. Every small shop having its own CCTV is different than one company having cameras everywhere, even if they both result cameras all over the place.

> "Malware exists and nobody can unexist it now because it's just code and data"

Which is accurate. Even if you tried to ban malware, or LLMs, they would still be produced by China et al. And malware is by definition bad, so you're also omitting the thing that matters again, which is that we should not ban the LLMs that aren't bad.

throwaway290 · 2025-11-28T11:51:01 1764330661

like LLM or NFT or killer drones, malware isn't bad for somebody. it is always about who it is benefits the most.

> the LLMs that aren't bad

which LLM is not made by stealing copyleft code?

CamperBob2 · 2025-11-28T17:36:14 1764351374

You don't get to unilaterally make laws for the rest of us, which is what you are trying to do when you throw around terms like "stealing" in contexts where they have no legal meaning. Sorry.

If the incumbent copyright interests insist on picking an unnecessary fight with LLMs or AI in general, they will and must lose decisively. That applies to all of the incumbents, from FSF to Disney. Things are different now.

throwaway290 · 2025-11-29T05:56:13 1764395773

The laws already exist. If you side with corrupt judges disrespecting these laws and interpreting them in favor of big tech corps, it's your choice)

CamperBob2 · 2025-11-29T18:04:49 1764439489

I see; the laws aren't in question or in flux, but it's the judges who are wrong. Enlightening.

I still don't understand how copyright maximalism has suddenly become so popular on a site called "Hacker News." But it's early here, and I'm sure I'm not done learning exciting new things today.

AnthonyMouse · 2025-11-28T18:49:49 1764355789

> like LLM or NFT or killer drones, malware isn't bad for somebody.

Malware isn't bad for Russian crime syndicates, but we're generally content to regard them as the adversary and not care about their satisfaction. That isn't the case for someone who wants to use an LLM to fix a bug in their printer. They're doing the good work and people trying to stop them are the adversary.

> which LLM is not made by stealing copyleft code?

Let's drive a stake through this one by going completely the other way. Suppose you train an LLM only on GPL code, and all the people distributing and using it are only distributing its output under the GPL. Regardless of whether that's required, it's allowed, right? How would you accuse any of those people of a GPL violation?

throwaway290 · 2025-11-29T05:51:19 1764395479

I see the mega wealthy in charge of LLMs who benefit the most from destroying copyleft and individual IP as adversaries)

> That isn't the case for someone who wants to use an LLM to fix a bug in their printer. They're doing the good work

they take advantage of temporary situation for good outcome but longer term they benefit those people doing shady stuff and concentrate power to them.

> Suppose you train an LLM only on GPL code, and all the people distributing and using it are only distributing its output under the GPL

That seems fair? but that's not what happens except by accident.

CamperBob2 · 2025-11-28T03:45:13 1764301513

... Yeah?

throwaway290 · 2025-11-28T04:20:21 1764303621

good luck with that!

saurik · 2025-11-27T20:20:06 1764274806

But that isn't the same code that you were running before. And like, let's not forget GPLv3: "please give me the code for a mobile OS that could run on an iPhone" does not in any way help me modify the code running on MY iPhone.

CamperBob2 · 2025-11-27T21:28:00 1764278880

Sure it does. Just tell the model to change whatever you want changed. You won't need access to the high-level code, any more than you need access to the CPU's microcode now.

We're a few years away from that, but it will happen unless someone powerful blocks it.

dzaima · 2025-11-28T00:30:13 1764289813

I believe the point was that iPhones don't even allow running custom code even if you have the code; whereas GPLv3 mandates that any conveyed form of a work must be replacable by the user. So unless LLMs easily spit out an infinite stream of 0days to exploit to circumvent that, they won't help here.

dzaima · 2025-11-28T00:00:18 1764288018

In said hypothetical world, though, the whatever-driver would also have been written by LLMs; and, if the printer or whatever is non-trivial and made by a typical large company, many LLM instances with a sizable amount of token spending over a long period of time.

So getting your own LLM rewrite to an equivalent point (or, rather, less buggy as that's the whole point!) would be rather expensive; at the absolute very least, certainly more expensive than if you still had the original source code to reference or modify (even if an LLM is the thing doing those). Having the original source code is still just strictly unconditionally better.

Never mind the question of how you even get your LLM to reverse-engineer & interact with & observe the physical hardware of your printer, and whatever wasted ink during debugging of the reinvention of what the original driver already did correctly.

AnthonyMouse · 2025-11-28T02:49:16 1764298156

Now I'm kind of curious if you give an LLM the disassembly of a proprietary firmware blob and tell it to turn it into human-readable source code, how good is it at that?

You could probably even train one to do that in particular. Take existing open source code and its assembly representations as training data and then treat it like a language translation task. Use the context to guess what the variable names were before the original compiler discarded them etc.

dzaima · 2025-11-28T09:28:26 1764322106

The most difficult parts of getting readable code would be dealing with inlined functions and otherwise-duplicated code from macros or similar, and dealing with in-memory structure layouts; both pretty complicated very-global tasks. (never mind naming things, but perhaps LLMs have a good shot at that)

That said, chatgpt currently seems to fail even basic things - completely missed the `thrM` path being possible here: https://chatgpt.com/share/69296a8e-d620-800b-8c25-15f4260c78... https://dzaima.github.io/paste/#0jZJNTsMwEIX3OcWoSFWCqrhN0wb... and that's only basic bog-standard branching, no in-memory structures or stack usage (such trivial problems could be handled by using an actual proper disassembler before throwing an LLM at that wall, but of course that only solves the easy part)

CamperBob2 · 2025-11-28T18:55:59 1764356159

You can't feed something like that to the free ChatGPT model and expect anything useful. Try these:

https://chatgpt.com/s/t_6929f00ff5508191b75f31e219609a35 (5.1 Pro Thinking)

https://claude.ai/share/7d9caa25-14f7-4233-b15c-d32b86e20e09 (Opus 4.5)

https://docs.google.com/document/d/1C0lSKbLSZOyMWnGgR0QhZh3Q... (Gemini 3 Pro Thinking)

All of them recognized the thrM exception path, although I didn't review them for correctness.

That being said, I imagine the major showstopper in real-world disassembly tasks would simply be the limited context size. As you suggest, a standard LLM isn't really the best tool for the job, at least not without assistance to split up the task logically.

dzaima · 2025-11-28T22:09:39 1764367779

Those first two indeed look correct (third link is not public); indeed free chatgpt is understandably not the best, but I did give it basically the smallest function in my codebase that does something meaningful, instead of any of the actually-non-trivial multi-kilobyte functions doing realistic things needing context.

CamperBob2 · 2025-11-28T22:51:39 1764370299

Would be interesting to push the models with a couple of larger functions, if you have some links you'd like me to try.

I have paid pro accounts on all three, but for some reason Gemini is no longer allowing links to be shared on some queries including this one. All it would let me do is export it to Docs, which I thought would be publicly visible but evidently isn't.

dzaima · 2025-11-29T20:41:21 1764448881

Actually, even finding a larger function that would by itself have a meaningful disassembly is posing problematic; basically every function deals with in-memory data structures non-trivially, and a bunch do indirect jumps (function pointers, but also lookup-table-based switches, which require table data from memory in addition to assembly to disassemble).

Like, here's a ~2.7x larger function: https://dzaima.github.io/paste/#0jVdNjxs3DL3nVwzQo30gRY00ChY... (is https://github.com/dzaima/CBQN/blob/90c1dc09e88c5324373281f6... with a bunch of inlining)

(I'm keeping the other symbol names there even though they'd likely not be there for real closed-source things, under the assumption that for a full thing you'd have something doing a quick naming pass beforehand)

This is still very much on the trivial end, but it's already dealing with in-memory structures, three inlined memory allocation calls (two half-deduplicated into one by the compiler, and the compiler initializing a bunch of the objects' fields in one store), and a bunch of inlined tagged object manipulations; should definitely be possible to get some disassembly from that, but figuring out the useful abstractions that make it readable without pain would probably take aggregating over multiple functions.

(unrelated notes of your previous results - claude indeed guessed correctly that it's BQN! though CBQN is presumably wholesale in its training data anyway; it did miss that the function has an unused 0th arg (a "this" pointer), which'd cause problems as the function is stored & used as a generic function pointer (this'd probably be easily resolved when attempting to integrate it in a wider disassembly though); neither claude nor cgpt unified the `x>>48==0xfff7` and `(x&0xffff000000000000)==0xfff7000000000000` which do the exact same thing but clang is stupid [https://github.com/llvm/llvm-project/issues/62145] and generates different things; and of course a big question is how many such intricacies could be automatically reduced down with a full codebases worth of context, cause understandably the single-function disassemblies are way way more verbose than the original)

CamperBob2 · 2025-11-28T03:58:19 1764302299

Should be possible. A couple of years ago I used an earlier ChatGPT model to understand and debug some ARM assembly, which I'm not personally very familiar with.

I can imagine that a process like what you describe, where a model is trained specifically on .asm / .c file pairs, would be pretty effective.

xorcist · 2025-11-28T11:21:36 1764328896

The only legal way to do that in the proprietary software world is a clean room implementation.

An AI could never do a clean room implementation of anything, since it was not trained on clean room materials alone. And it never can be, for obvious reasons. I don't think there's an easy way out here.

boltzmann64 · 2025-11-29T14:08:29 1764425309

Google's engineers when they were copying Java API for Davlik (and later ART), they had access to and consulted Java source code. The infamous Oracle v. Google judgement siding Google set precedent at the highest level, SCOTUS that looking at the code is not an issue.

So, it doesn't matter if a AI can or cannot do clean room implementation. Unless it is a patent or trade secret violation, cleam room implementation doesn't matter.