More

isaacfrond · 2025-06-01T09:16:54 1748769414

That’s assuming the administration will allow you to administer the exams onsite which is increasingly not the case. Online students bring in more money.

isaacfrond · 2025-05-22T07:56:40 1747900600

> The claim of inevitability is crucial to technology hype cycles, from the railroad to television to AI.

Well. You know. We still have plenty of railroad, and television has had a pretty good run too. So if that are the models to compare AI to, then I have bad news for how 'hype cycle' AI is going to be.

isaacfrond · 2025-05-20T09:08:54 1747732134

The article itself lists as successful, even breakthrough, applications of AI: protein folding, weather forecasting, and drug discovery.

isaacfrond · 2025-05-03T12:18:47 1746274727

That is some cool jazz

isaacfrond · 2025-04-24T08:09:08 1745482148

No kidding. First AIs were passing the bar exam—now they’re writing it.

I don't really see the scandal though. I pretty much _only_ use AI for tasks that I could do myself.

isaacfrond · 2025-04-10T07:37:28 1744270648

Some of the Gemini stuff is almost at airport level. I'm surprised. Everything is going so fast.

The odd thing, is that with technical stuff, I'm continually rewriting the LLM's to be clearer and less verbose. While the fiction is almost the opposite--not literary enough.

isaacfrond · 2025-04-07T10:11:01 1744020661

I wonder how well humans would do in this chart.

zone411 · 2025-04-07T11:25:45 1744025145

Author here - I'm planning to create game versions of this benchmark, as well as my other multi-agent benchmarks (https://github.com/lechmazur/step_game, https://github.com/lechmazur/pgg_bench/, and a few others I'm developing). But I'm not sure if a leaderboard alone would be enough for comparing LLMs to top humans, since it would require playing so many games that it would be tedious. So I think it would be just for fun.

michaelgiba · 2025-04-07T14:59:19 1744037959

I was inspired by your project to start making similar multi-agent reality simulations. I’m starting with the reality game “The Traitors” because it has interesting dynamics.

https://github.com/michaelgiba/survivor (elimination game with a shoutout to your original)

https://github.com/michaelgiba/plomp (a small library I added for debugging the rollouts)

zone411 · 2025-04-07T15:23:05 1744039385

Very cool!

OtherShrezzing · 2025-04-07T19:52:25 1744055545

If you watch the top tier social deduction players on YouTube (things like Blood on the Clocktower etc), they’d figure out weaknesses in the LLM and exploit it immediately.

skybrian · 2025-04-08T03:02:28 1744081348

Testing against people like that would be the way to do it. Otherwise it’s like testing a chess engine against casual players or worse.

gs17 · 2025-04-07T20:52:27 1744059147

I'm interested in seeing how the LLMs react to some specific defined strategies. E.g. an "honest" bot that says "I'm voting for player [random number]." and does it every round (not sure how to handle the jury step). Do they decide to keep them around for longer, or eliminate them for being impossible to reason with if they pick you?

zone411 · 2025-04-07T22:01:10 1744063270

Yes, predefined strategies are very interesting to examine. I have two simple ones in another multi-agent benchmark, https://github.com/lechmazur/step_game (SilentGreedyPlayer and SilentRandomPlayer), and it's fascinating to see LLMs detect and respond to them. The only issue with including them here is that the cost of running a large set of games isn't trivial.

Another multi-agent benchmark I'm currently developing, which involves buying and selling, will also feature many predefined strategies.

isaacfrond · 2025-03-30T09:09:32 1743325772

Well, by definition we are simulation LLMs just fine, but per the article we are utterly failing on Elagans, so it seems the smart money is on the latter.

ninetyninenine · 2025-03-30T15:46:23 1743349583

Doesn’t mean it’s more complex though.

It just means the complexity is harder to capture and copy.

LLMs are built via algorithm. Given enough data and a large enough neural network the complexity of an LLM is boundless. I guess my question is are existing LLMs more complex?

0x000xca0xfe · 2025-03-30T10:00:00 1743328800

LLM research has a much bigger budget than C. Elegans research.

fossgeller · 2025-03-30T10:42:58 1743331378

That says more about the industrializaton of scientific research than anything.

LLMs is the new hype product of tech companies. Wait a couple years and the interest will die out.

Maybe we’ll live the day when true neuroscience (none of that Andrew Huberman stuff) will be trending.

idiotsecant · 2025-03-30T09:40:43 1743327643

More complex and more advanced are not the same thing. Evolution produces a lot of twisty little passages that are only that way because it happened to work.

isaacfrond · 2025-03-17T15:07:49 1742224069

Auction items here: https://libreplanet.org/wiki/2025-auction

isaacfrond · 2025-03-07T09:47:04 1741340824

Reminds me of a quote by famous number theoretic mathematician Hendrik Lenstra:

For every problem you can't solve, there's a simpler problem that you also can't solve.

techwizrd · 2025-03-07T12:24:24 1741350264

Is this quote real? I'm familiar with George Pólya's, "If you cannot solve the proposed problem, try to solve first a simpler related problem" but I cannot find any source for the Lenstra quote.

gessha · 2025-03-07T13:25:19 1741353919

I also found it connected to Polya [1]

https://www.pleacher.com/mp/mquotes/mobquote.html

v1t · 2025-03-07T21:14:21 1741382061

yeah https://www.reddit.com/r/quotes/comments/16qgwcv/if_you_cant...

isaacfrond · 2025-03-07T19:03:45 1741374225

I’ve heard him say it myself in a lecture on the AKS primality test. So, ehh, the source is oral tradition I guess.

Horffupolde · 2025-03-07T13:05:00 1741352700

That doesn’t induce nicely. Unless it was an insult.

arnarbi · 2025-03-07T17:16:57 1741367817

It’s not induction. It’s just the contrapositive of “if you can solve the simpler problem then you can solve the harder problem”

bubblyworld · 2025-03-07T13:38:35 1741354715

Monotonic sequences can be bounded!

deadbabe · 2025-03-07T16:39:35 1741365575

If you could solve the simpler problems, you’d be able to solve the larger problem. But you can’t, because you can’t even solve a simple problem.

samstave · 2025-03-07T15:39:10 1741361950

It reads like the famous Churchill quote about "if you gave me poison I would drink it"