I think that's exactly what Eliezer means by entanglement

throwanem · 2025-02-25T21:42:00 1740519720

And the guy who's already argued for airstrikes on datacenters considers that to be good news? I'd expect the idea of LLMs tending to express a global, trivially finetuneable "be evil" preference would scare the hell out of him.

thornewolf · 2025-02-25T21:51:03 1740520263

He is less concerned that people can create an evil AI if they want to and more concerned that no person can keep an AI from being evil even if we tried.

throwanem · 2025-02-25T22:02:32 1740520952

He expects the bad guy with an AI to be stopped by a good guy with an AI?

DennisP · 2025-02-25T23:34:04 1740526444

No, he expects the AI to kill us all even if it was built by a good guy.

How much this result improves his outlook, we don't know, but he previously put our chance of extinction at over 95%: https://pauseai.info/pdoom

imtringued · 2025-02-26T07:02:02 1740553322

These guys and their black hole harvesting dreams always sound way too optimistic to me.

Humanity has a 100% chance of going extinct. Take it or leave it.

DennisP · 2025-02-26T13:30:17 1740576617

It'd be nice if it weren't in the next decade though.

mitthrowaway2 · 2025-02-25T23:28:28 1740526108

No, he expects a bad AI to be unstoppable by anybody, including the unwitting guy who runs it.

bdangubic · 2025-02-25T22:03:11 1740520991

works for gun control :)

knowaveragejoe · 2025-02-26T15:02:27 1740582147

I hope this is sarcasm because that is hardly a rule!

staunton · 2025-02-25T21:52:50 1740520370

I guess the argument there would be that this news makes it sound more plausible people could technically build LLMs which are "actually" "good"...