I am used to postmortems posted to here being a rare chance for us to take a pee...

bee_rider · on Feb 22, 2024

We also know it uses the GPUs to generate numbers. But these numbers, they were the wrong ones. More technically, part of the computation didn’t work when run on some hardware.

dimatura · on Feb 22, 2024

Yeah, definitely opaque. If I had to guess it sort of sounds like a code optimization that resulted in a numerical error, but only in some GPUs or CUDA versions. I've seen that sort of issue happen a few times in the pytorch framework, for example.

reaperducer · on Feb 22, 2024

Yeah, definitely opaque.

I wonder what the AI would say if someone asked it what happened.

It would be pretty funny if it gave a detailed answer.

peddling-brink · on Feb 22, 2024

It will make something up if it answers at all. It doesn’t know.

rendall · on Feb 22, 2024

Someone upthread did just that. It needs some love.

https://news.ycombinator.com/item?id=39462686

jldugger · on Feb 22, 2024

It sounds like something went sideways with the embedding mapping. Either some kind of quantization, different rounding, or maybe just an older embedding.

deathanatos · on Feb 22, 2024

The point isn't the specifics; the point is that this isn't a postmortem.

A postmortem should be detailed enough for someone to understand the background, how the problem came to be, then what happened, and the walk-through what has been done such that it won't happen again. It takes … well at least a page. This is far too short to quality.

This is more "ugh, here's a rough explanation, please go away now" territory.

OpenAI isn't the first company to abuse the term this way, though. But it devalues the real PMs out there.

jldugger · on Feb 22, 2024

Sorry, not disagreeing, just offering speculation in lieu of the answers we don't have.

fieldcny · on Feb 22, 2024

That’s not helping, that’s excusing OpenAIs behavior, which is not something anyone on hn should be doing.

This is supposedly the greatest AI mankind has ever created, it goes down for a little while and we have zero information on why or how, that’s simply inexcusable

If this is such a socially impacting technical change we should be ripping it to pieces to understand exactly how it works. Thats a) how we protect society from technical charlatans b) how you spawn a whole new world of magnificent innovations (see Linus building a truly free Unix like operating system for everyone to use).

Failing to hold them to as high a bar is a another step down the path to a dystopian corporatists future…

krisoft · on Feb 22, 2024

> it goes down for a little while and we have zero information on why or how

We have more than zero information. They applied a change and it didn’t work on some set of their hardware so they reverted it. That is not much information but also not zero.

> that’s simply inexcusable

If your contractual SLAs were violated take it up with the billing department.

> If this is such a socially impacting technical change we should be ripping it to pieces to understand exactly how it works.

And people are doing that. Not by complaining when the corp are not sufficiently forthcomming but by implementing their own systems. That is how you have any chance of avoiding the dystopian corporatist future you mention.

__loam · on Feb 22, 2024

Would not be the first time "Open"AI abused words.

bbor · on Feb 22, 2024

In my limited experience this screams “applied a generated mask to the wrong data”. Like they scored tokens then applied the results to the wrong source or something. Obviously more an idle guess from first principles than the direct cause, tho

blueprint · on Feb 22, 2024

or a manic episode of a caged genius?

JKCalhoun · on Feb 22, 2024

I wonder if we'll accidentally gain insight into schizophrenia or other human-neurological disorders from AI crashes/failures.

alephxyz · on Feb 22, 2024

Someone posted an explanation that lines up with their postmortem: https://news.ycombinator.com/item?id=39450978

fsmv · on Feb 22, 2024

How does that line up? OpenAI said they had a bug in certain GPU configurations that caused the token numbers to be wrong which made normal output look like garbage. This post is guessing they set the frequency and presence penalties too high.

yieldcrv · on Feb 22, 2024

ChatGPT had a stroke. Haven't seen that since the 3B parameter models from 8 months ago

prmoustache · on Feb 22, 2024

They could have said "shit happened" and it would have been as informative tbh.

minimaxir · on Feb 22, 2024

EDIT: misread

floating-io · on Feb 22, 2024

Pretty sure that was his point.