https://status.anthropic.com/incidents/72f99lh1cj2c They recently resolved two b...

kiratp · 2025-09-09T17:54:46 1757440486

> Importantly, we never intentionally degrade model quality as a result of demand or other factors, and the issues mentioned above stem from unrelated bugs.

Things they could do that would not technically contradict that:

- Quantize KV cache

- Data aware model quantization where their own evals will show "equivalent perf" but the overall model quality suffers.

Simple fact is that it takes longer to deploy physical compute but somehow they are able to serve more and more inference from a slowly growing pool of hardware. Something has to give...

cj · 2025-09-09T18:11:22 1757441482

> Something has to give...

Is training compute interchangeable with inference compute or does training vs. inference have significantly different hardware requirements?

If training and inference hardware is pooled together, I could imagine a model where training simply fills in any unused compute at any given time (?)

kiratp · 2025-09-09T18:38:25 1757443105

Hardware can be the same but scheduling is a whole different beast.

Also, if you pull too manny resources from training your next model to make inference revenue today, you’ll fall behind in the larger race.

mh- · 2025-09-09T17:34:24 1757439264

The problem is twofold:

- They're reporting that only impacted Haiku 3.5 and Sonnet 4. I used neither model during the time period I'm concerned with.

- It took them a month to publicly acknowledge that issue, so now we lack confidence there isn't another underlying issue going undetected (or undisclosed, less charitably) that affects Opus.

trunnell · 2025-09-09T17:38:39 1757439519

now we lack confidence there isn't another underlying issue

You can be confident there is a non-zero rate of errors and defects in any complex service that's moving as fast as the frontier model providers!

mh- · 2025-09-09T17:42:52 1757439772

Of course. Totally agree, and that's why (I think) I'm being as charitable as possible in this thread.

criemen · 2025-09-09T20:26:38 1757449598

They posted

> We are continuing to monitor for any ongoing quality issues, including reports of degradation for Claude Opus 4.1.

I take that as acknowledgment that there might be an issue with Opus 4.1 (granted, undetected still), but not undisclosed, and they're actively looking for it? I'd not jump to "they must be hiding things" yet. They're building, deploying and scaling their service at incredible pace, they, as we all, are bound to get some things wrong.

mh- · 2025-09-10T01:23:44 1757467424

To be clear, I'm not one of the people suggesting they're doing something nefarious. As I said elsewhere, I don't know what my expectations are of them at this point. I'd like early disclosure of known performance drops, I guess. But from a business POV, I understand why they're not going to be updating a status page to say "things are worsening but we're not exactly sure why".

I'm also a realist, though, and have built a career on building/operating large systems. There's obviously capability to dynamically shed load built into the system somewhere, there's just no other responsible way to engineer it. I'd prefer they slowed response times rather than harmed response quality, personally.

claude_ya_ · 2025-09-09T17:45:32 1757439932

Does anyone know if this also affected Claude Sonnet models running in AWS Bedrock, or if it was just when using the model via Anthropic’s API?