This is exactly the kind of issue that can lead to unintended consequences. What...

astrange · on Feb 22, 2024

The LLM doesn't have secrets or other users' chats in it. Why would they put that in there?

ummonk · on Feb 22, 2024

If they use batching during inference (which they very probably do), then some kind of coding mistake of the sort that happened with this bug absolutely could result in leakage between chats.

astrange · on Feb 22, 2024

One thing that did happen is there was a bug in the website for a day that really did show you other user's chat history.

IIRC some reporting confused this with "Samsung had some employees upload internal PDFs to ChatGPT" to produce the claim that ChatGPT was leaking internal Samsung information via training, which it wasn't.

_heimdall · on Feb 22, 2024

LLMs can have secrets if they were scraped in the training data.

And how do we know definitively what is done with chat logs? The LLM model is a black box for OpenAI (they don't know what was learned or why it was learned), and OpenAI is a black box for users (we don't know what data they collect or how they use it).

astrange · on Feb 22, 2024

You can probe what was learned if you have access to the model; it'll tell you, especially if you do it before applying the safety features.

A good heuristic for whether they would train user chats into the model is whether this makes any sense. But it doesn't; it's not valuable. They could be saying anything in there, it's likely private, and it's probably not truthful information.

Presumably they do do something with responses you've marked thumbs up/thumbs down to, but there are ways of using those that aren't directly putting them in the training data. After all, that feedback isn't trustworthy either.

_heimdall · on Feb 22, 2024

> You can probe what was learned if you have access to the model; it'll tell you, especially if you do it before applying the safety features.

Does that involve actually parsing the data itself, or effectively asking the model questions to see what was learned?

If the data model itself can be parsed and analyzed directly by humans that is better than I realized. If its abstracted through an interpreter (I'm sure my terminology is off here) similar to the final GPT product then we still can't really see what was learned.

bpye · on Feb 22, 2024

> You can probe what was learned if you have access to the model; it'll tell you

Could that not also be hallucinated?

astrange · on Feb 22, 2024

By probe, I mean observe the internal activations. There are methods that can suggest if it's hallucinating or not, and ones that can delete individual pieces of knowledge from the model.

EchoChamberMan · on Feb 22, 2024

There are historical cases of just such a thing happening. Google warns you not to put important or private information in.

astrange · on Feb 22, 2024

That's because they have human evaluators read user prompts to decide where to make quality improvements.

EchoChamberMan · on Feb 23, 2024

https://arstechnica.com/security/2024/01/ars-reader-reports-...

EchoChamberMan · on Feb 22, 2024

I'm pretty sure it was because people were seeing other people's data. Feel free to search for sources of you're curious.