> Temperature is a measure of capriciousness. Yes, that’s what “temperature” mea...

fauxpause_ · on May 21, 2023

https://lukesalamone.github.io/posts/what-is-temperature/

> Well, no, making a point of highlighting the points of your ignorance when discussing something is good. Especially when you are a notable expert in the broad field being discussed.

I disagree. Stating “whatever that means” indicates dismissiveness, not a transparent lack of expertise. Also, you should know what it means if you’re an expert.

This quote implies to me that he is actually a beginner when it comes to this technology but is expecting to be treated like an expert whose experience generalizes

copperx · on May 22, 2023

Absolutely disagree. I don't think anyone, except someone with access to the source code, knows exactly what temperature 0.7 means.

Knuth is a world expert in randomized algorithms. Do you think he doesn't have a good intuition for what could be happening? But he's a stickler for detail, and temperature is an obfuscation.

fauxpause_ · on May 22, 2023

I’m getting pretty titled at the number of people who are ignoring everything I’m posting and claiming temperature is some unknowable thing because Knuth does not know what it is. Look at my link. This is not a concept specific to them. It’s a single term in the softmax selection.

There is no reason to assume that OpenAI has changed the definition of this term.

woah · on May 21, 2023

They could literally have asked chatGPT and gotten a great explanation

doetoe · on May 21, 2023

I don't know what prompt you used, but this is what it tells me (just to be clear, I don't think it explains anything beyond higher temperature = higher randomness, range of API values 0 to 2):

> In the OpenAI GPT API, the temperature parameter controls the randomness of the model's output. A temperature value of 0.7 in the GPT API means that the model's responses will have a moderate level of randomness.

> When generating responses, a higher temperature value, such as 1.0, makes the output more random and creative. This can result in more diverse and unpredictable responses. On the other hand, a lower temperature value, such as 0.2, makes the output more focused and deterministic, with the model tending to choose more probable and conservative responses.

> By setting the temperature to 0.7, you can expect the model's responses to strike a balance between randomness and coherence, offering a mix of expected and unexpected answers.

fauxpause_ · on May 21, 2023

Seems like a good answer. If you want a deeper answer, you should ask a follow up question. Here is my follow up answer to “can you explain it at a deeper level”

> At a deeper level, the temperature parameter in generative AI models like GPT-3 affects the softmax function, which is used to convert the model's output into a probability distribution over the possible words or tokens.

> The softmax function takes the logits, which are the unnormalized output values of the model, and applies the exponential function to each value, normalizing them to produce probabilities. The formula for the softmax function is as follows:

> P(y_i) = exp(logit(y_i)) / sum(exp(logit(y_j)))

> Where:

> P(y_i) is the probability of the i-th token in the output sequence. logit(y_i) is the logit value (unnormalized score) of the i-th token. > The sum is taken over all possible tokens. > The temperature parameter comes into play when computing the exponential function. The temperature value, denoted as T, is applied as an exponent to the logits before performing the softmax. Mathematically, the modified softmax function with temperature is:

> P(y_i) = exp(logit(y_i) / T) / sum(exp(logit(y_j) / T))

> The temperature parameter T controls the steepness of the softmax function. A higher temperature value (e.g., 1.0) increases the exponentiation effect, making the probabilities more spread out and the distribution more uniform. This leads to a higher likelihood of selecting less probable tokens, resulting in more diverse and random output.

> Conversely, a lower temperature value (e.g., 0.5) reduces the exponentiation effect, making the probabilities sharper and more focused. The most probable tokens are emphasized, leading to more deterministic and conservative output.

> By adjusting the temperature, you can manipulate the balance between exploration and exploitation during text generation. Higher temperatures encourage the model to explore more possibilities, while lower temperatures bias it towards exploiting the most likely choices based on the training data.

> It's important to note that the choice of temperature is a design decision, and there is no fixed rule for determining the optimal value. It often requires experimentation and understanding the specific behavior of the model to achieve the desired output characteristics

doetoe · on May 21, 2023

Not saying it's bad as a qualitative answer, but it doesn't say anything quantitative about the effect of the temperature in the ChatGPT API. Temperature is a well known and wel documented concept, but if you don't know what y_i is, and for all I know that's just a number coming out of a black box with billions of parameters, you don't know what temperature 0.7 is, beyond the fact that a token i whose logit(y_i) is 0.7 higher that that of another token, is e times as likely to be produced. What does that tell me? Nothing.

fauxpause_ · on May 21, 2023

My dude it’s not my fault if you don’t understand the concept of asking follow up questions for clarification. This isn’t like a Google search. The way you retrieve knowledge is different