Question 1

What is the algorithm pattern for Temperature Sampling?

Accepted Answer

Logit Rescaling Before Softmax: Temperature T controls randomness: T<1 makes the model more confident (peaky distribution), T>1 makes it more exploratory (flat). T=1 is standard softmax.

Question 2

How do you solve Temperature Sampling step by step?

Accepted Answer

Divide logits by temperature T: z' = z / T. Apply softmax to the scaled logits. T < 1: distribution sharpens — model picks likely tokens more often. T > 1: distribution flattens — more diverse, creative outputs. T → 0: greedy decoding (always pick the highest-probability token).

Question 3

What are common mistakes when solving Temperature Sampling?

Accepted Answer

Temperature does NOT change which token has the highest probability — just the gap between them. Top-p (nucleus) sampling combines temperature with a probability mass cutoff. High temperature increases diversity but can produce incoherent text.

Temperature Sampling — Step-by-Step Visualization

Algorithm Pattern

Key Idea

Step-by-Step Approach

Common Gotchas

Related Problems