Temperature
Temperature controls how random or deterministic an AI model's responses are, with higher values producing more varied outputs.
Temperature is a parameter that controls the randomness or creativity of an AI model's responses. It influences how the model selects the next token when generating text, affecting whether outputs are deterministic and focused or random and creative.
In language models, at each step of generation, the model produces probabilities for all possible next tokens. Temperature scales these probabilities before the model selects a token. A temperature of 0 makes the model deterministic-it always selects the most likely token, producing consistent, predictable responses. Higher temperatures increase randomness: the model becomes more likely to select less probable tokens, producing more varied and creative outputs.
Temperature values typically range from 0 to 2, though some systems allow higher values. A temperature of 1 represents the model's default probability distribution. Values below 1 (like 0.3) make outputs more focused and deterministic, useful for tasks requiring accuracy and consistency like question answering or code generation. Values above 1 (like 1.5) increase randomness and creativity, useful for creative writing, brainstorming, or generating diverse ideas.
Choosing appropriate temperature depends on the task. For factual tasks where accuracy matters, use low temperature. For creative tasks where variety is desired, use higher temperature. For most general-purpose tasks, temperature around 0.7-0.8 provides a good balance between coherence and creativity.
Understanding temperature helps users get better results from language models by matching the parameter to their needs. It's one of several parameters (along with top-p, top-k, and others) that control model behavior during generation. Experimenting with temperature can significantly improve the quality of outputs for specific applications.