Used to enable JSON mode, which guarantees the message the model generates is valid JSON.
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or
user message. Without this, the model may generate an unending stream of whitespace until the generation reaches
the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content
may be partially cut off if finish_reason="length"
, which indicates the generation exceeded max_tokens
or
the conversation exceeded the max context length.
The chat response format parameters.