How to guarantee output structure for ChatGPT

When building applications with ChatGPT, we often need to ensure that the output is structured in a way that we can reliably parse. Without a guaranteed structure, it's nearly impossible to parse the output reliably, which can lead to errors and inconsistencies in our applications.

When building applications with ChatGPT, we to ensure consistent output. Without a guaranteed structure, it's nearly impossible to parse the output reliably, which can lead to errors and inconsistencies in our applications.

Luckily, OpenAI's API provides format parameters that we can use to enforce a specific output structure.

Let's take a look at how we can leverage this feature in their various SDKs.

Python SDK

OpenAI's Python SDK directly supports Pydantic models, which makes it easy to define our schema in a way OpenAI can use.

Let's say we have an endpoint that generates headline options for blog posts.

Here's how we can guarantee the output structure using Pydantic and the OpenAI Python SDK:

from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()

# Create a Pydantic model for the expected output structure
class HeadlineGenerationPromptResponse(BaseModel):
    options: list[str]
    recommended_option: str

# Use `text_format` for the responses API
responses_response = client.chat.responses.parse(
    ...,
    text_format=HeadlineGenerationPromptResponse
)
parsed_response = responses_response.output_parsed

# Use `response_format` for the completions API
completions_response = client.chat.completions.parse(
    ...,
    response_format=HeadlineGenerationPromptResponse
)
parsed_completion = completions_response.output_parsed

Breaking this down:

  1. We define a Pydantic model, HeadlineGenerationPromptResponse, that describes the expected structure of the output.
  2. We call the parsed method, passing our model to the response_format parameter for completions or the text_format parameter for responses.
  3. We then use output_parsed to get the parsed response, which will be an instance of our Pydantic model.

This minimizes the amount of manual parsing we need to do while giving us peace of mind that the output will match to our defined structure.

JavaScript/TypeScript SDK

OpenAI's JavaScript SDK also supports language-native structured outputs using Zod for schema validation.

Using the same example of generating headline options, we can guarantee our output structure like this:

import { z } from 'zod';
import { zodResponseFormat, zodTextFormat } from 'openai';
import { OpenAI } from 'openai';
const client = new OpenAI();

// Define the Zod schema for the expected output structure
const HeadlineGenerationPromptResponse = z.object({
  options: z.array(z.string()),
  recommended_option: z.string(),
});

// Use `text.format` for the responses API
const responsesResponse = client.chat.responses.parse({
    ...,
    text: {
        format: zodTextFormat(
            HeadlineGenerationPromptResponse,
            "headline_generation_prompt_response"
        ),
    }
});
const parsedResponse = responsesResponse.output_parsed;

// Use `response_format` for the completions API
const completionsResponse = client.chat.completions.parse({
    ...,
    response_format: zodResponseFormat(
        HeadlineGenerationPromptResponse,
        "headline_generation_prompt_response"
    ),
});
const parsedCompletion = completionsResponse.choices?[0].message?.parsed;

Breaking this down:

  1. We define a Zod schema HeadlineGenerationPromptResponse that describes the expected structure of the output.
  2. We call the parse method, passing our schema to the response_format parameter for completions or the text.format parameter for responses.
  3. We can then access the parsed response directly from the API response, which will be validated against our Zod schema.

One quick note: While the responses API supports output_parsed directly, the completions API requires us to access the parsed output from the choices array.

Other languages (plain JSON schema)

Since structured responses are only available in the Python and JavaScript SDKs at the time of writing, other languages will need to implement their own parsing logic.

It's entirely possible that an unofficial SDK exists for your language of choice that supports structured outputs, so be sure to check the community resources.

If you're using a language without direct SDK support for structured outputs, you can still enforce a structure with a custom JSON schema and parse the output manually.

For reference, here's what our model above would look like as a JSON schema:

{
  "type": "object",
  "properties": {
    "options": {
      "type": "array",
      "items": { "type": "string" }
    },
    "recommended_option": { "type": "string" }
  },
  "required": ["options", "recommended_option"],
  "additionalProperties": false
}

Then, in our applicaiton, we can pass the JSON schema to the OpenAI API in the request body. For example, in a cURL request, it would look like this:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    ...,
    "response_format": {
      "type": "json_schema",
      "json_schema": {...SCHEMA_HERE...}
    }
  }'

More documentation on JSON schemas can be found at json-schema.org.

Please note: OpenAI requires two things to guarantee structured outputs using a JSON schema:

  1. All properties must be marked as required.
  2. The additionalProperties field must be set to false.

If you do not follow these rules, OpenAI will error out when you try to use the schema.

Limitations

Structured outputs are a powerful feature, but they do have some limitations:

  • Not all validation rules are supported: For example, OpenAI does not allow you to specify the maximum length of a string. For a full list of supported schema values, see their documentation.
  • Not all models support structured outputs: Structured outputs are only available models after gpt-4o-mini and gpt-4o-2024-08-06. If you're using an older model, you'll need to either 1) use a newer model or 2) use the more limited JSON mode.
  • Limited support for max_tokens: Because the max_tokens parameter does not impact the content of a response, it's likely that your output will be longer than the max_tokens parameter would allow. If that happens, it will cut off the output in the middle of your JSON object, rendering the entire response invalid. If you require a specific length of output, you will need to handle that in your application logic.