Skip to content

Pydantic conversion logic for structured outputs is broken for models containing dictionaries #2004

Open
@dbczumar

Description

@dbczumar

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

There's a bug in OpenAI's python client logic for translating pydantic models with dictionaries into structured outputs JSON schema definitions: dictionaries are always required to be empty in the resulting JSON schema, rendering the dictionary outputs significantly less useful since the LLM is never allowed to populate them

I've filed a small PR to fix this and introduce test coverage: #2003

To Reproduce

import json
from typing import Any, Dict

import pydantic

from openai.lib._pydantic import to_strict_json_schema

class GenerateToolCallArguments(pydantic.BaseModel):
    arguments: Dict[str, Any] = pydantic.Field(description="The arguments to pass to the tool")

print(json.dumps(to_strict_json_schema(GenerateToolCallArguments), indent=4))

Observe that the output inserts additionalProperties: False into the resulting JSON schema definition, meaning that the dictionary must always be empty:

{
    "properties": {
        "arguments": {
            "description": "The arguments to pass to the tool",
            "title": "Arguments",
            "type": "object",
            # THE INSERTION OF THIS LINE IS A BUG
            "additionalProperties": false
        }
    },
    "required": [
        "arguments"
    ],
    "title": "GenerateToolCallArguments",
    "type": "object",
    "additionalProperties": false
}

Code snippets

No response

OS

macOS

Python version

Python v3.10.12

Library version

1.59.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions