Open
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
There might be a memory leak when using the method .parse()
on AsyncCompletions
with Pydantic models created with create_model
. When submitting several calls, the memory usage keeps on rising. I haven't found any plateau yet, which could mean the parsers built upon these models might not be garbage collected.
To Reproduce
- Have a function that creates a Pydantic model with
create_model
- Have several calls where the response_format param always gets a new model from the function above
- Monitor the memory
We do have a work-around though. The leaking scenario will be called leaking
and the safe one non_leaking
in the snippets.
Please let me know if you need more info. Thanks a lot.
Code snippets
import asyncio
import gc
import os
from typing import List
from memory_profiler import profile
from openai import AsyncOpenAI
from openai.lib._parsing import type_to_response_format_param
from pydantic import Field, create_model
StepModel = create_model(
"Step",
explanation=(str, Field()),
output=(str, Field()),
)
def create_new_model():
"""This sounds useless as it is. In our business case, I'm generating a model that slightly different at each call, hence the use of create_model. This illustrates of a model that seems to always be the same keeps on adding up in the memory."""
return create_model(
"MathResponse",
steps=(List[StepModel], Field()),
final_answer=(str, Field()),
)
@profile()
async def leaking_call(client, new_model):
await client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=new_model,
)
async def non_leaking_call(client, new_model):
await client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=type_to_response_format_param(new_model),
)
async def main():
client = AsyncOpenAI()
for _ in range(200):
# You can switch to `non_leaking_call` and see that the memory is correctly emptied
await leaking_call(client, create_new_model())
# We wanted to thoroughly check the memory usage, hence memory profiler + gc
gc.collect()
print(len(gc.get_objects()))
if __name__ == "__main__":
asyncio.run(main())
OS
macOS
Python version
Python 3.11.9
Library version
openai v1.64.0