Skip to content

feat(api): add token logprobs to chat completions #980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions api.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ from openai.types.chat import (
ChatCompletionNamedToolChoice,
ChatCompletionRole,
ChatCompletionSystemMessageParam,
ChatCompletionTokenLogprob,
ChatCompletionTool,
ChatCompletionToolChoiceOption,
ChatCompletionToolMessageParam,
Expand Down
122 changes: 104 additions & 18 deletions src/openai/resources/chat/completions.py

Large diffs are not rendered by default.

66 changes: 36 additions & 30 deletions src/openai/resources/completions.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,14 +119,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -288,14 +289,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -450,14 +452,15 @@ def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -687,14 +690,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -856,14 +860,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down Expand Up @@ -1018,14 +1023,15 @@ async def create(
As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
from being generated.

logprobs: Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.

max_tokens: The maximum number of [tokens](/tokenizer) to generate in the completion.
max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down
6 changes: 4 additions & 2 deletions src/openai/resources/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ def create(
The size of all the
files uploaded by one organization can be up to 100 GB.

The size of individual files can be a maximum of 512 MB. See the
The size of individual files can be a maximum of 512 MB or 2 million tokens for
Assistants. See the
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
learn more about the types of files supported. The Fine-tuning API only supports
`.jsonl` files.
Expand Down Expand Up @@ -314,7 +315,8 @@ async def create(
The size of all the
files uploaded by one organization can be up to 100 GB.

The size of individual files can be a maximum of 512 MB. See the
The size of individual files can be a maximum of 512 MB or 2 million tokens for
Assistants. See the
[Assistants Tools guide](https://platform.openai.com/docs/assistants/tools) to
learn more about the types of files supported. The Fine-tuning API only supports
`.jsonl` files.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ class MessageCreationStepDetails(BaseModel):
message_creation: MessageCreation

type: Literal["message_creation"]
"""Always `message_creation``."""
"""Always `message_creation`."""
2 changes: 1 addition & 1 deletion src/openai/types/beta/threads/runs/run_step.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ class RunStep(BaseModel):
"""

object: Literal["thread.run.step"]
"""The object type, which is always `thread.run.step``."""
"""The object type, which is always `thread.run.step`."""

run_id: str
"""
Expand Down
3 changes: 3 additions & 0 deletions src/openai/types/chat/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@
from .chat_completion_message_param import (
ChatCompletionMessageParam as ChatCompletionMessageParam,
)
from .chat_completion_token_logprob import (
ChatCompletionTokenLogprob as ChatCompletionTokenLogprob,
)
from .chat_completion_message_tool_call import (
ChatCompletionMessageToolCall as ChatCompletionMessageToolCall,
)
Expand Down
11 changes: 10 additions & 1 deletion src/openai/types/chat/chat_completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,14 @@
from ..._models import BaseModel
from ..completion_usage import CompletionUsage
from .chat_completion_message import ChatCompletionMessage
from .chat_completion_token_logprob import ChatCompletionTokenLogprob

__all__ = ["ChatCompletion", "Choice"]
__all__ = ["ChatCompletion", "Choice", "ChoiceLogprobs"]


class ChoiceLogprobs(BaseModel):
content: Optional[List[ChatCompletionTokenLogprob]]
"""A list of message content tokens with log probability information."""


class Choice(BaseModel):
Expand All @@ -24,6 +30,9 @@ class Choice(BaseModel):
index: int
"""The index of the choice in the list of choices."""

logprobs: Optional[ChoiceLogprobs]
"""Log probability information for the choice."""

message: ChatCompletionMessage
"""A chat completion message generated by the model."""

Expand Down
10 changes: 10 additions & 0 deletions src/openai/types/chat/chat_completion_chunk.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from typing_extensions import Literal

from ..._models import BaseModel
from .chat_completion_token_logprob import ChatCompletionTokenLogprob

__all__ = [
"ChatCompletionChunk",
Expand All @@ -12,6 +13,7 @@
"ChoiceDeltaFunctionCall",
"ChoiceDeltaToolCall",
"ChoiceDeltaToolCallFunction",
"ChoiceLogprobs",
]


Expand Down Expand Up @@ -70,6 +72,11 @@ class ChoiceDelta(BaseModel):
tool_calls: Optional[List[ChoiceDeltaToolCall]] = None


class ChoiceLogprobs(BaseModel):
content: Optional[List[ChatCompletionTokenLogprob]]
"""A list of message content tokens with log probability information."""


class Choice(BaseModel):
delta: ChoiceDelta
"""A chat completion delta generated by streamed model responses."""
Expand All @@ -87,6 +94,9 @@ class Choice(BaseModel):
index: int
"""The index of the choice in the list of choices."""

logprobs: Optional[ChoiceLogprobs] = None
"""Log probability information for the choice."""


class ChatCompletionChunk(BaseModel):
id: str
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,14 @@

from __future__ import annotations

from typing import Optional
from typing_extensions import Literal, Required, TypedDict

__all__ = ["ChatCompletionFunctionMessageParam"]


class ChatCompletionFunctionMessageParam(TypedDict, total=False):
content: Required[str]
content: Required[Optional[str]]
"""The contents of the function message."""

name: Required[str]
Expand Down
47 changes: 47 additions & 0 deletions src/openai/types/chat/chat_completion_token_logprob.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# File generated from our OpenAPI spec by Stainless.

from typing import List, Optional

from ..._models import BaseModel

__all__ = ["ChatCompletionTokenLogprob", "TopLogprob"]


class TopLogprob(BaseModel):
token: str
"""The token."""

bytes: Optional[List[int]]
"""A list of integers representing the UTF-8 bytes representation of the token.

Useful in instances where characters are represented by multiple tokens and
their byte representations must be combined to generate the correct text
representation. Can be `null` if there is no bytes representation for the token.
"""

logprob: float
"""The log probability of this token."""


class ChatCompletionTokenLogprob(BaseModel):
token: str
"""The token."""

bytes: Optional[List[int]]
"""A list of integers representing the UTF-8 bytes representation of the token.

Useful in instances where characters are represented by multiple tokens and
their byte representations must be combined to generate the correct text
representation. Can be `null` if there is no bytes representation for the token.
"""

logprob: float
"""The log probability of this token."""

top_logprobs: List[TopLogprob]
"""List of the most likely tokens and their log probability, at this token
position.

In rare cases, there may be fewer than the number of requested `top_logprobs`
returned.
"""
23 changes: 21 additions & 2 deletions src/openai/types/chat/completion_create_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class CompletionCreateParamsBase(TypedDict, total=False):
particular function via `{"name": "my_function"}` forces the model to call that
function.

`none` is the default when no functions are present. `auto`` is the default if
`none` is the default when no functions are present. `auto` is the default if
functions are present.
"""

Expand All @@ -99,8 +99,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):
or exclusive selection of the relevant token.
"""

logprobs: Optional[bool]
"""Whether to return log probabilities of the output tokens or not.

If true, returns the log probabilities of each output token returned in the
`content` of `message`. This option is currently not available on the
`gpt-4-vision-preview` model.
"""

max_tokens: Optional[int]
"""The maximum number of [tokens](/tokenizer) to generate in the chat completion.
"""
The maximum number of [tokens](/tokenizer) that can be generated in the chat
completion.

The total length of input tokens and generated tokens is limited by the model's
context length.
Expand All @@ -127,6 +137,8 @@ class CompletionCreateParamsBase(TypedDict, total=False):
response_format: ResponseFormat
"""An object specifying the format that the model must output.

Compatible with `gpt-4-1106-preview` and `gpt-3.5-turbo-1106`.

Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
message the model generates is valid JSON.

Expand Down Expand Up @@ -180,6 +192,13 @@ class CompletionCreateParamsBase(TypedDict, total=False):
functions the model may generate JSON inputs for.
"""

top_logprobs: Optional[int]
"""
An integer between 0 and 5 specifying the number of most likely tokens to return
at each token position, each with an associated log probability. `logprobs` must
be set to `true` if this parameter is used.
"""

top_p: Optional[float]
"""
An alternative to sampling with temperature, called nucleus sampling, where the
Expand Down
12 changes: 7 additions & 5 deletions src/openai/types/completion_create_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,16 +88,18 @@ class CompletionCreateParamsBase(TypedDict, total=False):

logprobs: Optional[int]
"""
Include the log probabilities on the `logprobs` most likely tokens, as well the
chosen tokens. For example, if `logprobs` is 5, the API will return a list of
the 5 most likely tokens. The API will always return the `logprob` of the
sampled token, so there may be up to `logprobs+1` elements in the response.
Include the log probabilities on the `logprobs` most likely output tokens, as
well the chosen tokens. For example, if `logprobs` is 5, the API will return a
list of the 5 most likely tokens. The API will always return the `logprob` of
the sampled token, so there may be up to `logprobs+1` elements in the response.

The maximum value for `logprobs` is 5.
"""

max_tokens: Optional[int]
"""The maximum number of [tokens](/tokenizer) to generate in the completion.
"""
The maximum number of [tokens](/tokenizer) that can be generated in the
completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's
context length.
Expand Down
Loading