Skip to content

[Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config #17420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

aarnphm
Copy link
Collaborator

@aarnphm aarnphm commented Apr 29, 2025

This PR introduces the args --structured-output-config as a way to unify all related structured outputs config in one CLI field.
This would help simplify general UX for specifying custom options with backends.

This means all previous arguments are considered deprecated, and --structured-output-config should be used going forward (v0.10.x)

This is the first of many to move all guided_decoding -> structured_output namespace. I plan to finish this migration by v0.10.x
for both OpenAI frontend and all path within vLLM codebase.

Signed-off-by: Aaron Pham [email protected]

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@aarnphm aarnphm requested review from hmellor and removed request for comaniac, njhill, zhuohan123, youkaichao and alexm-redhat April 29, 2025 22:58
@aarnphm aarnphm changed the title [Feature][CLI] Unify configuration for structured outputs [Feature][CLI] Unify configuration for structured outputs via --structured-output-config Apr 29, 2025
@mergify mergify bot added documentation Improvements or additions to documentation structured-output v1 tool-calling labels Apr 29, 2025
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review but I've left some initial comments.

I'm not a huge fan of using JSON to bundle what could be separate arguments. As it is right now, each argument is listed and clearly described (which we get for free because we documented it in the dataclass anyway).

If you bundle it into JSON:

  • the documentation and --help text won't look as nice
  • you'll have to maintain separate documentation explaining all the args in the dataclass
  • it will become out of sync with the dataclass documentation

edit: After speaking to @aarnphm offline, I'm going to try adding the Config classes to the API reference directly. And then the JSON arg could reference that directly with no duplication or missing of information.

Copy link

mergify bot commented Apr 30, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@aarnphm aarnphm force-pushed the feat/decoding-args branch 2 times, most recently from e299540 to c70b8cf Compare April 30, 2025 21:16
@mergify mergify bot added the frontend label Apr 30, 2025
@aarnphm aarnphm force-pushed the feat/decoding-args branch from 958eacf to 26419c4 Compare April 30, 2025 21:51
Copy link

mergify bot commented May 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Copy link

mergify bot commented May 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

vllm/config.py Outdated
Comment on lines 3292 to 3298
# This class is purely to use for typing
class StructuredOutputOptions(TypedDict, total=False):
backend: Annotated[str, StructuredOutputBackend]
reasoning_backend: Optional[str]
disable_fallback: bool
disable_any_whitespace: bool
disable_additional_properties: bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mirror StructuredOutputConfig? You can type hint using StructuredOutputConfig (we do this for all the other configs)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hints will enforce users to have to create the class instead of hinting the dict.

This is mainly to support the case

structured_output_config = {"backend": " <-- auto completion here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to be passing the configs around as dicts

Copy link
Collaborator Author

@aarnphm aarnphm May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the internal representation will still be the class StructuredOutputConfig. If you can see this is used in Unpack in L3344 so that type checker won't complain about kwargs expansion:

StructuredOutputBackend(**structured_output_config)

This class is used to hint --structured-output-config from the CLI opts, and from LLM entrypoint, when users pass in structured_output_config={...}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI can handle dataclasses as default values, and with #17599 it can even validate their schemas directly fro the CLI.

when users pass in structured_output_config={...}

I don't think we want to allow users to do that, if it's type hinted with StructuredOutputConfig, they'll know what they need to import. I don't really want to create a special case for this config.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to mirror this option --structured-output-config "{}"

this implies it can be passed in as a dict right? so naturally structured_output_config={} should be supported

this will also apply to --speculative-config

Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good PR, I just have a lot of comments about how some things don't fit in well with the design of the rest of the configs. With a few relatively simple changes it can fit right in.

logger.debug("Building with reasoning backend %s",
self.decoding_config.reasoning_backend)
self.structured_output_config.reasoning_backend)

processor = get_local_guided_decoding_logits_processor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this method be renamed to structured output too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh probably I won't touched this code path, just that people using v0 will see a lot of deprecated warning if we don't touch this path 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'd say we want V0 users to see deprecation errors, but there are still a few legitimate use cases for it so we should probably avoid warnings about this if we can.

@@ -616,11 +616,11 @@ async def _process_request(
build_guided_decoding_logits_processor_async(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this method be renamed to structured output too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not user facing, and it is in v0, so probably not needed to rename it here now (given that the code path for v0 still calls it guided_decoding)

@@ -444,7 +445,7 @@ async def get_model_config(self) -> ModelConfig:
return self.model_config

async def get_decoding_config(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we use this function for anything atm

fwiw we can always get this from get_vllm_config().structured_outputs_config

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why these getters exist tbh. Do you need async getters to access members of classes used in async contexts?

Copy link
Collaborator Author

@aarnphm aarnphm May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find anywhere that we use this, will probably create a smaller PR to remove these noop.

seems like we already uses vllm_config everywhere else, so maybe this is an usecase downstream?

@aarnphm aarnphm force-pushed the feat/decoding-args branch from f5ad6ed to 3002d08 Compare May 16, 2025 08:56
@aarnphm aarnphm changed the title [Feature][CLI] Unify configuration for structured outputs via --structured-output-config [Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config May 16, 2025
@aarnphm aarnphm requested a review from hmellor May 16, 2025 08:59
@aarnphm aarnphm force-pushed the feat/decoding-args branch from 304617a to d357d9e Compare May 16, 2025 09:57
@aarnphm aarnphm force-pushed the feat/decoding-args branch from 5a74c80 to f5a8510 Compare May 16, 2025 10:35
@russellb
Copy link
Member

I personally find CLI arguments that take a JSON string pretty horrendous and don't like this trend in vLLM. It's a sign we're over-extending simple CLI arguments. We need a proper configuration file that supports structured, hierarchical data. I don't consider the current "yaml" config support to count, since it's just the same as the CLI args, including JSON blob values.

I don't have time to take on such a project, but I'd prefer to see efforts put in that direction than this direction.

@DarkLight1337
Copy link
Member

#18208 made it simpler to specify JSON-like args.

Copy link

mergify bot commented May 21, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants