[Feature][CLI] Unify configuration for structured outputs via `--structured-outputs-config` #17420

aarnphm · 2025-04-29T22:58:00Z

This PR introduces the args --structured-output-config as a way to unify all related structured outputs config in one CLI field.
This would help simplify general UX for specifying custom options with backends.

This means all previous arguments are considered deprecated, and --structured-output-config should be used going forward (v0.10.x)

This is the first of many to move all guided_decoding -> structured_output namespace. I plan to finish this migration by v0.10.x
for both OpenAI frontend and all path within vLLM codebase.

Signed-off-by: Aaron Pham [email protected]

github-actions · 2025-04-29T22:58:11Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vllm/config.py

vllm/engine/arg_utils.py

hmellor

Not a full review but I've left some initial comments.

I'm not a huge fan of using JSON to bundle what could be separate arguments. As it is right now, each argument is listed and clearly described (which we get for free because we documented it in the dataclass anyway).

If you bundle it into JSON:

the documentation and --help text won't look as nice
you'll have to maintain separate documentation explaining all the args in the dataclass
it will become out of sync with the dataclass documentation

edit: After speaking to @aarnphm offline, I'm going to try adding the Config classes to the API reference directly. And then the JSON arg could reference that directly with no duplication or missing of information.

vllm/config.py

vllm/engine/arg_utils.py

vllm/config.py

mergify · 2025-04-30T00:03:00Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-05-01T06:20:51Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-05-01T12:22:50Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

vllm/config.py

vllm/engine/arg_utils.py

hmellor · 2025-05-15T13:08:16Z

vllm/config.py

+# This class is purely to use for typing
+class StructuredOutputOptions(TypedDict, total=False):
+    backend: Annotated[str, StructuredOutputBackend]
+    reasoning_backend: Optional[str]
+    disable_fallback: bool
+    disable_any_whitespace: bool
+    disable_additional_properties: bool


Why mirror StructuredOutputConfig? You can type hint using StructuredOutputConfig (we do this for all the other configs)

The hints will enforce users to have to create the class instead of hinting the dict.

This is mainly to support the case

structured_output_config = {"backend": " <-- auto completion here

I don't think we want to be passing the configs around as dicts

No, the internal representation will still be the class StructuredOutputConfig. If you can see this is used in Unpack in L3344 so that type checker won't complain about kwargs expansion:

StructuredOutputBackend(**structured_output_config)

This class is used to hint --structured-output-config from the CLI opts, and from LLM entrypoint, when users pass in structured_output_config={...}

The CLI can handle dataclasses as default values, and with #17599 it can even validate their schemas directly fro the CLI.

when users pass in structured_output_config={...}

I don't think we want to allow users to do that, if it's type hinted with StructuredOutputConfig, they'll know what they need to import. I don't really want to create a special case for this config.

This is to mirror this option --structured-output-config "{}"

this implies it can be passed in as a dict right? so naturally structured_output_config={} should be supported

this will also apply to --speculative-config

hmellor

This is a good PR, I just have a lot of comments about how some things don't fit in well with the design of the rest of the configs. With a few relatively simple changes it can fit right in.

docs/source/features/structured_outputs.md

tests/models/language/generation/test_mistral.py

tests/v1/entrypoints/llm/test_struct_output_generate.py

hmellor · 2025-05-15T19:23:41Z

vllm/engine/llm_engine.py

                logger.debug("Building with reasoning backend %s",
-                             self.decoding_config.reasoning_backend)
+                             self.structured_output_config.reasoning_backend)

            processor = get_local_guided_decoding_logits_processor(


Should this method be renamed to structured output too?

same as above

eh probably I won't touched this code path, just that people using v0 will see a lot of deprecated warning if we don't touch this path 😄

Hmm, I'd say we want V0 users to see deprecation errors, but there are still a few legitimate use cases for it so we should probably avoid warnings about this if we can.

vllm/engine/multiprocessing/client.py

hmellor · 2025-05-15T19:24:10Z

vllm/engine/multiprocessing/client.py

@@ -616,11 +616,11 @@ async def _process_request(
                build_guided_decoding_logits_processor_async(


Should this method be renamed to structured output too?

This is not user facing, and it is in v0, so probably not needed to rename it here now (given that the code path for v0 still calls it guided_decoding)

vllm/entrypoints/llm.py

hmellor · 2025-05-15T19:25:54Z

vllm/v1/engine/async_llm.py

@@ -444,7 +445,7 @@ async def get_model_config(self) -> ModelConfig:
        return self.model_config

    async def get_decoding_config(self):


Should we rename this method?

I don't think we use this function for anything atm

fwiw we can always get this from get_vllm_config().structured_outputs_config

I'm not sure why these getters exist tbh. Do you need async getters to access members of classes used in async contexts?

I can't find anywhere that we use this, will probably create a smaller PR to remove these noop.

seems like we already uses vllm_config everywhere else, so maybe this is an usecase downstream?

Signed-off-by: Aaron Pham <[email protected]>

russellb · 2025-05-19T14:41:44Z

I personally find CLI arguments that take a JSON string pretty horrendous and don't like this trend in vLLM. It's a sign we're over-extending simple CLI arguments. We need a proper configuration file that supports structured, hierarchical data. I don't consider the current "yaml" config support to count, since it's just the same as the CLI args, including JSON blob values.

I don't have time to take on such a project, but I'd prefer to see efforts put in that direction than this direction.

DarkLight1337 · 2025-05-19T14:59:10Z

#18208 made it simpler to specify JSON-like args.

mergify · 2025-05-21T14:35:18Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

aarnphm requested review from mgoin, russellb, DarkLight1337, ywang96, WoosukKwon, robertgshaw2-redhat, njhill, comaniac, alexm-redhat, zhuohan123 and youkaichao as code owners April 29, 2025 22:58

aarnphm requested review from hmellor and removed request for comaniac, njhill, zhuohan123, youkaichao and alexm-redhat April 29, 2025 22:58

aarnphm changed the title ~~[Feature][CLI] Unify configuration for structured outputs~~ [Feature][CLI] Unify configuration for structured outputs via --structured-output-config Apr 29, 2025

mergify bot added documentation Improvements or additions to documentation structured-output v1 tool-calling labels Apr 29, 2025

github-project-automation bot added this to Structured Output Apr 29, 2025

russellb added this to Tool Calling Apr 29, 2025

russellb reviewed Apr 29, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

aarnphm commented Apr 29, 2025

View reviewed changes

vllm/engine/arg_utils.py Outdated Show resolved Hide resolved

hmellor requested changes Apr 29, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Outdated Show resolved Hide resolved

vllm/engine/arg_utils.py Outdated Show resolved Hide resolved

hmellor reviewed Apr 29, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

aarnphm force-pushed the feat/decoding-args branch 2 times, most recently from e299540 to c70b8cf Compare April 30, 2025 21:16

mergify bot added the frontend label Apr 30, 2025

aarnphm force-pushed the feat/decoding-args branch from 958eacf to 26419c4 Compare April 30, 2025 21:51

mergify bot added needs-rebase and removed needs-rebase labels May 1, 2025

mergify bot added the needs-rebase label May 1, 2025

aarnphm mentioned this pull request May 5, 2025

[Feature]: provide a way to configure rope-scaling that isn't inline JSON #17640

Closed

mergify bot removed the needs-rebase label May 15, 2025

aarnphm force-pushed the feat/decoding-args branch from 8c123b1 to 3062207 Compare May 15, 2025 05:25

hmellor reviewed May 15, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

hmellor reviewed May 15, 2025

View reviewed changes

vllm/engine/arg_utils.py Outdated Show resolved Hide resolved

hmellor reviewed May 15, 2025

View reviewed changes

aarnphm mentioned this pull request May 15, 2025

[Frontend] Rename all guided prefix to structured_output #18219

Closed

hmellor requested changes May 15, 2025

View reviewed changes

aarnphm force-pushed the feat/decoding-args branch from f5ad6ed to 3002d08 Compare May 16, 2025 08:56

aarnphm changed the title ~~[Feature][CLI] Unify configuration for structured outputs via --structured-output-config~~ [Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config May 16, 2025

aarnphm requested a review from hmellor May 16, 2025 08:59

aarnphm force-pushed the feat/decoding-args branch from 304617a to d357d9e Compare May 16, 2025 09:57

feat: unify with '--structured-outputs-config'

f5a8510

Signed-off-by: Aaron Pham <[email protected]>

aarnphm force-pushed the feat/decoding-args branch from 5a74c80 to f5a8510 Compare May 16, 2025 10:35

aarnphm added 3 commits May 16, 2025 10:58

fix: create constructor

0bcf708

Signed-off-by: Aaron Pham <[email protected]>

fix: make sure to maintain backward

3b1453c

Signed-off-by: Aaron Pham <[email protected]>

fix(config): make sure to setup from correct variables

aded914

Signed-off-by: Aaron Pham <[email protected]>

mergify bot added the needs-rebase label May 21, 2025

		@@ -616,11 +616,11 @@ async def _process_request(
		build_guided_decoding_logits_processor_async(

		@@ -444,7 +445,7 @@ async def get_model_config(self) -> ModelConfig:
		return self.model_config

		async def get_decoding_config(self):

Uh oh!

[Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config #17420

Are you sure you want to change the base?

[Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config #17420

Uh oh!

Conversation

aarnphm commented Apr 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 29, 2025

Uh oh!

Uh oh!

Uh oh!

hmellor left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Apr 30, 2025

Uh oh!

mergify bot commented May 1, 2025

Uh oh!

mergify bot commented May 1, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aarnphm May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aarnphm May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russellb commented May 19, 2025

Uh oh!

DarkLight1337 commented May 19, 2025

[Feature][CLI] Unify configuration for structured outputs via `--structured-outputs-config` #17420

[Feature][CLI] Unify configuration for structured outputs via `--structured-outputs-config` #17420

aarnphm commented Apr 29, 2025 •

edited by github-actions bot

Loading

hmellor left a comment •

edited

Loading

aarnphm May 15, 2025 •

edited

Loading

aarnphm May 16, 2025 •

edited

Loading