-
-
Notifications
You must be signed in to change notification settings - Fork 7.7k
[Feature][CLI] Unify configuration for structured outputs via --structured-outputs-config
#17420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
--structured-output-config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review but I've left some initial comments.
I'm not a huge fan of using JSON to bundle what could be separate arguments. As it is right now, each argument is listed and clearly described (which we get for free because we documented it in the dataclass anyway).
If you bundle it into JSON:
- the documentation and
--help
text won't look as nice - you'll have to maintain separate documentation explaining all the args in the dataclass
- it will become out of sync with the dataclass documentation
edit: After speaking to @aarnphm offline, I'm going to try adding the Config classes to the API reference directly. And then the JSON arg could reference that directly with no duplication or missing of information.
This pull request has merge conflicts that must be resolved before it can be |
e299540
to
c70b8cf
Compare
958eacf
to
26419c4
Compare
This pull request has merge conflicts that must be resolved before it can be |
This pull request has merge conflicts that must be resolved before it can be |
8c123b1
to
3062207
Compare
vllm/config.py
Outdated
# This class is purely to use for typing | ||
class StructuredOutputOptions(TypedDict, total=False): | ||
backend: Annotated[str, StructuredOutputBackend] | ||
reasoning_backend: Optional[str] | ||
disable_fallback: bool | ||
disable_any_whitespace: bool | ||
disable_additional_properties: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mirror StructuredOutputConfig
? You can type hint using StructuredOutputConfig
(we do this for all the other configs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hints will enforce users to have to create the class instead of hinting the dict.
This is mainly to support the case
structured_output_config = {"backend": " <-- auto completion here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to be passing the configs around as dicts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the internal representation will still be the class StructuredOutputConfig
. If you can see this is used in Unpack
in L3344 so that type checker won't complain about kwargs expansion:
StructuredOutputBackend(**structured_output_config)
This class is used to hint --structured-output-config
from the CLI opts, and from LLM entrypoint, when users pass in structured_output_config={...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI can handle dataclasses as default values, and with #17599 it can even validate their schemas directly fro the CLI.
when users pass in
structured_output_config={...}
I don't think we want to allow users to do that, if it's type hinted with StructuredOutputConfig
, they'll know what they need to import. I don't really want to create a special case for this config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to mirror this option --structured-output-config "{}"
this implies it can be passed in as a dict right? so naturally structured_output_config={}
should be supported
this will also apply to --speculative-config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good PR, I just have a lot of comments about how some things don't fit in well with the design of the rest of the configs. With a few relatively simple changes it can fit right in.
logger.debug("Building with reasoning backend %s", | ||
self.decoding_config.reasoning_backend) | ||
self.structured_output_config.reasoning_backend) | ||
|
||
processor = get_local_guided_decoding_logits_processor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be renamed to structured output too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh probably I won't touched this code path, just that people using v0 will see a lot of deprecated warning if we don't touch this path 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'd say we want V0 users to see deprecation errors, but there are still a few legitimate use cases for it so we should probably avoid warnings about this if we can.
@@ -616,11 +616,11 @@ async def _process_request( | |||
build_guided_decoding_logits_processor_async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be renamed to structured output too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not user facing, and it is in v0, so probably not needed to rename it here now (given that the code path for v0 still calls it guided_decoding)
vllm/v1/engine/async_llm.py
Outdated
@@ -444,7 +445,7 @@ async def get_model_config(self) -> ModelConfig: | |||
return self.model_config | |||
|
|||
async def get_decoding_config(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we use this function for anything atm
fwiw we can always get this from get_vllm_config().structured_outputs_config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why these getters exist tbh. Do you need async getters to access members of classes used in async contexts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find anywhere that we use this, will probably create a smaller PR to remove these noop.
seems like we already uses vllm_config everywhere else, so maybe this is an usecase downstream?
f5ad6ed
to
3002d08
Compare
--structured-output-config
--structured-outputs-config
304617a
to
d357d9e
Compare
Signed-off-by: Aaron Pham <[email protected]>
5a74c80
to
f5a8510
Compare
Signed-off-by: Aaron Pham <[email protected]>
Signed-off-by: Aaron Pham <[email protected]>
Signed-off-by: Aaron Pham <[email protected]>
I personally find CLI arguments that take a JSON string pretty horrendous and don't like this trend in vLLM. It's a sign we're over-extending simple CLI arguments. We need a proper configuration file that supports structured, hierarchical data. I don't consider the current "yaml" config support to count, since it's just the same as the CLI args, including JSON blob values. I don't have time to take on such a project, but I'd prefer to see efforts put in that direction than this direction. |
#18208 made it simpler to specify JSON-like args. |
This pull request has merge conflicts that must be resolved before it can be |
This PR introduces the args
--structured-output-config
as a way to unify all related structured outputs config in one CLI field.This would help simplify general UX for specifying custom options with backends.
This means all previous arguments are considered deprecated, and
--structured-output-config
should be used going forward (v0.10.x)This is the first of many to move all
guided_decoding
->structured_output
namespace. I plan to finish this migration by v0.10.xfor both OpenAI frontend and all path within vLLM codebase.
Signed-off-by: Aaron Pham [email protected]