Closed
Description
Hello,
I've read the docs and tried a few different ways to start speculative decoding, but they all fail.
E.g.
error: unrecognized arguments: --draft_model=prompt-lookup-decoding --draft_model_num_pred_tokens=2
or
Extra inputs are not permitted [type=extra_forbidden, input_value='prompt-lookup-decoding', input_type=str]
So what is the correct way to start an openai server with speculative decoding?
Cheers.
Metadata
Metadata
Assignees
Labels
No labels