Skip to content

feat(serve): Enhance multi-node deployment and worker configuration #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Apr 1, 2025

Conversation

ishandhanani
Copy link
Contributor

@ishandhanani ishandhanani commented Apr 1, 2025

Overview

  • Add some docs for multinode deployments (8workers of decode on node1 and 8 workers or prefill on node2)
  • Small improvement to dynamo serve to allow individual components to accept flags/configs
  • Add uvloop.install() back in. We used to have it in all of our old examples (see here)

Details:

  • Updated LLM README with multi-node deployment examples and instructions
  • Updated serving.py to handle worker environments and service-specific configurations

cc: #428 by @tlipoca9

Copy link

copy-pr-bot bot commented Apr 1, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Contributor

@tlipoca9 tlipoca9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change of disagg_router.yaml is unnecessary.

@ishandhanani ishandhanani enabled auto-merge (squash) April 1, 2025 18:05
@nnshah1 nnshah1 requested a review from tlipoca9 April 1, 2025 18:50
Copy link
Contributor

@mohammedabdulwahhab mohammedabdulwahhab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @hutm that we need to clean up the different ways env, flags, and config are manipulated in serving.py. But that can come as part of a later follow up.

@ai-dynamo ai-dynamo deleted a comment from mohammedabdulwahhab Apr 1, 2025
@ishandhanani ishandhanani merged commit 60fa74c into main Apr 1, 2025
5 checks passed
@ishandhanani ishandhanani deleted the ishan/multinode-docs branch April 1, 2025 22:34
if worker_envs:
args.extend(["--worker-env", json.dumps(worker_envs)])
if resource_envs:
args.extend(["--worker-env", json.dumps(resource_envs)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mohammedabdulwahhab , @ishandhanani - still a little confused on the plural here, we are passing the full list of resource_envs to a variable that is worker-env -> singular - is that right?

its the full list for all workers that we pass here to a single process? - is this a global watcher watching all workers?

kylehh pushed a commit to kylehh/dynamo that referenced this pull request Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants