Skip to content

will ./server support parallel decoding as well? #3693

Closed
@ibehnam

Description

@ibehnam

Expected Behavior

The parallel examples look promising. I'm wondering if ./server will also support an -np argument to process requests in parallel. This way, the user can send np prompts at a time.

Current Behavior

Currently, ./server processes requests sequentially.

Environment and Context

macOS Sonoma, M1 Pro chip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions