Skip to content

Expose run name in dist/spmd components #1035

Closed
@clumsy

Description

@clumsy

Description

The experiment name is currently exposed in dist/spmd components via TORCHX_TRACKING_EXPERIMENT_NAME.
But we don't have an env variable for run name.

Motivation/Background

This is very convenient to control e.g. MLflow client's experiment name via env variables via --name myexp/myrun.
TORCHX_JOB_ID is by definition unique and when we need to restart training for the same MLflow run, we would like to retain the previous run name.

Detailed Proposal

Set TORCHX_TRACKING_RUN_NAME in dist/spmd components similar to TORCHX_TRACKING_EXPERIMENT_NAME

Alternatives

Using TORCHX_JOB_ID won't help as it's unique by definition, but need to continue a run

Additional context/links

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions