Closed
Description
Description
The experiment name is currently exposed in dist/spmd components via TORCHX_TRACKING_EXPERIMENT_NAME
.
But we don't have an env variable for run name.
Motivation/Background
This is very convenient to control e.g. MLflow client's experiment name via env variables via --name myexp/myrun
.
TORCHX_JOB_ID
is by definition unique and when we need to restart training for the same MLflow run, we would like to retain the previous run name.
Detailed Proposal
Set TORCHX_TRACKING_RUN_NAME
in dist/spmd components similar to TORCHX_TRACKING_EXPERIMENT_NAME
Alternatives
Using TORCHX_JOB_ID
won't help as it's unique by definition, but need to continue a run
Additional context/links
N/A
Metadata
Metadata
Assignees
Labels
No labels