Skip to content

Training reproducible with PyTorch but not with PyTorch + PyTorch3D #659

Closed
@abhi1kumar

Description

@abhi1kumar

❓ How to ensure reproducibility of training with PyTorch3D

I am trying to reproduce the training with PyTorch + PyTorch3D. When I only use PyTorch and do not use PyTorch3D, my entire training is reproducible. In other words, when I execute my training script, the errors and the logs match. However, when I introduce PyTorch3D based rendering in training, the training becomes irreproducible.

Libraries and their versions -

  • PyTorch3D 0.4.0
  • PyTorch 1.5.1
  • Torchvision 0.6.1
  • Cuda 10.1

Code to seed out the training

def init_torch(rng_seed, cuda_seed):
    """
    Initializes the seeds for ALL potential randomness, including numpy, random and  torch.

    Args:
        rng_seed (int): the shared random seed to use for numpy and random
        cuda_seed (int): the random seed to use for pytorch's torch.cuda.manual_seed_all function
    """
    np.random.seed(rng_seed)
    random.seed(rng_seed)
    os.environ['PYTHONHASHSEED'] = str(rng_seed)
    
    torch.manual_seed(rng_seed)
    torch.cuda.manual_seed(cuda_seed)
    torch.cuda.manual_seed_all(cuda_seed)

    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

I also looked if I am missing something in the PyTorch 1.5.1 reproducibility documentation but could not find anything else.

The latest PyTorch reproducibility documentation says that
Furthermore, if you are using CUDA tensors, and your CUDA version is 10.2 or greater, you should set the environment variable CUBLAS_WORKSPACE_CONFIG according to CUDA documentation
Since I am using Cuda 10.1, so I assume this problem should not arise.

It would be great if you could tell how do we remove randomness while using PyTorch3D in order to fully reproduce the training.

Metadata

Metadata

Assignees

Labels

do-not-reapDo not delete this pull request or issue due to inactivity.questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions