Skip to content

🐛 [Bug] Compilation failure for HuggingFace T5-base Model #1583

Closed
@gs-olive

Description

@gs-olive

Bug Description

When compiling the T5-base network (https://huggingface.co/t5-base), the following error is encountered:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

To Reproduce

Steps to reproduce the behavior:

  1. Run torch_tensorrt.compile with t5-base model as input, using fp32 precision.
  2. Choose two fixed-size inputs of shape [1, 128] and [1, 128] and enable truncate_long_and_double with 12 GB workspace.
  3. Pass in model keyword args to disable attention and hidden state outputs
  4. Run inference using the compiled model on two sample inputs.

Expected behavior

Model should successfully compile with Torch-TRT. Specifically, internal device mismatch issues should either be addressed with a warning at compile time, or should otherwise not cause errors.

Environment

  • Torch-TensorRT Version: 1.4.0.dev0+f43be5b6
  • PyTorch Version: 1.14.0.dev20221114+cu116
  • CPU Architecture: Intel Xeon CPU
  • OS: Ubuntu 20.04
  • How you installed PyTorch: pip
  • Build command you used: python setup.py develop
  • Are you using local sources or building from archives: local
  • Python version: 3.8.13
  • CUDA version: 11.6

Additional context

The problem seems related to #1416 which was intended to address device mismatch issues of this sort. Since this case is not caught by that PR, it likely arises in a different area, for example as a result of an internal computation in a Torch block.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions