🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT

##  Bug Description

I wanted to use Torch-TensorRT to boost BERT model inference, but met following errors:

../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.

**_CUDA initialization failure with error: 710. Please check your CUDA installation:  http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Segmentation fault (core dumped)_**

## To Reproduce

```python
from transformers import BertModel, BertTokenizer, BertConfig
import numpy as np
import torch
import torch_tensorrt
import time

print("VERSION:", torch_tensorrt.__version__)

# Creating a dummy input
test_batchsz = 128
tokens_tensor = torch.ones((test_batchsz, 20)).to(torch.int32).cuda()
segments_tensors = torch.zeros((test_batchsz, 20)).to(torch.int32).cuda()
mask_tensors = torch.ones((test_batchsz, 20)).to(torch.int32).cuda()

model = BertModel.from_pretrained("bert-base-chinese", torchscript=True)
torch_script_module = torch.jit.trace(model.eval().cuda(), (tokens_tensor, mask_tensors, segments_tensors))

trt_ts_module = torch_tensorrt.compile(torch_script_module.float(),
                        inputs= [torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        ], 
                        enabled_precisions= {torch.float},
                        workspace_size=2000000000,
                        truncate_long_and_double=True)

```

## Environment

> Build information about Torch-TensorRT can be found by turning on debug messages

 - Torch-TensorRT Version (e.g. 1.0.0): 1.2.0
 - PyTorch Version (e.g. 1.0): 1.12.1
 - CPU Architecture:
 - OS (e.g., Linux): Linux
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source): pip
 - Build command you used (if compiling from source):
 - Are you using local sources or building from archives:
 - Python version: 3.8
 - CUDA version: 11.6
 - GPU models and configuration:
 - Any other relevant information:

## Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

Bug Description

To Reproduce

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

Description

Bug Description

To Reproduce

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions