Skip to content

(pad_on_left and) ids_tensor error(s) in Dynamic Quantization on BERT tutorial #1406

Open
@jjohnson-arm

Description

@jjohnson-arm

Tutorial link: https://github.com/pytorch/tutorials/blob/master/intermediate_source/dynamic_quantization_bert_tutorial.rst
Version 1.8.0

Using either the Colab version or following the tutorial text locally, it fails in section 3.2:

# Evaluate the original FP32 BERT model
time_model_evaluation(model, configs, tokenizer)

Output from colab version:

/usr/local/lib/python3.7/dist-packages/transformers/data/processors/glue.py:175: FutureWarning: This processor will be removed from the library soon, preprocessing should be handled with the 🤗 Datasets library. You can have a look at this example script for pointers: https://github.com/huggingface/transformers/blob/master/examples/text-classification/run_glue.py
  warnings.warn(DEPRECATION_WARNING.format("processor"), FutureWarning)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-7f3f2ffdfbf3> in <module>()
      8 
      9 # Evaluate the original FP32 BERT model
---> 10 time_model_evaluation(model, configs, tokenizer)

2 frames
<ipython-input-11-9c9008fc3551> in load_and_cache_examples(args, task, tokenizer, evaluate)
    113                                                 pad_on_left=bool(args.model_type in ['xlnet']),                 # pad on the left for xlnet
    114                                                 pad_token=tokenizer.convert_tokens_to_ids([tokenizer.pad_token])[0],
--> 115                                                 pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0,
    116         )
    117         if args.local_rank in [-1, 0]:

TypeError: glue_convert_examples_to_features() got an unexpected keyword argument 'pad_on_left'

I can workaround this by commenting out the pad arguments in 2.3:

        features = convert_examples_to_features(examples,
                                                tokenizer,
                                                label_list=label_list,
                                                max_length=args.max_seq_length,
                                                output_mode=output_mode,
                                                # pad_on_left=bool(args.model_type in ['xlnet']),                 # pad on the left for xlnet
                                                # pad_token=tokenizer.convert_tokens_to_ids([tokenizer.pad_token])[0],
                                                # pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0,
        )

There is another problem when running through the tutorial locally, when I got to section 3.3:

input_ids = ids_tensor([8, 128], 2)

I got this error:

Traceback (most recent call last):
  File "bert-test.py", line 307, in <module>
    input_ids = ids_tensor([8, 128], 2)
NameError: name 'ids_tensor' is not defined

The Colab version diverges at this point and doesn't use the ids_tensor function.

cc @jerryzh168 @jianyuh @z-a-f @vkuzo

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions