Issues with se2seq tutorial (batch training)

### Add Link

Link to the tutorial:

https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

### Describe the bug

The tutorial was markedly changed in June 2023,  see commit 6c03bb3bbe17100a3b45e0c92c564911e24ab796  which aimed at fixing the implementation of attention among other things (#2468).  In doing so, several other things have been changed:

+ adding dataloader which returns a batch of zero-padded sequences to train the network 
+ the `foward()` function of the Decoder process input one word at the time in parallel for all sentences 
    in the batch until MAX_LENGTH is reached.

I am not a torch expert but I think that  the embedding layers in the encoder and decoder should have been modified to recognize padding (padding_idx=0 is missing).  Using zero-padded sequence as input might also have other implications during learning but I am not sure.  Can you confirm that the implementation is correct?

As a result of these change, the text does not describe well the code. I think that it would be nice to include a discussion of zero-padding and  the implications of using batches on the code in the tutorial. I am also curious if there is really a gain in using a batch since most sentences are short.
 
Finally, I found a mention in the text about using `teacher_forcing_ratio` which is not included in the code. The tutorial or the code need to be adjusted.

If this is useful, I found another [implementation](https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb) of the same tutorial which seems to be a fork from a previous version (it was archived in 2021):

+ It does not  does not use batches
+ It includes  `teacher_forcing_ratio` to select the amount of forced teaching
+ It implements both Luong et al and Bahdanau et al. models of attention



### Describe your environment

I appreciate this tutorial as it provides a simple introduction to Seq2Seq models with a small dataset. I am actually trying to port this tutorial in R with [torch](https://torch.mlverse.org/) package.

cc @albanD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with se2seq tutorial (batch training) #2840

Add Link

Describe the bug

Describe your environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issues with se2seq tutorial (batch training) #2840

Description

Add Link

Describe the bug

Describe your environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions