Also, I am slightly wrong in that the first dimension will many not always be 77, since apparently there is no padding in the tokenizer. Test notebook here: https://colab.research.google.com/drive/192PDIbc2XiI1HgJQSdN...