Why is the number of input embeddings 8 greater than the vocabulary size?

#56
by Ali2500 - opened

I noticed that the number of input embeddings (128264) is 8 greater than the number of output embeddings/vocabulary size (128256).

image.png

What are the extra 8 embeddings at the end for?

Ali2500 changed discussion title from Why is the number of input embeddings is 8 greater than the vocabulary size? to Why is the number of input embeddings 8 greater than the vocabulary size?

Sign up or log in to comment