Missing weights for example code

#1
by Starlento - opened

I install cramming and use the latest transformers.
The example code can be run, but many weights cannot be loaded.
Is this normal?

Some weights of the model checkpoint at ./models/pbelcak_FastBERT-1x11-long were not used when initializing ScriptableLMForPreTraining: ['encoder.layers.7.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_out.weight', 'encoder.layers.13.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.weight', 'encoder.layers.12.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_in.bias', 'encoder.layers.0.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_out.weight', 'encoder.layers.6.ffn.linear_in.weight', 'encoder.layers.4.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.weight', 'encoder.layers.14.ffn.linear_in.weight', 'encoder.layers.2.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.bias', 'encoder.layers.3.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_in.weight', 'encoder.layers.8.ffn.linear_out.weight', 'encoder.layers.9.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.bias', 'encoder.layers.13.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_in.weight', 'encoder.layers.10.ffn.linear_out.weight', 'encoder.layers.5.ffn.linear_in.weight', 'encoder.layers.6.ffn.linear_in.bias', 'encoder.layers.4.ffn.linear_in.bias', 'encoder.layers.15.ffn.linear_out.weight', 'encoder.layers.10.ffn.linear_in.weight', 'encoder.layers.13.ffn.linear_in.bias', 'encoder.layers.5.ffn.linear_in.bias', 'encoder.layers.2.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_in.bias', 'encoder.layers.1.ffn.linear_in.bias', 'encoder.layers.12.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.bias', 'encoder.layers.8.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_in.weight', 'encoder.layers.1.ffn.linear_out.weight', 'encoder.layers.3.ffn.linear_in.bias', 'encoder.layers.9.ffn.linear_in.weight', 'encoder.layers.0.ffn.linear_out.weight', 'encoder.layers.14.ffn.linear_out.weight', 'encoder.layers.15.ffn.linear_in.weight', 'encoder.layers.11.ffn.linear_out.weight', 'encoder.layers.7.ffn.linear_out.weight']
- This IS expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ScriptableLMForPreTraining from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of ScriptableLMForPreTraining were not initialized from the model checkpoint at ./models/pbelcak_FastBERT-1x11-long and are newly initialized: ['encoder.layers.14.ffn.dense_in.weight', 'encoder.layers.15.ffn.dense_out.weight', 'encoder.layers.15.ffn.dense_in.weight', 'encoder.layers.13.ffn.dense_in.weight', 'encoder.layers.11.ffn.dense_in.weight', 'encoder.layers.5.ffn.dense_out.weight', 'encoder.layers.12.ffn.dense_out.weight', 'encoder.layers.9.ffn.dense_out.weight', 'encoder.layers.1.ffn.dense_out.weight', 'encoder.layers.5.ffn.dense_in.weight', 'encoder.layers.8.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_out.weight', 'encoder.layers.8.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_in.weight', 'encoder.layers.4.ffn.dense_in.weight', 'encoder.layers.10.ffn.dense_out.weight', 'encoder.layers.4.ffn.dense_out.weight', 'encoder.layers.2.ffn.dense_out.weight', 'encoder.layers.11.ffn.dense_out.weight', 'encoder.layers.14.ffn.dense_out.weight', 'encoder.layers.0.ffn.dense_in.weight', 'encoder.layers.3.ffn.dense_out.weight', 'encoder.layers.13.ffn.dense_out.weight', 'encoder.layers.3.ffn.dense_in.weight', 'encoder.layers.1.ffn.dense_in.weight', 'encoder.layers.6.ffn.dense_out.weight', 'encoder.layers.10.ffn.dense_in.weight', 'encoder.layers.12.ffn.dense_in.weight', 'encoder.layers.2.ffn.dense_in.weight', 'encoder.layers.9.ffn.dense_in.weight', 'encoder.layers.7.ffn.dense_out.weight', 'encoder.layers.7.ffn.dense_in.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Hello,

What your warnings are saying is that they found weights for the FFF module (training/cramming/architectures/fff.py) but the model is trying to load the weights for the FFNComponent module (training/cramming/crammed_bert.py). I just tried running the README example with a fresh instance and I could not reproduce your warnings.

You're most likely using cramming installed from the original cramming repository and not from the training directory of this project.

To recap, these are the steps:

  1. pip uninstall cramming to remove the previous version of cramming installed in your environment -- or just start with a fresh environment.
  2. cd training
  3. pip install .
  4. Create minimal_example.py
  5. Paste
import cramming
from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long")
model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long")

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
  1. python minimal_example.py.

there is no training folder in your repo. Am I looking to the wrong place?

Hi batrlatom,

This is the folder in the repo:

https://github.com/pbelcak/FastBERT/tree/main/training

Hello,

What your warnings are saying is that they found weights for the FFF module (training/cramming/architectures/fff.py) but the model is trying to load the weights for the FFNComponent module (training/cramming/crammed_bert.py). I just tried running the README example with a fresh instance and I could not reproduce your warnings.

You're most likely using cramming installed from the original cramming repository and not from the training directory of this project.

To recap, these are the steps:

  1. pip uninstall cramming to remove the previous version of cramming installed in your environment -- or just start with a fresh environment.
  2. cd training
  3. pip install .
  4. Create minimal_example.py
  5. Paste
import cramming
from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("pbelcak/FastBERT-1x11-long")
model = AutoModelForMaskedLM.from_pretrained("pbelcak/FastBERT-1x11-long")

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
  1. python minimal_example.py.

Yes, you are right. I do install the original cramming. Sorry that I missed it in the README and thank you very much for the reply.

Starlento changed discussion status to closed

Sign up or log in to comment