Rotary Scaling Factor of 4 for 8k context (Do not merge)

#23

by nbroad HF staff - opened Jun 8

base: refs/heads/main

←

from: refs/pr/23

Discussion Files changed

-1

nbroad

Jun 8

•

edited Jun 11

This is a revision that updates the "rotary_scaling_factor" to 4.0 which corresponds with a sequence length of 8192 tokens.

This PR should not be merged, as it is intended only for usage in TEI by specifying the revision argument.

Here is how you can use this model:

model=nomic-ai/nomic-embed-text-v1.5
revision=refs/pr/23
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run

docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model --revision $revision

Rotary Scaling Factor of 4 for 8k context (Do not merge)58506cb1

namnnumbr

about 6 hours ago

This indicates the scaling factor is 4 for 8k context; the model card documentation indicates that the scaling factor is 2. For a full 8k context, which rotary_scaling_factor is recommended?

The model natively supports scaling of the sequence length past 2048 tokens. To do so,

- tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
+ tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased', model_max_length=8192)

- model = AutoModel.from_pretrained('nomic-ai/nomic-embed-text-v1', trust_remote_code=True)
+ model = AutoModel.from_pretrained('nomic-ai/nomic-embed-text-v1', trust_remote_code=True, rotary_scaling_factor=2)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment