Mistral 7B + UltraChat + Arithmo checkpoints
Collection
A collection of Mistral 7B fine-tunes on UltraChat and Arithmo to boost the math capabilities of chat models. See https://x.com/_lewtun/status/1715652
•
5 items
•
Updated
•
2
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the UltraChat and Arithmo datasets. It achieves the following results on the evaluation set:
# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="lewtun/mistral-7b-sft-ultrachat-arithmo-full", torch_dtype=torch.bfloat16, device_map="auto")
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# Ah, me hearty matey! But yer question be a puzzler! A human cannot eat a helicopter in one sitting, as helicopters are not edible. They be made of metal, plastic, and other materials, not food!
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.8586 | 0.38 | 344 | 0.9133 |
Base model
mistralai/Mistral-7B-v0.1