🐐 FinGEITje 7B DPO

A large open Dutch financial language model aligned through AI feedback.

This model is a fine-tuned version of snoels/FinGEITje-7B-sft on the BramVanroy/ultra_feedback_dutch dataset.

📖 Model Description

FinGEITje-7B-dpo is a large open Dutch financial language model with 7 billion parameters, based on Mistral 7B. It has been further trained using Direct Preference Optimization (DPO) on AI-generated preference data, aligning the model's responses with human-like preferences in the Dutch language. This alignment process enhances the model's ability to generate more helpful, coherent, and user-aligned responses in financial contexts.

📊 Training

Training Data

FinGEITje-7B-dpo was fine-tuned on the BramVanroy/ultra_feedback_dutch dataset, which consists of synthetic preference data in Dutch. This dataset includes prompts along with preferred and less preferred responses, allowing the model to learn to generate more aligned responses through DPO.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 64
total_eval_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.1029	0.1327	100	0.1099	-1.8067	-5.3683	0.9679	3.5616	-892.3373	-579.9115	-2.4775	-2.3705
0.042	0.2654	200	0.0430	-3.5129	-10.6778	0.9828	7.1649	-1423.2883	-750.5289	-1.9744	-1.9895
0.0278	0.3981	300	0.0344	-3.7335	-13.5153	0.9828	9.7818	-1707.0360	-772.5893	-1.7454	-1.8191
0.0223	0.5308	400	0.0308	-3.6554	-13.7712	0.9858	10.1158	-1732.6289	-764.7831	-1.8020	-1.9184
0.0378	0.6635	500	0.0297	-4.0018	-16.3285	0.9851	12.3266	-1988.3542	-799.4221	-1.6924	-1.8650
0.0352	0.7962	600	0.0278	-3.8104	-15.6430	0.9836	11.8327	-1919.8119	-780.2752	-1.7437	-1.8978
0.0238	0.9289	700	0.0279	-3.8974	-15.9642	0.9828	12.0668	-1951.9310	-788.9780	-1.7371	-1.8937

Framework versions

PEFT 0.11.1
Transformers 4.42.4
Pytorch 2.3.1
Datasets 2.20.0
Tokenizers 0.19.1

🛠️ How to Use

FinGEITje-7B-dpo can be utilized using the Hugging Face Transformers library along with PEFT to load the adapters efficiently.

Installation

Ensure you have the necessary libraries installed:

pip install torch transformers peft accelerate

Loading the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("BramVanroy/GEITje-7B-ultra", use_fast=False)

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained("BramVanroy/GEITje-7B-ultra", device_map='auto')

# Load the FinGEITje-7B-dpo model with PEFT adapters
model = PeftModel.from_pretrained(base_model, "snoels/FinGEITje-7B-dpo", device_map='auto')

Generating Text

# Prepare the input
input_text = "Wat zijn de laatste trends in de Nederlandse banksector?"
input_ids = tokenizer.encode(input_text, return_tensors='pt').to(model.device)

# Generate a response
outputs = model.generate(input_ids, max_length=200, num_return_sequences=1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

🙏 Acknowledgements

We would like to thank:

Rijgersberg (GitHub) for creating GEITje, one of the first Dutch foundation models.
Bram Vanroy (GitHub) for creating GEITje-7B-ultra and providing the ultra_feedback_dutch dataset.
Contributors of the Alignment Handbook for providing valuable resources that guided the development and training process of FinGEITje-7B-dpo.

📝 Citation

Link to the paper

If you use FinGEITje-7B-dpo in your work, please cite:

@article{FinGEITje2024,
  title={A Dutch Financial Large Language Model},
  author={Noels, Sander and De Blaere, Jorne and De Bie, Tijl},
  journal={arXiv preprint arXiv:2410.12835},
  year={2024},
  url={https://arxiv.org/abs/2410.12835}
}

📜 License

This model is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

📧 Contact

For any inquiries or questions, please contact Sander Noels.

snoels
/

FinGEITje-7B-dpo