--- license: llama2 datasets: - HuggingFaceH4/deita-10k-v0-sft language: - en pipeline_tag: text-generation --- # Model Card for Model ID ## Model Details ### Model Description This model is SFT by `HuggingFaceH4/deita-10k-v0-sft` dataset on `lmsys/vicuna-7b-v1.5` model. - **Model type:** Llama2 Decoder-Only - **Language(s) (NLP):** English - **License:** llama2 - **Finetuned from model:** lmsys/vicuna-7b-v1.5 ## Training Details ### Training Data HuggingFaceH4/deita-10k-v0-sft ### Training Procedure SFT Notice: `do_sample` in `generation_config.json` was set to `True` to avoid this error `https://github.com/huggingface/transformers/issues/29988`. #### Training Hyperparameters - **Precision:** BFloat16 - **Chat Template:** Vicuna 1.1 - **Global Batch Size:** 128 - **Learning Rate:** 2.0e-5 - **Num Epoches:** 3 - **Max Length:** 2048 - **Packing:** True - **Training Steps** 1047 ## Evaluation It Finally achieved loss=0.8375901579856873 in the eval set of `HuggingFaceH4/deita-10k-v0-sft` ### Testing Data, Factors & Metrics