mhenrichsen commited on
Commit
f1cd101
1 Parent(s): cb20077

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -92
README.md CHANGED
@@ -8,99 +8,8 @@ model-index:
8
  results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
15
- <details><summary>See axolotl config</summary>
16
-
17
- axolotl version: `0.3.0`
18
- ```yaml
19
- base_model: mhenrichsen/danskgpt-tiny
20
-
21
- model_type: LlamaForCausalLM
22
- tokenizer_type: LlamaTokenizer
23
- is_llama_derived_model: true
24
-
25
- load_in_8bit: false
26
- load_in_4bit: false
27
- strict: false
28
-
29
- #pretraining_dataset: mhenrichsen/terra
30
-
31
- datasets:
32
- - path: mhenrichsen/rag-qa-sharegpt
33
- type: sharegpt
34
- conversation: chatml
35
- - path: mhenrichsen/creator
36
- type: sharegpt
37
- conversation: chatml
38
- - path: mhenrichsen/puffin-sharegpt-fix
39
- type: sharegpt
40
- conversation: chatml
41
- - path: mhenrichsen/orcaslim-sharegpt-fix
42
- type: sharegpt
43
- conversation: chatml
44
- - path: mhenrichsen/dansk-tekst-sharegpt
45
- type: sharegpt
46
- conversation: chatml
47
-
48
- chat_template: chatml
49
-
50
- dataset_prepared_path:
51
- val_set_size: 0.001
52
- output_dir: ./tiny-chat
53
-
54
- sequence_len: 2048
55
- sample_packing: true
56
- pad_to_sequence_len: true
57
-
58
- wandb_project: tiny-danskgpt-chat
59
- wandb_entity:
60
- wandb_watch:
61
- wandb_name:
62
- wandb_log_model:
63
-
64
- gradient_accumulation_steps: 4
65
- micro_batch_size: 16
66
- num_epochs: 3
67
- optimizer: adamw_bnb_8bit
68
- lr_scheduler: cosine
69
- learning_rate: 0.00005
70
-
71
- train_on_inputs: false
72
- group_by_length: false
73
- bf16: true
74
- fp16: false
75
- tf32: false
76
-
77
- gradient_checkpointing: true
78
- early_stopping_patience:
79
- resume_from_checkpoint:
80
- local_rank:
81
- logging_steps: 1
82
- xformers_attention:
83
- flash_attention: true
84
-
85
- warmup_steps: 10
86
- evals_per_epoch: 4
87
- eval_table_size:
88
- saves_per_epoch: 2
89
- debug:
90
- deepspeed: deepspeed/zero2.json
91
- weight_decay: 0.1
92
- fsdp:
93
- fsdp_config:
94
- special_tokens:
95
- eos_token: "<|im_end|>"
96
- tokens:
97
- - "<|im_start|>"
98
-
99
- ```
100
-
101
- </details><br>
102
-
103
- # tiny-chat
104
 
105
  This model is a fine-tuned version of [mhenrichsen/danskgpt-tiny](https://huggingface.co/mhenrichsen/danskgpt-tiny) on the None dataset.
106
  It achieves the following results on the evaluation set:
 
8
  results: []
9
  ---
10
 
 
 
11
 
12
+ # DanskGPT-chat-tiny
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  This model is a fine-tuned version of [mhenrichsen/danskgpt-tiny](https://huggingface.co/mhenrichsen/danskgpt-tiny) on the None dataset.
15
  It achieves the following results on the evaluation set: