adarshxs commited on
Commit
aa24407
1 Parent(s): c8afbcd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -108
README.md CHANGED
@@ -1,110 +1,3 @@
1
  ---
2
- library_name: peft
3
  license: apache-2.0
4
- datasets:
5
- - mhenrichsen/alpaca_2k_test
6
- ---
7
- We fine tune base `Llama-2-7b-hf` on the `henrichsen/alpaca_2k_test` dataset using peft-LORA.
8
- Find adapters at: https://huggingface.co/Tensoic/Llama-2-7B-alpaca-2k-test
9
-
10
- Visit us at: https://tensoic.com
11
-
12
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/644bf6ef778ecbfb977e8e84/C0btqRI3eCz0kNYGQoa9k.png)
13
-
14
- ## Training Setup:
15
- ```
16
- Number of GPUs: 8x NVIDIA V100 GPUs
17
- GPU Memory: 32GB each (SXM2 form factor)
18
- ```
19
- ## Training Configuration:
20
-
21
- ```yaml
22
- base_model: meta-llama/Llama-2-7b-hf
23
- base_model_config: meta-llama/Llama-2-7b-hf
24
- model_type: LlamaForCausalLM
25
- tokenizer_type: LlamaTokenizer
26
- is_llama_derived_model: true
27
-
28
- load_in_8bit: true
29
- load_in_4bit: false
30
- strict: false
31
-
32
- datasets:
33
- - path: mhenrichsen/alpaca_2k_test
34
- type: alpaca
35
- dataset_prepared_path: last_run_prepared
36
- val_set_size: 0.01
37
- output_dir: ./lora-out
38
-
39
- sequence_len: 4096
40
- sample_packing: false
41
- pad_to_sequence_len: true
42
-
43
- adapter: lora
44
- lora_model_dir:
45
- lora_r: 32
46
- lora_alpha: 16
47
- lora_dropout: 0.05
48
- lora_target_linear: true
49
- lora_fan_in_fan_out:
50
-
51
- wandb_project:
52
- wandb_entity:
53
- wandb_watch:
54
- wandb_run_id:
55
- wandb_log_model:
56
-
57
- gradient_accumulation_steps: 4
58
- micro_batch_size: 2
59
- num_epochs: 3
60
- optimizer: adamw_bnb_8bit
61
- lr_scheduler: cosine
62
- learning_rate: 0.0002
63
-
64
- train_on_inputs: false
65
- group_by_length: false
66
- bf16: false
67
- fp16: true
68
- tf32: false
69
-
70
- gradient_checkpointing: true
71
- early_stopping_patience:
72
- resume_from_checkpoint:
73
- local_rank:
74
- logging_steps: 1
75
- xformers_attention: true
76
- flash_attention: false
77
-
78
- warmup_steps: 10
79
- eval_steps: 20
80
- save_steps:
81
- debug:
82
- deepspeed:
83
- weight_decay: 0.0
84
- fsdp:
85
- fsdp_config:
86
- special_tokens:
87
- bos_token: "<s>"
88
- eos_token: "</s>"
89
- unk_token: "<unk>"
90
- ```
91
- ```
92
- The following `bitsandbytes` quantization config was used during training:
93
- - quant_method: bitsandbytes
94
- - load_in_8bit: True
95
- - load_in_4bit: False
96
- - llm_int8_threshold: 6.0
97
- - llm_int8_skip_modules: None
98
- - llm_int8_enable_fp32_cpu_offload: False
99
- - llm_int8_has_fp16_weight: False
100
- - bnb_4bit_quant_type: fp4
101
- - bnb_4bit_use_double_quant: False
102
- - bnb_4bit_compute_dtype: float32
103
-
104
- ```
105
-
106
-
107
- ### Framework versions
108
-
109
-
110
- - PEFT 0.6.0.dev0
 
1
  ---
 
2
  license: apache-2.0
3
+ ---