bhenrym14 commited on
Commit
bf9a7aa
1 Parent(s): 444fb74

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -4,6 +4,7 @@
4
  This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (with GPTQ Quantization) with several key modifications:
5
  - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA.
6
  - Training sequences beyond 2048 have the target truncated to equal 2048.
 
7
 
8
  Otherwise, I emulated the training process as closely as possible. It was trained on 1x RTX 6000 Ada for ~43 hours.
9
 
@@ -18,7 +19,8 @@ Recent advancements in extending context by RoPE scaling ([kaiokendev](https://k
18
  | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **2048** | **4.32** |
19
  | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **3072** | **4.26** |
20
 
21
- How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet.
 
22
 
23
  ## Quantization:
24
 
 
4
  This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-33b-gpt4-1.4) (with GPTQ Quantization) with several key modifications:
5
  - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA.
6
  - Training sequences beyond 2048 have the target truncated to equal 2048.
7
+ - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
8
 
9
  Otherwise, I emulated the training process as closely as possible. It was trained on 1x RTX 6000 Ada for ~43 hours.
10
 
 
19
  | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **2048** | **4.32** |
20
  | **bhenrym14/airoboros-33b-gpt4-1.4.1-PI-8192-GPTQ** | **3072** | **4.26** |
21
 
22
+ - How does this reduction in perplexity translate into actual performance lift on downstream tasks? I'm not sure yet.
23
+ - This comparison isn't perfect. I did use the 1.4.1 dataset and the quantization method is slightly different.
24
 
25
  ## Quantization:
26