kromeurus commited on
Commit
e777201
1 Parent(s): 821fa00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -82
README.md CHANGED
@@ -1,82 +1,136 @@
1
- ---
2
- base_model:
3
- - Sao10K/L3-8B-Niitama-v1
4
- - Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
5
- - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
6
- - nothingiisreal/L3-8B-Celeste-V1.2
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # merge
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the passthrough merge method.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * parts/celeniit14-20.sl
26
- * [Sao10K/L3-8B-Niitama-v1](https://huggingface.co/Sao10K/L3-8B-Niitama-v1)
27
- * [Nitral-AI/Hathor_Tahsin-L3-8B-v0.85](https://huggingface.co/Nitral-AI/Hathor_Tahsin-L3-8B-v0.85)
28
- * [ArliAI/ArliAI-Llama-3-8B-Formax-v1.0](https://huggingface.co/ArliAI/ArliAI-Llama-3-8B-Formax-v1.0)
29
- * [nothingiisreal/L3-8B-Celeste-V1.2](https://huggingface.co/nothingiisreal/L3-8B-Celeste-V1.2)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- models:
37
- slices:
38
- - sources:
39
- - layer_range: [0, 4]
40
- model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
41
- - sources:
42
- - layer_range: [1, 5]
43
- model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
44
- - sources:
45
- - layer_range: [4, 8]
46
- model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
47
- - sources:
48
- - layer_range: [5, 9]
49
- model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
50
- - sources:
51
- - layer_range: [8, 10]
52
- model: Sao10K/L3-8B-Niitama-v1
53
- - sources:
54
- - layer_range: [6, 14]
55
- model: nothingiisreal/L3-8B-Celeste-V1.2
56
- - sources:
57
- - layer_range: [0, 6]
58
- model: parts/celeniit14-20.sl
59
- - sources:
60
- - layer_range: [20, 23]
61
- model: Sao10K/L3-8B-Niitama-v1
62
- - sources:
63
- - layer_range: [22, 26]
64
- model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
65
- - sources:
66
- - layer_range: [22, 28]
67
- model: nothingiisreal/L3-8B-Celeste-V1.2
68
- - sources:
69
- - layer_range: [25, 27]
70
- model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
71
- - sources:
72
- - layer_range: [28, 30]
73
- model: Sao10K/L3-8B-Niitama-v1
74
- - sources:
75
- - layer_range: [25, 32]
76
- model: nothingiisreal/L3-8B-Celeste-V1.2
77
- parameters:
78
- int8_mask: true
79
- merge_method: passthrough
80
- dtype: bfloat16
81
-
82
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Sao10K/L3-8B-Niitama-v1
4
+ - Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
5
+ - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
6
+ - nothingiisreal/L3-8B-Celeste-V1.2
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ ---
12
+
13
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/667eea5cdebd46a5ec4dcc3d/YkA856HMNfrjBxFUOkxtP.jpeg)
14
+
15
+ 1/3 of the 13B models for Horizon Himerus (will update with link later). This merge was orginally suppose to be only for that final model, but this guy is suprisangly competent.
16
+ A tad jank, but very solid for what it is.
17
+
18
+ ### Quants
19
+
20
+ [OG Q8 GGUF](https://huggingface.co/kromeurus/L3-Himerus-Basis.C-13B-Q8-GGUF) by me.
21
+
22
+ Other quants are not available, yet.
23
+
24
+ ### Details & Recommended Settings
25
+
26
+ (Still testing; nothing here is finalized.)
27
+
28
+ Follows intructions fairly well for RP and eRP. Dramatic as fuck at times, depending on the senario. Human dialogue and lots of it.
29
+
30
+ Rec. Settings:
31
+ ```
32
+ Template: Model Default
33
+ Temperature: 1.22
34
+ Min P: 0.115
35
+ Repeat Penelty: 1.05
36
+ Repeat Penelty Tokens: 256
37
+ ```
38
+
39
+ ### Models Merged & Merge Theory
40
+
41
+ The following models were included in the merge:
42
+ * [Sao10K/L3-8B-Niitama-v1](https://huggingface.co/Sao10K/L3-8B-Niitama-v1)
43
+ * [Nitral-AI/Hathor_Tahsin-L3-8B-v0.85](https://huggingface.co/Nitral-AI/Hathor_Tahsin-L3-8B-v0.85)
44
+ * [ArliAI/ArliAI-Llama-3-8B-Formax-v1.0](https://huggingface.co/ArliAI/ArliAI-Llama-3-8B-Formax-v1.0)
45
+ * [nothingiisreal/L3-8B-Celeste-V1.2](https://huggingface.co/nothingiisreal/L3-8B-Celeste-V1.2)
46
+
47
+ So you're not suppose to mix models with different trained context limits, but I did it anyway. Wanted the 'human' output of Celeste v1.2 while curbing the repitition and adding
48
+ some back up from Niitama and Hathor Tahsin. Formax was included in the beginning for it's instruct following.
49
+
50
+ Took a page out of [@matchaaaaa](https://huggingface.co/matchaaaaa)'s Chaifighter Latte and took out a slice of Celeste and Nittama in the center for smoothing out layer disparity.
51
+ I realized while testing that using that 'splice' metheod, you could theoretically make a pretty big model then squish it down to streamline the layers. So, after much testing,
52
+ I came up with the following merges.
53
+
54
+ ### Config
55
+
56
+ ```yaml
57
+ models:
58
+ slices:
59
+ - sources:
60
+ - layer_range: [14, 20]
61
+ model: nothingiisreal/L3-8B-Celeste-V1.2
62
+ parameters:
63
+ int8_mask: true
64
+ merge_method: passthrough
65
+ dtype: bfloat16
66
+ name: celeste14-20.sl
67
+ ---
68
+ models:
69
+ slices:
70
+ - sources:
71
+ - layer_range: [14, 20]
72
+ model: Sao10K/L3-8B-Niitama-v1
73
+ parameters:
74
+ int8_mask: true
75
+ merge_method: passthrough
76
+ dtype: bfloat16
77
+ name: niitama14-20.sl
78
+ ---
79
+ models:
80
+ - model: celeste14-20.sl
81
+ parameters:
82
+ weight: [1, 0.75, 0.625, 0.5, 0.375, 0.25, 0]
83
+ - model: niitama14-20.sl
84
+ parameters:
85
+ weight: [0, 0.25, 0.375, 0.5, 0.625, 0.75, 1]
86
+ merge_method: dare_linear
87
+ base_model: celeste14-20.sl
88
+ dtype: bfloat16
89
+ name: celeniit14-20.sl
90
+ ---
91
+ models:
92
+ slices:
93
+ - sources:
94
+ - layer_range: [0, 4]
95
+ model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
96
+ - sources:
97
+ - layer_range: [1, 5]
98
+ model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
99
+ - sources:
100
+ - layer_range: [4, 8]
101
+ model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
102
+ - sources:
103
+ - layer_range: [5, 9]
104
+ model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
105
+ - sources:
106
+ - layer_range: [8, 10]
107
+ model: Sao10K/L3-8B-Niitama-v1
108
+ - sources:
109
+ - layer_range: [6, 14]
110
+ model: nothingiisreal/L3-8B-Celeste-V1.2
111
+ - sources:
112
+ - layer_range: [0, 6]
113
+ model: celeniit14-20.sl
114
+ - sources:
115
+ - layer_range: [20, 23]
116
+ model: Sao10K/L3-8B-Niitama-v1
117
+ - sources:
118
+ - layer_range: [22, 26]
119
+ model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
120
+ - sources:
121
+ - layer_range: [22, 28]
122
+ model: nothingiisreal/L3-8B-Celeste-V1.2
123
+ - sources:
124
+ - layer_range: [25, 27]
125
+ model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
126
+ - sources:
127
+ - layer_range: [28, 30]
128
+ model: Sao10K/L3-8B-Niitama-v1
129
+ - sources:
130
+ - layer_range: [25, 32]
131
+ model: nothingiisreal/L3-8B-Celeste-V1.2
132
+ parameters:
133
+ int8_mask: true
134
+ merge_method: passthrough
135
+ dtype: bfloat16
136
+ ```