File size: 5,395 Bytes
b9cfc7b 9c14fe5 b9cfc7b afb917e b9cfc7b f2c6aa7 b9cfc7b 997a65d b9cfc7b 51cc7a5 b9cfc7b 51cc7a5 b9cfc7b 51cc7a5 bc4d787 51cc7a5 4bbf81e 51cc7a5 4bbf81e 51cc7a5 4bbf81e 51cc7a5 b9cfc7b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
---
base_model:
- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
- beomi/Llama-3-KoEn-8B-Instruct-preview
- asiansoul/Llama-3-Open-Ko-Linear-8B
- NousResearch/Meta-Llama-3-8B
- NousResearch/Meta-Llama-3-8B-Instruct
- ajibawa-2023/Code-Llama-3-8B
- defog/llama-3-sqlcoder-8b
- NousResearch/Hermes-2-Pro-Llama-3-8B
- Locutusque/llama-3-neural-chat-v2.2-8B
- asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
library_name: transformers
tags:
- mergekit
- merge
---
# Joah-Llama-3-KoEn-8B-Coder-v2
<a href="https://ibb.co/k8hmBF4"><img src="https://i.ibb.co/J7z3tPv/Screenshot-2024-05-11-at-7-48-08-PM.png" alt="Screenshot-2024-05-11-at-7-48-08-PM" border="0"></a>
"A cool merge model with swag"
"Joah" by AsianSoul
Soon Multi Language Model Merge based on this. First German Start (Korean / English / German) π
Where to use Joah : Medical, Korean, English, Translation, Code, Science... π₯
<u>Strengthened SQL code & Other Sci compared to V1</u>
## π‘ Merge Details
The performance of this merge model doesn't seem to be bad though.-> Just opinion ^^ ποΈ
This may not be a model that satisfies you. But if we continue to overcome our shortcomings,
Won't we someday find the answer we want?
Don't worry even if you don't get the results you want.
I'll find the answer for you.
Soon real PoSE to extend Llama's context length to 64k with using my merge method : [reborn](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2)
I have found that most of merge's model outside so far do not actually have 64k in their configs. I will improve it in the next merge with my reborn. If that doesn't work, I guess I'll have to find another way, right?
256k is not possible. My computer is running out of memory.
If you support me, i will try it on a computer with maximum specifications, also, i would like to conduct great tests by building a network with high-capacity traffic and high-speed 10G speeds for you.
### π§Ά Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.
### π Models Merged
The following models were included in the merge:
* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1)
* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview)
* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
* [ajibawa-2023/Code-Llama-3-8B](https://huggingface.co/ajibawa-2023/Code-Llama-3-8B)
* [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
* [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
* [Locutusque/llama-3-neural-chat-v2.2-8B](https://huggingface.co/Locutusque/llama-3-neural-chat-v2.2-8B)
* [asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1](https://huggingface.co/asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1)
### πΉ Ollama
Modelfile_Q5_K_M
```
FROM joah-llama-3-koen-8b-coder-v2-Q5_K_M.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
SYSTEM """
μΉμ ν μ±λ΄μΌλ‘μ μλλ°©μ μμ²μ μ΅λν μμΈνκ³ μΉμ νκ² λ΅νμ. λͺ¨λ λλ΅μ νκ΅μ΄(Korean)μΌλ‘ λλ΅ν΄μ€.
"""
PARAMETER num_keep 24
PARAMETER temperature 0.7
PARAMETER num_predict 3000
PARAMETER num_ctx 64000
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
```
```
ollama create joah -f ./Modelfile_Q5_K_M
```
Modelfile_Q5_K_M default, i hope you to test many upload file for my repo to change that and create ollama
### π Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: NousResearch/Meta-Llama-3-8B
# Base model providing a general foundation without specific parameters
- model: NousResearch/Meta-Llama-3-8B-Instruct
parameters:
density: 0.60
weight: 0.25
- model: beomi/Llama-3-KoEn-8B-Instruct-preview
parameters:
density: 0.55
weight: 0.15
- model: asiansoul/Llama-3-Open-Ko-Linear-8B
parameters:
density: 0.55
weight: 0.1
- model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
parameters:
density: 0.55
weight: 0.1
- model: asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
parameters:
density: 0.55
weight: 0.2
- model: ajibawa-2023/Code-Llama-3-8B
parameters:
density: 0.55
weight: 0.05
- model: defog/llama-3-sqlcoder-8b
parameters:
density: 0.55
weight: 0.1
- model: Locutusque/llama-3-neural-chat-v2.2-8B
parameters:
density: 0.55
weight: 0.1
- model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 0.55
weight: 0.05
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
parameters:
int8_mask: true
dtype: bfloat16
``` |