Update README.md

afb917e verified 6 months ago

5.18 kB

	---
	base_model:
	- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
	- beomi/Llama-3-KoEn-8B-Instruct-preview
	- asiansoul/Llama-3-Open-Ko-Linear-8B
	- NousResearch/Meta-Llama-3-8B
	- NousResearch/Meta-Llama-3-8B-Instruct
	- ajibawa-2023/Code-Llama-3-8B
	- defog/llama-3-sqlcoder-8b
	- NousResearch/Hermes-2-Pro-Llama-3-8B
	- Locutusque/llama-3-neural-chat-v2.2-8B
	- asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# Joah-Llama-3-KoEn-8B-Coder-v2

	<a href="https://ibb.co/k8hmBF4"><img src="https://i.ibb.co/J7z3tPv/Screenshot-2024-05-11-at-7-48-08-PM.png" alt="Screenshot-2024-05-11-at-7-48-08-PM" border="0"></a>

	"A cool merge model with swag"

	"Joah" by AsianSoul

	Soon Multi Language Model Merge based on this. First German Start (Korean / English / German) 🌍

	Where to use Joah : Medical, Korean, English, Translation, Code, Science... 🎥

	<u>Strengthened SQL code & Other Sci compared to V1</u>

	## 🎡 Merge Details


	The performance of this merge model doesn't seem to be bad though.-> Just opinion ^^ 🏟️

	This may not be a model that satisfies you. But if we continue to overcome our shortcomings,

	Won't we someday find the answer we want?

	Don't worry even if you don't get the results you want.

	I'll find the answer for you.

	Soon real PoSE to extend Llama's context length to 64k with using my merge method : [reborn](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2)

	I have found that most of merge's model outside so far do not actually have 64k in their configs. I will improve it in the next merge with my reborn. If that doesn't work, I guess I'll have to find another way, right?

	256k is not possible. My computer is running out of memory.

	If you support me, i will try it on a computer with maximum specifications, also, i would like to conduct great tests by building a network with high-capacity traffic and high-speed 10G speeds for you.

	### 🧶 Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.

	### 📚 Models Merged

	The following models were included in the merge:
	* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1)
	* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview)
	* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
	* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
	* [ajibawa-2023/Code-Llama-3-8B](https://huggingface.co/ajibawa-2023/Code-Llama-3-8B)
	* [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
	* [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
	* [Locutusque/llama-3-neural-chat-v2.2-8B](https://huggingface.co/Locutusque/llama-3-neural-chat-v2.2-8B)
	* [asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1](https://huggingface.co/asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1)

	### 🛹 Ollama

	Modelfile_Q5_K_M

	```
	FROM joah-llama-3-koen-8b-coder-v2-Q5_K_M.gguf
	TEMPLATE """
	{{- if .System }}
	system
	<s>{{ .System }}</s>
	{{- end }}
	user
	<s>Human:
	{{ .Prompt }}</s>
	assistant
	<s>Assistant:
	"""

	SYSTEM """
	친절한 챗봇으로서 상대방의 요청에 최대한 자세하고 친절하게 답하자. 모든 대답은 한국어(Korean)으로 대답해줘.
	"""

	PARAMETER temperature 0.7
	PARAMETER num_predict 3000
	PARAMETER num_ctx 4096
	PARAMETER stop "<s>"
	PARAMETER stop "</s>"
	```

	```
	ollama create joah -f ./Modelfile_Q5_K_M
	```

	Modelfile_Q5_K_M default, i hope you to test many upload file for my repo to change that and create ollama

	### 🍎 Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: NousResearch/Meta-Llama-3-8B
	# Base model providing a general foundation without specific parameters

	- model: NousResearch/Meta-Llama-3-8B-Instruct
	parameters:
	density: 0.60
	weight: 0.25

	- model: beomi/Llama-3-KoEn-8B-Instruct-preview
	parameters:
	density: 0.55
	weight: 0.15

	- model: asiansoul/Llama-3-Open-Ko-Linear-8B
	parameters:
	density: 0.55
	weight: 0.1

	- model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
	parameters:
	density: 0.55
	weight: 0.1

	- model: asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
	parameters:
	density: 0.55
	weight: 0.2

	- model: ajibawa-2023/Code-Llama-3-8B
	parameters:
	density: 0.55
	weight: 0.05

	- model: defog/llama-3-sqlcoder-8b
	parameters:
	density: 0.55
	weight: 0.1

	- model: Locutusque/llama-3-neural-chat-v2.2-8B
	parameters:
	density: 0.55
	weight: 0.1

	- model: NousResearch/Hermes-2-Pro-Llama-3-8B
	parameters:
	density: 0.55
	weight: 0.05

	merge_method: dare_ties
	base_model: NousResearch/Meta-Llama-3-8B
	parameters:
	int8_mask: true
	dtype: bfloat16


	```