Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ SmartLlama-3-Ko-8B-256k-PoSE is an advanced AI model that integrates the capabil
|
|
27 |
- **abacusai/Llama-3-Smaug-8B**: Improves the model's performance in real-world, multi-turn conversations, which is crucial for applications in customer service and interactive learning environments.
|
28 |
- **beomi/Llama-3-Open-Ko-8B-Instruct-preview**: Focuses on improving understanding and generation of Korean, offering robust solutions for bilingual or multilingual applications targeting Korean-speaking audiences.
|
29 |
|
30 |
-
## Key Features
|
31 |
|
32 |
- **Extended Context Length**: Utilizes the PoSE (Positional Encoding) technique to handle up to 256,000 tokens, making it ideal for analyzing large volumes of text such as books, comprehensive reports, and lengthy communications.
|
33 |
|
@@ -35,7 +35,7 @@ SmartLlama-3-Ko-8B-256k-PoSE is an advanced AI model that integrates the capabil
|
|
35 |
|
36 |
- **Advanced Integration of Models**: Combines strengths from various models including NousResearch's Meta-Llama-3-8B, the instruction-following capabilities of Llama-3-Open-Ko-8B-Instruct-preview, and specialized capabilities from models like Llama-3-Smaug-8B for nuanced dialogues and Orca-1.0-8B for technical precision.
|
37 |
|
38 |
-
## Models Merged
|
39 |
|
40 |
The following models were included in the merge:
|
41 |
- **winglian/llama-3-8b-256k-PoSE**: [Extends the context handling capability](https://huggingface.co/winglian/llama-3-8b-256k-PoSE). This model uses Positional Skip-wise Training (PoSE) to enhance the handling of extended context lengths, up to 256k tokens.
|
@@ -45,10 +45,42 @@ The following models were included in the merge:
|
|
45 |
- **NousResearch/Meta-Llama-3-8B-Instruct**: [Offers advanced instruction-following capabilities](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct). It is optimized to follow complex instructions, enhancing the model's utility in task-oriented dialogues and applications that require a high level of understanding and execution of user commands.
|
46 |
|
47 |
|
48 |
-
### Merge Method
|
49 |
- **DARE TIES**: This method was employed to ensure that each component model contributes effectively to the merged model, maintaining a high level of performance across diverse applications. NousResearch/Meta-Llama-3-8B served as the base model for this integration, providing a stable and powerful framework for the other models to build upon.
|
50 |
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
The YAML configuration for this model:
|
53 |
|
54 |
```yaml
|
|
|
27 |
- **abacusai/Llama-3-Smaug-8B**: Improves the model's performance in real-world, multi-turn conversations, which is crucial for applications in customer service and interactive learning environments.
|
28 |
- **beomi/Llama-3-Open-Ko-8B-Instruct-preview**: Focuses on improving understanding and generation of Korean, offering robust solutions for bilingual or multilingual applications targeting Korean-speaking audiences.
|
29 |
|
30 |
+
## πΌοΈ Key Features
|
31 |
|
32 |
- **Extended Context Length**: Utilizes the PoSE (Positional Encoding) technique to handle up to 256,000 tokens, making it ideal for analyzing large volumes of text such as books, comprehensive reports, and lengthy communications.
|
33 |
|
|
|
35 |
|
36 |
- **Advanced Integration of Models**: Combines strengths from various models including NousResearch's Meta-Llama-3-8B, the instruction-following capabilities of Llama-3-Open-Ko-8B-Instruct-preview, and specialized capabilities from models like Llama-3-Smaug-8B for nuanced dialogues and Orca-1.0-8B for technical precision.
|
37 |
|
38 |
+
## π¨ Models Merged
|
39 |
|
40 |
The following models were included in the merge:
|
41 |
- **winglian/llama-3-8b-256k-PoSE**: [Extends the context handling capability](https://huggingface.co/winglian/llama-3-8b-256k-PoSE). This model uses Positional Skip-wise Training (PoSE) to enhance the handling of extended context lengths, up to 256k tokens.
|
|
|
45 |
- **NousResearch/Meta-Llama-3-8B-Instruct**: [Offers advanced instruction-following capabilities](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct). It is optimized to follow complex instructions, enhancing the model's utility in task-oriented dialogues and applications that require a high level of understanding and execution of user commands.
|
46 |
|
47 |
|
48 |
+
### ποΈ Merge Method
|
49 |
- **DARE TIES**: This method was employed to ensure that each component model contributes effectively to the merged model, maintaining a high level of performance across diverse applications. NousResearch/Meta-Llama-3-8B served as the base model for this integration, providing a stable and powerful framework for the other models to build upon.
|
50 |
|
51 |
+
## π» Ollama
|
52 |
+
|
53 |
+
```
|
54 |
+
ollama create smartllama-3-Ko-8b-256k-pose -f ./Modelfile_Q5_K_M
|
55 |
+
```
|
56 |
+
|
57 |
+
[Modelfile_Q5_K_M]
|
58 |
+
```
|
59 |
+
FROM smartllama-3-ko-8b-256k-pose-Q5_K_M.gguf
|
60 |
+
TEMPLATE """
|
61 |
+
{{- if .System }}
|
62 |
+
system
|
63 |
+
<s>{{ .System }}</s>
|
64 |
+
{{- end }}
|
65 |
+
user
|
66 |
+
<s>Human:
|
67 |
+
{{ .Prompt }}</s>
|
68 |
+
assistant
|
69 |
+
<s>Assistant:
|
70 |
+
"""
|
71 |
+
|
72 |
+
SYSTEM """
|
73 |
+
μΉμ ν μ±λ΄μΌλ‘μ μλλ°©μ μμ²μ μ΅λν μμΈνκ³ μΉμ νκ² λ΅νμ. κΈΈμ΄μ μκ΄μμ΄ λͺ¨λ λλ΅μ νκ΅μ΄(Korean)μΌλ‘ λλ΅ν΄μ€.
|
74 |
+
"""
|
75 |
+
|
76 |
+
PARAMETER temperature 0.7
|
77 |
+
PARAMETER num_predict 3000
|
78 |
+
PARAMETER num_ctx 256000
|
79 |
+
PARAMETER stop "<s>"
|
80 |
+
PARAMETER stop "</s>"
|
81 |
+
```
|
82 |
+
|
83 |
+
### ποΈ Configuration
|
84 |
The YAML configuration for this model:
|
85 |
|
86 |
```yaml
|