|
--- |
|
license: mit |
|
library_name: transformers |
|
datasets: |
|
- Severian/Internal-Knowledge-Map |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# New Fixed Version with extended training available now! |
|
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/GO4MY_3adP2G9EHKZbZpg.webp" width="500" height="500"> |
|
|
|
|
|
This model is the second trained with experimental 'Internal Knowledge Map' dataset. Developed with an aim to go beyond the scope of usual data processing capabilities, this model gets trained to build comprehensive understanding and reasoning in a wide range of knowledge domains with elaborate guidelines. It bases its reasoning on a specially selected dataset emphasizing the interrelations of the diverse disciplines which aim to synthesize, integrate, and apply complex information in ways that mimic humanly abstract reasoning and creative thought processes. |
|
|
|
At the very core of the development of this model is the desire to make sure that LLMs engage in a kind of cognitive activity not limited to memory but actually taking on abstract reasoning, problem-solving, and generation of new insights. To achieve this, 'Nexus-IKM-Mistral-7B' has been fine-tuned until convergance using a novel Phased Training appraoch on this unique dataset, which resulted in the model demonstrating greater capability for giving rise to insights and problem-solving in complex, multi-disciplinary settings. This involves improved ability in drawing links between different pieces of knowledge, reasoning through complex scenarios, and proposing innovative solutions that cut across various domains, including science, technology, environmental studies, and humanities. |
|
|
|
Test this out and see if you find anything interesting or intriguing. I will keep iterating more versions but this one seems like a fun and useful way to start. |
|
|
|
|
|
# Phased Training Methodology |
|
Leveraging this dataset, we've adopted a phased training methodology that focuses sequentially on different dataset components, namely "System" and "Instruction," across separate training phases. This approach allows models to build layered understandings from general systemic insights to specific instructional cues, enriching their generative output with both broad contextual awareness and detailed, topic-specific knowledge. |
|
|
|
**Phase 1: System Focus** |
|
In the initial phase, the model concentrates on the "System" component, absorbing overarching guidelines and objectives. This phase lays the foundational understanding, enabling the model to grasp the contextual framework and systemic knowledge encapsulated in the dataset. |
|
|
|
**Phase 2: Instruction Focus** |
|
Building upon the systemic knowledge, the second phase shifts the model's focus to the "Instructions" component. This phase sharpens the model's ability to interpret specific prompts and generate responses that are not only informed by the broader context but also precisely tailored to the instructional cues. |
|
|
|
## GGUF Q8 Version: https://huggingface.co/Severian/Nexus-IKM-Mistral-7B-GGUF |
|
|
|
|
|
**If you'd like to train your own version, here is the full notebook to recreate the training on Unsloth yourself (https://colab.research.google.com/drive/1828t77iO2nLRXVfB8HoI11eFu-79-Oe7?usp=sharing). You'll just have to drop in the train.jsonl from the Dataset repo (https://huggingface.co/datasets/Severian/Internal-Knowledge-Map) into your Colab directory and rename it dataset.jsonl** |
|
|
|
|
|
## Training Snapshot |
|
|
|
``` |
|
|
|
Step Training Loss |
|
1 3.223000 |
|
2 3.221300 |
|
3 3.215900 |
|
4 3.210600 |
|
5 3.203000 |
|
6 3.193500 |
|
7 3.184000 |
|
8 3.173400 |
|
9 3.162400 |
|
10 3.151500 |
|
11 3.140500 |
|
12 3.128800 |
|
13 3.117600 |
|
14 3.106700 |
|
15 3.095500 |
|
16 3.084700 |
|
17 3.073700 |
|
18 3.062700 |
|
19 3.052300 |
|
20 3.041800 |
|
|
|
|
|
201 1.273200 |
|
202 1.257600 |
|
203 1.241900 |
|
204 1.226100 |
|
205 1.210800 |
|
206 1.195500 |
|
207 1.180800 |
|
208 1.166000 |
|
209 1.151200 |
|
210 1.136900 |
|
211 1.122000 |
|
212 1.106600 |
|
213 1.091200 |
|
214 1.075200 |
|
215 1.059200 |
|
216 1.042900 |
|
217 1.026600 |
|
218 1.010300 |
|
219 0.994200 |
|
|
|
416 0.041700 |
|
417 0.041700 |
|
418 0.041600 |
|
419 0.041600 |
|
420 0.041600 |
|
421 0.041600 |
|
422 0.041500 |
|
423 0.041500 |
|
424 0.041500 |
|
425 0.041400 |
|
426 0.041400 |
|
427 0.041400 |
|
428 0.041400 |
|
429 0.041300 |
|
430 0.041300 |
|
431 0.041300 |
|
432 0.041200 |
|
433 0.041200 |
|
434 0.041200 |
|
435 0.041100 |
|
436 0.041200 |
|
437 0.041100 |
|
438 0.041100 |
|
439 0.041100 |
|
440 0.041000 |
|
441 0.041000 |
|
442 0.041000 |
|
443 0.040900 |
|
444 0.040900 |
|
445 0.040900 |
|
|
|
668 0.035200 |
|
669 0.035100 |
|
670 0.035100 |
|
671 0.035100 |
|
672 0.035100 |
|
673 0.035000 |
|
674 0.035000 |
|
675 0.035000 |
|
676 0.035000 |
|
677 0.034900 |
|
678 0.034900 |
|
679 0.034900 |
|
680 0.034800 |
|
681 0.034800 |
|
682 0.034800 |
|
683 0.034800 |
|
684 0.034800 |
|
685 0.034700 |
|
686 0.034700 |
|
687 0.034700 |
|
688 0.034700 |
|
689 0.034600 |
|
690 0.034600 |
|
691 0.034600 |
|
692 0.034600 |
|
693 0.034500 |
|
694 0.034500 |
|
695 0.034500 |
|
696 0.034400 |
|
697 0.034400 |
|
698 0.034400 |
|
699 0.034400 |
|
700 0.034300 |
|
701 0.034300 |
|
702 0.034300 |
|
703 0.034300 |
|
704 0.034200 |
|
705 0.034200 |
|
706 0.034200 |
|
707 0.034200 |
|
708 0.034100 |
|
709 0.034100 |
|
710 0.034100 |
|
711 0.034100 |
|
712 0.034000 |
|
713 0.034000 |
|
714 0.034000 |
|
715 0.034000 |
|
716 0.033900 |
|
717 0.033900 |
|
718 0.033800 |
|
719 0.033800 |
|
720 0.033800 |
|
721 0.033800 |
|
|
|
1209 0.006600 |
|
1210 0.006500 |
|
1211 0.006300 |
|
1212 0.006200 |
|
1213 0.006100 |
|
1214 0.006000 |
|
1215 0.005800 |
|
1216 0.005700 |
|
1217 0.005600 |
|
1218 0.005500 |
|
1219 0.005400 |
|
1220 0.005300 |
|
1221 0.005100 |
|
1222 0.004900 |
|
1223 0.004800 |
|
1224 0.004700 |
|
1225 0.004600 |
|
1226 0.004500 |
|
1227 0.004400 |
|
1228 0.004300 |
|
1229 0.004200 |
|
1230 0.004000 |
|
1231 0.003900 |
|
1232 0.003800 |
|
1233 0.003700 |
|
1234 0.003500 |
|
1235 0.003400 |
|
1236 0.003300 |
|
1237 0.003200 |
|
1238 0.003000 |
|
1239 0.003000 |
|
1240 0.002900 |
|
1241 0.002800 |
|
1242 0.002700 |
|
1243 0.002600 |
|
1244 0.002500 |
|
1245 0.002400 |
|
1246 0.002300 |
|
1247 0.002200 |
|
1248 0.002100 |
|
1249 0.002000 |
|
1250 0.001900 |
|
1251 0.001800 |
|
1252 0.001800 |
|
1253 0.001700 |
|
1254 0.001600 |
|
1255 0.001600 |
|
1256 0.001500 |
|
1257 0.001400 |
|
1258 0.001300 |
|
1259 0.001300 |
|
1260 0.001200 |
|
1261 0.001200 |
|
1262 0.001100 |
|
1263 0.001100 |
|
1264 0.001000 |
|
1265 0.001000 |
|
1266 0.000900 |
|
1267 0.000900 |
|
1268 0.000800 |
|
1269 0.000800 |
|
1270 0.000800 |
|
1271 0.000800 |
|
1272 0.000700 |
|
1273 0.000700 |
|
1274 0.000700 |
|
1275 0.000600 |
|
1276 0.000600 |
|
1277 0.000600 |
|
1278 0.000600 |
|
1279 0.000500 |
|
1280 0.000500 |
|
1281 0.000500 |
|
1282 0.000500 |
|
1283 0.000500 |
|
1284 0.000500 |
|
1285 0.000500 |
|
1286 0.000400 |
|
1287 0.000400 |
|
1288 0.000400 |
|
1289 0.000400 |
|
1290 0.000400 |
|
1291 0.000400 |
|
1292 0.000400 |
|
1293 0.000400 |
|
1294 0.000400 |
|
1295 0.000400 |
|
1296 0.000400 |
|
1297 0.000300 |
|
1298 0.000300 |
|
``` |