README.md · QMB15/Stheno-L2-13B-8bit-exl2 at c6f2e516f39c55f8da0530c79ef93f6de101fe76

metadata

license: llama2
language:
  - en

This is a exllama V2 quantization of https://huggingface.co/TheBloke/Stheno-L2-13B-GPTQ Uses a target bpw of 8, intended for best quality on cards like a 3090 or similar. Includes measurement.json for convenience of quantizing to other sizes. Calibration data: https://huggingface.co/datasets/wikitext/resolve/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet

An experimental merging of Several Models using two various methods, Ties-Merge and BlockMerge_Gradient

I plan for this to be the base of my Model with my own [Stheno: ERP-Based LORA] merged in, some time in the future.

Stheno:
Gradient Merge of Stheno-P1 & Stheno-P2.

SISTER MODEL HERE: Stheno-Inverted-L2-13B

Quants courtesy of TheBloke!
GPTQ
GGUF
GGML

Test Checklist:
Censorship - Fairly Uncensored
Writing - Good Prose, Fairly Descriptive
NSFW - Yes
IQ Level - Pretty Smart
Formatting - Proper Formatting with Examples

Stheno-P1 [Ties-Merge]
-----elinas/chronos-13b-v2
-----jondurbin/airoboros-l2-13b-2.1
-----NousResearch/Nous-Hermes-Llama2-13b+nRuaif/Kimiko-v2 LORA

Stheno-P2 [Ties-Merge]
-----CalderaAI/13B-Legerdemain-L2+lemonilia/limarp-llama2-v2 LORA
-----ehartford/WizardLM-1.0-Uncensored-Llama2-13b
-----Henk717/spring-dragon

Most formats could work, but my tests have all been done in Alpaca format and it works well.

### Instruction:
Your instruction or question here.
For roleplay purposes, I suggest the following - Write <CHAR NAME>'s next reply in a chat between <YOUR NAME> and <CHAR NAME>. Write a single reply only.

### Response:

Below is the Illustration for the Final Merge:

Once Again, thanks to Chargoddard for his amazing and simple ties-merge script, and Gryphe for their great BlockMerge_Gradient script. Thanks to the original model creators too!

Art by wada_kazu / わだかず (pixiv page private?)