--- tags: - gptq - 4bit - int4 - gptqmodel - modelcloud - llama-3.1 - 8b - instruct license: llama3.1 --- This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GPTQModel). - **bits**: 4 - **group_size**: 128 - **desc_act**: true - **static_groups**: false - **sym**: true - **lm_head**: false - **damp_percent**: 0.005 - **true_sequential**: true - **model_name_or_path**: "" - **model_file_base_name**: "model" - **quant_method**: "gptq" - **checkpoint_format**: "gptq" - **meta**: - **quantizer**: "gptqmodel:0.9.9-dev0" ## Example: ```python from transformers import AutoTokenizer from gptqmodel import GPTQModel model_name = "ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit" prompt = [{"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"}] tokenizer = AutoTokenizer.from_pretrained(model_name) model = GPTQModel.from_quantized(model_name) input_tensor = tokenizer.apply_chat_template(prompt, add_generation_prompt=True, return_tensors="pt") outputs = model.generate(input_ids=input_tensor.to(model.device), max_new_tokens=100) result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True) print(result) ``` ## lm-eval benchmark ``` | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|------|-----:|----------|---|-----:|---|-----:| |arc_challenge | 1|none | 0|acc |↑ |0.4889|± |0.0146| | | |none | 0|acc_norm |↑ |0.5265|± |0.0146| |arc_easy | 1|none | 0|acc |↑ |0.7908|± |0.0083| | | |none | 0|acc_norm |↑ |0.7702|± |0.0086| |boolq | 2|none | 0|acc |↑ |0.8404|± |0.0064| |hellaswag | 1|none | 0|acc |↑ |0.5748|± |0.0049| | | |none | 0|acc_norm |↑ |0.7718|± |0.0042| |lambada_openai | 1|none | 0|acc |↑ |0.7110|± |0.0063| | | |none | 0|perplexity|↓ |4.0554|± |0.0955| |mmlu | 1|none | |acc |↑ |0.6411|± |0.0038| | - humanities | 1|none | |acc |↑ |0.5896|± |0.0068| | - formal_logic | 0|none | 0|acc |↑ |0.4762|± |0.0447| | - high_school_european_history | 0|none | 0|acc |↑ |0.7273|± |0.0348| | - high_school_us_history | 0|none | 0|acc |↑ |0.8088|± |0.0276| | - high_school_world_history | 0|none | 0|acc |↑ |0.8270|± |0.0246| | - international_law | 0|none | 0|acc |↑ |0.7851|± |0.0375| | - jurisprudence | 0|none | 0|acc |↑ |0.7593|± |0.0413| | - logical_fallacies | 0|none | 0|acc |↑ |0.7607|± |0.0335| | - moral_disputes | 0|none | 0|acc |↑ |0.7197|± |0.0242| | - moral_scenarios | 0|none | 0|acc |↑ |0.4045|± |0.0164| | - philosophy | 0|none | 0|acc |↑ |0.7074|± |0.0258| | - prehistory | 0|none | 0|acc |↑ |0.6975|± |0.0256| | - professional_law | 0|none | 0|acc |↑ |0.4817|± |0.0128| | - world_religions | 0|none | 0|acc |↑ |0.7953|± |0.0309| | - other | 1|none | |acc |↑ |0.7100|± |0.0078| | - business_ethics | 0|none | 0|acc |↑ |0.6700|± |0.0473| | - clinical_knowledge | 0|none | 0|acc |↑ |0.7660|± |0.0261| | - college_medicine | 0|none | 0|acc |↑ |0.6590|± |0.0361| | - global_facts | 0|none | 0|acc |↑ |0.3600|± |0.0482| | - human_aging | 0|none | 0|acc |↑ |0.6547|± |0.0319| | - management | 0|none | 0|acc |↑ |0.8447|± |0.0359| | - marketing | 0|none | 0|acc |↑ |0.8803|± |0.0213| | - medical_genetics | 0|none | 0|acc |↑ |0.7100|± |0.0456| | - miscellaneous | 0|none | 0|acc |↑ |0.8161|± |0.0139| | - nutrition | 0|none | 0|acc |↑ |0.7124|± |0.0259| | - professional_accounting | 0|none | 0|acc |↑ |0.4787|± |0.0298| | - professional_medicine | 0|none | 0|acc |↑ |0.7279|± |0.0270| | - virology | 0|none | 0|acc |↑ |0.5181|± |0.0389| | - social sciences | 1|none | |acc |↑ |0.7312|± |0.0078| | - econometrics | 0|none | 0|acc |↑ |0.4035|± |0.0462| | - high_school_geography | 0|none | 0|acc |↑ |0.8232|± |0.0272| | - high_school_government_and_politics| 0|none | 0|acc |↑ |0.8653|± |0.0246| | - high_school_macroeconomics | 0|none | 0|acc |↑ |0.6128|± |0.0247| | - high_school_microeconomics | 0|none | 0|acc |↑ |0.7227|± |0.0291| | - high_school_psychology | 0|none | 0|acc |↑ |0.8422|± |0.0156| | - human_sexuality | 0|none | 0|acc |↑ |0.7634|± |0.0373| | - professional_psychology | 0|none | 0|acc |↑ |0.6585|± |0.0192| | - public_relations | 0|none | 0|acc |↑ |0.6182|± |0.0465| | - security_studies | 0|none | 0|acc |↑ |0.7306|± |0.0284| | - sociology | 0|none | 0|acc |↑ |0.8358|± |0.0262| | - us_foreign_policy | 0|none | 0|acc |↑ |0.8600|± |0.0349| | - stem | 1|none | |acc |↑ |0.5623|± |0.0085| | - abstract_algebra | 0|none | 0|acc |↑ |0.3800|± |0.0488| | - anatomy | 0|none | 0|acc |↑ |0.6222|± |0.0419| | - astronomy | 0|none | 0|acc |↑ |0.7039|± |0.0372| | - college_biology | 0|none | 0|acc |↑ |0.7778|± |0.0348| | - college_chemistry | 0|none | 0|acc |↑ |0.5400|± |0.0501| | - college_computer_science | 0|none | 0|acc |↑ |0.5300|± |0.0502| | - college_mathematics | 0|none | 0|acc |↑ |0.3200|± |0.0469| | - college_physics | 0|none | 0|acc |↑ |0.4608|± |0.0496| | - computer_security | 0|none | 0|acc |↑ |0.7800|± |0.0416| | - conceptual_physics | 0|none | 0|acc |↑ |0.5617|± |0.0324| | - electrical_engineering | 0|none | 0|acc |↑ |0.6138|± |0.0406| | - elementary_mathematics | 0|none | 0|acc |↑ |0.4365|± |0.0255| | - high_school_biology | 0|none | 0|acc |↑ |0.7839|± |0.0234| | - high_school_chemistry | 0|none | 0|acc |↑ |0.5665|± |0.0349| | - high_school_computer_science | 0|none | 0|acc |↑ |0.6600|± |0.0476| | - high_school_mathematics | 0|none | 0|acc |↑ |0.4407|± |0.0303| | - high_school_physics | 0|none | 0|acc |↑ |0.4371|± |0.0405| | - high_school_statistics | 0|none | 0|acc |↑ |0.5602|± |0.0339| | - machine_learning | 0|none | 0|acc |↑ |0.4643|± |0.0473| |openbookqa | 1|none | 0|acc |↑ |0.3180|± |0.0208| | | |none | 0|acc_norm |↑ |0.4140|± |0.0220| |piqa | 1|none | 0|acc |↑ |0.7878|± |0.0095| | | |none | 0|acc_norm |↑ |0.7971|± |0.0094| |rte | 1|none | 0|acc |↑ |0.6751|± |0.0282| |truthfulqa_mc1 | 2|none | 0|acc |↑ |0.3403|± |0.0166| |winogrande | 1|none | 0|acc |↑ |0.7206|± |0.0126| | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 1|none | |acc |↑ |0.6411|± |0.0038| | - humanities | 1|none | |acc |↑ |0.5896|± |0.0068| | - other | 1|none | |acc |↑ |0.7100|± |0.0078| | - social sciences| 1|none | |acc |↑ |0.7312|± |0.0078| | - stem | 1|none | |acc |↑ |0.5623|± |0.0085| ```