46 9 127

Daniel Han-Chen

danielhanchen

https://unsloth.ai/

danielhanchen

AI & ML interests

None yet

Articles

Faster fine-tuning using TRL & Unsloth

Jan 10

• 36

Organizations

danielhanchen's activity

New activity in unsloth/gemma-2-27b-it-bnb-4bit about 1 month ago

Aphrodite/VLLM/SGLang all refuse to load this model

#5 opened about 1 month ago by

fullstack

New activity in unsloth/gemma-7b-bnb-4bit about 1 month ago

No module named 'triton'

#3 opened about 1 month ago by

NeelM0906

New activity in unsloth/Hermes-3-Llama-3.1-8B-bnb-4bit about 2 months ago

update base_model

#1 opened about 2 months ago by

davanstrien

New activity in unsloth/mistral-7b-instruct-v0.3 about 2 months ago

ValueError: The following `model_kwargs` are not used by the model: ['num_logits_to_keep'] (note: typos in the generate arguments will also show up in this list)

#1 opened about 2 months ago by

NeelM0906

New activity in unsloth/Phi-3-mini-4k-instruct-v0-bnb-4bit 2 months ago

Cant use the tokenizer using Unsloth Fastmodel

#2 opened 2 months ago by

aryarishit

New activity in unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit 3 months ago

RuntimeError: Unsloth: `unsloth/Meta-Llama-3.1-8B-bnb-4bit` is not a base model or a PEFT model.

#3 opened 3 months ago by

yorickdejong

New activity in unsloth/Mistral-Nemo-Base-2407 3 months ago

difference

#1 opened 3 months ago by

ehartford

New activity in google/gemma-2-9b-it 3 months ago

9B - query_pre_attn_scalar = 256 not 224

#26 opened 3 months ago by

danielhanchen

New activity in google/gemma-2-9b 3 months ago

9B - query_pre_attn_scalar = 256 not 224

#22 opened 3 months ago by

danielhanchen

New activity in unsloth/llama-3-8b 5 months ago

is this the llama-3-8b model clone?

#1 opened 6 months ago by

malhajar

New activity in unsloth/gemma-2b-bnb-4bit 5 months ago

Model seems to be not PEFT model

#1 opened 5 months ago by

neuralresearcher

New activity in unsloth/mistral-7b-v0.2-bnb-4bit 5 months ago

full disk on colab

#2 opened 5 months ago by

Dav22

New activity in unsloth/Phi-3-mini-4k-instruct-bnb-4bit 5 months ago

TGI - RuntimeError: mat1 and mat2 shapes cannot be multiplied (4145x3072 and 1x14155776)

#3 opened 5 months ago by

turjo4nis

New activity in unsloth/llama-3-8b-bnb-4bit 5 months ago

34 hour for file tunning ?

#7 opened 5 months ago by

dad1909

New activity in unsloth/llama-3-70b-Instruct-bnb-4bit 5 months ago

Update config.json

#1 opened 5 months ago by

huseink

New activity in unsloth/llama-3-8b-Instruct 5 months ago

Update config.json

#3 opened 5 months ago by

huseink

New activity in unsloth/llama-3-8b-Instruct-bnb-4bit 5 months ago

Update config.json

#2 opened 5 months ago by

huseink

New activity in unsloth/Phi-3-mini-4k-instruct-bnb-4bit 5 months ago

No package metadata was found for bitsandbytes

#1 opened 6 months ago by

halilbabacan

New activity in unsloth/llama-3-8b-Instruct-bnb-4bit 5 months ago

BitsAndBytesConfig error

#1 opened 6 months ago by

vdavidr

New activity in unsloth/llama-3-8b-bnb-4bit 5 months ago

Error: pull model manifest: file does not exist

#6 opened 5 months ago by

wesleyhk

No package metadata was found for bitsandbyte error

#5 opened 6 months ago by

halilbabacan

New activity in unsloth/Phi-3-mini-4k-instruct 5 months ago

fix: update tokenizer config to support `add_generation_prompt=True` and clarify content

#3 opened 5 months ago by

lamhieu

fix: stop generation at eos exactly like the original model

#4 opened 5 months ago by

lamhieu

inquiry about model architecture

#5 opened 5 months ago by

MahmoudMohamed

New activity in unsloth/llama-3-8b-Instruct 6 months ago

Upload generation_config.json

#2 opened 6 months ago by

Orenguteng

New activity in unsloth/llama-3-8b-bnb-4bit 6 months ago

Missing Chat Template

#1 opened 6 months ago by

dfrank

New activity in mistral-community/Mixtral-8x22B-v0.1 7 months ago

Benchmarks are here!

#4 opened 7 months ago by

0-hero

New activity in unsloth/gemma-7b-bnb-4bit 7 months ago

is this model the instruct version

#1 opened 7 months ago by

shi-zheng-qxhs

New activity in unsloth/mistral-7b-v0.2 7 months ago

Ignore

#3 opened 7 months ago by deleted

New activity in unsloth/mistral-7b 7 months ago

Mistral Base version

#2 opened 7 months ago by

chtmp223

New activity in unsloth/mistral-7b 8 months ago

Chat Templates

#1 opened 8 months ago by

mschmill

New activity in unsloth/mistral-7b-instruct-v0.2-bnb-4bit 8 months ago

How to merge lora?

#1 opened 8 months ago by

heatball

Add License Tag

#2 opened 8 months ago by

robinsmits

New activity in unsloth/llama-2-7b-chat-bnb-4bit 8 months ago

AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'cos_cached'

#1 opened 8 months ago by

hannahbernstein

New activity in mistralai/Mistral-7B-v0.1 8 months ago

How to finetune this model mistralai/Mistral-7B-v0.1 and also merge the weights

#126 opened 9 months ago by

yeniceriSGK

New activity in unsloth/llama-2-13b 9 months ago

Delete model-00001-of-00006.safetensors

#1 opened 9 months ago by

danielhanchen

New activity in unsloth/mistral-7b-bnb-4bit 9 months ago

Use in pipeline

#1 opened 9 months ago by

sudhir2016

New activity in unsloth/tinyllama-bnb-4bit 9 months ago

Which TinyLlama version is this?

#1 opened 9 months ago by

gardner

New activity in unsloth/notebooks 10 months ago

Error loading the lora adaters using peft

#1 opened 10 months ago by

carlosatFroom

New activity in TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T 10 months ago

Pickling error - cannot load on transformers==4.37.0.dev0

#3 opened 10 months ago by

danielhanchen

New activity in berkeley-nest/Starling-LM-7B-alpha 10 months ago

Issues in the tokenizer

#25 opened 10 months ago by

Imran1

New activity in danielhanchen/test 10 months ago

Add unsloth tag

#1 opened 10 months ago by

osanseviero

New activity in open-llm-leaderboard/open_llm_leaderboard about 1 year ago

[FLAG] Voicelab/trurl-2-13b: training data surely includes the test data, right?

#202 opened about 1 year ago by

TNTOutburst

New activity in openlm-research/open_llama_7b over 1 year ago

Enable LlamaTokenizerFast and AutoTokenizer to load in seconds rather than 5 minutes.

#1 opened over 1 year ago by

danielhanchen

New activity in openlm-research/open_llama_3b over 1 year ago

Enable LlamaTokenizerFast and AutoTokenizer to load in seconds rather than 5 minutes.

#2 opened over 1 year ago by

danielhanchen

New activity in danielhanchen/open_llama_3b_600bt_preview over 1 year ago

Adding `safetensors` variant of this model

#1 opened over 1 year ago by

SFconvertbot