Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
31.8
TFLOPS
648
28
146
Lysandre
lysandre
Follow
pololiv's profile picture
scaraliu's profile picture
clem's profile picture
275 followers
·
163 following
http://lysand.re
LysandreJik
LysandreJik
AI & ML interests
chief open-source officer @ hf
Articles
Fixing Gradient Accumulation
7 days ago
•
32
License to Call: Introducing Transformers Agents 2.0
May 13
•
114
We are hiring interns!
Nov 29, 2022
•
6
Hugging Face on PyTorch / XLA TPUs
Feb 9, 2021
•
1
Organizations
lysandre
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
lysandre/ctrl-clone-2
15 days ago
qwefewf
#3 opened 15 days ago by
SFconvertbot
asdasd
#2 opened 11 months ago by
SFconvertbot
New activity in
lysandre/tiny-bert-random
15 days ago
asdasd
#1 opened 15 days ago by
SFconvertbot
New activity in
THUDM/CogVideoX-5b
about 2 months ago
Update transformers version
#1 opened about 2 months ago by
lysandre
New activity in
mistralai/Mistral-Large-Instruct-2407
3 months ago
consolidated vs model safetensors - what's the difference?
15
#9 opened 3 months ago by
jukofyork
Transformers implementation
#1 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B
3 months ago
Update tokenizer to prepend special token
#12 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-70B
3 months ago
Update tokenizer to prepend special token
1
#11 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B-FP8
3 months ago
Update tokenizer to prepend special token
#12 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-8B
3 months ago
Update tokenizer to prepend special token
1
#12 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B-Instruct
3 months ago
Upload tokenizer
1
#9 opened 3 months ago by
ArthurZ
New activity in
meta-llama/Llama-3.1-70B-Instruct
3 months ago
Upload tokenizer
1
#12 opened 3 months ago by
ArthurZ
New activity in
meta-llama/Llama-3.1-8B-Instruct
3 months ago
Upload tokenizer
2
#29 opened 3 months ago by
ArthurZ
New activity in
meta-llama/Llama-3.1-405B-Instruct-FP8
3 months ago
Upload tokenizer
1
#9 opened 3 months ago by
ArthurZ
New activity in
meta-llama/Llama-3.1-70B-Instruct
3 months ago
configuration-changes
#1 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B-Instruct
3 months ago
Update original/mp16/README.md
#1 opened 3 months ago by
marcsun13
Update original/mp8/README.md
#2 opened 3 months ago by
marcsun13
New activity in
meta-llama/Llama-3.1-405B
3 months ago
Update original/mp16/README.md
#5 opened 3 months ago by
marcsun13
Update original/mp8/README.md
#4 opened 3 months ago by
marcsun13
New activity in
meta-llama/Llama-3.1-8B
3 months ago
Have saner defaults in the generation config
#4 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-70B
3 months ago
Have saner defaults in the generation config
#3 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B
3 months ago
Have saner defaults in the generation config
#3 opened 3 months ago by
lysandre
Have saner defaults in the generation config
#2 opened 3 months ago by
lysandre
New activity in
meta-llama/Llama-3.1-405B-FP8
3 months ago
Have saner defaults in the generation config
#5 opened 3 months ago by
lysandre
New activity in
yentinglin/Llama-3-Taiwan-8B-Instruct-128k
3 months ago
TGI model serving errors
6
#4 opened 4 months ago by
wennycooper
New activity in
shenzhi-wang/Gemma-2-27B-Chinese-Chat
4 months ago
Default to eager attention
2
#1 opened 4 months ago by
lysandre
New activity in
google/gemma-2-27b-it
4 months ago
Default to 'eager' attention implementation
3
#22 opened 4 months ago by
lysandre
New activity in
google/gemma-2-27b
4 months ago
Default attention to eager implementation
#12 opened 4 months ago by
lysandre
New activity in
google/gemma-2-27b-it
4 months ago
Default to eager implementation
#21 opened 4 months ago by
lysandre
New activity in
google/gemma-2-27b
4 months ago
Default attention to eager implementation
#11 opened 4 months ago by
lysandre
New activity in
google/gemma-2-9b-it
4 months ago
it looks it do not work as expected , see below
11
#17 opened 4 months ago by
Sakura77
New activity in
google/gemma-2-9b
4 months ago
ValueError: Transformers does not recognize this architecture.
5
#15 opened 4 months ago by
mike202303
New activity in
google/gemma-2-27b
4 months ago
The base model doesn't generate coherently
4
#9 opened 4 months ago by
migtissera
New activity in
google/gemma-2-27b-it
4 months ago
How can I get results similar to those from Google AI Studio locally?
2
#14 opened 4 months ago by
nitky
New activity in
google/gemma-2-9b-it
4 months ago
"It is strongly recommended to train Gemma2 models with the `eager` attention implementation "
2
#10 opened 4 months ago by
JaronTHU
error of ATen\native\cuda\IndexKernel.cu
6
#14 opened 4 months ago by
koromatsu
nonsense response when bsz>1
5
#16 opened 4 months ago by
jinjieni
New activity in
google/gemma-2-9b
4 months ago
Can't repro MMLU: sliding window attention implementation seems broken
3
#11 opened 4 months ago by
dzhulgakov
TypeError: arange() received an invalid combination of arguments
4
#12 opened 4 months ago by
darrenbudiman
Model repeating information and "spitting out" random characters
8
#14 opened 4 months ago by
brazilianslib
New activity in
hf.rst.imokbook-images
5 months ago
Upload agents_db5.png
1
#15 opened 5 months ago by
m-ric
New activity in
facebook/blenderbot-3B
6 months ago
Updates incorrect tokenizer configuration file
#7 opened 8 months ago by
lysandre
New activity in
microsoft/Phi-3-mini-128k-instruct
6 months ago
About Transformers version
2
#58 opened 6 months ago by
AllenChai
New activity in
distilbert/distilbert-base-multilingual-cased
6 months ago
Updates incorrect tokenizer configuration file
#5 opened 8 months ago by
lysandre
New activity in
distilbert/distilbert-base-german-cased
6 months ago
Updates incorrect tokenizer configuration file
#4 opened 8 months ago by
lysandre
New activity in
distilbert/distilbert-base-uncased-distilled-squad
6 months ago
Updates incorrect tokenizer configuration file
#8 opened 8 months ago by
lysandre
New activity in
distilbert/distilbert-base-cased-distilled-squad
6 months ago
Updates incorrect tokenizer configuration file
#10 opened 8 months ago by
lysandre
New activity in
distilbert/distilbert-base-cased
6 months ago
Updates incorrect tokenizer configuration file
#8 opened 8 months ago by
lysandre
New activity in
distilbert/distilbert-base-uncased
6 months ago
Updates incorrect tokenizer configuration file
#12 opened 8 months ago by
lysandre
New activity in
openai-community/gpt2
7 months ago
model output
2
#86 opened 7 months ago by
foxsilverfox
🚩 Report
#87 opened 7 months ago by
beerbubbles
New activity in
facebook/wav2vec2-xls-r-1b-21-to-en
7 months ago
Incorrect config file
4
#5 opened 7 months ago by
shrey-jasuja
New activity in
facebook/xlm-roberta-xl
7 months ago
Adding `safetensors` variant of this model
1
#3 opened 7 months ago by
SFconvertbot
New activity in
lysandre/bert-test
7 months ago
shhhhh
#3 opened 7 months ago by
SFconvertbot
nononon
#2 opened 7 months ago by
SFconvertbot
New activity in
openai-community/gpt2
7 months ago
OSError: gpt2 does not appear to have a file named config.json. Checkout 'https://huggingface.co/gpt2/None' for available files.
9
#59 opened over 1 year ago by
MorphzZ
New activity in
FacebookAI/roberta-large-mnli
7 months ago
How to finetune this model on RTE, MRPC and SST datasets in GLUE benchmark?
1
#9 opened 8 months ago by
zhai1010
New activity in
google/flan-t5-xxl
7 months ago
ValueError: Need either a `state_dict` or a `save_folder` containing offloaded weights.
5
#53 opened over 1 year ago by
tuannguyends
New activity in
google/gemma-7b-it
7 months ago
Difficulty importing Pipeline - AttributeError: module 'keras._tf_keras.keras' has no attribute '__internal__'
8
#71 opened 8 months ago by
mqureshi
New activity in
open-source-metrics/stars
8 months ago
Fix splits
#2 opened 8 months ago by
lhoestq
Load more