Daniel van Strien PRO

davanstrien

AI & ML interests

Machine Learning Librarian

Articles

Organizations

davanstrien's activity

upvoted 2 articles about 1 hour ago
view article
Article

Releasing Outlines-core 0.1.0: structured generation in Rust and Python

7
view article
Article

ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models

5
upvoted an article about 2 hours ago
view article
Article

OCR Processing and Text in Image Analysis with DeepSeek Janus-1.3B

1
upvoted 3 articles about 6 hours ago
view article
Article

OCR Processing and Text in Image Analysis with Florence-2-base and Qwen2-VL-2B

9
view article
Article

🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦‍⬛

By anakin87
14
view article
Article

Aria: First Open Multimodal Native MoE Model

By RhymesAI
4
upvoted an article about 10 hours ago
upvoted an article 5 days ago
view article
Article

How to build a custom text classifier without days of human labeling

By sdiazlor
48
upvoted 3 articles 13 days ago
view article
Article

Improving Parquet Dedupe on Hugging Face Hub

27
view article
Article

Faster Assisted Generation with Dynamic Speculation

26
view article
Article

Scaling AI-based Data Processing with Hugging Face + Dask

22
upvoted an article 18 days ago
upvoted an article 20 days ago
upvoted an article 25 days ago
upvoted an article 26 days ago
view article
Article

🌟 Easy Fine-Tuning with Hugging Face SQL Console, Notebook Creator, and SFT

By asoria
12
upvoted an article 28 days ago
view article
Article

Data Is Better Together: A Look Back and Forward

18
upvoted an article about 1 month ago
view article
Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By manu
139
upvoted an article 2 months ago
view article
Article

⭐ PySpark and 🤗 Hugging Face Parquet Files

By asoria
5
upvoted an article 3 months ago
view article
Article

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

26
upvoted 2 articles 3 months ago
view article
Article

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

17
upvoted an article 3 months ago
view article
Article

Experimenting with Automatic PII Detection on the Hub using Presidio

24