File size: 626 Bytes

82ee4d2

---
license: mit
---

# CLIP

Contrastive Language-Image Pretraining (CLIP) model pre-trained on LAION-2B at resolution 224x224. It was introduced in the paper [Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020) and further reproduced in the follow-up paper [Reproducible scaling laws for contrastive language-image learning](https://arxiv.org/abs/2212.07143).
The weights were converted from the `laion/CLIP-ViT-g-14-laion2B-s34B-b88K` presented in the [OpenCLIP LAION-2B collections](https://huggingface.co/collections/laion/openclip-laion-2b-64fcade42d20ced4e9389b30).