cs-giung
/

clip-vit-large-patch14-fullcc2.5b

Zero-Shot Image Classification

Inference Endpoints

Model card Files Files and versions Community

clip-vit-large-patch14-fullcc2.5b / README.md

cs-giung's picture

Update README.md

6d7f301 verified 4 months ago

|

history blame contribute delete

557 Bytes

	---
	license: cc-by-nc-4.0
	---

	# CLIP

	Contrastive Language-Image Pretraining (CLIP) model pre-trained on 2.5 billion data points of CommonCrawl at resolution 224x224. It was introduced in the paper [Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020) and further reproduced in the follow-up paper [Demystifying CLIP Data](https://arxiv.org/abs/2309.16671).
	The weights were converted from the `l14_fullcc2.5b.pt` file presented in the [original repository](https://github.com/facebookresearch/MetaCLIP).