zamal's picture
Update README.md
e1bbe78 verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - deepseek-ai/deepseek-vl-1.3b-chat
pipeline_tag: image-to-text

Deepseek-VL-1.3b-chat-4bit

Deepseek Logo

Overview

Deepseek-VL-1.3b-chat-4bit is a state-of-the-art multimodal model that combines visual and linguistic processing capabilities. It has been optimized for efficient performance by quantizing the model to 4 bits, significantly reducing its size while maintaining high performance.

Model Details

  • Model Type: Multimodal Causal Language Model
  • Base Model Size: 1.3 billion parameters
  • Quantized Size: Approximately 1.72 GB (from the original size)
  • Files Included:
    • config.json: Model configuration file.
    • model.safetensors: The quantized model weights.
    • preprocessor_config.json: Configuration for the preprocessor.
    • processor_config.json: Configuration for the processor.
    • special_tokens_map.json: Mapping for special tokens used in the tokenizer.
    • tokenizer.json: Tokenizer configuration.
    • tokenizer_config.json: Additional tokenizer settings.

Quantization

Quantization is a technique used to reduce the model size and improve inference speed by using lower precision arithmetic. In this case, the model was quantized to 4 bits, which means it utilizes 4 bits to represent each weight instead of the typical 16 or 32 bits. This results in:

  • Size Reduction: The model size has been reduced from several gigabytes to approximately 1.72 GB.
  • Performance: The quantized model maintains a high level of accuracy and efficiency, making it suitable for deployment in environments with limited resources.

Installation

To use the Deepseek-VL-1.3b-chat-4bit model, follow these steps:

  1. Install the Required Libraries:
    pip install transformers huggingface-hub