Model,Backbone,UMT-FVD↓,UMTScore↑,MTScore↑,CHScore↑,GPT4o-MTScore↑ [ModelScopeT2V](https://huggingface.co/ali-vilab/text-to-video-ms-1.7b),U-Net,194.77,2.909,0.401,11.03,2.86 [ZeroScope](https://huggingface.co/cerspense/zeroscope_v2_576w),U-Net,227.02,2.35,0.4,25.13,2.09 [T2V-Zero](https://github.com/Picsart-AI-Research/Text2Video-Zero),U-Net,209.66,2.661,0.4,1.68,2.55 [LaVie](https://github.com/Vchitect/LaVie),U-Net,166.97,2.763,0.346,8.6,2.46 [AnimateDiff-V3](https://github.com/guoyww/AnimateDiff),U-Net,197.89,2.944,0.467,11.36,2.62 [VideoCrafter2](https://github.com/AILab-CVC/VideoCrafter),U-Net,178.45,2.753,0.433,8.27,2.68 [MCM-MSLION](https://yhzhai.github.io/mcm/),U-NeT,202.08,2.33,0.417,14.08,3.04 [MagicTime](https://github.com/PKU-YuanGroup/MagicTime),U-Net,257.56,1.916,0.478,10.66,3.13 [Latte](https://github.com/Vchitect/Latte),DiT,192.12,2.111,0.363,13.81,2.2 [OpenSora 1.1](https://github.com/hpcaitech/Open-Sora),DiT,195.43,2.678,0.444,10.03,2.52 [OpenSora 1.2](https://github.com/hpcaitech/Open-Sora),DiT,166.92,2.781,0.375,4.69,2.56 [OpenSoraPlan v1.1](https://github.com/PKU-YuanGroup/Open-Sora-Plan),DiT,188.53,2.421,0.327,10.35,2.19