FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Paper • 2410.13925 • Published 5 days ago • 19
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities Paper • 2410.14672 • Published 4 days ago • 4