These are latent diffusion transformer models trained from scratch on 100k 256x256 images Checkpoint 278k-full_state_dict.pth has been trained on about 500 epochs and is well into being overfit on the 100k training images. This repo was used to obtain the training code and dataset: https://github.com/apapiu/transformer_latent_diffusion See this Reddit post for more information: https://www.reddit.com/r/MachineLearning/comments/198eiv1/p_small_latent_diffusion_transformer_from_scratch/