从Auto-Regressive到DDPM,DALLE的发展史(一)
imageGPT:Generative Pretraining from Pixels
How iGPT do context reduction? do k=512 clustering on RGB space to map R^{3} to R^{1}.
The difference between AR and BERT loss. quite similar to GPT.
DALL-E:
Zero-Shot Text-to-Image Generation