Background: `text2img` can be thought of as generating visual content based on textual descriptions. Popular models include [DALL-E](https://openai.com/dall-e-2), [Midjourney](https://www.midjourney.com/home), and [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release).