Background: `text2img` can be thought of as generating visual content based on textual descriptions. Popular models include [DALL-E](https://openai.com/dall-e-2), [Midjourney](https://www.midjourney.com/home), and [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release).
![](images/sd-working.jpg)
![](images/sd-latent-space.jpg)
Go to Google (or any other cloud proveder), an open [console](https://console.cloud.google.com). Create a VM with a T4 GPU, 15GB RAM, and 30GB disk. Connect to SSH, install git, and clone this [Automatic1111 (A1111) repository](https://github.com/AUTOMATIC1111/stable-diffusion-webui). `cd` in to the cloned directory and run:
```
$ bash ./
```
## text2img
Background: `text2img` can be thought of as generating visual content based on textual descriptions. Popular models include [DALL-E](https://openai.com/dall-e-2), [Midjourney](https://www.midjourney.com/home), and [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release).