Create a VM with a GPU, at least 15GB RAM, and 30GB disk. Connect to SSH, install git, and clone this [Automatic1111 (A1111) repository](https://github.com/AUTOMATIC1111/stable-diffusion-webui).
You should see a link like `https://xxxxxxxxxxxxxxxx.gradio.live` after the webui finishes launching. Warning, do NOT share the public link, others can abuse you instance and increase your bill.
Place downloaded models in the `stable-diffusion-webui/models`.
## Start Generating
### Image Dimensions (Resolution)
As different models are trained on different image resolutions, it is best to use the training image resolution for generations. For `SD1.5` use 512x512 and for `SDXL1.0` use 1024x1024. You can slightly vary one of the dimensions without significant issues.
`text2img` can be thought of as generating visual content based on textual descriptions. Popular models include [DALL-E](https://openai.com/dall-e-2), [Midjourney](https://www.midjourney.com/home), and [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release).
`img2img` refers to the transformation of one image into another, typically maintaining the same content but changing the style or other visual attributes.