You might not have heard about Stable Diffusion. As of writing this article, it’s less than a few weeks old. Perhaps you’ve heard about it and some of the hubbub around it. It is an AI model that can generate images based on a text prompt or an input image. Why is it important, how do you use it, and why should you care?
This year we have seen several image generation AIs such as Dall-e 2, Imagen, and even Craiyon. Nvidia’s Canvas AI allows someone to create a crude image with various colors representing different elements, such as mountains or water. Canvas can transform it into a beautiful landscape. What makes Stable Diffusion special? For starters, it is open source under the Creative ML OpenRAIL-M license, which is relatively permissive. Additionally, you can run Stable Diffusion (SD) on your computer rather than via the cloud, accessed by a website or API. They recommend a 3xxx series NVIDIA GPU with at least 6GB of RAM to get decent results. But due to its open-source nature, patches and tweaks enable it to be CPU only, AMD powered, or even Mac friendly.
This touches on the more important thing about SD. The community and energy around it. There are dozens of repos with different features, web UIs, and optimizations. People are training new models or fine-tuning models to generate different styles of content better. There are plugins to Photoshop and Krita. Other models are incorporated into the flow, such as image upscaling or face correction. The speed at which this has come into existence is dizzying. Right now, it’s a bit of the wild west. Continue reading “Stable Diffusion And Why It Matters”