What is Stable Diffusion, how to install and use it

images generated with stable diffusion

This is one guide to learn about Stable Diffusion and teach how you can use this tool.

The image above is generated with Stable Diffusion. It has been generated from the following text (prompt)

City skyline with skycrapers, by Stanislav Sidorov, digital art, ultra realistic, ultra detailed, photorealistic, 4k, character concept, soft light, blade runner, futuristic

Stable Diffusion is a text-to-image machine learning model. A deep learning model, of artificial intelligence that allows us to generate images from text that we put as input or input.

It's not the first model or the first tool of this style, right now there's a lot of talk about Dall-e 2, MidJourney, Google Image, but it is the most important because of what it represents. Stable Diffusion is an Open Source project, so anyone can use and modify it. In version 1.4 we have a 4G .cpxt file where the entire pre-trained model comes from, and this is a real revolution.

So much so that in just 2 or 3 weeks since its release, we find plugins for PhotoShop, GIMP, Krita, WordPress, Blender, etc. pretty much every tool that comes with images is implementing Stable Diffusion, so much so that even competitors like Midjourney are using it to enhance their tools. But it is not only used to generate tools, but we as users can install it on our PC and run it to obtain the images locally.

Because in addition to being Open Source does not mean that it is less powerful than the previous ones. It is a true wonder. For me right now it is the best tool that we can use if we want to generate our images for any project.

Ways to install and use Stable Diffusion

There are different ways to use it. Right now I recommend 2. If your computer has the necessary power, that is, a graphics card with about 8Gb of RAM, then install it on your computer. If your hardware is not powerful enough use a Google Collab, right now I recommend the Altryne one, because it comes with a graphical interface and is easier to use.

step to detail.

Colab of Altryne

This is the option that I recommend if your computer is not powerful enough (GPU with 8Gb of RAM) or if you want to try it with all its features without having to install anything.

I recommend it because it has a very comfortable graphical interface with many options to control the images and other model tools such as image to image and upscale.

We use the Google colab created by Altryne and Google Drive to save the model and the results.

It's all free. I leave a video of the whole process that as you will see is very simple.

Install on PC

To install it from PC you can follow the instructions given in its GitHub, https://github.com/CompVis/stable-diffusion or in its version with graphical interface that I like much more https://github.com/AUTOMATIC1111/stable-diffusion-webui and on windows and linux you can use this executable to install it Stable Diffusion UI v2

You already know that you need a powerful GPU with a minimum of 8Gb of RAM for it to work smoothly. You can make it pull CPU, but it is much slower and it will also depend on the processor you have. So if your equipment is old you will have to resign yourself to using Colab or some payment method to use Stable Diffusion

The advantages of having it on your PC is that it is much faster to use, you don't have to install or configure anything, just doing it once is enough, from then on everything is much faster.

Also, another reason why I like it a lot is because I can integrate it into other scripts and take advantage of the generated images by inserting them directly into the workflow of the tasks, which is a very important point.

Official Collab Diffusers

It is very similar to the Colab that I have recommended above, it runs almost the same, you do NOT have to upload the model, but it does not have a graphical interface and to modify any option you have to change the options of the code blocks and modify them to adjust it to what we need .

In addition, we cannot use the image to image option, which is very attractive.

You can access from this https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb

We have a filter for adult images, the famous NSFW, but you can deactivate it using this code, that is, creating a cell in the document with

def dummy_checker(images, **kwargs): return images, False
http://pipe.safety_checker = dummy_checker

You have to put it right after the cell

pipe = pipe.to("cuda")

and run it

Colab Stable Diffusion Infinity

In this Colab we can use the Infinity tool, which allows us to complete images. Create content from the existing image. A real pass.


Dreamboth with Stable Diffusion

This is the implementation of Google's Dreamboth with Stable Diffusion that allows, from a few images of a person, to obtain personalized results with the face that the demos.

An amazing way to customize images


Other Colabs

You already know how to work in Colab, well I'll leave you others that I'm finding so you can use the one you like the most. Even if you want you can make a copy and modify it to your liking to have your own version

From its official website

A simple way to use it, as if you use Dall-e 2 in OpenAI, but if you use the platform the service is paid. https://stability.ai/

From HuggingFace

An interesting option to test it quickly and take some pictures, just to see how it works, but there are many options that we will use if we are going to get serious about this.


Using AWS or some Cloud service

The Stable Diffusion model can be used by running it on hardware in the cloud, a classic service is Amazon's AWS. Right now I am testing with EC2 instances to work with different algorithms. I'll tell you how it is.

Other payment services

There are many and more and more are emerging, from implementations in stock photos to websites that allow us to integrate with APIs. At the moment this has caught my attention, although personally I am going to use the free services

Tools for prompt engineering

The engineering prompt is the part that refers to the generation of the prompt, that is, the phrase with which we feed the model so that it generates our images. It is not a trivial issue and you have to know very well how to use it to obtain great results.

A very useful tool to learn is lexicon, where we see images and the prompt they have used, the seed and the guidance scale.

Browsing around you will learn what type of elements you have to assign to the prompt to obtain the type of result you are looking for.

Leave a comment