How to use ``Prompt matrix'' and ``X/Y plot'' in ``Stable Diffusion web UI (AUTOMATIC 1111 version)'' that you can see at a glance what kind of difference you get by changing prompts, spells and parameters in image generation AI ``Stable Diffusion'' Summary
In order to introduce image generation AI / Stable Diffusion, in addition to a PC equipped with an NVIDIA GPU, knowledge of Python, Anaconda, etc. was required, and it was a little difficult to introduce in a local environment. However, since it was released to the
GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable diffusion web UI
https://github.com/AUTOMATIC1111/stable-diffusion-webui
The following article summarizes how to introduce Stable Diffusion web UI (AUTOMATIC1111 version) to the local environment.
Image generation AI ``Stable Diffusion'' works even with 4 GB GPU & various functions such as learning your own pattern can be easily operated on Google Colabo or Windows Definitive edition ``Stable Diffusion web UI (AUTOMATIC 1111 version)'' installation method summary - GIGAZINE
Also, you can understand the basic usage of Stable Diffusion web UI (AUTOMATIC 1111 version) by reading the following article.
Basic usage of ``Stable Diffusion web UI (AUTOMATIC 1111 version)'' that can easily use ``GFPGAN'' that can clean the face that tends to collapse with image generation AI ``Stable Diffusion''-GIGAZINE
Start Stable Diffusion web UI (AUTOMATIC1111 version). There is a pull-down called 'Script' in the lower left.
Select ' Prompt matrix ' from the Script pulldown.
Prompt matrix is a function that allows you to generate an image by combining all the strings (prompts) that are entered when generating an image, where keywords are normally separated by ',', but separated by '|'. For example, you would normally generate an image with the prompt ' a busy city street in a modern city, illustration ,cinematic lighting ', but the Prompt matrix says 'a busy 'city street in a modern city | illustration | cinematic lighting' will create an image that combines 'a busy city street in a modern city' with 'illustration' and 'cinematic lighting'. can be generated.
In fact, enter 'a busy city street in a modern city | illustration | cinematic lighting' as the prompt and click 'Generate'.
Then, the following image was generated.
Based on the first prompt, 'a busy city street in a modern city', images are output in four patterns depending on the presence or absence of 'illustration' and 'cinematic lighting'. The image in the upper left where both 'illustration' and 'cinematic lighting' are ignored is the image output only with 'a busy city street in a modern city', and the image in which only 'illustration' is combined is the image in the upper right, 'cinematic lighting' The bottom left is the combined image with only 'illustration' and the bottom right combined with both 'illustration' and 'cinematic lighting'.
The following image is the result of outputting a prompt composed of four elements 'A very beautiful girl | full body | long golden hair | sky blue eye' with the Prompt matrix. Based on 'A very beautiful girl' at the beginning, there are 4 columns with nothing in the column, only 'full body', only 'long golden hair', both 'full body' and 'long golden hair', and the row is 'sky blue A total of 8 images are generated with two lines, one without 'eye' and one with 'sky blue eye'.
Even if you change the Batch count and Batch size, only one sheet is generated unlike the next 'X/Y plot'.
Next, try using ' X/Y plot '. Select 'X/Y plot' from the 'Script' pull-down.
Then, two types of selection items 'X Type' and 'Y Type' appeared.
For each, you can choose 'Seed', 'Steps', 'CFG Scale (CFG scale, a variable that gives an image closer to the prompt as it is higher)', 'Prompt S / R (end prompt)', 'Sampler' (sampler)'.
This time, enter 'The
Looking at the generated image looks like this. The columns are the style of the painter, and the rows are the sampler types. The difference in painting style by each painter is easy to understand, but the difference by the sampler is also quite clear. The compositions of Vermeer style and Modigliani style are almost the same, but there is a big difference in touch. Also, in the case of Picasso style, only the PLMS sampler has a composition that is more subdued for some reason.
This time, I chose 'Steps' for X Type and specified '10,20,30' and 'Seed' for Y Type and specified '123,456,789,123456' and tried outputting at the same prompt. There are things that the picture is completely different by changing the seed value. Also, as the number of generation steps increases, the impression that the amount of fine drawing in the picture increases.
Next, I selected 'CFG Scales' for Y Type, specified '7,8,9,10', and output the image without changing the prompt and X Type. With a generation step of 10, you can clearly see that as you increase the CFG scale, the generated image gets much closer to the Milkmaid. On the other hand, when the number of generation steps is 30, increasing the CFG scale does not change the output result significantly.
Also, by changing the Batch count and Batch size in the X/Y plot, it is possible to generate multiple images per cell. The image below is 'beautiful woman with braided blond hair, red eyes, wearing a camisole, sitting on her bed, highly detailed, in the style of and ilya kuvshinov and greg rutkowski and shinkai makoto, kawaii, high quality anime artstyle, intricate' In the image generated by the prompt, select 'Steps' for X Type and specify '40,80', select 'CHG Scale' for Y Type and specify '7,9', Batch count and Batch size are each set to 4.
In order to generate a more ideal image with Stable Diffusion, it is necessary to combine various keywords and enter them as prompts, and to change detailed settings such as the number of steps and CFG scale. I have to. However, if you make full use of 'Prompt matrix' and 'X/Y plot', you can more easily grasp the ideal prompts and settings that you do not know unless you generate them many times, so you can never create your favorite image. If you are in trouble, please try using it.
In addition, from the next time, we plan to explain each function of 'img2img' that automatically generates from the input image in order.
・Continued
How to use ``CLIP interrogator'' that can decompose and display what kind of prompt / spell was from the image automatically generated by image generation AI ``Stable Diffusion''-GIGAZINE
Related Posts:
in Review, Software, Web Application, Art, Posted by log1i_yk