Sep 02, 2022 17:00:00

``clip-retrieval'' that allows you to search for ``magic-like English sentences'' candidates from the atmosphere of the original image you want to create with image generation AI ``Stable Diffusion'' etc.

AI '

Stable Diffusion ', which generates images of your choice simply by entering text, has been attracting a great deal of attention since it was released to the public in August 2022. GUI applications that can be easily installed on PCs and low-spec PCs However, methods that can be used without problems are being devised one after another. However, Stable Diffusion requires you to enter English sentences, and it feels difficult for people who are not good at English. If you use the system `` clip-retrieval '' that allows you to easily search for `` English sentences explaining the image '' from the reference image, you can get the English sentences to be entered in Stable Diffusion in one shot, so I summarized how to use it.

GitHub - rom1504/clip-retrieval: Easily compute clip embeddings and build a clip retrieval system with them
https://github.com/rom1504/clip-retrieval

Stable Diffusion is an AI that generates images as instructed by entering instructions in English such as `` a bear playing in the forest '' and `` a boy eating ice cream ''. In addition to being able to try Stable Diffusion on the official demo site, it can also be used in Python environments on PCs equipped with NVIDIA GPUs and smartphone applications such as ' AI Picasso '. For PCs equipped with NVIDIA GPUs, 'NMKD Stable Diffusion GUI' can be used to generate images without the need for difficult work.

Summary of how to use ``NMKD Stable Diffusion GUI'' that can easily install image generation AI ``Stable Diffusion'' on Windows for free, you can quickly understand the tricks of setting spells and generating images - GIGAZINE

As mentioned above, the environment for easy use of Stable Diffusion is gradually being prepared, but at the time of writing the article, Stable Diffusion does not support instructions in Japanese. For this reason, it is necessary to master English sentences to generate images using Stable Diffusion, which makes it difficult for people who are not good at English. clip-retrieval is a system that has the function of ``displaying similar images with explanations when you enter an image''. You can get hints.

Clip-retrieval can also be used by installing it locally, but this time I will use the web application `` Clip front '' published by the developer of clip-retrieval. First, click the link below to access Clip front.

Clip front
https://rom1504.github.io/clip-retrieval/

When you access Clip front, you will see a menu on the left and an input field on the top. Since I want to enter an image this time, click the camera icon at the top right of the screen.

When the file selection dialog is displayed, select the image you want to input.

This time, I chose the following 'Photo of a woman and a dog playing in front of a Christmas tree'.

When you input an image, images similar to the input image are displayed in a row with explanations. Some explanations are written in languages other than English.

Hover your mouse over the image to see the full description.

Also, when you click the magnifying glass icon ......

A description is entered in the input field, and images that match the description are searched. This time, the explanation 'happy woman in sweater sitting on floor with labrador near christmas tree' was displayed. If you copy the description displayed in the input field, it can be used for image generation with Stable Diffusion.

As a result of image generation with the above explanation, the image of 'a woman in a sweater sitting on the floor with her dog near the Christmas tree' was output as shown below. People who say, ``I have a reference image of the generated image, but I don't know the appropriate English sentence,'' are recommended to try using clip-retrieval.

Related Posts:

Sep 02, 2022 17:00:00 in Review, Web Application, Posted by log1o_hf