``SnapFusion'' that can generate images with the image generation AI ``Stable Diffusion'' at an explosive speed of only 1 second for 2 seconds on a smartphone



Stable Diffusion, an image generation AI, automatically generates illustrations and photos just by entering text. However, since this image generation process performs complex processing many times, a machine with a high-spec GPU is required. A researcher at

Snap , a developer of Northeastern University and the photo sharing application Snapchat, announced that it has developed `` SnapFusion '' that allows such Stable Diffusion to be generated on a smartphone at a speed of less than 2 seconds per image. Did.

SnapFusion
https://snap-research.github.io/SnapFusion/

You can see how SnapFusion generates images at explosive speed in the following movie.

On-Device Demo for SnapFusion-YouTube


An iPhone running SnapFusion. Generate by entering 'dog head drawing'.



After sending the prompt to the server, processing proceeds.



A picture of a dog was displayed in a few seconds after processing started.



This time, it is generated by adding 'in oil painting'.



Processing progresses at an explosive speed.



This time the dog's picture was generated with an oil painting-like touch.



When I generated it again, another dog picture was generated, probably because the seed value changed.



The research team proposes a more efficient

UNet that is possible by identifying the redundancy of the original Stable Diffusion and reducing the amount of computation. In the movie, the number of steps for generation is 8, but the research team claims that the responsiveness and relevance to the prompt were better than the image generated with 50 steps in version 1.5 of Stable Diffusion.

Below is a graph comparing the actual generation with 8-step setting by SnapFusion (red) and the generation with 50-step setting of Stable Diffusion version 1.5 (blue). FID (responsiveness) is on the vertical axis, and you can see that SnapFusion has a higher relevance to prompts (CLIP Score). With Stable Diffusion version 1.5, it took 1 minute and 24 seconds to generate one image, but with SnapFusion it took only 1.84 seconds.



``Our role is to democratize content creation by bringing users a powerful diffusion model that generates images from text,'' said the research team.

in Software,   Video, Posted by log1i_yk