AI “RealFill” that can generate the best shot from multiple failed photos is overwhelmingly more accurate than Stable Diffusion’s Outpainting

When taking photos, you often can't get the best shot, such as ``Photo A shows the person's face too much, and Photo B shows the person's face, but it's too dark due to backlight.'' By using the AI `` RealFill '' developed by a research team from Google and Cornell University, it is possible to later create the best shot based on multiple failed photos.


An example of processing by RealFill looks like this. The left side of the image below is the reference image used for processing, and the right side is the best shot generated based on the reference image. The reference images include full-body images and photos that show the background, and the best shot is generated based on each element.

In the example below, the reference images include a 'narrow photo that is not backlit' and 'a photo that covers a wide area but is backlit,' and a photo that covers a wide area and is not backlit is generated based on the reference image. I am.

Below, from left to right, the reference image, the correct live-action image, the image generated by RealFill, the image generated by

Paint by Example , and the image generated by Stable Diffusion are arranged, and each generated image is white and has no blur. The part shows the part that was generated. Looking at the example, you can see that RealFill can depict the positional relationship of the subject fairly accurately.

At the time of article creation, RealFill has problems such as ``generating three-dimensionally incorrect images such as ``the hand is shorter than it actually is'''' and ``generating incorrect text.''

in Software, Posted by log1o_hf