Developed "virtual photographer" that Google can automatically generate professional level photos using deep learning



Machine learning is used in many fields, such as "whether the object in the image can be recognized correctly" or "whether it can be done properly when translating from one language to another language", "This is correct "This is particularly useful in tasks where there is a clear answer" This is a mistake. " On the other hand, when it is difficult to obtain an objective evaluation, it has been considered difficult to utilize it based on a subjective concept such as "whether the picture is beautiful or not". However, Google has succeeded in building an experimental system for creating artistic content using deep learning.

Research Blog: Using Deep Learning to Create Professional-Level Photographs
https://research.googleblog.com/2017/07/using-deep-learning-to-create.html

Google's "artistic content creation system" is a system that can automatically generate professional photographer level photos. The fundamental part of this system is born from the idea of ​​"imitating professional photographer's workflow", searching for the best composition from the landscape photograph of Google street view, adding various processing to create an aesthetic beautiful image You can create. We call the system which can automatically generate professional photographer level photographs as "virtual photographer".

Google's virtual photographer created images based on about 40,000 panoramic photos such as Alps, Banff / Jasper National Park, Yellowstone National Park. When a professional photographer looked at the image created by the virtual photographer, I was successful in getting a "close impression of professional quality" impression.

"Aesthetics" seems to be able to be modeled by using the data set, but in order to make the picture better, it is possible that part of the aesthetic appearance may be damaged simply by using the data set normally And that. Therefore, it is necessary to manage the contents of learning and correctly learn various aspects related to aesthetics with the system. For that reason, Google seems to have automatically decomposed aesthetics into multiple elements using professional-quality photo albums, etc., and carried out learning individually for each element using pictures matching each side. By doing this, virtual photographers are able to judge optimal (that is, beautiful) pictures by dividing them into elements such as picture composition, saturation, HDR level and so on.

The following picture shows a series of processes of the virtual photographer in stages, (a) a landscape photograph of the Google street view found by the system, (b) a picture of (a) trimmed to an appropriate composition, (C) shows the saturation and HDR intensity of (b) adjusted, and (d) shows a mask added to (c). In other words, virtual photographers process "multiple aspects of" aesthetics "learned by the system for each element.


What kind of photographs can actually be generated by virtual photographer is as follows. The top of each image is the image generated by the virtual photographer, the bottom is the original Google Street View photo.

Jasper National Park, Canada


Interlaken in Switzerland


Orobie · Bergamaske Park in Italy


Again Jasper National Park in Canada


In order to judge how excellent the algorithm of virtual photographer is Google, "Turing test"I designed an experiment like that. In the experiment, show the professional photographer with the photograph created by the virtual photographer and other pictures and ask them to evaluate each picture by score. Evaluation is in four stages, "1" is a photograph taken with auto not taking into consideration composition and lighting, "2" is a good picture (note that there is nothing special to mention), "3" is semiprolevel "4" is a picture of a professional level in the picture of.

The following graph shows how much score the image generated by the virtual photographer will be, compared with the score by a professional photographer. Predictive scores for each line are 1.5 to 2.5 for blue, 2.5 to 2.7 for green, 2.7 to 2.9 for red, and 2.9 to 3.1 for purple. More than 40% of professional photographers evaluate images with the highest prediction scores, that is, images that were most successful, as being semi-professional or professional photographs.


Virtual photographer is a system developed as an experimental project, but Google researchers comment, "Someday this technique may be useful for taking better pictures in the real world."

in Software, Posted by logu_ii