AI 'DALL E' that automatically generates illustrations and photos from words such as 'Pikachu riding a motorcycle' and 'Avocado chair'



OpenAI, a non-profit organization that studies artificial intelligence (AI), has announced the AI ' DALL E ' that combines natural language processing and image generation. DALL / E can create illustrations and photographs from sentences using the parameters of AI ' GPT-3 ' that generates sentences with high accuracy that is indistinguishable from humans.

DALL · E: Creating Images from Text
https://openai.com/blog/dall-e/

OpenAI debuts DALL-E for generating images from text | VentureBeat
https://venturebeat.com/2021/01/05/openai-debuts-dall-e-for-generating-images-from-text/

On the official OpenAI blog, 'text entered in DALL / E' and 'image output by DALL / E' are published. For example, the illustration of 'Wearing a tutu', 'Walking a dog', and 'Radish' looks like this. DALL ・ E can automatically create images that do not exist in the world from multiple text elements that are unrelated to each other, such as 'wearing a tutu', 'walking a dog', and 'radish'. It is a feature.



If you enter 'wearing pajamas', 'riding a motorcycle', and 'Pikachu', the following illustrations will be generated.



In addition, the output image can be a photo as well as an illustration. For example, if you enter the text 'Avocado' shaped like 'chair', the following image will be output. The output image looks like a photo, but it was generated by DALL ・ E, and there is no such chair.



When you enter 'OpenAI', 'raise a sign', 'store photo', it looks like this. Of course, the signboard 'OpenAI' does not actually exist, and all AI renders sentences to create neon signs and signs. Looking at the image, there were some that failed to generate, such as 'Open II' and 'OPEAAI'. According to OpenAI, the longer the required character string, the lower the success rate of image generation.



In addition, DALL ・ E also supports geographical knowledge and temporal knowledge. For example, if you enter the text 'Photo of San Francisco's' Golden Gate Bridge '' and an image of the background, the image of the Golden Gate Bridge will be automatically generated as shown below.



Also, if you enter 'Food' in 'Japan', the output photo is as follows. There are many Japanese-style photos such as sushi, sashimi, and simmered dishes, but the presentation looks pretty sloppy.



Furthermore, if you enter the sentence 'Display the sketch image of' cat 'in the input image below the original image' along with the picture of the cat, the sketch will be automatically generated as shown below. Looking at the sketches, a wide variety of sketches are output, from those that are drawn quite exactly, those that draw a completely different cat, those that draw 'same (same)' and give up. ..



OpenAI says, 'We recognize that the processing results of DALL / E, including generative models, can have a significant and widespread impact on society. In the future, models like DALL / E will be available. Analyze how it relates to social challenges such as economic impact on specific business processes and professionals, potential bias in model output, and long-term ethical challenges suggested by this technology. I plan to do it. '

The name 'DALL-E' is named after the painter Salvador Dali and the feature-length animated movie ' WALL-E ' in which the robot is the main character.

in Software, Posted by log1i_yk