Google has released 'Imagen 3', a high-quality image generation model that can also render characters, so I tried using it
Google DeepMind has released a new model, 'Imagen 3,' that can generate images from text and has published a research paper on Imagen 3. Imagen 3 can be tried out with Google's image generation tool '
Imagen 3 - Google DeepMind
https://deepmind.google/technologies/imagen-3/
[2408.07009] Imagen 3
https://arxiv.org/abs/2408.07009
Many examples of Imagen 3 are available on the Google DeepMind official website . The image below is the output at the prompt 'Shot in the style of DSLR camera with the polarizing filter. A photo of two hot air balloons floating over the unique rock formations in Cappadocia, Turkey. The colors and patterns on these balloons contrast beautifully against the earthy tones of the landscape below. This shot captures the sense of adventure that comes with enjoying such an experience.' (Shot in the style of DSLR camera with the polarizing filter. A photo of two hot air balloons floating over the unique rock formations in Cappadocia, Turkey. The colors and patterns on these balloons contrast beautifully against the earthy tones of the landscape below. This shot captures the sense of adventure that comes with enjoying such an experience.)'
In addition to photorealistic landscapes, Imagen 3 can generate a variety of image styles, including textured oil paintings and clay animation scenes. It also improves prompt comprehension, reducing the need for prompt engineering, and can accurately render complex textures such as the fine wrinkles of a person or a knitted stuffed animal.
Text rendering has also been greatly improved. For example, the image below shows the output of the prompt 'Word 'light' made from various colorful feathers, black background'. For some reason, 'LIGHT' is in uppercase, but the word itself is output correctly.
You can actually try out Imagen 3 on Google's image generation tool '
Click “Next”.
Scroll to the bottom of the Privacy Policy and click 'Next.'
Click 'Agree and continue'.
The ImageFX screen looks like this. To check if Japanese is available, enter the prompt 'A tin robot writing the word 'GIGAZINE'' in Japanese in the left column and click 'Create'.
During the generation process, an effect appeared that made me think, 'The AI is recognizing this character!'
Although I was able to generate it, the image that was output was completely different from the prompt content. It seems that it doesn't work in Japanese.
When I entered the prompt in English, the four images I intended were output.
Each time you click 'Create', four new images will be generated. If the generated results are not acceptable according to the content policy, the message 'Content not found' will be displayed. Click the image in the upper right.
The image is enlarged. There are buttons on the bottom right that say 'Copy Image' and 'Download Image', and on the bottom left that say 'Edit Image'. Click 'Edit Image'.
The edit function allows you to regenerate only a part of the image with a new prompt. Fill in the part you want to change with a brush, enter the changes, and click 'Generate edits'. This time, I asked for 'replace robot with wooden one'.
The result looks like this. Four images were generated with only the robot part changed.
It seems that the content filter in Imagen 3 is set quite strict, and if the word 'girl' is included, there is a high probability that it will be caught by the content filter.
'Boy' was also excluded by the content filter.
'Woman' is also not allowed.
However, 'man' was able to be output stably.
There is a high probability that real people will be rejected.
There was a mixture of characters that could be printed and those that could not.
In addition, images generated by Imagen 3 are embedded with a digital watermark 'SynthID' so that they can be identified. It is said that Imagen 3 will be available across Google services such as Gemini apps, Workspaces, and ads in the future.
◆ Forum is currently open
A forum related to this article has been set up on the official GIGAZINE Discord server . Anyone can post freely, so please feel free to comment! If you do not have a Discord account, please refer to the account creation procedure article to create an account!
• Discord | 'Show us the images generated by Google's image generation AI 'Imagen 3'!' | GIGAZINE
https://discord.com/channels/1037961069903216680/1273929464123752458
Related Posts:
in Review, Software, Web Service, Web Application, Posted by log1d_ts