Google launches Whisk, an image generation AI app that can remix images to create new ones
On December 16, 2024, Google announced a new image generation AI called ' Whisk ' that can generate various images by inputting an image.
Whisk
Introducing Whisk: Prompt Less, Play More | Google Labs - YouTube
Whisk: Visualize and remix ideas using images and AI
https://blog.google/technology/google-labs/whisk/
Conventional image generation AI requires input of text prompts such as '90s anime style ○○' or 'cyberpunk style ○○.' However, the newly announced Whisk is unique in that it does not require input of text prompts.
To generate an image using Whisk, first upload an image to the 'Subject' which will be the main subject. As with conventional image generation AI, it is also possible to enter a text prompt.
In the next 'Scene' section, you can select the scenery of the generated image. You can upload your favorite image, or you can pick one from the options presented randomly.
Additionally, you can upload an image to determine the style of the generated image, such as '90s cartoon style.' Like 'Subject,' 'Style' also supports input of a text prompt.
Finally, enter text details about the image to be generated.
This will generate a remixed image of the image you uploaded.
If you don't like the generated image, click the 'REFINE' button.
Enter 'suit with pink stripes' in the text field that appears.
A corrected image was generated based on the text entered.
According to Google, Whisk uses Gemini, Google's generative AI, to automatically write detailed captions for uploaded images, and then inputs them into
Whisk is a really fun and compelling new way to prompt our Imagen 3 model. You upload images that have subject, scene or style of interest, and then it composes these together to generate a new image based on these elements. Less wordsmithing, and fun results! https://t.co/0obgoTy7Zw
— Jason Baldridge (@jasonbaldridge) December 16, 2024
Below is an example of actual image generation using Whisk.
Below is an image generated by combining an image of a donut and an image of a pin badge.
Google said, 'Because Whisk only extracts a few key characteristics from an image, it may generate images that are different from what you expect, such as the subject's height, weight, hairstyle, or skin color. That's why Whisk lets you review and edit prompts at any time. ' In addition, Google vice president Josh Woodward said, 'Whisk was built based on conversations with filmmakers, creative advertisers, and fashion designers.'
It's time to Whisk! What makes it unique? No more long, detailed text prompts. Whisk lets you prompt with images and easily blend them together. It's so fast and fun. Just drag in your images and start creating.
https://t.co/44FzQVDxzr — Josh Woodward (@joshtwoodward) December 16, 2024
We created Whisk based on conversations with filmmakers working on…
At the time of writing, Whisk was only available to users in the United States and could not be used in Japan.
Related Posts:
in Software, Web Application, Video, Posted by log1r_ut