A technology has been introduced that uses image generation AI to adjust images to reflect the prompt's font size, color, style, and footnotes.



The AI function that generates images from input text (prompts) is called 'Text to Image.' Researchers at the University of Maryland, College Park have developed a technology called ``Expressive Text-to-Image Generation'' that converts this prompt into rich text instead of plain text and reflects it in images that generate font size, color, style, and footnotes. was created.

Expressive Text-to-Image Generation with Rich Text

https://rich-text-to-image.github.io/

[2304.06720] Expressive Text-to-Image Generation with Rich Text
https://arxiv.org/abs/2304.06720

GitHub - songweige/rich-text-to-image: Rich-Text-to-Image Generation
https://github.com/SongweiGe/rich-text-to-image

If you look at the demonstration video that was released, you can get a quick idea of what kind of technology it is.

Workflow demo of technology to adjust images generated by image generation AI with font size, color, style, and footnotes - YouTube


The prompt is ``A rustic cabin sits on the edge of a giant lake. Wildflowers dot the meadow around the cabin and lake.'' ).



Then, an image like this was generated.



Change 'cabin' to orange.



Then, the hut in the picture became orange-based.



Next, change the style of the word 'Wildflowers' to 'Watercolor'.



The wildflowers have changed to look like a watercolor painting.



Added a footnote to the word 'lake': 'crystal-clear, blueish lake, glistening in the sunlight.'



The depiction of the lake has changed.



This makes it possible to do things like ``I want to adjust just the hair color to my preference.''



Comparisons with other methods have also been published. This specifies the color of the church, and other prompts work well with color names such as 'Pink' and 'Olive Yellow,' but even with the RGB specification of '(211,22,52),' It is characteristic that they are able to respond.



If you change the color of the pants, it will look like this.



You can also increase the number of ingredients in the generated pizza image by increasing the font of the ingredients part of ``mushrooms'' and ``pepperonis'' in the pizza generation prompt.



in Web Application,   Video, Posted by logc_nt