2023年10月06日 07時50分ウェブアプリ

画像生成AIでプロンプトのフォントサイズ・色・スタイル・脚注を反映して画像を調整する技術が登場

入力された文章(プロンプト)から画像を生成するAIの機能を「Text to Image」と呼びます。このプロンプトを、プレーンテキストではなくリッチテキストにして、フォントの大きさや色、スタイル、脚注を生成する画像に反映する技術「Expressive Text-to-Image Generation」を、メリーランド大学カレッジパークの研究者らが生み出しました。

Expressive Text-to-Image Generation with Rich Text
https://rich-text-to-image.github.io/

[2304.06720] Expressive Text-to-Image Generation with Rich Text
https://arxiv.org/abs/2304.06720

GitHub - songweige/rich-text-to-image: Rich-Text-to-Image Generation
https://github.com/SongweiGe/rich-text-to-image

公開されたデモンストレーション映像を見ると、どのような技術なのかがぱっとわかります。

フォントサイズ・色・スタイル・脚注で画像生成AIの生成画像を調整する技術のワークフローデモ - YouTube

プロンプトとして「A rustic cabin sits on the edge of a giant lake. Wildflowers dot the meadow around the cabin and lake.」(巨大な湖のほとりにひなびた小屋がある。小屋と湖の周りには野草が点在している)という文章を与えます。