Microsoft develops 'Visual ChatGPT' that can generate images in chat format with image generation function installed in ChatGPT
The interactive AI '
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
(PDF file) https://arxiv.org/pdf/2303.04671.pdf
GitHub - microsoft/visual-chatgpt: Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Microsoft Research Introduces Visual ChatGPT That Incorporates Different Visual Foundation Models Enabling Users To Interact With ChatGPT - MarkTechPost
https://www.marktechpost.com/2023/03/10/microsoft-research-introduces-visual-chatgpt-that-incorporates-different-visual-foundation-models-enabling-users-to-interact-with-chatgpt/
Image generation AI such as Stable Diffusion allows you to generate your favorite image by entering sentences and reference images as prompts . However, in order to make full use of image generation AI, it is necessary to appropriately set various elements such as 'model data', 'resolution', and 'number of sampling', as well as to perform troublesome operations such as constructing complex prompts. I have.
Therefore, the research team of Mr. Wu et al. has developed an interactive AI called `` Visual ChatGPT '' based on the conventional ChatGPT. Visual ChatGPT can generate images by interacting with input text and prompts.
Wu et al.'s research team added VFM such as Stable Diffusion and
The architecture overview of Visual ChatGPT is shown in the image below. In Q2, the sofa image shown in Q1 is asked, 'Replace the sofa in the image with a desk and make it more watercolor-like.' Upon receiving a query from a user, it creates a prompt to use a selection of tools from among various VFMs, including a description of ChatGPT's system and interaction history, and enters it on ChatGPT.
The demo of Visual ChatGPT looks like this. When you type 'Could you generate a cat for me?' into Visual ChatGPT, Visual ChatGPT will instantly generate a cat image.
Furthermore, if you enter 'could you replace the cat to a dog and then remove a book?' image is generated.
Also, if you ask 'That's cool! Could you generate the
Next, if you enter 'Please generate a yellow dog based on the edge-detected dog image', the yellow dog image will be generated as requested.
By using tools like Visual ChatGPT, it is said that it is possible to reduce the barriers in generating images from text and add interoperability to various AI tools.
Wu et al.'s research team said, ``VFM failures and prompt irregularities may not lead to satisfactory generation results, which is a concern.'' We need a single self-modifying module that matches the intent of .In addition, the introduction of this module may increase generation time, so we will continue to investigate.'
The source code of Visual ChatGPT developed by Mr. Wu et al. is published on GitHub. In addition,
GitHub - microsoft/visual-chatgpt: Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
https://github.com/microsoft/visual-chatgpt
Related Posts:
in Software, Posted by log1r_ut