Introducing the multimodal image generation AI 'OmniGen,' a single model capable of 'pose-specified generation,' 'object replacement in an image,' 'subject-specified generation,' and more
Stable Diffusion, a widely used image generation AI, can perform a variety of tasks using extended functions such as 'combining ControlNet to perform pose extraction and pose-specified generation' and 'combining IP-Adapters to generate similar images.' ' OmniGen ' is a multimodal image generation AI developed with the aim of enabling a variety of generation tasks to be performed with only a single model without using extended functions such as ControlNet, and can perform tasks such as 'image generation,' 'pose extraction,' 'pose-specified generation,' 'object replacement in an image,' and 'subject-specified generation' by itself.
[2409.11340] OmniGen: Unified Image Generation
GitHub - VectorSpaceLab/OmniGen: OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
https://github.com/VectorSpaceLab/OmniGen
Below is a diagram showing the functions of OmniGen. Like general image generation AI, it is possible to generate images from text, as well as to 'change the drink in the image,' 'extract the pose of the person in the image,' 'generate an image by specifying a pose,' and 'generate an image of a pair of people by specifying one subject from two images.'
Here's an example: An image of a woman sitting on a chair is input, and an image of the same woman waving in a crowd is generated.
It is also possible to generate a different image by extracting the 'man in the red shirt' from an image containing three men. The feature is that you can specify the subject by prompting 'man in the red shirt' instead of 'man on the left'.
You can also generate separate images by selecting subjects one by one from separate images. The example below includes a subjective instruction for 'older woman,' but it works correctly.
OmniGen can handle more than just images with people in them. In the following example, it executes the command 'Put the flowers in image 1 into the lightest vase in image 2 and place it on a metal table inside a factory.'
You can actually use OmniGen at the following links:
OmniGen - a Hugging Face Space by Shitao
https://huggingface.co/spaces/Shitao/OmniGen
In addition, the code required to run OmniGen is available at the following link.
GitHub - VectorSpaceLab/OmniGen: OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
https://github.com/VectorSpaceLab/OmniGen?tab=readme-ov-file
Related Posts:
in Software, Posted by log1o_hf