Jul 07, 2024 09:00:00

'Era3D' creates high-resolution 360-degree images from a single image

Era3D , which can generate a 360-degree view in 3D by simply recognizing a single image, has solved the problems of conventional multi-view methods, such as inaccuracy, inefficiency, and low resolution. The creator of Era3D provides a detailed explanation on his webpage, and you can quickly and easily experience multi-view generation in your browser.

Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

https://penghtyx.github.io/Era3D/

Era3D MV Demo - a Hugging Face Space by pengHTYX
https://huggingface.co/spaces/pengHTYX/Era3D_MV_demo

Although technology for generating multi-views from different angles using a small amount of material has made great advances, according to pengHTYX , creator of Era3D, conventional methods tend to distort images that deviate even slightly from the assumed camera type. In addition, the multi-view calculation methods used in conventional methods have exponentially increasing computational complexity as image resolution increases, making the training costs for generating high-resolution images enormous.

Era3D first applies a camera prediction module that estimates the focal length and elevation angle of the input image to generate an image without shape distortion. In addition, it uses a simple and efficient layer called 'row-wise attention' to effectively combine information from multiple viewpoints. As a result, Era3D reduces the computational complexity by 12 times compared to the conventional state-of-the-art method. The image below shows the Era3D process, explaining that it enables high-quality and high-speed generation by estimating what the viewpoint and camera position are and then calculating. The image on the right is the final generated different viewpoint image, and although the expression of the car is slightly distorted, the inverted shape is output well.

In addition, various 360-degree images generated by Era3D are posted as samples. In the images below, the leftmost one is the input image, the middle one is the 360-degree view, and the right one is the image converted into a 3D asset colored with a rainbow gradient.

It is also possible to generate 3D assets by entering text, such as 'a bulldog wearing a black pirate hat,' 'a pig carrying a backpack,' or 'a beautiful brown-haired cyborg.'

You can actually try out Era3D generation from the demo page . There are several sample images available for the demo, so try selecting a camera image.

Once you have confirmed that the image has been loaded, click “Generate Normals and Colors” to begin generation.

'Processing 22.5 seconds' was displayed, and the generation was completed quickly in about 20 seconds. At this time, the sample image does not have a background to begin with, but even if the image contains a background, the background will be automatically removed before loading.

The results of generating a multi-view are shown below. Although the 'Multiview Images' version has some discomfort in the protruding parts of the lens and the back is painted white, you can see that a fairly high-quality appearance was quickly created.

You can also try it with any image by clicking 'click and upload,' so I tried loading a character image from the manga '

The Part-Time Life of a Hero and a Corporate Slave .'

Select an image and click 'Open'. In addition to JPEG and PNG, image formats such as XBM, TIFF, GIF, and SVG are widely supported.

Verify that the image has been loaded and click 'Generate Normals and Colors'.

The result of generating the multi-view is below. The character's facial expression is blurred, and for some reason, he has a tail the same color as his hair, but we were able to generate a three-dimensional multi-view from a flat illustration.

More information about Era3D is available on GitHub.

GitHub - pengHTYX/Era3D
https://github.com/pengHTYX/Era3D

Related Posts:

Jul 07, 2024 09:00:00 in Software, Web Service, Posted by log1e_dh