I tried using AI 'Shap-E' to create 3D models from text and images with Google Colaboratory
data:image/s3,"s3://crabby-images/0bdca/0bdca23953b6062dde78ba6e315cc76539d7f450" alt=""
OpenAI, which develops ChatGPT for chat AI and Whisper for speech recognition AI, announced 3D model creation AI 'Shap-E' in May 2023. Shap-E is developed as an open source and can be used by anyone, so I actually tried using it on Google Colaboratory.
shap-e/sample_text_to_3d.ipynb at main openai/shap-e GitHub
The following article details what you can do with Shap-E.
OpenAI announces open source AI ``Shap-E'' that generates 3D models from text and images - GIGAZINE
data:image/s3,"s3://crabby-images/7a718/7a718168e0fd3e614be1ab961b12cdcf6255fc8e" alt=""
First, access Google Drive and click the '+' mark on the right end.
data:image/s3,"s3://crabby-images/104b6/104b644bb540aa7622eecabb0db867d455dc58f0" alt=""
Enter 'Colaboratory' in the search field and click the displayed Colaboratory app.
data:image/s3,"s3://crabby-images/67939/679392050db539942ae7f1802fb35aff00d168ab" alt=""
Click 'Install'.
data:image/s3,"s3://crabby-images/d6cc0/d6cc074e4785e6a7dfcc7a6c031a4dca2a220c93" alt=""
You will be asked for permission, so click 'Continue'.
Select an account to install Colaboratory.
data:image/s3,"s3://crabby-images/58986/58986dd37809fe0ad66b2a0bac1eee6a7dbed497" alt=""
The installation is now complete. Click 'Finish'.
data:image/s3,"s3://crabby-images/a5292/a529285358e38e379c12cd8150522775d49e8c6d" alt=""
Click New on the left side of the Google Drive screen.
data:image/s3,"s3://crabby-images/4a864/4a8645d289b3bd7a2e04d08bf0b4fb88787a6673" alt=""
Since 'Google Colaboratory' has been added to 'Other', click it.
data:image/s3,"s3://crabby-images/fae7a/fae7a0e03a428eb711a8d9eb1369ae9df626cc46" alt=""
When Colaboratory opens, first change the setting to use GPU. Click 'Change runtime type' in the 'Runtime' menu.
data:image/s3,"s3://crabby-images/8fabc/8fabccb677d1a2d7109231f2083cb7c1a2cc354b" alt=""
Set 'Hardware Acceleration' to 'GPU' and click 'Save'.
data:image/s3,"s3://crabby-images/58b0e/58b0e7b647049ce895fd6cd30a7d0d2a8a9ed3af" alt=""
Enter the Python code here. First of all, we will import Shap-E data, so the code to enter is as follows.
[code]!git clone https://github.com/openai/shap-e[/code]
In Colaboratory, you can execute the code by entering the code in the right frame and clicking the play mark on the left.
data:image/s3,"s3://crabby-images/3dad7/3dad7eff88d2bb0d4fb8f46907e50bc87f01f03c" alt=""
When the execution is completed, the log is displayed below the code.
data:image/s3,"s3://crabby-images/124b1/124b1c7dea37cca61bfc720e24cb8c18af7f7488" alt=""
When entering a new code, add a code block with the '+ code' button above.
data:image/s3,"s3://crabby-images/e0f48/e0f485625558a68a96afc33a85e2467d8a60d08c" alt=""
So, install the necessary libraries with the following code.
[code] %cd shap -e
!pip install -e .[/code]
Load the necessary functions from the library with the code below.
[code] import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images, gif_widget[/code]
Use the code below to configure the GPU.
[code]device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')[/code]
Load the AI model that will be used to generate the 3D model.
[code]xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion')) [/code]
Loading was completed in about 2 minutes.
data:image/s3,"s3://crabby-images/0591f/0591f6826f87b4198b4e4f14cb4e41e54310327c" alt=""
And generate the 3D model with the code below. 'batch_size' is the number of 3D models to generate and 'guidance_scale' is the fidelity to the prompt. You can specify what kind of 3D model to generate with 'prompt'. Since I will output a shark this time, I entered 'a shark'.
[code] batch_size = 1
guidance_scale = 15.0
prompt='a shark'
latents = sample_latents(
batch_size=batch_size,
model=model,
diffusion=diffusion,
guidance_scale=guidance_scale,
model_kwargs=dict(texts=[prompt] * batch_size),
progress=True,
clip_denoised=True,
use_fp16=True,
use_karras=True,
karras_steps=64,
sigma_min=1e-3,
sigma_max=160,
s_churn=0,
)[/code]
With this setting, the 3D model generation was completed in 23 seconds.
data:image/s3,"s3://crabby-images/5b896/5b896184e6110ef3bebb0be4ef8996491e418361" alt=""
Enter the code below to display the generated 3D model as a rotating gif image.
[code]render_mode = 'nerf' # you can change this to 'stf'
size = 64 # this is the size of the renders, higher values take longer to render.
cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
display(gif_widget(images))[/code]
A shark like this was generated.
data:image/s3,"s3://crabby-images/05030/05030ad7bf9c3426a078df72c15934806c42acab" alt=""
Use the code below to save the generated 3D model.
[code]from shap_e.util.notebooks import decode_latent_mesh
for i, latent in enumerate(latents):
t = decode_latent_mesh(xm, latent).tri_mesh()
with open(f'example_mesh_{i}.ply', 'wb') as f:
t.write_ply(f)
with open(f'example_mesh_{i}.obj', 'w') as f:
t.write_obj(f)[/code]
When you run the code, an obj file and a ply file are generated with the name 'example_mesh_0' in the file column.
data:image/s3,"s3://crabby-images/eb0dc/eb0dc5dfe835efa2adca306eb67da60e7cde056d" alt=""
Right click and click 'Download'.
data:image/s3,"s3://crabby-images/e2b4b/e2b4b225b63527ceb8edf08843a06193a77473a7" alt=""
After that, it is OK if you import the downloaded file into the 3D model editing software. This time, I performed the procedure for creating a 3D model from text, but an example of creating a 3D model from an image is also included in the Shap-E repository, so please check it out if you are interested.
Related Posts: