Mar 26, 2025 11:15:00

ChatGPT adds high-precision image generation function '4o Image Generation', which can generate images using GPT-4o knowledge and can also generate images of celebrities

OpenAI has announced that it has incorporated a highly accurate image generation function, ' 4o Image Generation, ' into GPT-4o. 4o Image Generation is available on ChatGPT and Sora, and enables image generation using GPT-4o knowledge and interactive operations to improve the quality of the results.

Introducing 4o Image Generation | OpenAI

https://openai.com/index/introducing-4o-image-generation/

4o image generation has arrived.

It's beginning to roll out today in ChatGPT and Sora to all Plus, Pro, Team, and Free users. pic.twitter.com/pFXDzKhh2t
— OpenAI (@OpenAI) March 25, 2025

Below is an example of image generation using 4o Image Generation. 'A horizontal photo of a glassy whiteboard taken with a smartphone. Taken in a room with a view of the Bay Bridge. A woman wearing a T-shirt with a large OpenAI logo is writing something. Natural handwriting, a bit messy. The photographer is reflected in the photo.' High-quality images can be generated while reflecting all of the complex instructions.

Furthermore, when I gave the additional instruction of 'take a selfie with her and him high-fiving,' the system correctly recognized that 'she' was the 'woman writing something on a whiteboard' and 'he' was the 'photographer,' and generated an image exactly as I instructed.

OpenAI lists the following features of 4o Image Generation: 'It can accurately draw character strings,' 'It can remake images while interacting,' 'It can follow multiple instructions,' 'It can remake images while maintaining the context of the image,' and 'It can generate images using the knowledge of GPT-4o.' Specific examples of each are as follows.

◆ Character strings can be drawn accurately
The image below was generated by instructing the system to draw 'road signs indicating permitted vehicle classifications and the need for parking permits' and 'two witches,' while also specifying the composition to draw 'road, cars, and buildings → road signs → witches' from the back. By following the composition instructions, the system was able to draw a flawless string of characters on the road signs.

◆ You can edit images interactively
The image below was generated with the instruction, 'Can you make a cute sticker of a raccoon eating a strawberry? Surround it with a white border and make the background transparent.'

When I told him to 'try a minimalist style with a grey raccoon,' the color of the image changed while keeping the overall image the same.

We further instructed them to 'make a bite mark on the strawberry and stain the area around the mouth red,' and here is the result. Although the raccoon's color is darker, the image we created was largely as instructed.

◆Follow multiple instructions
Existing image generation AI is not good at generating images containing multiple subjects or objects. On the other hand, 4o Image Generation can draw a large number of subjects and objects at once. For example, you can instruct it to draw 16 types of shapes such as 'blue stars', 'red triangles', and 'orange cats wearing black baseball caps' in a 4 by 4 grid.

The generated images are below. All 16 images were drawn as instructed.

◆ Images can be remade while maintaining their context
As the image was being input, the user was instructed to 'make this image look like it was taken with a single-lens reflex camera.'

The generated result looks like this. The subject in the original image is preserved while the style is changed.

◆ Images can be generated using GPT-4o's knowledge
4o Image Generation can utilize GPT-4o's knowledge to create concrete deliverables from vague instructions. Below is an image created with the instruction 'Create an infographic that explains in detail Newton's prism experiment.' Although the user did not explain the effect of the prism, GPT-4o generated a detailed explanatory image based on its knowledge.

You can also provide additional instructions, such as 'Generate a first-person image of me writing this image in a notebook at a table in a cafe in Washington Square Park.'

4o Image Generation can be used with all ChatGPT plans, including the free version, and is also available with the video generation AI Sora. However, when I tried it on the editorial department's PC, the image was generated using the existing image generation AI 'DALL-E' at the time of writing the article, and a notification was displayed at the bottom of the generated image saying 'DALL-E is an old image generation model from OpenAI. A new model will be available soon.'

In addition, OpenAI does not block '4o Image Generation's generation of images including public figures' and allows the use of 4o Image Generation to create political speech and satirical images. A system card summarizing the design and risks of 4o Image Generation can be found at the following link.

Native_Image_Generation_System_Card.pdf
(PDF file) https://cdn.openai.com/11998be9-5319-4302-bfbf-1167e093f1fb/Native_Image_Generation_System_Card.pdf

Related Posts:

Mar 26, 2025 11:15:00 in AI, Software, Web Service, Web Application, Posted by log1o_hf