Jan 02, 2026 20:40:00

Introducing the image generation AI 'Qwen-Image-2512' that can create realistic images that don't look like AI, it's free to use and can also be instructed in Japanese

Qwen, Alibaba's AI research team, released the image generation AI model ' Qwen-Image-2512 ' on December 31, 2025. A major feature of Qwen-Image-2512 is its ability to generate realistic-looking images that don't look like AI-generated images. Since the model was already available for download and local execution, I tried it out.

Qwen-Image-2512: Finer Details, Greater Realism

https://qwen.ai/blog?id=qwen-image-2512

Below are images generated from the same prompt using the conventional Qwen-Image (left) and Qwen-Image-2512 (right). The conventional Qwen-Image looked like it had a beauty filter applied, giving it an AI-like appearance, but Qwen-Image-2512 makes the face and lighting more realistic.

Qwen-Image-2512 also offers improved detail.

The quality of images containing text has also improved.

The model data for Qwen-Image-2512 is available

on Hugging Face , and can be downloaded and run locally. The ComfyUI development team has also published a generation workflow that uses Qwen-Image-2512 , so I decided to try generating an image using the workflow.

Get started (Local ComfyUI):

1. Use the Qwen Image workflow in Template Library or open this Qwen Image 2512 workflow: https://t.co/603dqeUdYo
2. Download the models: https://t.co/ZCGllKmr3I
3. Edit your prompt and run!
— ComfyUI (@ComfyUI) January 1, 2026

This is what the workflow looks like when opened in ComfyUI.

The Qwen-Image series can also handle Japanese prompts to some extent, so I tried generating an image by entering the following in Japanese: 'A snapshot taken with a camera. A Japanese grandparents are eating cookies in a slightly cluttered room. It's a sunny day in the afternoon, and a field can be seen outside the window.'

The generated result is below. It responded perfectly to instructions in Japanese.

I also tried generating various other images. The prompts and generated results are shown below. It can respond well to detailed instructions and generate high-quality images, but it has difficulty generating images containing Japanese text.

A Japanese woman in her 20s is making a phone call in a phone booth. It is raining outside. Inside the booth, there is a sticker with the word 'GIGAZINE' written on it.

A snapshot taken with a camera. A maid drinking a bottle of cola in an alley at night. The maid is a Japanese woman in her 20s with blue hair in pigtails. She is sitting next to the air conditioner unit. The photo was taken from above, looking into the camera. Magazines are scattered around the maid, and the magazines have 'GIGAZINE' written on the covers.

A snapshot taken with a camera. A Japanese man in his 20s is holding up a placard that reads, 'We look forward to your continued support in 2026.' The photo was taken in front of the station, on a clear day in the evening.

The workflow provided by ComfyUI is set to generate images over 50 steps, and on a PC equipped with a GeForce RTX 5070 Ti, it took 200 to 205 seconds to generate one image.

Related Posts:

Jan 02, 2026 20:40:00 in AI, Software, Review, Posted by log1o_hf