Oct 23, 2024 10:55:00

Three models of the image generation AI 'Stable Diffusion 3.5' series are openly released, featuring high customizability, fidelity to prompts, and high quality

Stability AI, the developer of image generation AI Stable Diffusion, announced the open release of ' Stable Diffusion 3.5 ' on Tuesday, October 22, 2024. Stable Diffusion 3.5 includes multiple models, all of which are available for free for commercial and non-commercial use under

the Stability AI Community License .

Introducing Stable Diffusion 3.5 — Stability AI Japan
https://ja.stability.ai/blog/introducing-stable-diffusion-3-5

Stability AI said, 'In June (2024), we released Stable Diffusion 3 Medium , the first open release of the Stable Diffusion 3 series. However, this release did not fully meet our standards or the expectations of the community. Therefore, we took the time to develop a version that would advance our mission of transforming visual media, rather than a quick fix, based on valuable feedback from the community.

Stability AI claims that Stable Diffusion 3.5 offers advantages in 'customizability,' 'efficient performance,' 'diverse output,' and 'various styles,' making it one of the most customizable and easy-to-use image models on the market while maintaining the highest levels of performance in terms of compliance with text prompts and image quality.

In particular, the prioritization of customizability means that output results from the same prompt can vary greatly depending on the seed, and non-specific prompts can increase output uncertainty and cause visual variability. Stability AI says this is intentional.

Stable Diffusion 3.5 is available in three models: Stable Diffusion 3.5 Large , Stable Diffusion 3.5 Large Turbo , and Stable Diffusion 3.5 Medium .

The number of parameters of 'Stable Diffusion 3.5 Large' is 8 billion, and the most powerful model in the Stable Diffusion 3 series, Stable Diffusion 3.5 Large has excellent quality and quick adaptation, and is suitable for professional use cases at 1 megapixel resolution. The Stable Diffusion 3.5 Large model is available in the following repository of Hugging Face.

stabilityai/stable-diffusion-3.5-large · Hugging Face

https://huggingface.co/stabilityai/stable-diffusion-3.5-large

'Stable Diffusion 3.5 Large Turbo' is a distilled version of Stable Diffusion 3.5 Large, and is characterized by being able to generate high-quality images in just four steps and being much faster than Stable Diffusion 3.5 Large. The model of Stable Diffusion 3.5 Large Turbo is available in the following repository of Hugging Face.

stabilityai/stable-diffusion-3.5-large-turbo · Hugging Face
https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo

'Stable Diffusion 3.5 Medium' has 2.6 billion parameters and uses the MMDiT-X architecture, an improved version of the Multimodal Diffusion Transformer (MMDiT) architecture that is the basis of the Stable Diffusion 3 series. In addition, the training method has been improved to balance ease of customization with image quality. Stable Diffusion 3.5 Medium is designed to be 'ready to use out of the box' on consumer hardware, and can generate images with resolutions from 0.25 to 2 megapixels.

Below is a bar graph comparing the Elo rating scores for prompt adherence and aesthetic quality between the Stable Diffusion 3.5 series and other image generation AI models. The results for Stable Diffusion 3.5 Large show a higher prompt adherence rate than Flux (12 billion parameters), an image generation AI created by the original developer of Stable Diffusion, and the aesthetic quality exceeds that of previous Stable Diffusion series.

The Stable Diffusion 3.5 series models will be published on Hugging Face and will be available on multiple AI platforms, including the Stability AI API. Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo have already been released at the time of writing, but Stable Diffusion 3.5 Medium will be released a little later on Tuesday, October 29, 2024.

Related Posts:

Oct 23, 2024 10:55:00 in AI, Software, Posted by log1i_yk