Aug 02, 2024 17:00:00

Stable Diffusion's original developers launch AI company Black Forest Labs and release their own image generation AI model, Flux

AI researchers involved in the development of image generation AI such as '

Stable Diffusion ' have launched a new AI development company ' Black Forest Labs '. In addition, Black Forest Labs has also announced ' Flux ', an open source image generation AI model with a parameter size of 12 billion.

Announcing Flux by Black Forest Labs: The Next Leap in Text-to-Image Models
https://blog.fal.ai/flux-the-largest-open-sourced-text2img-model-now-available-on-fal/

Announcing Black Forest Labs - Black Forest Labs
https://blackforestlabs.ai/announcing-black-forest-labs/

The launch of Black Forest Labs has been announced, with the goal of developing generative deep learning models for media content such as images and videos. Black Forest Labs is founded by AI researchers and engineers who have been involved in the development of fundamental generative AI models in academic, industrial, and open source fields and have a strong track record. According to fal , a media platform for developers, Black Forest Labs was launched by 'the original development team of Stable Diffusion.'

AI researchers and engineers involved in the launch of Black Forest Labs include Andreas Bratmann, Andy Holmes, Axel Sauer, Dominik Lorenz, Dustin Podell, Frederick Boesel, Harry Saini, Jonas Muller, Kyle Lacy, Patrick Esser, Robin Rombach, Smith Kral, Tim Dockhorn, Yam Levy, and Zion English.

AI models developed by Black Forest Labs' AI researchers include VQGAN , Latent Diffusion , Stable Diffusion image and video generation models ( Stable Diffusion XL , Stable Video Diffusion , Rectified Flow Transformers ), and Adversarial Diffusion Distillation .

Black Forest Labs' fundamental belief is that 'broadly accessible AI models not only foster innovation and collaboration within the research community and academia, but also increase the transparency that is essential for trust and widespread adoption.' Black Forest Labs aims to develop the highest quality technology and make it available to the widest possible audience.

Black Forest Labs has raised $31 million in funding, with major backers including venture capital firm Andreessen Horowitz and angel investors Brendan Iribe, Michael Ovitz , Gary Tan , Timo Aila , and Vladlen Koltun .

Black Forest Labs has announced 'Flux', an open source text-to-image conversion model. Flux has a parameter size of 12 billion (12B), and Black Forest Labs emphasizes that it is an open source AI model from

SOTA that boasts cutting-edge performance at the time of writing.

Below is an example of an image created with Flux.

The prompt used to create this artwork was, 'Extreme close-up of a single tiger eye, direct frontal view. Detailed iris and pupil. Sharp focus on eye texture and color. Natural lighting to capture authentic eye shine and depth. The word 'FLUX' is painted over it in big, white brush strokes with visible texture.'

The prompt used to create this was 'Portrait of a bearded man with dark hair wearing red sunglasses and a light gray Patagonia fleece jacket. He has a serious expression and is looking directly at the camera. The background shows a blurred outdoor scene with rocky terrain and a vibrant pink and purple sunset sky. The lighting gives the image a warm, golden-hour glow. The overall mood is rugged yet stylish, with a touch of adventure.'

The prompt used to create it was 'Close-up of LEGO chef minifigure cooking for homeless. Focus on LEGO hands using utensils, showing culinary skill. Warm kitchen lighting, late morning atmosphere. Canon EOS R5, 50mm f/1.4 lens. Capture intricate cooking techniques. Background hints at charitable setting. Inspired by Paul Bocuse and Massimo Bottura's styles. Freeze-frame moment of food preparation. Convey compassion and altruism through scene details. ' 'The texture of the images is as if they were shot through a lens. They capture intricate cooking techniques and the backgrounds suggest charitable settings. Inspired by the style of Paul Bocuse and Massimo Bottura , they freeze the moment of food preparation, conveying compassion and altruism through the details of the scene.'

The prompt used to create the video was, 'A giant potato in sunglasses and a Hawaiian shirt lounges on a beach towel surrounded by colorful beach balls and flip-flops. Nearby, anthropomorphic fruits play beach volleyball. In the background, a lighthouse sand sculpture stands next to an ice cream truck with a giant cone, serving treats to cheerful beachgoers. The scene captures a fun, playful summer vibe with the sound of waves crashing nearby.'

Flux is available in three variations, all of which will be available around fall 2024. There is also a demo page where you can try out each Flux model. To use the fal demo, you will need to sign in with your GitHub account, but on the Replicate demo page you can try out each model without an account.

・FLUX.1 [dev]
The basic model of Flux that will be open sourced under a non-commercial license.

black-forest-labs/FLUX.1-dev · Hugging Face

https://huggingface.co/black-forest-labs/FLUX.1-dev

FLUX.1 [dev] | AI Playground | fal.ai
https://fal.ai/models/fal-ai/flux/dev

black-forest-labs/flux-dev – Run with an API on Replicate
https://replicate.com/black-forest-labs/flux-dev

・FLUX.1 [schnell]
A refined version of the basic model that runs up to 10 times faster. Licensed under Apache 2 .

FLUX.1 [schnell] | AI Playground | fal.ai
https://fal.ai/models/fal-ai/flux/schnell

black-forest-labs/flux-schnell – Run with an API on Replicate
https://replicate.com/black-forest-labs/flux-schnell

・FLUX.1 [pro]
Closed source version available only via API.

FLUX.1 [pro] | AI Playground | fal.ai
https://fal.ai/models/fal-ai/flux-pro

black-forest-labs/flux-pro – Run with an API on Replicate
https://replicate.com/black-forest-labs/flux-pro

The image below was created with FLUX.1 [pro] with the prompt, 'Close-up view of a tin toy. The background is a cyberpunk world. The word 'GIGAZINE' is written in an electrified font on the forefront of the image.' Image generation AI is generally not good at correctly depicting the spelling of characters , but the word 'GIGAZINE' was drawn as instructed.

The graph below compares

the ELO scores of each Flux model with other image generation AI models.

Regarding Flux's performance, Black Forest Labs explained, 'FLUX.1 [pro] and [dev] outperform popular image generation AI models such as Midjourney v6.0, DALL·E 3 (HD), and SD3-Ultra in visual quality, prompt followability, size/aspect ratio variability, typography, and output diversity. FLUX.1 [schnell] is our most advanced step model to date, outperforming not only competitors in its class, but also powerful non-distilled models such as Midjourney v6.0 and DALL·E 3 (HD). Flux is specially fine-tuned to preserve the entire diversity of outputs from pre-training, and when compared to existing state-of-the-art techniques, it shows a significant improvement.'

Below is a graph comparing each Flux model and a third-party image generation AI model based on five elements: 'Visual Quality,' 'Prompt Following,' 'Size / Aspect Variability,' 'Typography,' and 'Output Diversity.'

The main features of Flux are:

Enhanced Image Quality: Higher resolution produces stunning visuals.
Advanced Human Anatomy and Photorealism: Generate highly realistic, anatomically accurate images.
Improved prompt adherence: Generates more accurate and relevant images based on your input.
Outstanding speed: The speed and efficiency of the FLUX.1 [schnell] make it ideal for demanding applications.

Black Forest Labs has also announced plans to build a video generation AI model based on Flux that can generate videos from text.

Related Posts:

Aug 02, 2024 17:00:00 in AI, Software, Posted by logu_ii