Emerging from the innovative minds at Black Forest Labs, the creators behind the groundbreaking Stable Diffusion, comes their latest model—Flux AI. This new text-to-image AI, towering with 12 billion parameters, sets a new benchmark in the realm of open-source visual generation. Flux not only matches the artistic prowess of competitors like Midjourney but also promises to outperform other models in the market, regardless of their proprietary status.
Flux AI is introduced in three distinct variants tailored to diverse user needs. For enthusiasts and developers, Flux Dev offers a non-commercial license. It’s an environment ripe for community-driven enhancement. Those seeking speed without compromising efficiency can turn to Flux Schnell, a streamlined iteration that delivers up to tenfold faster results under the flexible Apache 2 license. For professional-grade applications, Flux Pro provides exclusive access through a sophisticated API, catering to high-demand commercial projects.
How to try Flux AI?
Availability is seamless, with Flux Dev and Flux Schnell ready for download on Hugging Face, and integration support through ComfyUI for streamlined local operations. The launch, announced last Thursday, underscores Black Forest Labs’ commitment to pioneering the frontiers of generative AI media technology.
“Our innovations include creating VQGAN and Latent Diffusion, Stability AI’s Stable Diffusion models for image and video generation (Stable Diffusion XL, Stable Video Diffusion, Rectified Flow Transformers), and Adversarial Diffusion Distillation for ultra-fast, real-time image synthesis,” the team stated.
Following an impressive seed funding round that garnered $31 million, spearheaded by industry giant Andreessen Horowitz and bolstered by influential investors such as Brendan Iribe, Michael Ovitz, and Garry Tan, Black Forest Labs announced the launch of Flux. This funding infusion has propelled the development of what is now heralded as a groundbreaking achievement in AI-driven image synthesis.
In rigorous benchmarking evaluations, Flux AI has not only met but exceeded the capabilities of established models like Midjourney v6.1, DALL-E 3, and SD3 Ultra across several criteria including visual quality, adherence to prompts, flexibility in size and aspect ratio, typography prowess, and diversity of outputs. According to Black Forest’s data, the Pro and Dev iterations of Flux are now the pinnacle of image generation technology, with the Schnell model also performing commendably, positioning itself between Midjourney v5 and Ideogram in terms of capability.
Despite these advances, there’s a caveat for users with less powerful hardware. The hefty size of the open-source models, approximately 23GB, necessitates almost 24GB of VRAM to operate effectively until a potentially lighter, quantized version becomes available. This requirement may alienate users with GPUs boasting only 6 to 8 GB of VRAM, curtailing their ability to partake in the latest AI explorations.
Nevertheless, Black Forest Labs has crafted a solution by partnering with Fal AI, creators of the Auraflow model, to facilitate cloud-based generation of images. This collaboration ensures that even users without the latest hardware can experience Flux. Available for initial free trials on Replicate.com, the models—after the daily free quota is reached—offer cost-effective image generation, with $1 fetching either 33 images from Flux Pro or a generous 333 from Flux Schnell.
Flux vs. Midjourney
We’ve put Flux and Midjourney to the test to see how they stack up against each other.
Check out the side-by-side comparisons and see the results for yourself:
A serene beach at sunset, with waves gently lapping at the shore, a lone palm tree swaying in the breeze, and a sailboat silhouetted against the vibrant orange and pink sky
Flux AI:
Midjourney:
A cozy cabin in the woods during winter, smoke curling from the chimney, snow-covered trees surrounding it, and a warm, inviting light glowing from the windows
Flux AI:
Midjourney:
A steampunk cityscape with intricate machinery, airships floating above, and people dressed in Victorian-era attire with mechanical enhancements
Flux AI:
Midjourney:
A close-up portrait of an elderly woman with deep wrinkles and wise eyes, wearing a weathered hat and a flannel shirt, standing in front of an old wooden barn
Flux AI:
Midjourney:
Note that while our initial comparisons between Flux and Midjourney suggest that Midjourney generally produces superior visuals, it’s important to note that we utilized the cloud-based version of Flux for these tests. To conduct a truly equitable assessment, it would be necessary to download Flux and deploy it on a high-powered GPU locally. This approach would likely unveil the full potential of Flux’s capabilities, which could differ significantly from our preliminary findings.
For those interested in deeper insights, we encourage exploring the array of community-generated visuals too:
A new open-source image generation model popped out of nowhere and it's actually insanely good??
FLUX.1 by @bfl_ml pic.twitter.com/K89GHoh3PQ
— Pietro Schirano (@skirano) August 1, 2024
Featured image credit: Kerem Gülen/Flux