Stability AI launches SDXL 0.9: a leap forward in AI image generation

On June 22nd, Stablity AI, the company behind the now famous Stable Diffusion,📸 which competes with OpenAI’s Midjourney and DALLE 2, announced its new SDXL 0.9 model.

Stability AI announces SDXL 0.9, the most advanced development of the Stable Diffusion text-to-image model suite. Following the successful release of Stable Diffusion XL beta in April, SDXL 0.9 delivers a massive improvement in image detail and composition over its predecessor.

The model can be accessed today through ClipDrop, and the API will be available soon. Research weights are already available and an open release is expected by mid-July as we move towards version 1.0.

🔎 How to use Stable Diffusion SDXL 0.9: Just going to the ClipDrop website is enough, but it will only allow you to generate a few images, to be able to generate many more you will have to register and verify the account with a phone number, and you will get 100 credits. ClipDrop offers several image editing services for AI, each one spending a different amount of credits. Once they run out you can buy credits to continue using their services. Try Stable Diffusion SDXL 0.9 at ClipDrop.

Despite its ability to run on a modern consumer GPU, SDXL 0.9 represents a leap forward in creative use cases for generative AI imaging. The ability to generate hyper-realistic creations for film, television, music and instructional videos, as well as offer breakthroughs for design and industrial use, puts SDXL at the forefront of real-world applications for AI imaging.

Comparison ⚔️

Some examples of the indicators tested in both SDXL beta and the new version 0.9 show how far this model has come in just two months. The images generated by the new version (0.9) have the ClipDrop watermark.

Used Prompt: “a robot driving a tesla in tokyo photographed by a professional nikon camera”.

Left ImageRight Image

Used Prompt: “motoko kusanagi walking through the streets of berlin”.

Left ImageRight Image

Used Prompt: Glowing jellyfish floating through a foggy forest at twilight.

Left ImageRight Image

The SDXL series also offers a range of functionalities that go beyond mere text display. These include image-to-image indicator (inputting an image to obtain variations of that image), inpainting (reconstructing missing parts of an image) and outpainting (constructing a seamless extension of an existing image).

Technical improvements 😮

The key driver of this advance in composition for SDXL 0.9 is its significant increase in parameter count (the sum of all weights and biases in the neural network on which the model is trained) over the beta version.

SDXL 0.9 has one of the largest parameter counts of any open source image model, with a base model of 3.5B parameters and an ensemble model pipeline of 6.6B parameters (the final output is created by running on two models and aggregating the results). The second model pipeline is used to add finer details to the output generated from the first stage.

For comparison, the beta version works with 3.1B parameters and uses only one model.

SDXL 0.9 runs on two CLIP models, including one of the largest OpenCLIP models trained to date (OpenCLIP ViT-G/14), which increases the processing power of 0.9 and the ability to create realistic images with greater depth and a higher resolution of 1024×1024.

A research blog will be launched that will go into more detail on the specifications and testing of this model by the SDXL team shortly.

System requirements 🖥️

Despite its powerful output and advanced model architecture, SDXL 0.9 is capable of running on a modern consumer GPU, requiring only a Windows 10 or 11 operating system, or Linux, with 16GB of RAM, an Nvidia GeForce RTX 20 graphics card (or higher level equivalent) equipped with a minimum of 8GB of VRAM. Linux users can also use a compatible AMD card with 16GB of VRAM.

Beta release statistics 🧪

Since the launch of the SDXL beta on April 13, we’ve had great responses from our community of Discord users, which has now almost reached 7,000. These users have generated over 700,000 images, averaging over 20,000 per day. Over 54,000 images have been entered into the Discord community showdowns, with 3,521 SDXL images nominated as winners.

Availability 🌐

SDXL 0.9 is now available on Stability AI’s Clipdrop platform. Stability AI API and DreamStudio customers will be able to access the model this Monday, June 26, as well as other leading imaging tools such as NightCafe.

SDXL 0.9 will be provided for research purposes only for a limited period to gather feedback and fully refine the model prior to its general open release. The code to run it will be publicly available on Github.

If researchers wish to access these models, they can request them through the following link: SDXL-0.9-Base model, and SDXL-0.9-Refiner. Please log in to your HuggingFace account with your academic email to request access. Please note that SDXL-0.9 is currently intended for research purposes only.

What’s next? 🌱

SDXL 0.9 will be followed by the full open release of SDXL 1.0 scheduled for mid-July (date to be confirmed).

License ©️

SDXL0.9 is released under a non-commercial, research-only license and is subject to its terms of use.