Stability AI Releases Stable Diffusion XL 1.0

AI startup Stability AI continues to improve its generative models in the face of growing competition and ethical concerns.

The company today announced the release of Stable Diffusion XL 1.0, a text-to-image model that the company calls “the most advanced” to date. Available open-source on GitHub, as an API, and in consumer apps ClipDrop and DreamStudio, Stable Diffusion XL 1.0 delivers “brighter” and “accurate” colors, as well as better contrast, shadows, and lighting, compared to its predecessor, claims stability.

In an interview with TechCrunch, Joe Penna, head of applied machine learning at Stability AI, noted that the Stable Diffusion XL 1.0 model, which contains 3.5 billion parameters, allows you to get full-fledged 1 megapixel images “in seconds.” “Parameters” are parts of the model obtained from training data and, in fact, determine the skill of the model in solving a particular task, in this case, generating images.

The previous generation Stable Diffusion model, the Stable Diffusion XL 0.9, could also generate higher resolution images, but required more processing power.

“The Stable Diffusion XL 1.0 is a customizable model, ready to fine-tune concepts and styles,” says Penna. “It’s also easier to use and capable of building complex designs with basic natural language processing-based hints.”

In addition, Stable Diffusion XL 1.0 has been improved in the field of text generation. Many of the best text-to-image models struggle to generate images with legible logos, let alone calligraphy or fonts, Penna says, but Stable Diffusion XL 1.0 is able to generate “advanced” text and keep it legible.

Stable Diffusion XL 1.0 supports inpainting (restoring missing parts of an image), outpainting (expanding existing images), and “image-to-image” hinting—that is, the user can enter an image and add multiple text hints to create more detailed variations of that image. In addition, the model understands complex multicomponent instructions presented as short prompts, while previous Stable Diffusion models required longer text prompts .

“We hope that with the release of this much more powerful open source model, not only will image resolution be quadrupled, but improvements will also occur that will greatly benefit all users,” he added.

However, as with previous versions of Stable Diffusion, this model raises difficult moral questions.

The open source version of Stable Diffusion XL 1.0 could theoretically be used by attackers to create toxic or harmful content, such as unauthorized deepfakes. This is partly due to the data that was used to train her: millions of images from the internet.

Countless tutorials demonstrate how to use native Stability AI tools, including DreamStudio, the open source front-end for Stable Diffusion, to create deepfakes . Countless other examples show how to fine-tune the basic Stable Diffusion models to create pornography.

Penna does not deny that abuse is possible – and admits that the model also contains certain errors. However, he added that Stability AI has taken “additional steps” to reduce harmful content by filtering the model’s training data for “unsafe” images, issuing new warnings related to problematic clues, and blocking as many individual problematic terms as possible in the tool.

Coinciding with the release of Stable Diffusion XL 1.0, Stability AI is releasing a beta version of its API’s fine-tuning feature, which will allow users to use as few as five images to “specialize” generation on specific people, products, and more. The company is also bringing Stable Diffusion XL 1.0 to Bedrock, Amazon’s cloud platform for hosting generative AI models, expanding its previously announced collaboration with AWS.

The forging of partnerships and expansion of opportunities comes amid a lull in Stability’s commercial endeavors, which is facing stiff competition from OpenAI, Midjourney and others. In April, Semafor reported that Stability AI, which has raised more than $100 million in venture capital to date, was suffering from a cash crunch, prompting the closing of a $25 million convertible bond issue in June and a search for executives. to increase sales.

“The latest SDXL model represents the next step in Stability AI’s innovative legacy and ability to bring the most advanced open models to market for the AI ​​community,” Stability AI CEO Emad Mostak said in a press release. “The introduction of version 1.0 on the Amazon Bedrock platform demonstrates our strong commitment to working with AWS to provide the best solutions for developers and our customers.”