Stability AI has been a key player in the artificial intelligence (AI) image generator space thanks to its open-source Stable Diffusion models, which set the bar for quality, customization, and speed. Now, the company is adding to its family of models with its most advanced text-to-image generator yet.

On Wednesday, Stability AI launched Stable Diffusion 3 Medium, which the company claims is its "most sophisticated" image generation model. The two-billion-parameter model boasts several upgrades from its predecessors, resulting in higher-quality generations.

For example, the new model can overcome typically difficult tasks for image generators, including generating photorealistic images (even of hands and faces) and accurate text without artifacts or spelling errors. It can also adhere to complex prompts and understand spatial relationships, as seen in the image below.

According to the company, Stable Diffusion 3 Medium is a smaller model, making it a good candidate for running on both individual computing systems and enterprise-tier GPUs. Stability AI added that the model is also ideal for customization due to its ability to gather "nuanced details from small datasets."

Stable Diffusion 3 Medium's weights remain open-sourced and accessible to all users with a free non-commercial license via Hugging Face. Those interested in using the commercial model are encouraged to contact Stability AI for licensing information.

Stable Diffusion 3 Medium is available on Stability AI's API, Stable Assistant, the company's chatbot, and Discord via Stable Artisan.