Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, cultivates autonomous freedom to produce incredible imagery, empowers billions of people to create stunning art within seconds.

High-Resolution Image Synthesis with Latent Diffusion Models

Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512×512 images from a subset of the LAION-5B database. Similar to Google’s Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.

Text-to-Image with Stable Diffusion

Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder. We provide a reference script for sampling, but there also exists a diffusers integration, which we expect to see more active community development.

Easy to use is an easy-to-use interface for creating images using the recently released Stable Diffusion image generation model.

High quality images

It can create high quality images of anything you can imagine in seconds–just type in a text prompt and hit Generate.

GPU enabled and fast generation

Perfect for running a quick sentence through the model and get results back rapidly.


  • Max supply: 100,000,000
  • Network: Solana
  • Symbol: STD
  • Contract: CofGfLCNVLL4Q8ycU8MZjvwWzrVAhCWyaJJpyUgT8GQ2

Frequently asked questions

What was the Stable Diffusion model trained on?

The underlying dataset for Stable Diffusion was the 2b English language label subset of LAION 5b, a general crawl of the internet created by the German charity LAION.

What kinds of GPUs will be able to run Stable Diffusion, and at what settings?

Most NVidia and AMD GPUs with 6GB or more.

What are Diffusion Models?

Generative models are a class of machine learning models that can generate new data based on training data.

What is the copyright for using Stable Diffusion generated images?

The area of AI-generated images and copyright is complex and will vary from jurisdiction to jurisdiction.

Can artists opt-in or opt-out to include their work in the training data?

There was no opt-in or opt-out for the LAION 5b model data. It is intended to be a general representation of the language-image connection of the Internet.

Can we expect more features?

Absolutely. We are working on that.

