Video: generate synthetic data with Stable Diffusion to augment computer vision datasets

Building image datasets is hard work. Instead of scraping, cleaning and labeling images, why not generate them directly with a Stable Diffusion model?

In this video, I show you how to generate new images with a Stable Diffusion model and the diffusers library, in order to augment an image classification dataset. Then, I add the new images to the original dataset, and push the augmented dataset to the Hugging Face hub. Finally, I fine-tune an existing model on the augmented dataset.

Code: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/food102
Food101 dataset: https://huggingface.co/datasets/food101
Original model: https://huggingface.co/juliensimon/autotrain-food101-1471154053
How the original model was created with AutoTrain: https://youtu.be/uFxtl7QuUvo
Stable Diffusion model: https://huggingface.co/runwayml/stable-diffusion-v1-5
Stable Diffusion Space: https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5
Diffusers library: https://github.com/huggingface/diffusers
Food102 dataset: https://huggingface.co/datasets/juliensimon/food102
New model: https://huggingface.co/juliensimon/swin-food102