Video: Accelerate Transformer training with Optimum Habana

In this video, I show you how to accelerate Transformer training with Optimum Habana, an open-source library by Hugging Face that leverages the Habana Labs Gaudi chip.

First, I walk you through the setup of an Amazon EC2 DL1 instance, which is equipped with 8 Gaudis.

Then, I run a natural language processing job where I adapt existing Transformer training code for Optimum Habana, accelerating a DistilBERT model to classify the star rating of Amazon product reviews. I train with 1 Gaudi chip, then with 8 to demonstrate near-linear scaling.

Finally, switching to computer vision, I use a built-in script in the Optimum Habana repository to accelerate image classification training jobs on the Food101 dataset, first with a Vision Transformer model and then with a Swin model.