In this video, I show you how to use Amazon SageMaker to train a Transformer model with AWS Trainium and compile it for AWS Inferentia.
Starting from a BERT model and the Yelp review datatset, I first train a multi-class classification model on an ml.trn1.2xlarge instance. I also show you how to reuse the Neuron SDK model cache from one training job to the next, in order to save time and money on repeated jobs. Then, I compile the trained model for Inferentia with a SageMaker Processing batch job, making it easy to automate such tasks.