Why did my Google Colab session crash while running the Llama model?

riddhi - Aug 12 -

I am trying to use the meta-llama/Llama-2-7b-hf model and run it locally on my premises but the session crashed during the process.

I am trying to use the meta-llama/Llama-2-7b-hf model and run it locally on my premises. To do this, I am using Google Colab and have obtained an access key from Hugging Face. I am utilizing their transformers library for the necessary tasks. Initially, I used the T4 GPU runtime stack on Google Colab, which provided 12.7 GB of system RAM, 15.0 GB of GPU RAM, and 78.2 GB of disk space. Despite these resources, my session crashed, and I encountered the following error:

Subsequently, I switched to the TPU V2 runtime stack, which offers 334.6 GB of system RAM and 225.3 GB of disk space, but the issue persisted.

Here is my code:

!pip install transformers
!pip install --upgrade transformers

from huggingface_hub import login
login(token='Access Token From Hugging Face')

import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from torch.utils.data import Dataset

# Load pre-trained Meta-Llama-3.1-8B model
model_name = "meta-llama/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)