Deep Learning Workflow in PyTorch

Super Kai (Kazuya Ito) - Jun 8 - - Dev Community

Buy Me a Coffee

  1. Prepare dataset(true data).
  2. Prepare a model.
  3. Train model.
  4. Test model.
  5. Save model.

1. Prepare data(true data).

(1) Get dataset(true data) like images, video, sound, text, etc.

(2) Divide the dataset(true data) into the one for training(Train data) and the one for testing(Test data). *Basically, train data is 80% and test data is 20%.

2. Prepare a model.

(1) Select the suitable layers for the dataset. *There are layers in PyTorch such as Linear(), Conv2d(), MaxPool2d(), etc according to the doc.

(2) Select activation functions if necesarry. *There are layers in PyTorch such as ReLU(), Sigmoid(), Softmax(), etc according to the doc.

3. Train the model.

(1) Select the suitable loss function and optimizer for the dataset(true data).
*Memos:

  • A loss function is the function which can get the mean(average) of the sum of the losses(differences) between a model's predictions and true values(train or test data) to optimize the model during training or to evaluate how good the model is during testing.
  • An optimizer is the Gradient Descent algorithm which can update(adjust) a model's parameters(weight and bias) to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) during training. *Gradient Descent(GD) is the algorithm which can find the minimum(or maximum) gradient(slope) of a function.
  • Loss Function is also called Cost Function or Error Function.
  • There are loss functions in PyTorch such as L1Loss(), MSELoss(), CrossEntropyLoss(), etc according to the doc.
  • There are optimizers in PyTorch such as SGD(), Adam(), Adadelta(), etc according to the doc.

(2) Calculate the model's predictions with true values(train data), working from input layer to output layer. *This calculation is called Forward Propagation or Forward Pass. *One training(epoch) starts.

(3) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using a loss function.

(4) Zero out the gradients of all tensors every training(epoch) for proper calculation. *The gradients are accumulated in buffers, then they are not overwritten until backward() is called.

(5) Calculate a gradient using the average loss(difference) calculated by (3), working from output layer to input layer. *This calculation is called Backpropagation or Backward Pass.

(6) Update the model's parameters(weight and bias) by gradient descent using the gradient calculated by (5) to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using an optimizer. *One training(epoch) ends.

*Memos:

  • The tasks from (2) to (6) are one training(epoch).
  • Basically, the training(epoch) is repeated with a for loop to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data).
  • Basically, a model is tested (4. Test the model) after (6) every training(epoch) or once every n trainings(epoches).

4. Test the model.

(1) Calculate the model's predictions with true values(test data).

(2) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(test data) with a loss function.

(3) Show each mean(average) of the sum of the losses(differences) with true values(train and test data) by text or graph.

5. Save the model.

Finally, save the model if the model is the enough quality which you want.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player