Neural Networks: Backpropagation Explained

Shagun Mistry - Sep 18 - - Dev Community

Backpropagation is what makes neural networks tick. It's the secret sauce that turns raw data into intelligent predictions, powering everything from self-driving cars to virtual assistants.

The Essence of Learning

At its core, backpropagation is elegantly simple:

  1. See: The network observes data.
  2. Guess: It makes a prediction.
  3. Learn: It refines its understanding based on its mistakes.

Sound familiar? It's how we all learn, transformed into mathematical poetry.

The Dance of Numbers

Let's break it down:

  1. Forward Pass: Data flows through the network, layer by layer.
  2. Error Calculation: The network compares its guess to reality.
  3. Backward Pass: The error ripples back, fine-tuning connections.

It's a beautiful choreography of numbers, each step precisely calculated to bring the network closer to perfection.

A Human Analogy

Imagine you're learning to play darts for the first time. Here's how your learning process mirrors backpropagation in neural networks:

  1. The Forward Pass: Taking Your Shot
    You look at the dartboard (input), estimate the force and angle needed (weights), and throw the dart (output). This is like the neural network making a prediction based on its current understanding.

  2. Calculating the Error: Measuring the Miss
    You see how far your dart landed from the bullseye. The distance and direction of your miss represent the error in the network's prediction.

  3. Backpropagation: Analyzing and Adjusting
    Now comes the crucial part. You don't just see that you missed; you analyze why:

    • Was your throw too hard or too soft? (Adjusting the "weight" of your throw)
    • Did you aim too high or too low? (Adjusting the "weight" of your aim)
    • How did your wrist motion affect the throw? (Fine-tuning earlier "layers" of the action)

This is like the error signal propagating back through the neural network, adjusting weights at each layer.

  1. Learning Rate: Pacing Your Adjustments
    You don't completely change your throwing motion based on one try. You make small, incremental adjustments. This is akin to the learning rate in neural networks, preventing overreaction to single errors.

  2. Iterations: Practice Makes Perfect
    You keep throwing darts, each time getting feedback and making tiny adjustments. With enough practice, your aim improves significantly. This iterative process mirrors the many epochs a neural network goes through during training.

  3. Generalization: Adapting to New Targets
    Eventually, you become good enough to hit not just the bullseye, but any part of the dartboard you aim for. This is similar to how a well-trained neural network can generalize its learning to new, unseen data.

In this analogy, your brain acts like the neural network, constantly processing feedback, making adjustments, and improving performance. The key insight is that improvement comes not just from practice, but from mindful analysis of each attempt and systematic adjustment based on errors.

Just as you fine-tune your dart-throwing technique through this feedback loop, backpropagation allows neural networks to continuously refine their "understanding," leading to increasingly accurate predictions and decisions.

Bringing It to Life

Here's a glimpse into the heart of backpropagation, distilled into Python:

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

class NeuralNetwork:
    def __init__(self, x, y):
        self.input = x
        self.weights1 = np.random.rand(self.input.shape[1], 4)
        self.weights2 = np.random.rand(4, 1)
        self.y = y
        self.output = np.zeros(y.shape)

    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))

    def backprop(self):
        d_weights2 = np.dot(self.layer1.T, 2 * (self.y - self.output) * sigmoid_derivative(self.output))
        d_weights1 = np.dot(self.input.T, np.dot(2 * (self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1))

        self.weights1 += d_weights1
        self.weights2 += d_weights2

# Example usage
X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]])
y = np.array([[0], [1], [1], [0]])
nn = NeuralNetwork(X, y)

for _ in range(1500):
    nn.feedforward()
    nn.backprop()

print(nn.output)
Enter fullscreen mode Exit fullscreen mode

Decoding the Neural Dance: A Code Walkthrough

Let's peel back the layers of our Python implementation, revealing the elegant simplicity of backpropagation in action.

The Neuron's Activation

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)
Enter fullscreen mode Exit fullscreen mode

These functions are the heartbeat of our network. The sigmoid function squashes any input into a value between 0 and 1, mimicking a neuron's firing. Its derivative, crucial for learning, tells us how to adjust our network's "thinking."

The Network's Architecture

class NeuralNetwork:
    def __init__(self, x, y):
        self.input = x
        self.weights1 = np.random.rand(self.input.shape[1], 4)
        self.weights2 = np.random.rand(4, 1)
        self.y = y
        self.output = np.zeros(y.shape)
Enter fullscreen mode Exit fullscreen mode

Here, we're setting up our network's structure. We initialize random weights, creating a canvas for our network to paint its understanding upon. The input and y are our training data and expected outputs – the network's textbook and exam answers.

The Forward Pass: Thinking in Action

def feedforward(self):
    self.layer1 = sigmoid(np.dot(self.input, self.weights1))
    self.output = sigmoid(np.dot(self.layer1, self.weights2))
Enter fullscreen mode Exit fullscreen mode

This is our network in action. It takes the input, processes it through layers of neurons (represented by matrix multiplications and sigmoid activations), and produces an output. It's like the network is forming a thought, layer by layer.

The Backward Pass: Learning from Mistakes

def backprop(self):
    d_weights2 = np.dot(self.layer1.T, 2 * (self.y - self.output) * sigmoid_derivative(self.output))
    d_weights1 = np.dot(self.input.T, np.dot(2 * (self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1))

    self.weights1 += d_weights1
    self.weights2 += d_weights2
Enter fullscreen mode Exit fullscreen mode

This is where the magic happens. The network compares its output to the expected result, calculates the error, and propagates it backward. It's like the network is reflecting on its mistakes, adjusting its understanding bit by bit. The mathematical operations here embody the chain rule of calculus, allowing the network to assign credit (or blame) to each of its neurons.

Putting It All Together

X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]])
y = np.array([[0], [1], [1], [0]])
nn = NeuralNetwork(X, y)

for _ in range(1500):
    nn.feedforward()
    nn.backprop()

print(nn.output)
Enter fullscreen mode Exit fullscreen mode

Here, we're training our network on a simple dataset. Each iteration is a cycle of thought and reflection, gradually sculpting the network's understanding. After 1500 cycles, we print the output – a testament to the network's learned wisdom.

The Future, Unfolding

Backpropagation isn't just a technical curiosity—it's the engine driving breakthroughs in:

  • Computer vision that rivals human perception
  • Language models that craft prose and poetry
  • AI assistants that understand and respond with near-human intuition

This article was written with information from Deep Learning and is based on this paper:
Learning representations by back-propagating errors


Share this article if you found it helpful!
If you're interested in learning more about AI and machine learning, check out my Newsletter for weekly insights and tips! 🤖📈

. . . . . . . . . . . . . . . . . . . .
Terabox Video Player