loading

Neural Network — Step-by-Step Visualization

mediumAIDLNeural NetworkActivation Function

Step through the forward pass of a 3-layer neural network — watch input values flow through weighted connections and activation functions layer by layer.

Algorithm Pattern

Layered Linear Transformations + Non-linearities

Key Idea

A neural network alternates linear transformations (z = Wx + b) with non-linear activation functions (a = ReLU(z)) to learn complex mappings.

Step-by-Step Approach

  1. Multiply the input by the weight matrix W1 and add bias b1 to get pre-activation z1.
  2. Apply ReLU (max(0, z)) element-wise to get activations a1.
  3. Multiply a1 by W2, add b2 to get z2 (output pre-activations).
  4. Apply softmax to z2 to get class probabilities a2.
  5. The highest probability in a2 is the predicted class.

Common Gotchas

  • ReLU kills neurons with z < 0 (outputs exactly 0) — this is intentional non-linearity.
  • Softmax normalizes all outputs to sum to 1, making them interpretable as probabilities.
  • Without activation functions, stacking layers is equivalent to a single linear layer.

Related Problems