What Happens During Training?

Training a neural network is a cyclical process of learning from data.

Let’s break it down step-by-step.

1. 📥 Input Data

The model takes in examples:

Input: "What is the capital of France?"
Label: "Paris"

The data is tokenized and turned into numbers.

2. 🔄 Forward Pass

The inputs go forward through the network, layer by layer, producing an output (prediction).

3. 📉 Compute Loss

Compare the output to the true label using a loss function.

Example:

L = (ŷ - y)^2

4. 🔁 Backward Pass

Use backpropagation to calculate how much each weight contributed to the error.

5. 🎯 Update Weights

Apply gradient descent to slightly adjust each weight in the direction that reduces loss.

6. 🔁 Repeat

This cycle is repeated for:

Many batches
Over many epochs

The model improves over time as loss decreases.

⚙️ Visualization

Training Loop Visualization

Watch the complete training cycle: forward pass, loss calculation, backward pass, and parameter updates

Show Metrics

Training Loop

Current Step

Forward Pass

Compute predictions through the network

Loss:0.5644

Accuracy:50.7%

Learning Rate:0.0100

Epoch:1 / 20

Step:1 / 4

Training Progress

Epoch 1, Step 11%

Training Metrics

Total Steps

out of 80

Current Loss

0.5644

decreasing

Current Accuracy

50.7%

improving

Training Loop Steps

Forward Pass: Input data flows through the network to produce predictions

Loss Calculation: Compare predictions with true labels to compute error

Backward Pass: Compute gradients of loss with respect to all parameters

Parameter Update: Adjust weights using gradients and optimization algorithm

This cycle repeats for each batch, and multiple epochs until convergence.

Watch how the model learns after each step.

🧠 Summary Table

| Step | Description | |------------------|-------------------------------------| | Input | Tokenized data | | Forward Pass | Compute prediction | | Loss | Compare prediction to label | | Backward Pass | Compute gradients | | Update Weights | Use gradients to adjust parameters |

✅ Self-Check

What are the major phases of training?
What does the model learn from?
Why is training repeated in cycles?