Training a neural network is a cyclical process of learning from data.
Letβs break it down step-by-step.
1. π₯ Input Data
The model takes in examples:
Input: "What is the capital of France?"
Label: "Paris"
The data is tokenized and turned into numbers.
2. π Forward Pass
The inputs go forward through the network, layer by layer, producing an output (prediction).
3. π Compute Loss
Compare the output to the true label using a loss function.
Example:
L = (Ε· - y)^2
4. π Backward Pass
Use backpropagation to calculate how much each weight contributed to the error.
5. π― Update Weights
Apply gradient descent to slightly adjust each weight in the direction that reduces loss.
6. π Repeat
This cycle is repeated for:
- Many batches
- Over many epochs
The model improves over time as loss decreases.
βοΈ Visualization
Watch how the model learns after each step.
π§ Summary Table
| Step | Description | |------------------|-------------------------------------| | Input | Tokenized data | | Forward Pass | Compute prediction | | Loss | Compare prediction to label | | Backward Pass | Compute gradients | | Update Weights | Use gradients to adjust parameters |
β Self-Check
- What are the major phases of training?
- What does the model learn from?
- Why is training repeated in cycles?