Epochs, Batches and Learning Rate
Training a model means adjusting weights over time, not all at once.
This is done through multiple epochs, on smaller batches, with a specific learning rate.
β±οΈ Epoch
An epoch is one full pass through the entire training dataset.
- We train in multiple epochs to improve learning.
- One epoch doesnβt mean the model has learned enough.
π¦ Batch
Instead of feeding all data at once, we split it into mini-batches.
Why?
- Fits in memory
- Adds randomness β helps generalization
Batch Size affects:
- Speed
- Stability
- Accuracy
ποΈ Learning Rate
The learning rate (Ξ±) determines how big the steps are during gradient descent.
- Too high β skips minima
- Too low β slow or stuck
π Training Loop
The full training loop typically looks like:
- Shuffle dataset
- Loop through mini-batches
- Forward pass
- Compute loss
- Backward pass
- Update weights
- Repeat for multiple epochs
βοΈ Trade-Offs
Concept | High Value | Low Value |
---|---|---|
Batch Size | Fast, unstable | Slow, stable |
Epochs | Better learning | Underfitting risk |
Learning Rate | Fast convergence | Slower learning |
Self-Check
- What is an epoch?
- Why use batches instead of the full dataset?
- What happens if the learning rate is too high?
Explore More Learning
Continue your AI learning journey with our comprehensive courses and resources.