Training a model means adjusting weights over time, not all at once.
This is done through multiple epochs, on smaller batches, with a specific learning rate.
⏱️ Epoch
An epoch is one full pass through the entire training dataset.
- We train in multiple epochs to improve learning.
- One epoch doesn’t mean the model has learned enough.
📦 Batch
Instead of feeding all data at once, we split it into mini-batches.
Why?
- Fits in memory
- Adds randomness → helps generalization
Batch Size affects:
- Speed
- Stability
- Accuracy
🎚️ Learning Rate
The learning rate (α) determines how big the steps are during gradient descent.
- Too high → skips minima
- Too low → slow or stuck
🔁 Training Loop
The full training loop typically looks like:
- Shuffle dataset
- Loop through mini-batches
- Forward pass
- Compute loss
- Backward pass
- Update weights
- Repeat for multiple epochs
⚖️ Trade-Offs
| Concept | High Value | Low Value | |---------------|-------------------|-------------------| | Batch Size | Fast, unstable | Slow, stable | | Epochs | Better learning | Underfitting risk | | Learning Rate | Fast convergence | Slower learning |
✅ Self-Check
- What is an epoch?
- Why use batches instead of the full dataset?
- What happens if the learning rate is too high?