Overfitting and Generalization

Level 201intermediate
10 mins

Overfitting is when a model performs well on training data, but poorly on unseen data.

It means the model has memorized rather than generalized.


🔍 What is Generalization?

A good model:

  • Learns patterns, not noise
  • Performs well on new, real-world data

This ability is called generalization.


📉 Example

Suppose we train a model on 100 examples.

  • It gets 98% accuracy on training data
  • But only 70% accuracy on test data

This gap suggests overfitting.


📊 Visual Intuition

  • Underfitting: Too simple, poor on both train/test
  • Good fit: Balanced performance
  • Overfitting: Too complex, great on train, bad on test

🚨 Signs of Overfitting

  • High training accuracy, low test accuracy
  • Large gap between training and validation loss
  • Model performance degrades on real inputs

🛡️ How to Prevent Overfitting

TechniqueDescription
More DataHelps model see more variation
RegularizationPenalize large weights (e.g. L2)
DropoutRandomly disable neurons during training
Early StoppingStop training when validation loss worsens
Simpler ModelsAvoid overly complex models

Summary

  • Overfitting = memorizing, not learning
  • Generalization = ability to perform on new data
  • Prevent with regularization, more data, and validation

Self-Check

  • How do you know a model is overfitting?
  • What is the difference between underfitting and overfitting?
  • How does dropout help reduce overfitting?

Explore More Learning

Continue your AI learning journey with our comprehensive courses and resources.

Overfitting and Generalization - AI Course | HowAIWorks.ai