View Categories

Overfitting vs Underfitting

Overfitting vs Underfitting #

In machine learning there are the two most common hurdles every developer faces. Your model is like a student. It can either memorize the answers (overfitting) or not study enough (underfitting). Neither is good. You want the sweet spot.

The Simple Explanation #

ProblemWhat HappensAnalogy
UnderfittingModel is too weak. Cannot learn the pattern.Student who studies 5 minutes for a big exam. Fails.
OverfittingModel is too complex. Learns noise, not signal.Student who memorizes answers but does not understand. Fails on new questions.
Good FitModel learns the real pattern. Ignores noise.Student who understands concepts. Scores well anywhere.
Overfitting vs Underfitting

How to Diagnose (Look at the Gap) #

Training ScoreValidation ScoreGapDiagnosis
LowLowSmall❌ Underfitting
Very HighLowLarge❌ Overfitting
HighHighSmall✅ Good Fit

Simple Rule:

  • Both bad = Underfitting
  • Training great, validation bad = Overfitting
  • Both good = Perfect
Overfitting vs Underfitting 2

The Cures #

For Underfitting (Model is too weak):

CureHow
Use stronger modelLinear → Random Forest → Neural Network
Add more featuresPolynomial features, more columns
Reduce regularizationLower alpha or C value

For Overfitting (Model is too complex):

CureHow
Get more dataCollect more samples
Simplify modelReduce layers, depth, neurons
Add regularizationL1 (Lasso), L2 (Ridge), Dropout
Early stoppingStop training when validation stops improving
Reduce featuresRemove useless columns
Overfitting vs Underfitting 4

One Simple Example #

Problem: Predict house prices.

ModelTraining ErrorValidation ErrorIssue
Linear Regression$80,000$82,000Underfitting (too simple)
Deep Neural Network$5,000$75,000Overfitting (memorized)
Random Forest (tuned)$35,000$38,000✅ Good fit

The Goldilocks Rule #

  • Too simple model = Underfit ❌
  • Too complex model = Overfit ❌
  • Just right model = Good Fit ✅

You want the “just right” model.

Quick Quiz #

Q1: Training accuracy = 99%, Validation accuracy = 55%. Problem?

A1: Overfitting.

Q2: Training accuracy = 60%, Validation accuracy = 58%. Problem?

A2: Underfitting.

Q3: How to fix overfitting?

A3: More data, simplify model, add regularization, early stopping.

Key Takeaways (3 Lines) #

  1. Underfitting = Model too weak. Cure: Make it stronger.
  2. Overfitting = Model too complex. Cure: Simplify or add more data.
  3. Compare training and validation scores. Big gap = overfitting. Both low = underfitting.

💬
AIRA (AI Research Assistant) Neural Learning Interface • Drag & Resize Enabled
×