How Models Learn via Gradient Descent

Level 201intermediate

10 mins

Gradient descent is the core optimization algorithm used to train most machine learning models.

It helps the model learn by minimizing the loss function step by step.

🧗 Intuition

Imagine a hiker trying to descend a mountain (loss function) in the fog:

The height = error (loss)
The direction = gradient (slope)
Each step = model update

The hiker wants to reach the bottom — minimum error.

🧮 How It Works

At each step:

Calculate the loss.
Compute the gradient (slope of the loss curve).
Take a step in the opposite direction of the gradient.
Update the model's parameters.

🔧 Formula (Simplified)

Let θ be the model’s parameter.
Update rule:

θ = θ - η * ∇L(θ)

Where:

η is the learning rate
∇L(θ) is the gradient of the loss function

📉 Example

Prediction too high? Decrease the weight.
Prediction too low? Increase the weight.

Over time, the model “nudges” itself to better performance.

⚠️ Learning Rate Matters

Too small → Slow learning
Too big → Might overshoot the minimum
Choose carefully!

Summary

Concept	Meaning
Gradient	Slope of the loss curve
Descent	Move in direction of lower error
Learning Rate	Size of the step
Goal	Minimize the loss

Self-Check

What does gradient descent try to minimize?
Why is the learning rate important?
How do gradients help the model learn?

← Previous Lesson

Go back to the previous lesson

Next Lesson →

Continue your learning journey

Explore More Learning

Continue your AI learning journey with our comprehensive courses and resources.

View Course Overview Browse All Courses

How Models Learn via Gradient Descent - AI Course | HowAIWorks.ai