Introduction 0%

Introduction

🎯 0/5 0%

Beyond Straight Lines

What happens when
reality isn’t linear?

House prices don’t grow linearly with size. Drug dosage effects aren’t straight lines. Temperature and ice cream sales? Definitely curved. Time to bend your regression line.

Why Curves?

When Linear Regression Fails

Linear regression fits y = mx + b — a straight line. But many real-world relationships are:

Quadratic — goes up then down (or vice versa)
Exponential-ish — accelerates over time
Saturating — grows fast then plateaus
Cyclic — oscillates periodically

A straight line through curved data gives terrible predictions. The fix? Add powers of x as features.

Same data, two models. The curve captures the pattern the line misses.

↑ Answer the question above to continue ↑

Why does linear regression fail on curved data?

The Math

From Lines to Curves

Building the polynomial model

📏 Linear (Degree 1) → 📐 Quadratic (Degree 2) → 🔄 Cubic (Degree 3) → 🧮 Degree d

📏 Linear (Degree 1)

A straight line with one slope and one intercept. Can only model constant-rate relationships — no curves, peaks, or valleys.

y = w₁x + w₀

📐 Quadratic (Degree 2)

A parabola that can model one peak or valley. Adding x² lets the model capture acceleration or deceleration in the data.

y = w₂x² + w₁x + w₀

🔄 Cubic (Degree 3)

An S-curve with up to two turning points. Captures more complex patterns like initial growth, plateau, then decline.

y = w₃x³ + w₂x² + w₁x + w₀

🧮 Degree d

The general polynomial with d+1 coefficients. Higher degree means more flexibility, but also more risk of overfitting to noise rather than learning the true signal.

y = w_d·xᵈ + ... + w₁x + w₀

Fitting still uses least squares

Feature matrix X = [1, x, x², ..., xᵈ]

Transform each data point into a row with d+1 features

Loss = Σ(yᵢ - ŷᵢ)²

Same MSE loss as linear regression — minimize squared errors

w* = (XᵀX)⁻¹Xᵀy

Same normal equation works! The model is still 'linear' in the weights

↑ Answer the question above to continue ↑

A polynomial regression model uses y = w₃x³ + w₂x² + w₁x + w₀. Is this model linear or nonlinear?

Choosing Degree

Degree 2 vs 5 vs 20 — What Happens?

Low degree = underfitting. Right degree = goldilocks. High degree = overfitting.

How to pick the right degree

The validation error curve typically looks like a U-shape:

Low degree → high bias → high training AND validation error
Right degree → low bias, low variance → low validation error
High degree → low training error but HIGH validation error → overfitting

↑ Answer the question above to continue ↑

You fit a degree-15 polynomial to 10 data points. Training MSE is nearly zero. What's likely happening?

Overfitting

The Overfitting Problem

How to diagnose and cure overfitting

What overfitting looks like mathematically

The coefficient explosion

Degree 2: y = 0.5x² - 2x + 1

Reasonable coefficients — smooth, predictable curve

Degree 15: y = 847x¹⁵ - 12340x¹⁴ + ...

Enormous coefficients! Tiny input changes → wild output swings

Solution: penalize large coefficients

This is exactly what Ridge and Lasso regression do (next article)

↑ Answer the question above to continue ↑

You're choosing between a degree-2 and degree-8 polynomial. Both have similar validation error. Which should you prefer?

In Practice

Polynomial Regression in the Real World

Where polynomial regression is used — and where it's exceeded by other methods

↑ Answer the question above to continue ↑

You have 3 input features and want to use degree-4 polynomial regression. How many terms will the model need (approximately)?

🎓 What You Now Know

✓ Linear regression fails on curved data — It can only model constant-rate relationships.

✓ Polynomial regression adds x², x³, … as features — It’s still linear in the weights, so the normal equation works.

✓ Higher degree ≠ better — Too many degrees leads to overfitting (memorizing noise).

✓ Use cross-validation to choose degree — Plot train vs validation error and pick the sweet spot.

✓ Polynomial regression doesn’t scale to many features — The number of terms explodes combinatorially.

Polynomial regression is the bridge between simple linear models and complex nonlinear ones. It teaches you overfitting, model selection, and the bias-variance tradeoff — the three most important concepts in all of ML. 🚀

Polynomial Regression — When Lines Aren't Enough

What happens when
reality isn’t linear?

When Linear Regression Fails

From Lines to Curves

Building the polynomial model

Fitting still uses least squares

Degree 2 vs 5 vs 20 — What Happens?

How to pick the right degree

The Overfitting Problem

What overfitting looks like mathematically

The coefficient explosion

Polynomial Regression in the Real World

🎓 What You Now Know

Comments

↗ Keep Learning

Linear Regression — The Foundation of Machine Learning

Ridge & Lasso — Taming Overfitting with Regularization

Bias-Variance Tradeoff — The Most Important Concept in ML

Linear Regression — The Foundation of Machine Learning

What happens when reality isn’t linear?

When Linear Regression Fails

From Lines to Curves

Building the polynomial model

Fitting still uses least squares

Degree 2 vs 5 vs 20 — What Happens?

How to pick the right degree

The Overfitting Problem

What overfitting looks like mathematically

The coefficient explosion

Polynomial Regression in the Real World

🎓 What You Now Know

Comments

↗ Keep Learning

Linear Regression — The Foundation of Machine Learning

Ridge & Lasso — Taming Overfitting with Regularization

Bias-Variance Tradeoff — The Most Important Concept in ML

Linear Regression — The Foundation of Machine Learning

What happens when
reality isn’t linear?