Introduction 0%

Introduction

🎯 0/4 0%

Classification Fundamentals

How do you predict
yes or no?

Will this email be spam? Will this patient have diabetes? Will this user click? Regression gives numbers. Classification gives answers.

The Sigmoid

From Numbers to Probabilities

Logistic regression in three steps

z = w₁x₁ + w₂x₂ + ... + b

Step 1: Compute a linear score — same as linear regression

σ(z) = 1 / (1 + e⁻ᶻ)

Step 2: Squash through the sigmoid — output is now between 0 and 1

ŷ = 1 if σ(z) ≥ 0.5, else 0

Step 3: Threshold the probability to get a class prediction

The sigmoid squashes any number into (0, 1). Large positive → 1, large negative → 0, zero → 0.5

↑ Answer the question above to continue ↑

What's the output range of the sigmoid function σ(z)?

Decision Boundary

The Decision Boundary

2D example: the model learns a line. Everything above → Class 1, everything below → Class 0.

↑ Answer the question above to continue ↑

Logistic regression's decision boundary is always a:

Training

How We Train It: Log-Loss

Binary cross-entropy (log-loss)

📉 Log-Loss = ✅ When y = 1 + ❌ When y = 0

✅ When y = 1

If the true class is 1, loss equals −log(ŷ). A confident correct prediction (ŷ ≈ 0.99) costs nearly zero, but a confident wrong prediction (ŷ ≈ 0.01) costs ~4.6 — exponentially harsh punishment.

−y · log(ŷ)

❌ When y = 0

If the true class is 0, loss equals −log(1−ŷ). The model is penalized for assigning high probability to the wrong class — the more confident the mistake, the steeper the penalty.

−(1−y) · log(1−ŷ)

↑ Answer the question above to continue ↑

A logistic regression model predicts 0.99 probability for a sample that is actually class 0. What's the log-loss for this sample?

Multi-class

Beyond Binary: Multi-class Classification

Softmax for multi-class

z₁, z₂, ..., zₖ = raw scores for K classes

Each class gets its own weight vector → its own score

P(class k) = eᶻᵏ / Σⱼ eᶻʲ

Softmax: exponentiate and normalize. All probabilities sum to 1.

Prediction = argmax(P(class 1), ..., P(class K))

Pick the class with highest probability

↑ Answer the question above to continue ↑

In logistic regression with 5 classes, how many weight vectors does the model learn?

In Practice

When to Use Logistic Regression

Logistic regression: deceptively powerful for the right problems

🎓 What You Now Know

✓ Logistic regression = linear model + sigmoid — Squash z ∈ (-∞,∞) into p ∈ (0,1).

✓ Decision boundary is always linear — A hyperplane where σ(z) = 0.5.

✓ Train with log-loss, not MSE — Cross-entropy is convex and punishes confident mistakes.

✓ Multi-class uses softmax — K scores → K probabilities that sum to 1.

✓ Always start with logistic regression — It’s fast, interpretable, and sets a strong baseline.

Logistic regression is the workhorse of classification. It’s used everywhere — spam detection, medical diagnosis, click-through prediction, credit scoring. And the sigmoid + cross-entropy combination is the same output layer used in neural networks. You just learned a neural net’s final layer. 🚀

Logistic Regression — The Classifier That's Not Really Regression

How do you predict
yes or no?

From Numbers to Probabilities

Logistic regression in three steps

The Decision Boundary

How We Train It: Log-Loss

Binary cross-entropy (log-loss)

Beyond Binary: Multi-class Classification

Softmax for multi-class

When to Use Logistic Regression

🎓 What You Now Know

Comments

↗ Keep Learning

Linear Regression — The Foundation of Machine Learning

Accuracy, Precision, Recall & F1 — Choosing the Right Metric

ROC Curves & AUC — Measuring Classifier Performance Visually

Naive Bayes — Why 'Stupid' Assumptions Work Brilliantly

Linear Regression — The Foundation of Machine Learning

How do you predict yes or no?

From Numbers to Probabilities

Logistic regression in three steps

The Decision Boundary

How We Train It: Log-Loss

Binary cross-entropy (log-loss)

Beyond Binary: Multi-class Classification

Softmax for multi-class

When to Use Logistic Regression

🎓 What You Now Know

Comments

↗ Keep Learning

Linear Regression — The Foundation of Machine Learning

Accuracy, Precision, Recall & F1 — Choosing the Right Metric

ROC Curves & AUC — Measuring Classifier Performance Visually

Naive Bayes — Why 'Stupid' Assumptions Work Brilliantly

Linear Regression — The Foundation of Machine Learning

How do you predict
yes or no?