Introduction 0%

Introduction

🎯 0/4 0%

📍

Your inbox knows
who, what, where, and when.

“Meeting with Dr. Sarah Chen at Google HQ in Mountain View on March 15th at 2pm.” Gmail extracts every entity — person, organization, location, date — and auto-creates a calendar event. That’s Named Entity Recognition: finding structured facts in unstructured text.

↓ Scroll to learn how machines find names in a sea of words

Entity Types

What Are Named Entities?

Standard NER identifies 4+ entity types — each highlighted differently in the text

Approaches

Three Eras of NER

NER has evolved from handcrafted rules to neural sequence labeling

↑ Answer the question above to continue ↑

A rule-based NER system uses capitalization to find names: 'Capitalize words after periods are likely names.' It processes: 'I love Paris. Great city.' What happens?

BIO Tagging

The BIO Tagging Scheme

BIO encodes entity boundaries: B=Beginning of entity, I=Inside entity, O=Outside (not an entity)

CRF Models

CRF: Why Label Dependencies Matter

CRF: jointly optimizing the full label sequence

Independent model: P(tag₁) × P(tag₂) × ... × P(tagₙ)

Each tag predicted independently — can produce invalid sequences like I-PER after B-LOC

CRF model: P(tag₁, tag₂, ..., tagₙ | x₁, x₂, ..., xₙ)

Predicts the ENTIRE tag sequence jointly — guarantees valid sequences

Score(y|x) = Σᵢ [emission(yᵢ,xᵢ) + transition(yᵢ₋₁,yᵢ)]

emission = how well tag fits this word; transition = how well tag follows previous tag

transition(B-PER → I-PER) = HIGH

Person names often span multiple words

transition(I-PER → I-LOC) = IMPOSSIBLE

A location can't continue a person entity — CRF prevents this

↑ Answer the question above to continue ↑

A simple NER model (without CRF) labels 'New York City' as: New=B-LOC, York=B-LOC, City=O. What went wrong and how does CRF fix it?

Neural NER

Modern NER: BERT + CRF

State-of-the-art NER: BERT contextual embeddings + CRF sequence labeling

↑ Answer the question above to continue ↑

The sentence is: 'I ate an apple while reading about Apple on my Apple Watch.' A BERT-based NER model should:

Applications

Real-World NER Applications

NER powers extraction features across email, search, finance, and healthcare

↑ Answer the question above to continue ↑

Your NER system tags 'Apple' as ORG and 'Cupertino' as LOC correctly. But for 'Jordan played basketball,' it tags Jordan as LOC instead of PER. What's the core problem and fix?

🎓 What You Now Know

✓ NER is sequence labeling, not classification — every token gets a label (B-PER, I-PER, O, etc.), making it fundamentally harder than document-level tasks.

✓ BIO tagging handles multi-word entities — B marks where an entity starts, I marks continuation, O marks non-entities.

✓ CRFs model label dependencies — preventing invalid sequences like I-PER following B-LOC, and helping multi-word entities stay together.

✓ BERT revolutionized NER — contextual embeddings disambiguate “apple” (fruit) vs “Apple” (company) based on surrounding words, eliminating the need for hand-crafted features.

✓ NER powers real products — Gmail auto-creating calendar events, Google’s Knowledge Graph, financial contract analysis, and clinical text mining all rely on NER.

Named Entity Recognition is how machines extract structure from chaos. Every email auto-categorized, every knowledge panel shown, every medical record digitized — NER is working quietly behind the scenes. 📍

Named Entity Recognition — Teaching Machines to Find Names in Text

Your inbox knows
who, what, where, and when.

What Are Named Entities?

Three Eras of NER

The BIO Tagging Scheme

CRF: Why Label Dependencies Matter

CRF: jointly optimizing the full label sequence

Modern NER: BERT + CRF

Real-World NER Applications

🎓 What You Now Know

Comments

↗ Keep Learning

Text Preprocessing — Turning Messy Words into Clean Features

Text Classification — Teaching Machines to Sort Your Inbox

Word Embeddings — When Words Learned to Be Vectors

Query Understanding — What Did the User Actually Mean?

Text Preprocessing — Turning Messy Words into Clean Features

Your inbox knows who, what, where, and when.

What Are Named Entities?

Three Eras of NER

The BIO Tagging Scheme

CRF: Why Label Dependencies Matter

CRF: jointly optimizing the full label sequence

Modern NER: BERT + CRF

Real-World NER Applications

🎓 What You Now Know

Comments

↗ Keep Learning

Text Preprocessing — Turning Messy Words into Clean Features

Text Classification — Teaching Machines to Sort Your Inbox

Word Embeddings — When Words Learned to Be Vectors

Query Understanding — What Did the User Actually Mean?

Text Preprocessing — Turning Messy Words into Clean Features

Your inbox knows
who, what, where, and when.