Showing 53 articles
No articles found
Try adjusting your search or category filter.
Confusion Matrix Deep Dive — What Your Model Gets Wrong and Why
A scroll-driven deep dive into the confusion matrix. Master TP, TN, FP, FN, and learn to derive every classification metric from a single 2×2 table.
Language Models & N-grams — Predicting the Next Word Since 1948
A scroll-driven visual deep dive into statistical language models. From Shannon's information theory to N-grams to the bridge to transformers — learn how autocomplete, spell check, and speech recognition all predict what comes next.
ROC Curves & AUC — Measuring Classifier Performance Visually
A scroll-driven visual deep dive into ROC curves and AUC. Learn TPR vs FPR, why AUC is threshold-independent, and when to use ROC vs PR curves.
Topic Modeling — Discovering Hidden Themes in Millions of Documents
A scroll-driven visual deep dive into topic modeling. From LDA to NMF to neural topic models — learn how search engines and email services automatically categorize, cluster, and label unstructured text at scale.
Accuracy, Precision, Recall & F1 — Choosing the Right Metric
A scroll-driven visual deep dive into classification metrics. Learn why accuracy misleads, what precision and recall actually measure, and when to use F1, F2, or something else entirely.
Named Entity Recognition — Teaching Machines to Find Names in Text
A scroll-driven visual deep dive into Named Entity Recognition (NER). From rule-based to CRF to transformer-based approaches — learn how search engines and email services extract people, places, companies, and dates from unstructured text.
Feature Engineering — The Art That Makes or Breaks Your Model
A scroll-driven visual deep dive into feature engineering. Learn transformations, encoding, interaction features, handling missing data, and why feature engineering matters more than model choice.
Text Similarity — From Jaccard to Neural Matching
A scroll-driven visual deep dive into text similarity. Learn how search engines detect duplicates, match queries to documents, and measure how 'close' two texts really are — from set overlap to cosine similarity to learned embeddings.
Cross-Validation & Hyperparameter Tuning — How to Actually Evaluate Models
A scroll-driven visual deep dive into cross-validation and hyperparameter tuning. Learn K-fold CV, stratified splitting, grid search, random search, and Bayesian optimization.
Query Understanding — What Did the User Actually Mean?
A scroll-driven visual deep dive into query understanding. From spell correction to query expansion to intent classification — learn how search engines interpret ambiguous, misspelled, and complex queries.
Bias-Variance Tradeoff — The Most Important Concept in ML
A scroll-driven visual deep dive into the bias-variance tradeoff. Learn why every model makes errors, how underfitting and overfitting emerge, and how to balance them.
Information Retrieval — How Search Engines Find Your Needle in a Billion Haystacks
A scroll-driven visual deep dive into information retrieval. From inverted indices to BM25 to learning-to-rank — learn how Google, Bing, and enterprise search find the most relevant documents in milliseconds.
Search Reranking — The Two-Stage Pipeline That Powers Production Search
A visual deep dive into the retrieve + rerank pipeline. How BM25, dense retrieval, and learned sparse retrieval feed into Reciprocal Rank Fusion, then cross-encoder reranking — with a full latency budget breakdown.
Bagging vs Boosting — The Two Philosophies of Ensemble Learning
A scroll-driven visual deep dive comparing bagging and boosting. Learn when to average independent models vs sequentially correct errors, and why ensembles dominate ML.
Sentiment Analysis — Reading Between the Lines at Scale
A scroll-driven visual deep dive into sentiment analysis. Learn how machines detect opinion, sarcasm, and emotion in text — from star ratings to brand monitoring to Gmail's tone detection.
Cross-Encoders vs Bi-Encoders — Why Accuracy Costs 1000× More Compute
A visual deep dive into the architectural difference between bi-encoders and cross-encoders. Why cross-attention produces higher-quality relevance scores, and why cross-encoders can only be used for reranking — never for retrieval.
DBSCAN — Finding Clusters of Any Shape
A scroll-driven visual deep dive into DBSCAN. Learn density-based clustering, core/border/noise points, choosing epsilon and minPts, and when DBSCAN beats K-Means.
Spam Detection — The Original ML Success Story
A scroll-driven visual deep dive into spam detection. From Bayesian filters to modern adversarial ML — learn how email services block 15 billion spam messages daily and why spammers keep finding ways around it.
BM25 — The 30-Year-Old Algorithm That Still Wins at Search
A visual deep dive into BM25 scoring. Understand every term in the formula — IDF, TF saturation via k₁, length normalization via b — and why BM25 still outperforms neural retrievers on specialized vocabularies.
PCA — Compressing Reality Without Losing the Plot
A scroll-driven visual deep dive into Principal Component Analysis. Learn eigenvectors, variance maximization, dimensionality reduction, and when PCA transforms your data — and when it doesn't.
Text Classification — Teaching Machines to Sort Your Inbox
A scroll-driven visual deep dive into text classification. From spam filters to Gmail's categories — learn how ML models read text, extract features, and assign labels at scale.
HNSW — Hierarchical Navigable Small World Graphs for Vector Search
A visual deep dive into HNSW, the dominant graph-based index for vector search. Understand the multi-layer graph structure, the M and ef parameters, complexity analysis, and how it compares to IVF, LSH, and brute force.
K-Means Clustering — Grouping Data Without Labels
A scroll-driven visual deep dive into K-Means clustering. Learn the iterative algorithm, choosing K with the elbow method, limitations, and when to use alternatives.
Word Embeddings — When Words Learned to Be Vectors
A scroll-driven visual deep dive into word embeddings. Learn how Word2Vec, GloVe, and FastText turn words into dense vectors where meaning becomes geometry — and why 'king - man + woman = queen' actually works.
IVF Index — Partitioning Vector Space with K-Means for Fast Search
A visual deep dive into the Inverted File Index (IVF). How k-means clustering partitions billion-scale vector collections into searchable regions, the nprobe recall-speed knob, and IVF complexity analysis.
Gradient Boosting & XGBoost — The Kaggle King
A scroll-driven visual deep dive into gradient boosting. Learn how weak learners combine sequentially, how XGBoost optimizes the process, and why it dominates tabular ML competitions.
Bag of Words & TF-IDF — How Search Engines Ranked Before AI
A scroll-driven visual deep dive into Bag of Words and TF-IDF. Learn how documents become vectors, why term frequency alone fails, and how IDF rescues relevance — the backbone of search before neural models.
Approximate Nearest Neighbor Search — Trading 1% Accuracy for 1000× Speed
A visual deep dive into ANN search. Why brute-force nearest neighbor fails at scale, how approximate methods achieve 99% recall with logarithmic query time, and the fundamental accuracy-speed tradeoff behind every vector search system.
Hash Tables — O(1) Access That Powers Everything
A scroll-driven visual deep dive into hash tables. From hash functions to collision resolution (chaining vs open addressing) to load factors — understand the data structure behind caches, databases, and deduplication.
Linked Lists — Pointers, Flexibility, and the Tradeoffs vs Arrays
A scroll-driven visual deep dive into linked lists. Singly, doubly, and circular variants. Understand pointer manipulation, O(1) insertion/deletion, and real-world uses in LRU caches, OS schedulers, and undo systems.
Sorting Algorithms — From Bubble Sort to Radix Sort
A scroll-driven visual deep dive into sorting algorithms. Compare bubble, insertion, merge, quick, heap, counting, and radix sort — with complexity analysis, stability, and real-world applications in databases, graphics, and event systems.
String Manipulation — Patterns, Hashing, and Sliding Windows
A scroll-driven visual deep dive into string algorithms. From brute-force pattern matching to KMP, Rabin-Karp hashing, and sliding window techniques — with real-world applications in search engines, DNA analysis, and parsers.
Trees — Hierarchies, Traversals, and Balanced Search
A scroll-driven visual deep dive into tree data structures. Binary trees, BSTs, AVL trees, heaps, and B-trees — with real-world applications in databases, file systems, priority queues, and decision engines.
Tries — Prefix Trees for Autocomplete, Spell Check, and IP Routing
A scroll-driven visual deep dive into trie data structures. From basic prefix trees to compressed tries and radix trees — with real-world applications in autocomplete, spell checking, IP routing tables, and T9 keyboard input.
Random Forests — Why 1000 Bad Models Beat 1 Good One
A scroll-driven visual deep dive into Random Forests. Learn bagging, feature randomness, out-of-bag error, and why ensembles are the most reliable ML technique.
Text Preprocessing — Turning Messy Words into Clean Features
A scroll-driven visual deep dive into text preprocessing. Learn tokenization, stemming, lemmatization, stopword removal, and normalization — the essential first step of every NLP pipeline.
Decision Trees — How Machines Learn to Ask Questions
A scroll-driven visual deep dive into decision trees. Learn how trees split data, what Gini impurity and information gain mean, and why trees overfit like crazy.
MSE, MAE, R-Squared and Beyond — Regression Metrics That Actually Matter
A scroll-driven deep dive into regression metrics. Understand MSE, RMSE, MAE, MAPE, R-squared, and Adjusted R-squared — when to use each, their gotchas, and how to report results properly.
Support Vector Machines — Finding the Perfect Boundary
A scroll-driven visual deep dive into SVMs. Learn about maximum margin, the kernel trick, support vectors, and why SVMs dominated ML before deep learning.
Naive Bayes — Why 'Stupid' Assumptions Work Brilliantly
A scroll-driven visual deep dive into Naive Bayes. Learn Bayes' theorem, why the 'naive' independence assumption is wrong but works anyway, and why it dominates spam filtering.
K-Nearest Neighbors — The Algorithm with No Training Step
A scroll-driven visual deep dive into KNN. Learn how the laziest algorithm in ML works, why distance metrics matter, and how the curse of dimensionality kills it.
Logistic Regression — The Classifier That's Not Really Regression
A scroll-driven visual deep dive into logistic regression. Learn how a regression model becomes a classifier, why the sigmoid is the key, and how log-loss trains it.
Ridge & Lasso — Taming Overfitting with Regularization
A scroll-driven visual deep dive into Ridge and Lasso regression. Learn why models overfit, how penalizing large weights fixes it, and why Lasso kills features.
Polynomial Regression — When Lines Aren't Enough
A scroll-driven visual deep dive into polynomial regression. See why straight lines fail, how curves capture nonlinear patterns, and when you're overfitting vs underfitting.
Flash Attention — Making Transformers Actually Fast
A scroll-driven visual deep dive into Flash Attention. Learn why standard attention is broken, how GPU memory works, and how tiling fixes everything — with quizzes to test your understanding.
Speculative Decoding — Making LLMs Think Ahead
A scroll-driven visual deep dive into speculative decoding. Learn why LLM inference is slow, how a small 'draft' model can speed up a big model by 2-3x, and why the output is mathematically identical.
Hello World — Why I'm Building This Site
A quick intro to who I am, why I decided to build this portfolio, and what you can expect to find here.
Transformers — The Architecture That Changed AI
A scroll-driven visual deep dive into the Transformer architecture. From RNNs to self-attention to GPT — understand the engine behind every modern AI model.
Linear Regression — The Foundation of Machine Learning
A scroll-driven visual deep dive into linear regression. From data points to loss functions to gradient descent — understand the building block behind all of ML.
RLHF — How AI Learns to Follow Human Instructions
A visual deep dive into Reinforcement Learning from Human Feedback. From pretraining to reward models to PPO — understand how ChatGPT went from autocomplete to assistant.
Caching — The Art of Remembering What's Expensive to Compute
A visual deep dive into caching. From CPU caches to CDNs — understand cache strategies, eviction policies, and the hardest problem in computer science: cache invalidation.
Database Sharding — Scaling Beyond One Machine
A visual deep dive into database sharding. From single-server bottlenecks to consistent hashing — understand how companies scale their databases to billions of rows.
Vector Databases — Search by Meaning, Not Keywords
A visual deep dive into vector databases. From embeddings to ANN search to HNSW — understand how AI-powered search finds what you actually mean, not just what you typed.