Intelligence
A database is a promise about durability and consistency; ML is a bet that patterns in old data predict new data.
On this page
The working table of contents.
- What ML actually is — instead of writing rules, you show examples and let the machine find the pattern. Training = finding the pattern. Inference = using it on new data.
- The three learning modes — supervised (labeled examples: "this image is a cat"), unsupervised (no labels: "find clusters in this data"), reinforcement (trial and error: "maximize this score").
- Neural networks — layers of simple math functions that, stacked deep enough, can approximate surprisingly complex patterns. The key mechanism: backpropagation (adjust weights based on how wrong you were).
- The architectures that matter now — CNN (spatial patterns: images), RNN/LSTM (sequential patterns: time-series), Transformer (attention: the architecture behind LLMs). Why attention changed everything (process all positions at once, not one by one).
- LLMs — what they are (giant transformers trained on internet text to predict the next token), what they can and can't do, the training stack (pre-training → fine-tuning → RLHF → deployment), prompt engineering as the new interface.
- RAG — when the model doesn't know something, give it relevant documents at query time. Embeddings, vector search, retrieval, generation.
- Agents — LLMs that can use tools, make plans, and take actions. The loop: think → act → observe → think again.
- The honest limits — hallucination, evaluation difficulty, cost, data quality as the real bottleneck, the gap between demo and production.
Going deeper
Branches that earn their own article.
- Classical ML algorithms (linear/logistic regression, decision trees, random forests, SVMs, k-means, PCA).
- Deep learning math (loss functions, optimizers — SGD, Adam, learning rate schedules).
- CNN architectures (ResNet, EfficientNet, Vision Transformer).
- RNN/LSTM/GRU deep dives.
- Transformer internals (multi-head attention, positional encoding, layer norm).
- Diffusion models (Stable Diffusion, DALL-E).
- Training infrastructure (distributed training, mixed precision, DeepSpeed, FSDL).
- Fine-tuning methods (LoRA, QLoRA, prefix tuning, RLHF, DPO).
- Inference optimization (quantization, distillation, speculative decoding, KV cache).
- Evaluation frameworks and benchmarks.
- MLOps (model registries, feature stores, experiment tracking, monitoring).
- Responsible AI (bias, fairness, interpretability, red-teaming).
- Multimodal models (vision-language, audio-language).