Intelligence

A database is a promise about durability and consistency; ML is a bet that patterns in old data predict new data.

On this page

The working table of contents.

  1. What ML actually is — instead of writing rules, you show examples and let the machine find the pattern. Training = finding the pattern. Inference = using it on new data.
  2. The three learning modes — supervised (labeled examples: "this image is a cat"), unsupervised (no labels: "find clusters in this data"), reinforcement (trial and error: "maximize this score").
  3. Neural networks — layers of simple math functions that, stacked deep enough, can approximate surprisingly complex patterns. The key mechanism: backpropagation (adjust weights based on how wrong you were).
  4. The architectures that matter now — CNN (spatial patterns: images), RNN/LSTM (sequential patterns: time-series), Transformer (attention: the architecture behind LLMs). Why attention changed everything (process all positions at once, not one by one).
  5. LLMs — what they are (giant transformers trained on internet text to predict the next token), what they can and can't do, the training stack (pre-training → fine-tuning → RLHF → deployment), prompt engineering as the new interface.
  6. RAG — when the model doesn't know something, give it relevant documents at query time. Embeddings, vector search, retrieval, generation.
  7. Agents — LLMs that can use tools, make plans, and take actions. The loop: think → act → observe → think again.
  8. The honest limits — hallucination, evaluation difficulty, cost, data quality as the real bottleneck, the gap between demo and production.
Going deeper

Branches that earn their own article.

  • Classical ML algorithms (linear/logistic regression, decision trees, random forests, SVMs, k-means, PCA).
  • Deep learning math (loss functions, optimizers — SGD, Adam, learning rate schedules).
  • CNN architectures (ResNet, EfficientNet, Vision Transformer).
  • RNN/LSTM/GRU deep dives.
  • Transformer internals (multi-head attention, positional encoding, layer norm).
  • Diffusion models (Stable Diffusion, DALL-E).
  • Training infrastructure (distributed training, mixed precision, DeepSpeed, FSDL).
  • Fine-tuning methods (LoRA, QLoRA, prefix tuning, RLHF, DPO).
  • Inference optimization (quantization, distillation, speculative decoding, KV cache).
  • Evaluation frameworks and benchmarks.
  • MLOps (model registries, feature stores, experiment tracking, monitoring).
  • Responsible AI (bias, fairness, interpretability, red-teaming).
  • Multimodal models (vision-language, audio-language).