Handbook · Digital · Software

Security & Cryptography

Security & Cryptography··39 min read

TL;DR

Every system that carries anything valuable — money, personal data, private messages, the integrity of a software update — has adversaries. An attacker wants to read data they should not see, alter data they should not change, impersonate someone they are not, or prevent legitimate use. Security is the practice of building systems where those attacks either cannot succeed or are too expensive to be worth attempting. Cryptography is the mathematical toolkit that makes many of those bounds rigorous — it turns "the attacker will probably fail" into "the attacker succeeds with probability less than 1 in 2¹²⁸."

Security sits on top of a small number of agreements. You need to know what you are protecting (assets), from whom (adversaries), and what property you are claiming (confidentiality — only authorised parties can read; integrity — the data is not tampered with; availability — legitimate users can still use the system; authenticity — the sender is who they claim to be; non-repudiation — the sender cannot later deny sending). These claims are the language of a threat model, and every cryptographic primitive is a specific bound under a specific adversary assumption. Choose the wrong assumption and the math is meaningless.

This handbook walks the stack from the discipline of thinking about attackers (threat modelling) through the primitives that implement the guarantees (symmetric encryption, hashes, MACs, asymmetric cryptography, key exchange), to the protocols that compose them (TLS, WebAuthn, OAuth, OIDC), to the infrastructure that keeps keys safe (PKI, HSMs, KMS). Each station explains what the primitive is, what problem it solves, and — critically — what it does not promise. The most common source of real-world breaches is not broken cryptography; it is well-chosen primitives used under the wrong threat model.

You will be able to

  • Explain, in plain language, what confidentiality, integrity, authenticity, and non-repudiation each mean, and why they are different properties.
  • Write a threat model that names the assets, the adversaries, and the properties you claim.
  • Pick the right symmetric, asymmetric, hash, MAC, KDF, and signature primitive for a given problem — and say what each one promises and what it does not.
  • Read a TLS 1.3 handshake transcript and explain where identity is established, where session keys come from, and where forward secrecy is preserved.
  • Point at the station where any given CVE category lives (weak randomness, reused nonce, bad certificate validation, missing KDF, leaked key).

The Map

Rendering diagram…

Read the graph twice. First bottom-up: every protocol is a composition of primitives, every primitive is a bound under an assumption. Second top-down: every bound leaks when an implementation choice weakens the assumption. The attack path that worked yesterday is always a gap between the math and the code.

Station 1 — Threat modelling

Before you pick a cryptographic primitive, a firewall rule, or an authentication scheme, you have to know what you are defending, from whom, and what "defence succeeded" looks like. Threat modelling is the discipline of writing that down in enough detail that the security choices you later make are checkable against something. Without a threat model, security decisions reduce to superstition — "we should use AES" without being able to say what AES is supposed to prevent.

A useful threat model names four things. The assets — the data or capability you care about (user passwords, session cookies, private keys, the ability to deploy code). The adversaries — who might attack, with what resources (a random internet user, a state actor, an insider with cloud-console access). The properties — what you claim to guarantee (confidentiality, integrity, availability, authenticity, non-repudiation). And the assumptions — what you are trusting without verification (that the OS is uncompromised, that the hardware random number generator works, that a specific CA is trustworthy). Break any assumption and the guarantees evaporate.

Frameworks like STRIDE (Spoofing / Tampering / Repudiation / Information disclosure / Denial of service / Elevation of privilege) give you a checklist to walk through. They do not replace thinking; they help you avoid missing an entire category.

You cannot defend what you have not named. A threat model is a short document that states the assets (what has value), the adversaries (who might attack, with what capabilities and motivations), the trust boundaries (where data crosses from one trust domain to another), and the security properties you claim (confidentiality, integrity, availability, authenticity, non-repudiation, privacy). The moment an engineer cannot answer "what are we defending, against whom?" the cryptography below becomes decoration.

  STRIDE — a checklist of what can go wrong at each trust boundary
  (Microsoft, Schneier-attributed, standard vocabulary):

    S  Spoofing             → violates authenticity
    T  Tampering            → violates integrity
    R  Repudiation          → violates non-repudiation
    I  Information disclosure → violates confidentiality
    D  Denial of service    → violates availability
    E  Elevation of privilege → violates authorization

  Applied: for every trust-boundary crossing (user → API, API → DB,
  service → service), enumerate which letters apply and which
  control addresses each one.
  • Properties are not symmetric. Confidentiality is protected by encryption. Integrity by MACs or authenticated encryption. Authenticity by signatures or MACs with identity-bound keys. Availability by capacity, redundancy, rate limits. Non-repudiation specifically by asymmetric signatures (MACs cannot prove which party with the shared key signed).
  • Adversary capabilities follow a small taxonomy — passive (sniffs), active (modifies), insider (has some legitimate access), supply chain (controls a dependency you trust), physical (has the device). Different primitives defend different rows; no primitive defends all of them.
  • Threat budgets: the standard computational bound for a 128-bit security level is "≤2^128 operations" — about a billion chips running for the age of the universe. 256-bit security is the quantum-tolerant headline. Bits of security are a comparative number for primitives of different shapes (key size for symmetric, group size for asymmetric, digest size for hashing).
  • Assumptions decay. MD5 and SHA-1 were designed with "we don't know how to find collisions" as the assumption; the assumption broke. Every primitive that relies on a discrete-log or integer-factoring hardness assumption will break against a sufficiently large quantum computer — which is why the post-quantum KEMs matter now, not in 2040.

The model you want: security is a property of a system under a stated threat model; without the threat model, "secure" is a marketing adjective. Pin the model down in prose before you pick primitives.

CAUTION

Threat models drift silently. A service that started as internal (trust boundary: the VPC) acquires a public endpoint on a Friday and nobody updates the model. Review threat models on schema changes, on new integrations, and on any change that moves data across a boundary — not once at launch.

The wider picture. Threat modelling is a practice, not a one-off document:

  • Frameworks — STRIDE (Microsoft, per-component), DREAD (risk scoring — debated), PASTA (attack-simulation), LINDDUN (privacy-focused), ATT&CK (MITRE, real attacker behaviours), kill chains.
  • Saltzer & Schroeder principles (1975) — least privilege, economy of mechanism, complete mediation, fail-safe defaults, separation of privilege, open design, least common mechanism, psychological acceptability. Still the foundation of secure design.
  • Data-flow diagrams (DFDs) and trust boundaries — where the threats actually get drawn.
  • Abuse cases and misuse cases — the pessimistic counterpart to use cases.
  • Red-team / blue-team / purple-team — dedicated attackers, defenders, and their coordination.
  • Security design reviews and "pre-mortems" — applied threat modelling in the dev cycle.
  • Regulatory threat models — GDPR DPIAs, HIPAA risk analyses, SOC 2 controls. Formal cousins of the same practice.
  • Supply-chain threat modelling — SLSA, SBOMs, dependency confusion, typosquatting. Cross-link to the Engineering Craft handbook.

Where this shows up next. Every later station is a concrete answer to a threat-model question: Station 2 authenticates, Station 3 confidentialises, Stations 4–5 authenticate and sign, Station 6 puts it all on the wire, Stations 7–8 keep the keys safe.

Go deeper: Shostack, Threat Modeling: Designing for Security — the definitive friendly treatment of STRIDE and trust-boundary diagrams; Saltzer & Schroeder, "The Protection of Information in Computer Systems" (Proc. IEEE 1975); NIST SP 800-154 "Guide to Data-Centric System Threat Modeling"; Ross Anderson, Security Engineering (3rd ed) chapter 2 — the single best security-in-general book.

Station 2 — Identity and authentication: OIDC, OAuth 2, WebAuthn

Before a system can decide what a caller is allowed to do (authorization), it must know who the caller is (authentication), and that distinction is the single biggest source of confusion in security design. Authentication answers "who are you and how did you prove it?" Authorization answers "given that you are who you say you are, what may you do?" Two different questions, two different mechanisms.

Authentication on the modern web is built on three standards that compose. OAuth 2.0 is a delegation protocol — how one application can act on behalf of a user without learning the user's password. It does not itself authenticate the user; it issues access tokens to third-party applications. OpenID Connect (OIDC) is OAuth 2.0 plus an ID token that attests who the user is. WebAuthn / FIDO2 replaces passwords entirely with public-key cryptography: the user's device holds a private key, the server stores the public key, and authentication is a signature challenge — there is nothing phishable to steal. These three make up the modern single-sign-on (SSO) stack most people log in through every day without knowing.

Before anything is encrypted, who is on the other end of the line needs an answer. Authentication says who the other party is. Authorization says what they may do. Modern systems use three overlapping standards: OAuth 2.0 for delegated authorization (machine to machine, user-consented third-party access), OpenID Connect layered on top for user authentication, and WebAuthn / FIDO2 for phishing-resistant user authentication with public-key cryptography.

Rendering diagram…
  • OIDC returns an id_token (a signed JWT with sub, iss, aud, exp) that proves who the user is, plus an access_token that the RP uses against the resource API. The authorization-code flow with PKCE (RFC 7636) is the current default for every client that isn't a confidential server; the implicit flow is deprecated.
  • WebAuthn / passkeys (W3C 2019, FIDO2) replace passwords with per-site public-key pairs. The user's authenticator (phone, YubiKey, Touch ID) generates a keypair per site, signs a fresh challenge per login; the server stores only the public key. Phishing-proof because the signed challenge binds the origin (rp.id) — a fake site cannot reuse a real signature. Now shipping in ~all major platforms.
  • JWT (JSON Web Token, RFC 7519) is a compact signed (JWS) or encrypted (JWE) token with a header, claims, and signature — base64url(header).base64url(claims).base64url(signature). Symmetric (HS256) or asymmetric (RS256, ES256, EdDSA) signatures. Always verify iss, aud, exp, and alg — tokens with alg: none were a 2015 disaster when libraries accepted them.
  • Session management is orthogonal to authentication. After the id_token is verified, the RP typically issues a session cookie with Secure, HttpOnly, SameSite=Lax and a server-side session store. Access tokens in cookies invite CSRF; access tokens in localStorage invite XSS — neither is unambiguously safer, and both need their own mitigation stack.

The model you want: authentication is a cryptographic binding between an identity claim and a freshly-signed challenge; authorization is a policy evaluation against that identity's claims. If either is reduced to "the server trusts this string because the client sent it," you've rebuilt password-at-rest badly.

WARNING

"We'll just validate JWTs in the gateway" without checking the aud claim lets one service's token work against another service sharing the same issuer. OIDC's aud exists precisely to prevent cross-audience replay. Read RFC 8725 ("JWT BCP") before shipping a bearer-token architecture.

Go deeper: RFCs 6749 (OAuth 2.0), 6750 (Bearer tokens), 7519 (JWT), 7636 (PKCE), 8725 (JWT BCP), 9068 (JWT profile for access tokens); the OpenID Connect Core 1.0 spec; WebAuthn Level 3 (W3C); Okta/Auth0's OAuth playground for a concrete token-by-token walkthrough.

Station 3 — Symmetric crypto: AES-GCM and ChaCha20-Poly1305

The oldest and most common cryptographic task is keeping a message secret between two parties who already share a key. Symmetric encryption is the name for that: the same key encrypts and decrypts. Given a key and a plaintext, the encryption function produces a ciphertext that looks indistinguishable from random bytes without the key; given the key and the ciphertext, the decryption function returns the original. Symmetric crypto is fast (gigabytes per second on one CPU core) but has one problem: how do the two parties get the shared key in the first place? That problem is what Station 5 (asymmetric crypto) exists to solve.

There are two building blocks. A block cipher (AES is the canonical one, NIST FIPS 197, 2001) encrypts one fixed-size block (128 bits for AES) at a time. A stream cipher (ChaCha20) produces a keystream of pseudo-random bytes that are XOR'd with the plaintext. On their own, neither is safe to use — you need a mode of operation that turns one block at a time into a secure way to encrypt a whole message, ideally an AEAD (Authenticated Encryption with Associated Data) mode that provides both confidentiality and integrity in one operation.

The modern correct choices are AES-GCM (AES in Galois/Counter Mode) on hardware with AES-NI, and ChaCha20-Poly1305 on software-only platforms (mobile, embedded). Both are AEADs. Both take a key, a nonce (a number used once), a plaintext, and optional associated data, and return a ciphertext + authentication tag. Misuse is almost always in the nonce: reuse one with the same key and GCM's security collapses entirely.

A symmetric cipher encrypts with the same key that decrypts. The two modern ones you should ever pick are AES-GCM and ChaCha20-Poly1305 — both are AEAD (Authenticated Encryption with Associated Data) constructions, meaning a single call produces a ciphertext plus a tag that proves neither was modified. Non-authenticated encryption (AES-CBC, CTR by itself) is a footgun; use AEAD and never roll your own.

  AES (Rijndael, Daemen & Rijmen, 2001 as FIPS 197):
    block size       128 bits
    key sizes        128 / 192 / 256 bits
    rounds           10 / 12 / 14 (SubBytes, ShiftRows, MixColumns, AddRoundKey)
    hardware         AES-NI on x86 since 2010, dedicated on ARMv8 and RISC-V crypto ext

  GCM (Galois/Counter Mode, McGrew & Viega, SP 800-38D):
    nonce            96 bits (typical); MUST be unique per (key, message)
    tag              128 bits (truncation allowed down to 64, rarely safe)
    mechanism        CTR for confidentiality + GHASH (GF(2^128)) for authentication

  ChaCha20-Poly1305 (Bernstein, RFC 8439):
    stream cipher    20-round ARX, no lookup tables → constant-time by construction
    256-bit key, 96-bit nonce
    Poly1305 MAC     129-bit polynomial evaluation, ~3 GB/s on modern CPUs
    ship on          ARM mobile, where AES-NI isn't guaranteed and side channels bite
  • Nonce reuse catastrophe: GCM and ChaCha20-Poly1305 both assume unique nonces per key. Reuse the same (key, nonce) pair on two messages and an attacker can recover plaintext XOR and forge tags — a classical, cataclysmic break. AES-GCM-SIV (RFC 8452) is nonce-misuse-resistant; use it when nonce uniqueness cannot be guaranteed (distributed systems without a reliable counter).
  • Constant-time implementation matters. Classic AES lookup-table implementations leak key bits through cache timing (Bernstein 2005; the AES-NI instruction set exists largely to fix this). Any crypto code that branches on a secret or indexes a table by a secret is a potential timing side channel.
  • Block cipher modes not to use as primitives in 2026: ECB (reveals patterns — the famous Tux-penguin image), CBC without a MAC (malleable, padding-oracle attacks), CTR without a MAC (malleable). If the library forces you to build MAC-then-encrypt vs encrypt-then-MAC, use encrypt-then-MAC and compare MACs in constant time (hmac.compare_digest, subtle.ConstantTimeCompare).
  • Authenticated-data (AD) slots in AEAD are for context you do not want to encrypt but do want bound to the ciphertext — a user ID, a message index, a record type. An attacker cannot rebind a ciphertext to a different AD without the tag failing.

The model you want: the unit of symmetric encryption is an AEAD call with a key, a nonce, a plaintext, and (optional) associated data; the ciphertext plus tag is a single bundle that decrypts only if the key, nonce, and AD match. Everything downstream is key management (Station 8) and protocol plumbing (Station 6).

WARNING

"256-bit AES is more secure than 128-bit AES for the same work" is mostly a marketing reflex. A 128-bit key is ~2^128 work to brute-force — already unreachable. 256 matters against a future quantum attacker, where Grover's algorithm halves the exponent (2^128 quantum work); it does not matter against a classical one. Pick 128 for speed or 256 for post-quantum headroom, not because "bigger is better."

Go deeper: NIST FIPS 197 (AES); NIST SP 800-38D (GCM); RFC 8439 (ChaCha20-Poly1305); RFC 8452 (AES-GCM-SIV); Rogaway, "Evaluation of Some Blockcipher Modes of Operation" (2011); Bernstein's "Cache-timing attacks on AES" (2005); Ferguson, Schneier & Kohno, Cryptography Engineering chapters 4 and 8.

Station 4 — Hashes, MACs, KDFs, and password hashing

Three cryptographic primitives that are often confused. A cryptographic hash (SHA-256, SHA-3, BLAKE3) takes an arbitrary input and produces a fixed-size fingerprint such that it is computationally infeasible to find two inputs with the same output (collision resistance) or to invert the function (preimage resistance). A hash is unkeyed — anyone can compute it over any input. A MAC (Message Authentication Code) is keyed — HMAC-SHA256, Poly1305 — and produces a tag that proves both that the message came from someone who knows the key and that it has not been altered. A KDF (Key Derivation Function) — HKDF, Argon2, PBKDF2, scrypt — turns a low-entropy secret (a password, a Diffie-Hellman shared value) into one or more cryptographically strong keys.

Using the wrong one is a common and catastrophic mistake. A plain hash of a password is not password hashing — hashes are designed to be fast, and an attacker with the hashed password can brute-force ten billion candidates per second on a modern GPU. Password hashing needs a deliberately slow KDF like Argon2id that takes memory and time parameters so attackers' GPU advantages are blunted. Similarly, using a plain hash where integrity matters (a MAC is needed) lets an attacker append-and-recompute; you need a keyed construction.

A cryptographic hash maps arbitrary bytes to a fixed-size digest with three properties: preimage resistance, second-preimage resistance, collision resistance. A MAC is a keyed hash that proves authenticity between parties sharing a key. A KDF derives multiple keys from one secret. A password hash deliberately slows itself down to make brute-force expensive. Each has a different job; none substitutes for another.

  The four jobs and their standard tools:

   integrity of arbitrary data
       SHA-256, SHA-512      (Merkle-Damgård, FIPS 180-4)
       SHA3-256, SHAKE128    (sponge, FIPS 202)
       BLAKE3                (tree, parallel, ~3 GB/s/core)

   authenticity with a shared key
       HMAC-SHA-256          (RFC 2104; works with any hash)
       KMAC                  (native to SHA-3 sponge)
       Poly1305              (with a one-time key, inside AEAD)

   derive keys from a shared secret
       HKDF                  (RFC 5869; extract-then-expand)
       X9.63 KDF             (PKCS / TLS legacy)

   turn a low-entropy password into a key
       Argon2id              (PHC winner, RFC 9106) — memory-hard
       scrypt                (RFC 7914) — memory-hard, older
       bcrypt                (1999, still OK for new code)
       PBKDF2                (RFC 8018) — only if the spec demands it
  • Collisions sink hashes as identifiers. MD5 (1991) has been collidable since 2004 (Wang et al.) and is unsafe for anything but non-adversarial checksums. SHA-1 collisions landed in 2017 (SHAttered, ~2^63 work) — Git is migrating to SHA-256 for exactly this reason. SHA-256 needs ~2^128 work and remains secure.
  • HMAC (Bellare, Canetti & Krawczyk, 1996) is a specific construction: HMAC(k, m) = H((k ⊕ opad) || H((k ⊕ ipad) || m)). It gives a MAC from any Merkle-Damgård hash with a provable bound; it is immune to length-extension (a real attack against naive H(k || m)); and it is the MAC behind JWT HS256, TLS, IPsec, and S3 signatures.
  • HKDF (Krawczyk 2010, RFC 5869) extracts entropy from a possibly-biased secret into a pseudo-random key (PRK) with a salt, then expands the PRK into any number of named keys via HMAC with per-label inputs. TLS 1.3's whole key schedule is an HKDF cascade from the ECDHE output.
  • Password hashing needs memory-hardness, not just slowness. A pure CPU hash (PBKDF2) is cheap for GPU brute force. Argon2id with parameters m = 64 MiB, t = 3, p = 1 (or tune per environment) forces each guess to allocate megabytes of RAM — which GPUs hate. Store (algorithm, params, salt, digest) together so parameters can be migrated later.

The model you want: hashes are fingerprints, MACs are signatures with shared keys, KDFs are key fan-outs, password hashers are time taxes. Using a plain SHA-256 of key || message as a MAC, or a plain SHA-256 of a password as a password hash, is a classical mistake with a classical fix in each case.

CAUTION

Constant-time comparison matters. A naive if (hmac == expected) returns early on first mismatched byte, leaking a timing side channel that lets an attacker guess a byte at a time. Use hmac.compare_digest (Python), crypto.timingSafeEqual (Node), subtle.ConstantTimeCompare (Go), or the equivalent.

Go deeper: FIPS 180-4 (SHA-2); FIPS 202 (SHA-3); RFC 2104 (HMAC); RFC 5869 (HKDF); RFC 9106 (Argon2); RFC 7914 (scrypt); Bellare, Canetti & Krawczyk, "Keying Hash Functions for Message Authentication" (CRYPTO 1996); the BLAKE3 paper and repo; NIST SP 800-132 (PBKDF parameter guidance).

Station 5 — Asymmetric crypto and KEMs

Symmetric crypto (Station 3) is fast but requires both parties to already share a secret key. On the internet, two machines that have never talked before need to establish a shared key without ever sending it over the wire in plaintext. Asymmetric cryptography — also called public-key cryptography — is the mathematical breakthrough (Diffie & Hellman 1976, Rivest–Shamir–Adleman 1978) that makes this possible. Each party has a key pair: a public key that can be freely shared, and a private key that is kept secret. Operations done with one key can only be undone with the other.

This gives two superpowers. Encryption to a public key: anyone can encrypt a message using Alice's public key, and only Alice (with the private key) can decrypt it. Signatures with a private key: Alice can sign a message with her private key, and anyone with her public key can verify the signature — proving the message came from Alice without revealing her private key. A key encapsulation mechanism (KEM) is the modern framing: instead of encrypting the message directly with slow asymmetric math, encrypt a random symmetric key to the recipient's public key, then encrypt the actual message with that symmetric key. TLS, SSH, Signal, age, and WebAuthn all do some version of this.

The landscape changed recently. The elliptic-curve primitives (Curve25519 / Ed25519, secp256r1) are faster and smaller than RSA for the same security level. But all of the current standards are vulnerable to a future quantum computer running Shor's algorithm. NIST standardised the first post-quantum primitives in 2024 — ML-KEM (Kyber) for key exchange and ML-DSA (Dilithium) for signatures — and the transition to hybrid (classical + post-quantum) is under way across the major protocols.

Asymmetric (public-key) cryptography solves the problem symmetric cannot: two parties who have never shared a secret can still agree on one, and messages can be signed so anyone can verify but only the holder of the private key can sign. The 1976 Diffie-Hellman paper and 1977 RSA paper are the two moments this stopped being impossible.

  Two families of public-key primitives in production:

  1. Key-exchange / encryption:
       RSA (1977)             based on integer factoring; 2048/3072/4096-bit keys
       ECDH / ECDHE           elliptic-curve Diffie-Hellman; 256–384-bit curves
       ML-KEM (Kyber)         lattice-based KEM; NIST FIPS 203, 2024  (post-quantum)

  2. Signatures:
       RSA-PSS / RSA-SSA      3072+ bit for 128-bit security
       ECDSA / Ed25519        ~256-bit curves for 128-bit security
       ML-DSA (Dilithium)     lattice-based; NIST FIPS 204, 2024  (post-quantum)
       SLH-DSA (SPHINCS+)     hash-based; FIPS 205 (post-quantum; big signatures)
  • Ed25519 (Bernstein, Duif, Lange, Schwabe, Yang, 2011) is the modern default signature: 32-byte public keys, 64-byte signatures, deterministic (no RNG needed at signing time), fast (~100 µs on a laptop), constant-time by construction. Used in SSH, TLS, Signal, OpenBSD, and much of crypto engineering since 2016.
  • ECDHE — Elliptic-Curve Diffie-Hellman, Ephemeral — is the key-exchange that gives TLS 1.3 forward secrecy: each session generates a fresh keypair, so compromise of the server's long-term private key does not retroactively decrypt past sessions. Static DH is effectively dead for this reason.
  • Post-quantum readiness: a sufficiently large quantum computer running Shor's algorithm breaks RSA, ECDH, and ECDSA in polynomial time. In 2024 NIST standardised ML-KEM (Kyber) (FIPS 203), ML-DSA (Dilithium) (FIPS 204), and SLH-DSA (SPHINCS+) (FIPS 205). TLS 1.3 has already deployed hybrid key-exchange (X25519 + ML-KEM-768 via X25519MLKEM768) — Google, Cloudflare, AWS all ship it in 2024–25. Long-term-confidentiality data should be protected with PQ primitives now, before the harvest-now-decrypt-later attacker acts on the data.
  • KEM vs DH: a Key-Encapsulation Mechanism is the post-quantum-friendly shape of "public-key encrypt a random key." The recipient publishes a public key; the sender encapsulates — picks a random value, encrypts it with the public key, gets back (ciphertext, shared_secret). The recipient decapsulates the ciphertext to recover the same shared secret. TLS 1.3 describes ECDHE as a KEM for protocol purposes, and Kyber slots in natively.

The model you want: public-key crypto is the bootstrap — it converts "no shared secret" into "shared secret" so the faster symmetric primitives can do the bulk work. Long-term keys sign; ephemeral keys exchange; identity and confidentiality ride on different keys deliberately.

WARNING

RSA signing with bad padding (PKCS #1 v1.5 without strict verification) led to ROBOT and countless TLS downgrade attacks between 1998 and 2018. For RSA signatures, use RSASSA-PSS; for RSA encryption, use OAEP — or better, stop using RSA encryption and use an ECIES-style ECDH + AEAD construction.

Go deeper: Diffie & Hellman, "New Directions in Cryptography" (IEEE Trans. Info. Theory 1976); Rivest, Shamir & Adleman, "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems" (CACM 1978); Koblitz (1987) and Miller (1986) on elliptic-curve cryptography; Bernstein et al., "High-speed high-security signatures" (Ed25519, 2011); NIST FIPS 203 (ML-KEM), 204 (ML-DSA), 205 (SLH-DSA); the Kyber and Dilithium specification documents.

Station 6 — TLS 1.3: the handshake in full

Every secure web connection — https://..., every API call behind a certificate, every email client talking to a server — rides on TLS (Transport Layer Security). TLS is the protocol that takes raw TCP and layers on authentication (you are really talking to example.com), confidentiality (nobody can read the bytes), and integrity (nobody can alter them in flight). TLS 1.3 (RFC 8446, 2018) is the current version and a major simplification of previous ones — one round-trip handshake, only AEAD cipher suites, forward secrecy by default, deprecated cryptographic primitives all removed.

The handshake is worth understanding end to end, because it composes every primitive from Stations 3–5: Diffie–Hellman key exchange (asymmetric, Station 5) to establish a shared secret without ever sending it; an HKDF (Station 4) to derive session keys from that secret; an AEAD cipher (Station 3) for the actual record protection; and a digital signature (Station 5) over a transcript hash so the server proves it is who its certificate says it is. Understanding TLS means understanding how each of those primitives slots into a specific step, and why removing any of them collapses a specific property.

Forward secrecy — the property that even if a private key is leaked later, past sessions remain safe — is the reason TLS 1.3 uses ephemeral Diffie–Hellman on every handshake. The session keys depend on a random value that is destroyed after the session ends; no amount of future key compromise can recover them.

TLS 1.3 (RFC 8446, 2018) is the composition of the previous stations into a working secure channel. It authenticates the server with a signature, establishes an ephemeral shared secret with ECDHE, derives session keys with HKDF, and encrypts all traffic with AEAD. Compared to TLS 1.2 it dropped RSA key-exchange, dropped static DH, dropped CBC modes, dropped RC4 — everything risky got mandatory or gone.

  TLS 1.3 1-RTT handshake (simplified, happy path):

  Client                                              Server
    │ ClientHello                                        │
    │  ─ random 32 B                                     │
    │  ─ supported ciphersuites (TLS_AES_128_GCM_SHA256, │
    │    TLS_CHACHA20_POLY1305_SHA256, …)                │
    │  ─ supported named groups (X25519, secp256r1,      │
    │    X25519MLKEM768 for hybrid PQ)                   │
    │  ─ key_share: C's ECDH(E) public for preferred     │
    │    group (optimistic)                              │
    │  ─ signature_algorithms (rsa_pss_rsae_sha256,      │
    │    ecdsa_secp256r1_sha256, ed25519, …)             │
    ├─────────────────────────────────────────────────▶  │
    │                                         ServerHello│
    │  ◀───────────────────────────────────────────────  │
    │   selected ciphersuite, group, key_share(S)        │
    │  ◀── {EncryptedExtensions}                         │
    │  ◀── {CertificateRequest}?    (mTLS optional)      │
    │  ◀── {Certificate}       (leaf + chain)            │
    │  ◀── {CertificateVerify} (signature over transcript│
    │                           with server's private key│
    │  ◀── {Finished}          (HMAC over transcript)    │
    │ (derive handshake keys from ECDHE + HKDF)          │
    │ {Finished}  ─────────────────────────────────────▶ │
    │ (derive application keys from HKDF of transcript)  │
    │ {HTTP/2 data …}                                    │
    ├─── AEAD-protected application data ──────────────▶ │
  • Forward secrecy comes from ECDHE: the session key is a function of two ephemeral private keys, discarded at the end of the session. Recording ciphertexts and compromising the server's long-term RSA/ECDSA key later does not decrypt past traffic.
  • Server authentication is the CertificateVerify signature over the transcript — the server proves possession of the private key matching the certificate by signing everything both sides have seen so far. That transcript binding is what defeats MITM: a downgrade or a swap would change what the server signed.
  • Session resumption with PSK (Pre-Shared Key) makes the handshake 0-RTT for repeat connections: the client sends early data alongside ClientHello, encrypted with a key derived from a previous session. The tradeoff: replay attacks on idempotent operations are possible within the 0-RTT window, which is why TLS 1.3 restricts 0-RTT to safe request types.
  • Cipher suites in TLS 1.3 are short: TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256, plus the SHA-3 variants. Signatures and key-exchange groups are negotiated separately — a cleaner, smaller negotiation matrix than TLS 1.2's ~300 distinct suites.

The model you want: TLS 1.3 is HKDF over the transcript, threading ECDHE output through handshake keys, then through application keys, then eventually through exporter keys. Every key has a named, specific purpose; none is reused across purposes.

TIP

To see TLS in the raw, run openssl s_client -connect example.com:443 -tls1_3 -msg and the byte-by-byte handshake scrolls past. wireshark with a logged session key (SSLKEYLOGFILE) decrypts the whole thing. Thirty minutes watching your own handshakes is worth a month reading about them.

Go deeper: RFC 8446 (TLS 1.3 spec — the whole thing is readable in an afternoon); Krawczyk, "The OPTLS protocol and TLS 1.3" (2016); Paterson & van der Merwe, "Reactive and Proactive Standardisation of TLS" (2016); Adam Langley's blog on ECH, 0-RTT, and hybrid PQ; the IETF TLS working group archives for the draft history.

Station 7 — PKI, certificate transparency, and revocation

TLS proves the server has the private key corresponding to a public key in a certificate. But what makes us trust the certificate itself? "Here is my public key, I am google.com, trust me" is not good enough — anyone could claim that. The answer is Public Key Infrastructure (PKI): a set of trusted Certificate Authorities (CAs) sign certificates that attest a public key belongs to a specific name, and every browser / OS ships with a list of root CAs it already trusts. When you visit https://google.com, your browser verifies a chain of signatures from Google's certificate back to a root it already knows.

This system is powerful and fragile. If any of those root CAs is compromised or coerced, the attacker can issue certificates for any domain. History is full of examples (DigiNotar 2011, Comodo 2011, TURKTRUST 2013). The defence is Certificate Transparency (CT): every trusted CA must publish every certificate it issues into append-only public logs. Browsers require proof of CT log inclusion before accepting a certificate, so mis-issuance becomes detectable by anyone monitoring the logs for their own domain.

Revocation — "this certificate was compromised, don't trust it anymore" — is the unfinished engineering problem. Classic approaches (CRL files, OCSP responses) either don't scale or leak browsing history. Modern answers (OCSP Must-Staple, CRLite, short-lived certificates with 7-day validity from automatic issuers like Let's Encrypt) are partial wins; nobody has built a perfect solution yet.

TLS's server authentication only works if the client can decide "yes, this is the real example.com." That binding between a domain name and a public key is the job of PKI — Public Key Infrastructure. In the Web PKI, a hierarchy of Certificate Authorities (CAs) issues X.509 certificates; browsers and OSes ship a trust store of root CAs; every leaf certificate's validity is checked by walking a chain of signatures back to a trusted root.

Rendering diagram…
  • Chain-of-trust verification is mechanical but strict: the leaf's signature is verified with the intermediate's public key; the intermediate's signature with the root's public key; the root must be in the local trust store; every certificate must be within its validity period; hostname must match Subject Alternative Name (the Common Name is ignored on modern clients); any revocation signal (OCSP, CRL, or CRLite) must not fail open silently.
  • Certificate Transparency (RFC 9162, 2021) is the audit layer. Every leaf certificate must be submitted to multiple append-only CT logs before issuance; the log returns a Signed Certificate Timestamp (SCT) that the certificate embeds (or TLS extensions carry). Browsers refuse certificates without SCTs. Google, Cloudflare, DigiCert, Let's Encrypt, and others run the logs; sites like crt.sh let anyone search them — if someone issues a mis-issued cert for your domain, CT makes it detectable in hours, not years.
  • ACME (Automatic Certificate Management Environment, RFC 8555) is how Let's Encrypt and ZeroSSL issue free 90-day certificates. The protocol proves domain control via HTTP or DNS challenges and ships millions of certs a day. If a certificate lives longer than an engineer's onboarding, renewal should be automated, not a calendar alert.
  • Revocation was long the weak link. OCSP stapling embeds a signed "still valid" response in the TLS handshake so browsers don't have to ask the CA on every connection; Firefox's CRLite and Chrome's CRLSets ship compressed revocation data to the browser directly. The lesson is that revocation is often best-effort; short certificate lifetimes (90 days now, moving to 47–90 days by 2029 per CA/Browser Forum) are the structural answer.

The model you want: a certificate is a signed statement — "this public key belongs to this name, valid until this date" — with SCTs proving the statement is publicly logged. Trust is not monolithic; it is a walk from a leaf to a root, past a CT proof, under a hostname check, within a date window.

CAUTION

Pinning a certificate or a CA in a mobile app was standard 2013–2018 advice and is now a foot-gun — the app outlives the cert, rotation breaks clients, and the only fix is an emergency release. Use Expect-CT, Public-Key-Pins-alternatives, and short-lived certs with CT instead. If you must pin, pin to a long-lived intermediate set with a defined rotation story.

Go deeper: RFC 5280 (X.509 PKI); RFC 6066 (TLS extensions incl. SNI); RFC 8555 (ACME); RFC 9162 (Certificate Transparency); CA/Browser Forum Baseline Requirements; Adam Langley's "Revocation is broken" and the CRLite paper; the crt.sh web interface as a daily tool.

Station 8 — Secrets and key management: HSMs, KMS, envelope encryption

All of Stations 1–7 assume that private keys actually stay private. That assumption is where most real-world systems fail: a developer commits a key to Git, an environment variable leaks to logs, a breach of the application layer exfiltrates credentials, a backup tape is lost. Key management is the unglamorous discipline of keeping the inputs to cryptography safe. Get this wrong and the strongest primitives in the world are worthless.

Three engineering patterns dominate. HSMs (Hardware Security Modules) are tamper-resistant hardware devices that hold keys and perform cryptographic operations on demand — the key never leaves the HSM in plaintext. Banks, certificate authorities, and any regulated industry with serious threat models use HSMs certified to FIPS 140-3 levels. KMS (Key Management Service) is the cloud abstraction: AWS KMS, GCP Cloud KMS, Azure Key Vault, HashiCorp Vault all expose an HSM-backed API where your code asks "encrypt this" or "sign this" without ever seeing the key material. Envelope encryption is the scaling trick: the KMS encrypts a small data encryption key (DEK) under a long-lived master key; your application uses the DEK to encrypt actual data; the encrypted DEK travels with the data. This way a single master key can protect petabytes of information without ever doing petabyte-scale encryption itself.

The principles are ancient and still correct: minimise the number of humans who ever see a private key, rotate keys on a schedule, limit blast radius by using separate keys for separate purposes, and assume that every mechanism will eventually be compromised — so design for rotation, revocation, and crypto-agility from day one.

A crypto system's strength rarely lies in the algorithm and almost always in the operational handling of keys. A key is not a value in a config file; it is an object with a lifecycle (generation, distribution, storage, rotation, destruction), a set of authorized users and purposes, and an audit record of every use. Key management is the discipline that enforces that lifecycle at scale.

  Envelope encryption — the default pattern in cloud KMS:

   plaintext  ──┐                         ciphertext + wrapped DEK
                │                                   ▲
                ▼                                   │
      ┌───────────────┐        ┌───────────────────────────┐
      │  DEK (random) │ ─────▶ │ AES-GCM(DEK, plaintext)    │
      └───────────────┘        └───────────────────────────┘
             │                                   ▲
             │ wrap                              │
             ▼                                   │
      ┌───────────────┐        ┌───────────────────────────┐
      │  KEK (in KMS) │ ─────▶ │ AES-GCM(KEK, DEK) = wDEK   │
      └───────────────┘        └───────────────────────────┘

   To read: KMS.Decrypt(wDEK) → DEK; AES-GCM-decrypt(DEK, ciphertext, tag).
   DEK is unique per object; KEK stays in the HSM; throughput scales.
  • HSMs (Hardware Security Modules) — Thales Luna, YubiHSM, Marvell LiquidSecurity, AWS CloudHSM — generate keys inside tamper-resistant silicon and expose them only through authenticated APIs. Private keys leave the HSM only encrypted (if at all). FIPS 140-3 (NIST, 2019) grades them at Security Levels 1–4, where Level 3/4 hardware is required for financial and government workloads.
  • Cloud KMS (AWS KMS, GCP KMS, Azure Key Vault) is HSM-backed key management as an API. You call KMS.Encrypt(plaintext) and get ciphertext; the key never leaves the HSM boundary. Envelope encryption generates a per-object Data Encryption Key (DEK), encrypts the plaintext with the DEK, then wraps the DEK with the KMS-held Key Encryption Key (KEK) — so the HSM only does small KEK operations while bulk AEAD runs on application CPUs.
  • Key rotation is scheduled and versioned, not reactive. Each key has generations; old generations keep decrypting old ciphertext; new ciphertext uses the current generation. A compromise triggers key rotation + re-encryption of affected data, not deletion.
  • Secrets managers — HashiCorp Vault, AWS Secrets Manager, 1Password — manage the operational surface (API tokens, DB passwords, TLS private keys). Secrets are versioned, access is auditable, leases can be short (dynamically-generated DB credentials valid for 15 minutes). Never store long-lived secrets in environment variables on a shared host if a secrets manager is an option.
  • Access policy should be least-privilege and attribute-based: a service may decrypt customer records in region us-east-1 but not eu-west-1; a CI job may read deployment secrets but not production keys; an on-call human can break-glass into production but every use is logged and alerts a second human. The policies are code (IAM JSON, Vault HCL, Rego) and review-able.

The model you want: a key is a lifecycle, not a value. Every operation on a key — generate, wrap, unwrap, sign, verify, rotate, destroy — should be authenticated and logged, and the blast radius of a single key's compromise should be bounded by its scope.

WARNING

git history is forever. A secret checked in, even briefly, is presumed leaked — rotate it, don't just git rm. GitHub Secret Scanning and similar services publish alerts within minutes; attackers run their own scanners too. Pre-commit hooks (gitleaks, trufflehog) that block obvious patterns are a cheap control. See also the Engineering Craft handbook on pre-commit discipline.

Go deeper: NIST SP 800-57 (key management recommendations); NIST FIPS 140-3 (HSM security requirements); AWS KMS and Vault documentation, read together to see the same patterns in two vendors; Ross Anderson, Security Engineering (3rd ed) chapters 18 and 25 on crypto and key management at bank scale.

How the stations connect

Security is a stack — threat model sets the goals, identity establishes who, primitives give you tools with stated bounds, protocols compose the primitives into working channels, PKI makes identities globally verifiable, and key management is the lifecycle that keeps all of it true over time.

Rendering diagram…

The Foundations handbook covers hashes from the representation angle; the Systems & Architecture handbook names the trust boundaries you model here; the Operating Systems handbook is where capability and seccomp defenses live at the process level.

Standards & Specs

Test yourself

A service stores bearer access tokens in a cookie set with HttpOnly and Secure but without SameSite. A login endpoint accepts tokens from an Authorization header or from the cookie. What class of attack is live, and what is the one-line fix?

Cross-site Request Forgery (CSRF). Without SameSite=Lax or SameSite=Strict, a third-party site can cause the user's browser to send the cookie along with a forged request, and the server accepts the cookie as authentication. The one-line fix is SameSite=Lax on the cookie (or Strict for maximum protection). A deeper fix: accept either a bearer header or a cookie, not both — if the cookie is only good for the first-party UI, require an anti-CSRF token for any cookie-auth state change. See Station 2.

An internal API uses AES-GCM with a 96-bit counter nonce that resets when a service restarts. What goes wrong, and what is the minimally-invasive fix?

Nonce reuse. After a restart, the counter starts again from zero, colliding with nonces used before the restart under the same key. GCM's security collapses catastrophically on nonce reuse — XOR of ciphertexts reveals plaintext XOR, and the GHASH universal-hash authentication key can be recovered from two forged messages. Fix: switch to AES-GCM-SIV (RFC 8452), which is nonce-misuse-resistant, or use random 96-bit nonces (birthday-bound at ~2^48 messages per key), or persist the counter across restarts. Do not reuse. See Station 3.

A browser visits https://example.com. The server's certificate validates against the trust store, the chain is good, the hostname matches, and the cert is not expired. Name two additional checks modern browsers perform — and explain what each one defends against.

(1) Certificate Transparency SCTs. The cert must carry Signed Certificate Timestamps from multiple CT logs (or the TLS handshake must deliver them via an extension). This defends against a mis-issued or covertly-issued certificate — even a compromised CA cannot issue one that isn't also publicly logged, making detection possible in hours. (2) Revocation, via OCSP stapling or a locally-shipped CRL set (CRLite / CRLSets). This defends against an issued-but-compromised certificate. Together with short lifetimes (90 days today, 47–90 days by 2029), they make the gap between compromise and cleanup short enough to live with. See Station 7.