Security & Cryptography

Security is not a feature added at the end; a system is either secure at every layer or not at all. The substrate is small: a handful of cryptographic primitives, a way to bind keys to identities, and a discipline for thinking about who attacks and how. Everything else — TLS, OAuth, certificates, the OWASP Top 10 — composes those pieces.

The security stack — foundations, primitives, protocolsFoundationsPrimitivesProtocolswhat to defend, from whomthree building blockscomposed defenses on the wireThreat modelSTRIDERisk budgetassets · adversary · goalssix categories of harmcost of defense vs cost of lossSymmetricAsymmetricHashAES-GCM · ChaCha20Ed25519 · X25519 · RSASHA-256 · BLAKE3TLS 1.3PKI / X.509OAuth · OIDCZero trustchannel secrecy + integritystrangers prove identitydelegated access · SSOverify every requestFailure surfaces — where attacks actually landOWASP Top 10supply chain (SLSA, SBOM)human / phishingweb app vulnerabilitiescode you didn't writecredentials, social engineeringthree primitives compose into every protocol; protocols compose into the defenses that meet the threat model
Foundations come first — without a threat model the rest is busywork. Primitives are few. Protocols are where most engineering effort goes, and where most bugs live.

Threat modeling

A defense without a target is wasted effort. Before any control, four questions: what are you protecting (data, accounts, availability), from whom (script kiddies, criminal groups, insiders, nation-states), what do they want, and what will you spend to stop them. Those answers form a threat model. Skip the step and you over-spend on adversaries who never appear, or under-spend on the ones who do. A useful model is concrete: "we protect card data from external attackers who want to monetize it, with up to 5% of engineering capacity" — not "we take security seriously."

STRIDE is six prompts from Microsoft: at every component, ask whether each category applies and record what you find. The categories don't tell you what to do; they tell you what to look for.

STRIDE — six categories of harm to brainstorm againstSpoofingTamperingRepudiationInformation disclosureDenial of serviceElevation of privilegeclaim to be someone you're notsession-token replay,credential stuffingchange data in transit or at restmodified DB record,man-in-the-middledeny having done somethingmissing audit log,no signed receiptread what shouldn't be readleaked secrets, plaintextbackups, side channelsmake the system unavailableflood, algorithmic DoS,resource exhaustiondo more than you shouldbroken access control,container escapewalk every trust boundary; ask whether each category applies; record the ones that dodefenses map back: authentication · integrity · non-repudiation · confidentiality · availability · authorization
STRIDE is one of several frameworks (PASTA, LINDDUN, OCTAVE). Pick one and use it consistently rather than freelancing every review.

A trust boundary is any interface where data crosses between zones of different trust — user-to-server, server-to-database, container-to-host. Most security bugs live at trust boundaries: that is where assumptions on one side meet hostility on the other.

Pitfall — modeling once, then forgetting. A threat model is invalidated the moment the architecture changes, and architecture changes constantly. Tie the model to design review so every new feature gets a fresh pass. A two-year-old threat model is fiction.

The three crypto primitives

Cryptography is built from three operations. TLS, SSH, signed updates, password hashes, blockchains — every higher-level scheme composes symmetric encryption, asymmetric crypto, and hashing. Knowing what each does and does not do is most of knowing cryptography.

The three crypto primitives — symmetric, asymmetric, hashSymmetricAsymmetricHashone shared key · fastkey pair · slow · identityno key · one-way fingerprintplaintext "hi"ciphertext 0xa9...keykeykeykeyencryptdecrypt"hi"AES · ChaCha20message msignature σprivatepublicsignverifyvalid / invalidRSA · ECDSA · Ed25519 · X25519any input mdigest H(m)one-waycannot recover m from H(m)SHA-256: 32 bytesSHA-256 · SHA-3 · BLAKE3
Three operations, three trade-offs. Symmetric is fast but needs a pre-shared key. Asymmetric solves key exchange and proves identity but runs orders of magnitude slower. Hashing has no key and is one-way in practice — you cannot feasibly recover the input from the digest.

Symmetric encryption

Two parties want to exchange messages no third party can read, and they already share a secret. Symmetric encryption is the operation: one key encrypts, the same key decrypts. Anyone with the key can do both; anyone without sees random-looking bytes. AES is the standard block cipher; ChaCha20 dominates where AES hardware acceleration is absent. Symmetric crypto is fast — multi-GB/s per core.

The hard problem is how the two parties got the shared key without an eavesdropper learning it. That is what asymmetric crypto exists to solve.

Asymmetric (public-key) crypto

On the open internet, strangers have shared nothing. Asymmetric crypto gives each party a key pair — a public key anyone can see, and a private key only they hold. Three operations follow:

  • Encryption to a recipient: anyone can encrypt with the public key; only the private-key holder can decrypt.
  • Signing: the private-key holder produces a signature over a message; anyone with the public key can verify it came from that holder and was not modified. (Modern schemes like Ed25519 and RSA-PSS are not literally "encrypting with the private key" — they use distinct algorithms — but the asymmetry of who can produce versus verify is the same.)
  • Key exchange: two parties combine their key pairs to derive a shared secret without transmitting it.

The cost is speed: asymmetric operations are orders of magnitude slower than symmetric — roughly a thousand-fold for ECC, more for RSA. Real protocols use them only to agree on a session key, then switch to symmetric for bulk traffic.

Standard algorithms: RSA (large keys, slow); ECDSA (elliptic-curve, smaller keys, faster); Ed25519 for signatures and X25519 for key exchange (the modern default, used by TLS 1.3, SSH, Signal, WireGuard). A new family — ML-KEM and ML-DSA, standardized in 2024 — resists future quantum computers.

Worked example: how a signature actually verifies

Alice wants to send Bob a message Bob can prove came from her. Alice has a keypair (pkA, skA) — public key and secret key. Bob already has a copy of pkA (from a certificate, a known-good directory, or some prior exchange).

Signing (Alice):

  1. Compute h = SHA-256(message). For message = "transfer $100 to Bob", this is some fixed 32-byte digest, say h = 4f8c...a91d.
  2. Compute sig = Sign(skA, h). The sign function consumes Alice's secret key and the digest and emits a fixed-size signature — 64 bytes for Ed25519.
  3. Send (message, sig) to Bob over any channel — wire, USB stick, carrier pigeon.

Verifying (Bob):

  1. Re-compute h' = SHA-256(received_message) from the message bytes Bob actually received.
  2. Run Verify(pkA, h', sig). The verify function consumes Alice's public key, the recomputed digest, and the signature, and returns a single boolean.
  3. Accept the message if and only if the result is true.

Two ways the verification fails. If anyone changed a byte of the message in transit, h' is unrelated to the original h, and Verify returns false. If anyone signed without skA — say an attacker tried to forge sig' over their own message — Verify returns false because the signature does not match the public key it is checked against. The property is exact: only the holder of skA can produce signatures that pkA accepts, and the signature is bound to the specific message bytes.

This is not "encrypt with the private key, decrypt with the public key" — Ed25519 and RSA-PSS use sign/verify operations that are not literally inverse encryption. The directional asymmetry (only skA can sign, anyone with pkA can check) is what matters.

Shared-key agreement works through Diffie–Hellman. Each side picks a private number, derives a public value, and sends it. Each combines its own private number with the other's public value and lands on the same shared secret. An eavesdropper sees both public values but cannot derive the secret because reversing the math — the discrete-log problem — is computationally hard.

Diffie-Hellman key exchange — public transmissions, private secretAliceBobEve (eavesdropper)public channelprivate a(secret)private b(secret)A = g^a mod pvisible to EveB = g^b mod palso visible to EvecomputesB^a mod pcomputesA^b mod pB^a = A^b = g^(ab)shared secretEve has g^a and g^b but not g^(ab)discrete log is hard: recovering a from g^a takes exponential time at sufficient key sizes
Plain Diffie-Hellman gets you a shared secret but does not tell you who is on the other end of the wire. An active attacker can run DH separately with each side and sit in the middle. TLS authenticates the exchange with a signed certificate, which is what the next section is about.

Hashing

Sometimes the goal is not to hide data but to fingerprint it: detect tampering, look up content by identity, prove possession without revealing the value. Hashing maps any input to a fixed-size output (SHA-256 produces 32 bytes). Three properties make a function a cryptographic hash:

  • Pre-image resistance — given a digest, you cannot feasibly find an input that produces it.
  • Second pre-image resistance — given an input, you cannot feasibly find a different input with the same digest.
  • Collision resistance — you cannot feasibly find any two inputs with the same digest. (Collisions must exist mathematically — the input space is larger than the output — but a good hash makes finding one cost more than the universe has cycles.)

Modern choices: SHA-256, SHA-3, BLAKE3 (fastest in software). MD5 has been broken since 2004 and SHA-1 since 2017; both still appear in legacy systems and should never appear in new ones.

Hashes appear everywhere: file integrity, content-addressed storage (Git, IPFS), message authentication codes (HMAC), and password storage with a deliberately slow hash.

A Merkle tree combines leaf hashes pairwise upward into a single root. Proving any leaf is in the tree takes only log₂(N) sibling hashes — twenty for a million-leaf tree. Git commits, Certificate Transparency logs, blockchains, and BitTorrent all use this structure.

Merkle tree — leaf hashes combine pairwise into a single rootroot = H(H12 || H34)H12 = H(H1 || H2)H34 = H(H3 || H4)H1 = H(d1)H2 = H(d2)H3 = H(d3)H4 = H(d4)d1d2d3d4datadatadatadatainclusion proof for d3: send H4 and H12; verifier recomputes H34 then root, compares to known root
Tampering with any leaf cascades up: every ancestor hash changes, the root changes, the proof fails. The cost of proving membership is logarithmic in the leaf count.

A hash chain is the linear case: each block contains the hash of the previous one. Change any past block and every later hash stops matching. Git history, append-only audit logs, and every blockchain are hash chains. The chain by itself only detects tampering — anyone can rewrite the whole tail and the new chain still verifies. Useful chains add a witness: a signature, a trusted timestamp, distributed consensus, or proof-of-work to make rewriting expensive.

Pitfall — implementing primitives yourself. A correctly specified algorithm becomes broken code when the implementation leaks the key through cache timing, branch prediction, or power draw. Use a vetted library — libsodium, BoringSSL, the platform crypto module — and call it correctly.

Authenticated encryption (AEAD)

Encryption alone hides plaintext but does not detect tampering. An attacker who flips bits in the ciphertext gets the receiver to decrypt garbage; in some modes they can flip specific bits in the plaintext predictably. Authenticated encryption with associated data (AEAD) bundles encryption with a per-message authentication tag, so any modification is rejected at decrypt time.

Standard constructions: AES-GCM (the TLS 1.3 default) and ChaCha20-Poly1305. Both take a key, a nonce, the plaintext, and optional associated data that gets authenticated but not encrypted. They output a ciphertext plus a short tag. Decryption fails fast if any has been touched.

AEAD — encrypt and authenticate together; tampering breaks the tagplaintextkeynonceAADmunencrypted, authenticatedAEAD encryptAES-GCM · ChaCha20-Poly1305cipher + MAC in one passciphertextauth tagsame length as m16 bytesdecrypt fails fast: any flipped bit in ciphertext, AAD, or tag breaks tag verification — receiver discards the message
The same key with the same nonce must never re-encrypt different plaintexts. Use a counter, a random value with collision odds you understand, or AES-GCM-SIV if you cannot guarantee nonce uniqueness.

Pitfall — nonce reuse. Reusing a (key, nonce) pair with AES-GCM leaks the XOR of the two plaintexts and lets an attacker forge any tag for that key. WEP died this way; so did several console jailbreaks. AES-GCM-SIV exists precisely to survive that operator mistake.

PKI and certificates

Key exchange gets you a shared secret with someone — but how do you know that someone is bank.com and not an attacker on the same Wi-Fi? You need a binding from public key to identity that strangers can verify. That binding is a certificate: a document stating "this public key belongs to bank.com," signed by a third party both ends already trust.

Trust bootstraps from a trust store: root Certificate Authority public keys baked into your operating system or browser — about a hundred in the public web today. Anything a root signs, you trust transitively.

A certificate chain from leaf to trust anchorLeaf certificateCN = bank.com+ server's public keysigned byIntermediate CALet's Encrypt R3issuing key, onlinesigned byRoot CAISRG Root X1offline · in trust storeOS / browsertrust storeCT log(audit trail)validate: leaf signature → intermediate signature → root in trust store · check expiry, hostname, revocation
The leaf certificate is presented during the TLS handshake along with intermediates; the root must already be on the client. Roots stay offline and rarely change; intermediates rotate frequently.

Certificates follow X.509: subject, subject public key, issuer, validity dates, serial, signature algorithm, and extensions (Subject Alternative Names, key usage, revocation pointers). When a client connects, the server presents its leaf and intermediates; the client checks each signature up to a root, verifies the hostname against a SAN, checks dates, and consults revocation. Any failure aborts.

That construction has one hole: a misbehaving CA could quietly issue bank.com to an attacker, and only the real bank.com operators would notice — eventually. Certificate Transparency closes the gap by requiring every public CA to publish every certificate to append-only public logs. Domain owners monitor the logs and catch unauthorized issuance. Modern browsers refuse certificates not in trusted CT logs.

Pitfall — the model is as weak as its weakest CA. Any root can sign for any domain. DigiNotar was breached in 2011 and used to surveil Iranian Gmail users. Symantec was distrusted by Chrome in 2017 after years of mis-issuance. Defenses: CT monitoring, CAA DNS records (example.com. CAA 0 issue "letsencrypt.org" declares which CA may issue), and application-layer pinning where appropriate.

TLS 1.3 — putting it together

TLS composes every primitive so far. Client and server run an authenticated Diffie-Hellman exchange (key exchange with a signed certificate), derive symmetric session keys via HKDF, then switch to AEAD for the data. TLS 1.3 does this in one round trip — down from two in TLS 1.2 — and removes every legacy weakness: no static-RSA key transport (forward secrecy is mandatory), no CBC, no SHA-1, no compression.

TLS 1.3 1-RTT handshake — three primitives composedClientServerClientHello + key_share (X25519 pubkey)supported ciphers, ALPN, SNI, random nonceServerHello + key_share + {cert, sig, finished}chosen cipher · server's X25519 pubkey · cert chainserver signs handshake transcript with cert's private keyBoth derive shared secret from ECDHEHKDF → traffic keys for AES-GCM / ChaCha20-Poly1305Finished + encrypted application datafirst payload byte arrives in 1 RTTencrypted application dataprimitives used:X25519 (exchange) · Ed25519/RSA (cert sig) · AES-GCM (bulk) · HKDF (derive)
Forward secrecy comes from the ephemeral key exchange: the long-term certificate key is used only to sign the handshake, never to derive session keys. Even if that key leaks later, past sessions stay private because their ephemeral material was thrown away.

Identity

Identity has three sub-questions, often confused. Authentication: who is this user? Authorization: what may they do? Federation: how does one system trust another's answers? Putting authorization where authentication belongs (or vice versa) is a common architecture error.

Authentication — who are you?

A user proves identity with one or more factors: something you know (a password), something you have (a phone, a security key), something you are (a fingerprint, a face). Multi-factor requires at least two categories. SMS counts as "something you have" only loosely — SIM-swap hijacks it routinely, which is why NIST no longer recommends it.

Passwords need careful storage. A leaked database is only valuable to an attacker who can recover the plaintexts, so storage must make recovery prohibitively expensive after a leak. Plain SHA-256 fails: a GPU runs about ten billion per second. The fix is a deliberately slow password hashing function: argon2id and scrypt are memory-hard (each guess must allocate megabytes of RAM, which throttles GPUs); bcrypt is iteration-hard but not memory-hard, still acceptable at a high cost factor for legacy systems. Tuned correctly, modern hardware computes only a handful per second per core — turning a one-second login into centuries of brute force per password.

Password hashing — same input, three very different cost profilespassword = "correct horse battery staple"SHA-256bcryptargon2idfast hashhigh cost factormemory-hard≈ 1 µs per hash≈ 10 billion / sec on GPU≈ 250 ms per hash≈ 4 / sec per core≈ 500 ms + lots of RAMGPU-resistantoffline crack of leaked DB:hoursoffline crack of leaked DB:centuries per passwordoffline crack of leaked DB:centuries + RAM-boundalways salt per-user; raise the cost factor over time as hardware improves
Memory-hardness is the difference that defeats GPUs. A GPU runs thousands of SHA-256 lanes in parallel from tiny registers; argon2id forces each attempt to allocate tens of megabytes, and GPU memory bandwidth becomes the bottleneck.
Worked example: registration, login, and why the DB leak is hard to crack

Two flows use the same primitive in opposite directions.

Registration (user sets the password):

  1. Server generates a fresh random salt — 16 bytes, unique per user. Say salt = 0x9a3f...c1.
  2. Server computes stored = argon2id(password, salt, params), where params sets memory (e.g. 64 MB), iterations (e.g. 3), and parallelism. Running once takes around 500 ms on the server.
  3. Server stores (salt, params, stored) in the user's DB row. The plaintext password is discarded immediately.

Login (user submits the password):

  1. Server looks up the row by username; reads salt and params.
  2. Server computes candidate = argon2id(submitted_password, salt, params) — same function, same salt, same params.
  3. Server compares candidate to stored in constant time. Equal means accept; unequal means reject. The plaintext is again discarded.

Why the offline crack is hard. Suppose an attacker steals the entire user table — every (salt, params, stored) triple. To recover one user's password they must:

  1. Guess a candidate password (say "hunter2").
  2. Compute argon2id("hunter2", that_user's_salt, that_user's_params). This costs around 500 ms and around 64 MB of RAM per guess.
  3. Compare to stored. Wrong guess: try again.

The per-user salt forces step 2 to be redone from scratch for every user — there is no shared "rainbow table" that covers everyone. The memory cost forces step 2 onto a CPU or a heavily over-provisioned GPU — a normal GPU runs out of memory bandwidth long before it runs out of cores. A modern attacker machine might manage a few hundred guesses per second per user. At that rate, a random 10-character password takes longer than the universe has existed; even a weak password from a 100-million-word dictionary takes weeks per user.

Plain SHA-256 with no salt gives the attacker around 10 billion guesses per second shared across all users. The same dictionary attack finishes in seconds. That is the entire reason for the slow, salted, memory-hard construction.

A password by itself is one factor. A second factor blocks an attacker who has phished or leaked the first. The common second factor is a six-digit code from an authenticator app — TOTP (Time-based One-Time Password). App and server share a secret; both compute HMAC-SHA-1 over (secret, current_30s_window) and truncate to six digits, no round-trip needed. TOTP defeats most credential leaks but not phishing: a relay site collects the code with the password and replays both within the thirty-second window. WebAuthn closes that gap.

Passkeys (the WebAuthn standard) replace passwords with asymmetric crypto bound to the site's origin. At registration, the platform authenticator (Secure Enclave, TPM, or hardware key like a YubiKey) generates a fresh key pair per site and holds the private key. At login, the server sends a random challenge; the device signs the challenge together with the calling origin; the signature is bound to that origin. A phishing relay collects a signature for the wrong origin, which the real site rejects. The construction is phishing-resistant by design, not by user vigilance.

Authorization — what can you do?

Once authenticated, the application decides what the user may do. Three models:

  • RBAC (role-based) assigns users to roles (admin, editor, viewer) with permissions attached to roles. Coarse-grained, easy to audit.
  • ABAC (attribute-based) bases decisions on attributes of the user, resource, action, and context — "read allowed if user.department == document.department." More flexible, harder to audit.
  • Capabilities make possession of a token the permission itself: signed URLs and OAuth bearer tokens carry the right to do what they say with no central check.

A common pattern across all three is policy-as-code: rules in a dedicated language (Rego in OPA, Cedar in AWS), evaluated on every request, reviewed and updated outside the application.

Federation — other systems vouching for users

"Continue with Google" delegates authentication: Google identifies you, then asserts your identity to the third-party site. One identity provider vouches for users to many relying parties. The dominant protocols: OAuth 2.0, OIDC, SAML, with GNAP as the successor.

OAuth 2.0 answers a different question: not "who is the user," but "may this third-party app act on the user's behalf at this service, without getting the user's password?" The standard flow is authorization code with PKCE: user logs in at the authorization server, approves the requested permissions, is redirected back with a short code; the app exchanges the code (plus a PKCE verifier secret) for a short-lived access token used on API calls.

OAuth 2.0 authorization code flow with PKCEUserClient appAuthorization serverResource serverclick "log in"/authorize?...&code_challenge=S256(verifier)login + consent screenenter password / passkey, approveredirect ?code=xyzPOST /token code=xyz, code_verifieraccess_token (+ refresh_token)GET /api/data, Authorization: Bearer ...user's resourcePKCE binds the code to the client that started the flow — without it, a stolen code can be redeemed by anyone
The user's password never leaves the authorization server. The client gets a short-lived access token (typically minutes to hours) and an optional refresh token to renew it without re-prompting.

OIDC (OpenID Connect) is a thin layer on top of OAuth 2.0 that adds the identity question OAuth doesn't answer. Alongside the access token, the authorization server returns an id_token: a signed JWT with claims about the user. An access_token is a capability ("you may call this API"); an id_token is a statement ("here is who they are, signed by me"). Using an access token to identify the user is a common bug.

SAML is the older XML-based federation protocol, still dominant in enterprise SSO. GNAP (Grant Negotiation and Authorization Protocol) is the IETF's proposed OAuth 2.0 successor — single negotiation, client-bound keys, intent-based authorization. Early adoption, long-term direction.

A JSON Web Token (JWT) is the format used throughout: three base64url segments joined by dots — a header naming the algorithm, a payload of claims, a signature over both. The signed form (JWS) is by far the most common — anyone with the verifier's public key can read the claims, so do not put secrets in them. An encrypted form (JWE) exists for cases that need confidentiality, but most "JWT" deployments mean the signed kind.

JWT structure — three base64url segments separated by dotseyJhbGciOi...XVCJ9 . eyJzdWIiOi...MzkwMjJ9 . SflKxw...sw5cHeaderPayload (claims)Signaturebase64url JSONbase64url JSONbase64url bytes{"alg": "RS256","typ": "JWT","kid": "2026-q2"}{"iss": "auth.example","sub": "user-1234","aud": "api.example","iat": 1714000000,"exp": 1714003600,"scope": "read:orders"}sign( base64url(header) + "." + base64url(payload), private_key)Footgunsaccepting alg=none · verifying HS256 with the RSA public key · skipping aud / iss / exp checksletting the token's own header pick the algorithm — pin the algorithm at the verifier instead
The signature covers header and payload only. Pin the expected algorithm and the expected issuer at the verifier; never trust the token's own header to drive validation choices.

Pitfall — JWT validation footguns. Accepting alg: none (no signature, treated as valid). Accepting alg: HS256 when the verifier holds an RSA public key — the attacker uses that public key as the HMAC secret to forge tokens. Skipping aud, iss, exp, or nbf. Pin the algorithm and expected issuer server-side; validate every claim.

Zero trust

The historical model was the castle-and-moat: a firewall separated a trusted internal network from the untrusted internet. On the corporate VPN, trusted; off it, not. That worked when "the company" was a building of desktops on a LAN.

It does not work now. Cloud workloads, SaaS, BYOD laptops, remote workers, contractors, and supply-chain code execution all live outside any meaningful perimeter, and one phished credential gets an attacker past it anyway. Lateral movement after initial compromise — pivoting from one breached host to many — was the dominant phase of nearly every major 2010s breach.

Zero trust replaces "you are inside, therefore trusted" with per-request verification: every connection — user, service, or machine — must be authenticated, authorized, and encrypted, regardless of network location. Decisions consider identity, device posture, and context. The substrate is strong identity, mutual TLS or signed tokens on every hop, and fine-grained policy at every service.

Castle-and-moat vs zero trustCastle-and-moatZero trustperimeter = trust boundary"corporate network"appdbappappappappimplicit trust between servicesone breached host → lateral movementVPNtrust = "you are inside"no perimeter; identity is the boundaryappdbappappappappmTLSmTLSmTLSmTLSevery request: authn + authz + encrypteddevice posture · context · least privilegetrust = "you proved it, again, just now"Google's BeyondCorp was the canonical implementation; SASE / SSE products package the controls
Zero trust isn't a product. Concrete pieces: identity-aware proxies (Google IAP, Cloudflare Access, Tailscale), workload identity (SPIFFE/SPIRE, AWS IAM Roles for Service Accounts), a service mesh (Istio, Linkerd) for mTLS, and a policy decision point (OPA, Cedar) at every service.

Pitfall — "we have a VPN, that's zero trust." A VPN that authenticates once and grants flat network access is the opposite of zero trust. The point is per-request verification with fine-grained policy, not "got past the door, free run inside."

Web security — the OWASP Top 10

Most web applications get attacked the same way every year. The OWASP Top 10 distills industry vulnerability data into ten failure categories that account for most real breaches (2021 edition; 2025 in flight):

  • A01 Broken access control — user reaches data they shouldn't (id-fiddling, missing checks).
  • A02 Cryptographic failures — plaintext at rest, weak ciphers, no TLS, leaked secrets.
  • A03 Injection — SQL, shell, LDAP, template.
  • A04 Insecure design — missed in the threat model.
  • A05 Security misconfiguration — default creds, debug on, exposed services.
  • A06 Vulnerable components — old dependencies with known CVEs.
  • A07 Auth failures — bad MFA, session fixation, credential stuffing.
  • A08 Integrity failures — unsigned updates, compromised CI.
  • A09 Logging/monitoring failures — no audit trail when you need one.
  • A10 SSRF — server fetches attacker-controlled URLs.

Treat it as a map of failure modes for design review, not a checklist that closes the file. Four categories — injection, XSS (folded into A03), CSRF (under A01), SSRF — account for most implementation-level work. Each has the same shape: untrusted input crosses a trust boundary and reaches code not built for hostile data. Each has a structural fix.

Injection — data parsed as code

Injection happens when user input flows into a string parsed as a structured language: SQL, shell, HTML, LDAP, XPath. The parser cannot distinguish data the user supplied from code the developer wrote, so the user runs code. SQL injection is the textbook case: a query built by string concatenation breaks open the moment the user includes a quote and a SQL keyword.

The fix is structural separation. A parameterized query keeps the SQL as a constant string with placeholders; user input goes into the placeholder slot. The parser fixes the query plan before user bytes arrive, so those bytes can never become SQL keywords. No clever escaping rule beats this — use the placeholder.

SQL injection — string concatenation vs parameterized queryVulnerable: string concatSafe: parameterized query"SELECT * FROM users WHERE name='" + n + "'"n = "x' OR '1'='1"attacker-supplied inputSELECT * FROM usersWHERE name='x' OR '1'='1'parser sees code, not data→ returns every row"SELECT * FROM users WHERE name = ?"params = ["x' OR '1'='1"]same attacker inputSELECT ... WHERE name = ?[? bound to literal "x' OR '1'='1"]parser fixed at prepare time→ matches no rows; data is datasame shape applies to shell (argv arrays not strings), HTML (auto-escaping templates), LDAP, XPath
Parameterization isn't a performance optimization — it's a structural separation of code and data. The query plan is fixed before the attacker's bytes arrive.

Cross-site scripting (XSS) — injection in the browser

Cross-site scripting is injection where the structured language is HTML and the parser is the browser. Three variants differ only in where the payload was stored between attacker and victim:

  • Reflected: attacker sends a link with the payload as a query parameter; server echoes it unescaped; victim clicks and the script runs in their session.
  • Stored: payload goes through a normal form, server saves it, every later visitor runs it — wormable.
  • DOM-based: payload never touches the server; client-side JS reads from an untrusted sink (URL fragment, document.referrer) and writes via innerHTML.

Server-side input filtering catches the first two but misses DOM XSS. The fix that catches all three is output encoding by context: a template engine that knows whether each interpolation lands in an HTML body, attribute, URL, or JS string and escapes accordingly. The browser-side defense is Content Security Policy — a response header that whitelists which scripts may run, so even if attacker bytes reach the page, the browser refuses them.

Cross-site request forgery (CSRF) — riding the user's authority

The browser attaches bank.com cookies to any request targeting bank.com, regardless of which site initiated it. CSRF exploits that: a user logged in to bank.com visits attacker.com, which auto-submits a hidden form to bank.com/transfer. The browser attaches the session cookie; bank.com sees an authenticated request and processes the transfer. The attacker never reads any data — same-origin policy hides the response — but the side effect is done.

CSRF — browser auto-attaches bank.com cookies to a request from attacker.comUserattacker.combank.com1. user logs in to bank.com (session cookie set)2. user visits attacker.com (different tab)<form action="bank.com/transfer" method="POST" hidden><script>form.submit()</script>3. browser sends POST+ bank.com session cookie (automatic)cookie valid →transfer succeedsdefenses: SameSite cookies · CSRF token · Origin/Referer check · re-auth on sensitive opsSameSite=Lax is the modern browser default — neutralizes most cross-site POSTs
The two structural defenses are at different layers. SameSite cookies cut the attack off at the browser by withholding the cookie on cross-site requests. CSRF tokens enforce it at the server by demanding a value only same-origin code could have read.
Worked example: the exact HTTP request the victim's browser sends

The victim logs in to bank.com in tab 1. The bank's Set-Cookie response includes a session cookie:

Set-Cookie: session=abc123; Domain=bank.com; HttpOnly; Secure

In tab 2, the same victim opens attacker.com. The attacker's page contains:

<form action="https://bank.com/transfer" method="POST" id="f">
  <input name="to" value="attacker_account">
  <input name="amount" value="10000">
</form>
<script>document.getElementById('f').submit();</script>

When that script runs, the browser builds and sends this request:

POST /transfer HTTP/1.1
Host: bank.com
Origin: https://attacker.com
Referer: https://attacker.com/
Content-Type: application/x-www-form-urlencoded
Cookie: session=abc123

to=attacker_account&amount=10000

The key line is Cookie: session=abc123. The browser attaches every cookie whose Domain matches the target of the request — bank.com — regardless of who initiated the request. The fact that the page running the script is on attacker.com does not strip the cookie. From bank.com's view this is a fully authenticated POST: same cookie it issued, same shape it expects.

Why each defense works:

  • SameSite=Lax (now the browser default) tells the browser: only attach this cookie on top-level navigations from the same site. The cross-site POST above gets no Cookie header, and bank.com sees an anonymous request.
  • CSRF token: bank.com embeds a random token in its own pages (e.g. in a hidden form field) and requires it on every state-changing request. The attacker's page on attacker.com cannot read bank.com's pages (same-origin policy), so it cannot fill in the token. The request arrives without it and is rejected.
  • Origin/Referer check: server inspects the Origin: https://attacker.com header and refuses if it does not match bank.com.

The attacker never sees the response body — same-origin policy prevents attacker.com's JavaScript from reading anything bank.com returns. But the side effect — money moved — has already happened by the time the response comes back. That asymmetry is what makes CSRF dangerous: you do not need to read to do harm.

Server-side request forgery (SSRF) — pivoting through your own server

SSRF inverts CSRF. The attacker controls a URL the server fetches — a URL preview, a webhook target, an avatar URL, a PDF renderer — and points it at infrastructure only the server can reach. Cloud metadata services (169.254.169.254) hand out IAM credentials to anything that asks from inside the VPC; internal databases sit on private IPs with no authentication. Capital One's 2019 breach was an SSRF into the AWS metadata service that leaked credentials for a bucket holding a hundred million records.

SSRF — attacker pivots through a server-side fetcher into the internal networkAttackeron the public internetApp serverURL preview / webhook /avatar fetcher / PDF renderinternal network (VPC)169.254.169.254cloud metadata · IAM creds10.0.0.5:5432internal Postgresredis.internal:6379no auth on private netfile:///etc/passwdlocal-file fetchurl=…server fetches whatever URL it's tolddefense: deny private IP ranges · DNS pinning · egress proxy · IMDSv2Capital One 2019: SSRF → cloud metadata → IAM creds → 100M records
The structural fix is to not let application code resolve arbitrary URLs. Route outbound fetches through a tightly scoped egress proxy that resolves DNS once, blocks private IP ranges, and enforces an allow-list. On AWS, IMDSv2 closes the metadata variant by requiring a session token.

A handful of recurring browser defenses round out the picture: Content Security Policy to block injected scripts, SameSite cookies for CSRF, HSTS to lock the browser to TLS, Subresource Integrity to pin third-party scripts to a hash, frame-ancestors for clickjacking. Each is a response header with known semantics. Ship them as defaults.

Pitfall — security through obscurity. Hiding /admin at /admin-9f3b2, returning 200 instead of 403 to confuse scanners, or doing client-side "encryption" before posting passwords does not slow a real attacker down. Build defense in depth with primitives that hold up under examination, then publish the design and let it be reviewed.

Standards

Cryptographic algorithms (NIST FIPS)

NIST Special Publications

TLS, X.509, and key derivation

  • RFC 8446TLS 1.3. Replaces RFCs 5246 (TLS 1.2), 4346 (TLS 1.1), 2246 (TLS 1.0).
  • RFC 5246TLS 1.2 (still common; obsoleted by 8446 but widely deployed).
  • RFC 5869HKDF (HMAC-based key derivation function); used inside TLS 1.3 and Signal.
  • RFC 5280X.509 Public Key Infrastructure Certificate and CRL Profile. The certificate format.
  • RFC 6962 / RFC 9162Certificate Transparency v1 and v2 (append-only logs).
  • RFC 8879 — TLS Certificate Compression. Smaller handshakes.

Authorization and identity protocols

  • RFC 6749OAuth 2.0 Authorization Framework. Core spec.
  • RFC 8252OAuth 2.0 for Native Apps. PKCE plus best-current-practice for mobile/desktop.
  • RFC 7636PKCE (Proof Key for Code Exchange).
  • RFC 9700OAuth 2.0 Security Best Current Practice (2024). The modern guidance — read this before implementing anything.
  • RFC 9635GNAP (Grant Negotiation and Authorization Protocol). The OAuth 2.0 successor.
  • OpenID Connect Core 1.0 — Identity layer on OAuth.
  • SAML 2.0 — OASIS Security Assertion Markup Language. Enterprise SSO.
  • RFCs 7515–7519 — JWS, JWE, JWK, JWA, JWT. The JOSE family for signed/encrypted JSON.

Authentication

  • W3C Web Authentication, Level 3WebAuthn. Passkey API used by browsers.
  • FIDO Alliance CTAP2Client to Authenticator Protocol. The wire format between browser and security key.
  • RFC 6238TOTP: Time-Based One-Time Password Algorithm.
  • RFC 4226HOTP: HMAC-Based One-Time Password.
  • Argon2 — RFC 9106. The password-hashing winner of the 2015 PHC competition.

Web security frameworks

  • OWASP Top 10 — Current edition: 2021.
  • OWASP ASVSApplication Security Verification Standard. Concrete requirements at L1/L2/L3 assurance.
  • OWASP MASVS — Mobile counterpart of ASVS.
  • CWECommon Weakness Enumeration (MITRE). Taxonomy of software weaknesses; CVE entries cite CWE IDs.
  • MITRE ATT&CK — Adversary tactics, techniques, and procedures (TTPs); used for detection-engineering and red-team simulation.
  • Web platform headers — Content Security Policy (W3C CSP3), HSTS (RFC 6797), Subresource Integrity (W3C SRI), Referrer Policy, Permissions Policy.

Supply chain

  • SLSASupply-chain Levels for Software Artifacts. A maturity model for build provenance.
  • Sigstore — keyless code signing built on cosign, Rekor (transparency log), Fulcio (short-lived cert authority bound to OIDC identity).
  • SBOM formatsSPDX (ISO/IEC 5962); CycloneDX (OWASP). Machine-readable inventories of components.
  • in-toto — Attestations for build steps; underpins SLSA provenance.
  • NIST SSDF (SP 800-218) — Secure Software Development Framework.

Compliance frameworks

  • PCI DSS v4.0 (2024) — Payment Card Industry Data Security Standard. Required for any system handling cardholder data.
  • SOC 2 (AICPA) — Trust services criteria. The most common attestation B2B SaaS customers ask for.
  • ISO/IEC 27001 / 27002 — Information security management system + control catalog.
  • HIPAA Security Rule — US healthcare data; 45 CFR §§ 164.302–318.
  • GDPR — Regulation (EU) 2016/679. Personal data; informs threat models even when not legally binding.

Forward references

  • Logging, audit, and observability for security events — covered in Act IXc (Operations).
  • OAuth client-credentials and machine-to-machine API authentication — composed with the API patterns in Act Vc (APIs).
  • Symmetric cipher internals (AES rounds, ChaCha20 quarter-round), elliptic-curve math, and the TLS handshake byte-by-byte are link-out chapters from this section, not in this overview.
Going deeper

Branches that earn their own article.

  • Symmetric cipher internals (AES rounds, ChaCha20).
  • Asymmetric crypto math (RSA, elliptic curves, Diffie-Hellman).
  • TLS 1.3 handshake byte-by-byte.
  • OAuth 2.0 and OIDC flows in detail.
  • GNAP (RFC 9635) deep dive.
  • Supply-chain security (SBOMs, Sigstore, SLSA).
  • Secrets management (Vault, SOPS).
  • Penetration testing methodology.
  • Security compliance frameworks (SOC 2, ISO 27001).
  • Incident response for security breaches.