Symmetric · 03

Padding and Initialization Vectors

Block ciphers want their inputs in neatly sized blocks and they want every encryption to look different from every other. Padding solves the first. Initialization vectors solve the second. Both are easy to get wrong.

01

Two Problems, Two Tools

AES insists on 16-byte blocks. Real-world messages do not arrive in tidy 16-byte multiples. Worse, even when they do, encrypting the same plaintext with the same key always produces the same ciphertext, which leaks information.

This page covers the two mechanical components that paper over those gaps:

Both are simple in concept. Both have famous attacks tied to them when implementations cut corners.

02

Why Padding Is Needed

A block cipher running in modes such as ECB or CBC cannot encrypt anything except a complete block. If the plaintext is 1,000 bytes long, the cipher will see 62 full 16-byte blocks plus 8 leftover bytes. Those 8 bytes are not a complete block. The cipher refuses to operate on them.

Padding adds extra bytes to the end of the plaintext to fill the final block. Decryption must then be able to identify and strip the padding afterward. The trick is making the padding unambiguous so it can be reliably removed.

03

PKCS#7 Padding

The standard padding scheme used with AES is PKCS#7. The rule is simple:

Rule

If N bytes need to be added to complete the last block, pad with N bytes, each containing the value N.

So if 5 bytes of padding are needed, those 5 bytes will each contain the value 0x05. If 1 byte of padding is needed, it will be 0x01. If 13 bytes are needed, all 13 will be 0x0D (decimal 13). The receiver looks at the last byte of the decrypted message, learns how many padding bytes were added, and strips them.

PKCS#7 padding examples Three padding scenarios are shown: 11 bytes of message needing 5 bytes of 0x05 padding, 15 bytes needing one 0x01 byte, and exactly 16 bytes still needing a full extra block of 0x10 padding. PKCS#7 Padding Scenarios (16-byte blocks) CASE A · 11-byte message, needs 5 padding bytes (0x05) H E L L O space W O R L D 05 05 05 05 05 11 message bytes 5 padding bytes, each 0x05 CASE B · 15-byte message, needs 1 padding byte (0x01) m e s s a g e 1 5 b y t e s ! 01 15 message bytes 1 byte 0x01 CASE C · 16-byte message: STILL needs a full block of padding (0x10 = 16) 16 byte message fills block exactly 10101010 10101010 10101010 10101010
Figure 3.1 PKCS#7 padding. The value of each padding byte equals the number of padding bytes. A whole extra block of padding is added when the plaintext is already a perfect multiple of the block size, so that the receiver can always strip padding unambiguously.

Notice case C. If the plaintext happens to be exactly a multiple of 16 bytes, an entire extra block of 0x10 bytes is appended. This looks wasteful but it is essential. Without it, the receiver could not tell whether the last byte of the message was data or padding. By always adding padding, the receiver can always strip the last N bytes, where N is whatever value the last byte contains.

04

When Padding Goes Wrong

PKCS#7 is straightforward, but a careless implementation can leak information through whether or not the padding is valid. A famous class of attacks called padding oracle attacks exploits systems that respond differently to ciphertexts with valid padding versus invalid padding. By submitting forged ciphertext repeatedly and watching for the difference, an attacker can decrypt arbitrary blocks one byte at a time, without ever knowing the key.

The most famous example is the POODLE attack on SSL 3.0 in 2014. POODLE broke real-world TLS traffic by exploiting the padding behavior of CBC mode. The mitigation is to either move to authenticated encryption (covered later in this track) or to be extremely careful that the system never reveals padding validity through timing, error messages, or response codes.

Caution

Padding is not a security feature. It is a structural necessity. The moment your system can be probed for padding validity, an attacker may be able to recover plaintext without the key. Modern systems sidestep the problem entirely by using authenticated encryption modes (AES-GCM, ChaCha20-Poly1305) where padding is not needed and tampering is detected before any padding is checked.

05

Why Identical Blocks Are a Problem

The simplest way to encrypt a multi-block message is to encrypt each block independently with the same key. This is called ECB (Electronic Codebook) mode. It is also the wrong way to do it.

The problem: if two plaintext blocks are identical, they will produce two identical ciphertext blocks. Patterns in the plaintext, repeated headers, repeated fields, repeated zero-bytes from padding, survive the encryption and become visible in the ciphertext.

The ECB pattern leak Two identical plaintext blocks encrypted with the same key in ECB mode produce two identical ciphertext blocks, exposing the pattern. The ECB Problem: Patterns Survive Encryption PLAINTEXT BLOCKS "BALANCE: 1000.00" "BALANCE: 1000.00" "FEE: 0050.00" "BALANCE: 1000.00" AES (K) AES (K) AES (K) AES (K) CIPHERTEXT BLOCKS "7a4f...e3d2" "7a4f...e3d2" "9b21...88af" "7a4f...e3d2" repeat 1 repeat 2 repeat 3 Same plaintext + same key = same ciphertext, every time. An attacker who sees the ciphertext can count how often each block repeats and recover structure.
Figure 3.2 ECB mode leaks plaintext patterns directly into the ciphertext. This is why ECB is never used for anything beyond toy examples.

This is not theoretical. The canonical demonstration encrypts the Linux Tux mascot image with ECB. The result is a recolored but completely recognizable Tux, because every gray-pixel region of the image encrypts to the same gray-equivalent ciphertext region. The image of the bird passes straight through the encryption.

The fix is to make every block's encryption depend on something unique, so that even identical plaintext blocks produce different ciphertext blocks. That something is the initialization vector.

06

What an Initialization Vector Is

An initialization vector (IV) is a random or unique value, the same size as a block (16 bytes for AES), that is mixed into the first block of encryption. Different IVs produce different ciphertexts even from identical plaintexts under identical keys.

The IV itself is not secret. It does not need to be hidden. It is typically prepended to the ciphertext or transmitted alongside it so the receiver can use it during decryption. What the IV must be is unique per encryption with the same key. Reusing an IV destroys the security properties for most modes.

ModeIV Requirement
CBCMust be unpredictable and unique. Generated with a cryptographically secure random source.
CTRMust be unique. Often a counter that starts fresh each message. Predictability is acceptable.
GCMMust be unique. 96 bits (12 bytes) is recommended. Reuse with the same key is catastrophic for GCM specifically.
07

CBC Mode · How the IV Gets Used

CBC (Cipher Block Chaining) is the classical block cipher mode that demonstrates IV usage cleanly. Each plaintext block is XORed with the previous ciphertext block before being encrypted. The very first block has no previous ciphertext, so it is XORed with the IV instead.

CBC mode encryption with IV The IV is XORed with the first plaintext block before encryption. Each subsequent block is XORed with the previous ciphertext before being encrypted. CBC Mode: Each Block Chained to the Last P1 P2 P3 IV AES Encrypt (K) AES Encrypt (K) AES Encrypt (K) C1 C2 C3 feedback feedback First block uses IV as the "previous ciphertext". Every block after chains from the real previous block. A different IV at the start cascades into different ciphertext for the whole message.
Figure 3.3 CBC mode. The IV seeds the chain. Each block's ciphertext influences the next block's encryption, so identical plaintexts under the same key produce different ciphertexts when the IVs differ.

The decryption process runs in reverse. The first ciphertext block is decrypted and then XORed with the IV to recover the first plaintext. The second ciphertext block is decrypted and then XORed with the first ciphertext block to recover the second plaintext. And so on. The IV travels with the ciphertext, in the clear, usually as the first 16 bytes.

08

Three Common IV Mistakes

Reusing the IV

The single most damaging IV mistake. If two messages are encrypted under the same key with the same IV, the first block of both messages will produce identical ciphertext, exposing whether the two messages start the same way. In CTR or GCM mode, reuse is catastrophic and can leak the plaintext directly through XOR analysis.

Using an all-zeros IV

Some implementations use a constant zero IV "for simplicity." This is functionally identical to IV reuse: every message under the key uses the same starting point, so identical plaintexts produce identical ciphertexts, defeating the entire purpose of the IV.

Using a predictable IV in CBC

CBC requires an IV that an attacker cannot predict. If the IV is predictable (for example, a sequential counter visible to an attacker), the cipher becomes vulnerable to the BEAST attack, which broke real-world TLS 1.0 traffic in 2011. CTR and GCM modes tolerate predictable IVs because their security model is different, but CBC does not.

Practical Guidance

For CBC, generate the IV from a cryptographically secure random number generator. For CTR and GCM, a counter or random value is fine as long as it never repeats under the same key. Never hard-code an IV. Never default to zeros. Always transmit the IV with the ciphertext so the receiver can decrypt.

09

Putting It Together

To encrypt an arbitrary-length message with AES in CBC mode:

  1. Generate a fresh random 16-byte IV.
  2. Apply PKCS#7 padding to the plaintext so it is an exact multiple of 16 bytes.
  3. Encrypt the padded plaintext under the key, chaining through CBC starting with the IV.
  4. Transmit the IV concatenated with the ciphertext.

To decrypt:

  1. Split off the first 16 bytes as the IV.
  2. Decrypt the remaining ciphertext under the key, chaining through CBC starting with the IV.
  3. Read the last byte of the recovered plaintext; that is the padding length. Strip that many bytes from the end.
  4. Return the remaining plaintext.

This is the historical default for symmetric encryption, and it still works correctly when implemented carefully. Modern systems prefer authenticated encryption modes like AES-GCM, which fold encryption and tamper-detection into a single operation and eliminate both padding oracle attacks and the need to manage padding at all. That story is on the AEAD page.