Interactive Lab

Format String Attacks

When user input becomes a printf format string, the function reads — and writes — whatever the attacker requests.

Vulnerableuser input == format string
void log_request(const char *user_input) {
  printf(user_input);          // <-- user controls the format
}

log_request(argv[1]);          // any %x, %s, %n in argv[1] is honored
Safeuser input is data, not format
void log_request(const char *user_input) {
  printf("%s", user_input); // user input is an argument
}

log_request(argv[1]);          // %x, %s, %n are printed literally
Try It — simulated printf() on a known stack
The vulnerable call printf(user_input);
Your input
stdout
Click "Run printf" to evaluate the format string above.
Stack visible to printf (above the return address)
Format Specifier Cheat Sheet
SpecifierWhat it doesAttacker use
%xPrint next stack word as hexLeak stack values (canaries, addresses)
%pPrint next stack word as pointerDefeat ASLR by leaking addresses
%sDereference next stack word as a C stringRead arbitrary memory at the leaked address
%nWrite the count of bytes printed so far to the next stack word's targetArbitrary write — full code execution
%N$xPositional: read the Nth stack wordSkip past noise to a precise offset

Why printf is a primitive

The C printf family is variadic — it does not know how many arguments were actually passed. It trusts the format string to tell it. For every %x it sees, it reads one word from where the next argument should be. On most ABIs, that means walking up the stack (or argument registers), reading whatever happens to be there as if the caller had passed it.

If the format string is attacker-controlled, the attacker chooses how many stack words to walk and how to interpret each one. %x %x %x %x dumps four words of stack. %s treats the next word as a pointer and dereferences it — an arbitrary read of any address whose value happens to sit at the right offset. Combined with positional specifiers like %7$s, the attacker can pick the offset deliberately.

The truly catastrophic specifier is %n: instead of reading, it writes the number of bytes printed so far to the address sitting at the next argument slot. Pair an %n with field-width padding (%1000c%n prints 1000 spaces, then writes the integer 1000) and an attacker can construct an arbitrary value and write it to any address they can place on the stack. That is the recipe for overwriting return addresses, GOT entries, or function pointers — and from there, executing shellcode.

The fix is one rule: never pass untrusted data as a format string. Always write printf("%s", user_input), never printf(user_input). Modern compilers help: GCC and Clang issue -Wformat-security warnings, and _FORTIFY_SOURCE rejects writable format strings at runtime. Most contemporary languages eliminate the class entirely by separating format and data at the type level (Python f-strings, Rust's format! macro, Java's String.format with a literal first argument).

In the wild

CVE-2000-0573
wu-ftpd SITE EXEC
The format-string bug that made the class famous. A user-supplied SITE EXEC argument flowed into a vulnerable printf in wu-ftpd, giving remote root on countless FTP servers.
CVE-2012-0809
sudo
A format-string flaw in sudo's debug output let local users escalate privileges. A reminder that even security-critical tooling has shipped this bug.
CVE-2023-25193
HarfBuzz
Format-string handling in the HarfBuzz text-shaping library used by browsers and document renderers. Malicious fonts could trigger denial of service or memory disclosure.
CWE-134
The whole class
"Use of Externally-Controlled Format String." Still appears in OWASP and MITRE top-N lists despite three decades of compiler warnings, because legacy code and copy-pasted log calls keep introducing it.