07.05 · OWASP A04:2021 (Insecure Design)

File Upload

Profile picture, resume PDF, expense receipt. Every app accepts uploads. Every app validates them poorly. The bug that turns a user's avatar into a webshell is the same one that's existed since the cgi-bin era.

The naive endpoint

A junior developer writes the upload handler. They've seen tutorials. They check the extension, write the file to disk, return a URL. Done.

// PHP — the kind that ships and then ends careers $file = $_FILES["avatar"]; $ext = pathinfo($file["name"], PATHINFO_EXTENSION); if (in_array($ext, ["jpg", "png", "gif"])) { move_uploaded_file($file["tmp_name"], "/var/www/uploads/" . $file["name"]); echo "/uploads/" . $file["name"]; }

Five things wrong with this code. Each one is an exploit.

Five ways attackers win

1. Extension lying

The attacker uploads shell.php. The extension check fails. So they upload shell.php.jpg. The check passes (last segment after . is jpg). But Apache's AddHandler or misconfigured nginx might still execute shell.php.jpg as PHP if it sees .php anywhere in the filename. CVE-favorite for a decade.

2. Content-Type spoofing

The server checks $_FILES["avatar"]["type"]. That value is the MIME type declared by the client. The attacker sets Content-Type: image/jpeg on a PHP webshell. The server believes them. They upload shell.php with Content-Type: image/jpeg and the server happily writes it.

3. Magic-byte bypass

You upgraded the check. Now you read the first few bytes (the magic number) and confirm it's a JPEG (FF D8 FF E0) or PNG (89 50 4E 47). The attacker prepends a real JPEG header to a PHP file. Magic bytes match. File saves. Server still executes it because nothing actually parsed the JPEG content.

# attacker's polyglot file: valid JPEG header + PHP code $ printf '\xFF\xD8\xFF\xE0polyglot' > shell.jpg $ echo '<?php system($_GET["c"]); ?>' >> shell.jpg # `file shell.jpg` reports: JPEG image data. uploads cleanly. # curl https://target.com/uploads/shell.jpg?c=id -- if Apache executes by extension only

4. Path traversal in the filename

The user-supplied filename is ../../../../etc/cron.d/evil. The upload handler doesn't normalize the path. Now you have a cron job. Or you overwrite index.php. Or you write into .ssh/authorized_keys of a service user. Filenames must be sanitized to just a basename.

5. Stored on the same server as code

The structural mistake. Uploads live under the web root, in a directory the web server treats like any other. The attacker uploads a webshell with any of the four tricks above and just navigates to it.

The single best defense: upload directories should not be web-served. Period. Files come back through an application endpoint that re-renders them. Most modern stacks default to object storage (S3, Azure Blob, GCS) precisely because it bypasses this entire vulnerability class.

The right way

Defense in depth — pick all that applysafe
# 1. Generate the filename. Never trust the user's. random_name = uuid.uuid4().hex + ".jpg" storage_path = UPLOAD_DIR / random_name # UPLOAD_DIR is outside web root # 2. Verify content by parsing it, not by reading bytes you trust. try: img = Image.open(uploaded_file) img.verify() # raises if not a real image img = Image.open(uploaded_file) # re-open after verify img.thumbnail((1024, 1024)) # re-encode — strips embedded payloads img.save(storage_path, "JPEG", quality=85) except Exception: return "Bad file", 400 # 3. Store outside web root. Serve through an app endpoint that checks auth. # 4. Set Content-Type and Content-Disposition on responses; never trust upload mime. # 5. Limit size (mod_security, nginx client_max_body_size, framework setting). # 6. Antivirus scan if the file is shared with other users.

The decisive line is the third one: store outside the web root. Even if all the validation above fails, a file the web server cannot execute as code can't compromise the host through this path.

Special filetypes

TypeWhy it's dangerousHow to handle
SVGSVG is XML. SVG can contain JavaScript. Upload SVG → user views avatar → XSS in their session.Re-render to PNG/JPEG server-side. Never serve user SVG directly.
HTML / SVG / XMLSame Origin Policy: file served from your domain can read your cookies.Serve from a separate domain (cdn-user-content.example.com) or with Content-Disposition: attachment.
ZIP / TARZip slip (filenames in archive contain ../ — the extractor writes outside the intended directory).Sanitize each archive entry's name before writing. Reject entries with traversal.
PDFParsers have RCE bugs (CVE-2018-9958 PDF.js, others). PDFs can also embed JavaScript.Render server-side to image if you can. Sandbox parsing.
DOCX / XLSXOffice files are zip + XML — full XXE attack surface. Macros aside.Treat as untrusted XML input. Sandbox parsing. Reject macros.
Images with EXIFEXIF can carry malicious payloads (PHP, shell) that get processed if the image goes through an image processor with a vulnerability (ImageTragick — CVE-2016-3714).Strip EXIF on upload. Keep ImageMagick patched. Sandbox image processing.

Real incidents

  • ImageTragick (CVE-2016-3714) — a flaw in ImageMagick let attackers achieve RCE just by uploading a crafted image. Affected millions of sites that did exactly the right thing (process images server-side) but with a vulnerable processor.
  • Drupalgeddon-era webshells (2014–2018) — one of the most common post-exploitation steps after a Drupal RCE was dropping a webshell into /sites/default/files/, which was reachable from the web.
  • Microsoft Exchange ProxyShell (2021, CVE-2021-34473 chain) — after exploit, attackers wrote .aspx webshells into the Exchange FrontEnd directory, which IIS executed. The webshells gave persistent admin access.
  • Equifax (2017) — not strictly file upload, but the post-exploitation involved writing JSP webshells.

The takeaway

File upload is one of the oldest attack categories on the web. The defense pattern hasn't changed much: store user files outside the web root, generate filenames yourself, re-render content to strip payloads, and verify by parsing rather than by trust. Use object storage for new systems — it makes the structural error impossible by design.

The next time you write an upload handler, the question isn't "what file types do I allow?" The question is "if this file was malicious, what's the worst it could do?" — and your design should make the answer "nothing."

References

Formatted in APA 7.

  1. OWASP. (2024). File upload cheat sheet. OWASP Cheat Sheet Series. https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html
  2. MITRE. (2024). CWE-434: Unrestricted upload of file with dangerous type. Common Weakness Enumeration. https://cwe.mitre.org/data/definitions/434.html
  3. Snyk. (2018). Zip slip vulnerability. https://snyk.io/research/zip-slip-vulnerability
  4. ImageMagick. (2016). ImageMagick security policy — CVE-2016-3714 ("ImageTragick"). https://imagetragick.com/