Why PDFs Are a Security Nightmare
I’ve spent a decent amount of time looking at malicious Office documents, especially macro-enabled ones. In school, that was one of the more common examples of document-based malware: open the document, enable macros, and suddenly obfuscated VBA code is hijacking your system. That model is easy enough to explain because the suspicious part is right there in the name: the document has macros, macros can execute code. Obviously, that can go badly.
PDF attack surfaces are generally far less blatant and blend in with typical business workflows. Most people do not think of PDFs as interactive software. They think of them as digital paper. Invoices, resumes, contracts, bank statements, shipping labels, tax forms, court records, school forms, medical paperwork, vendor quotes, tickets, manuals. PDFs are the thing you send when you want someone else to see the same document you saw. That is why they are useful, but it is also why they are such a good attack vector.
A malicious Word document has to answer the question: why does this document need macros? A malicious PDF often does not have to answer anything. It arrives looking like an ordinary business document, gets previewed by the browser or email client, and fits naturally into the workflow.
PDFs are not dangerous because of one weird feature, or because they can have embedded JavaScript or entire embedded files. They are dangerous because they became a mandatory part of online business while accumulating more and more application-like behavior over time. The file extension implies they’re simply a formatted document, but they are so much more. They started as a way to preserve document format and structure but ended up as a giant compatibility machine for rendering, forms, links, annotations, signatures, embedded files, scripts, permissions, compression, media, and whatever else decades of business workflows needed them to become. When viewed from this perspective, the security nightmare is a familiar story: technical debt, design drift, growing implementation complexity, and an attack surface that grows along with all of it. The ubiquity of the file format just makes it more profitable for malicious actors.
The Original Problem Was Practical and Limited in Scope
The original PDF problem was boring in the best possible way: documents did not travel well. In the early 1990s, sending someone a file and expecting it to look the same on their computer was not realistic. Fonts might not exist on the other machine. Layouts could break. Printers behaved differently. Applications were not portable. Operating systems had their own expectations. The idea of a portable document format made sense. A user wants to preserve the visual representation of the document and make it possible to view and print the same thing across different systems. Turning a file into an artifact that could be replicated, like a digital printing press. That was quickly becoming a necessity when the idea was put to paper and is even more applicable today in the age of cloud computing and remote work.
John Warnock’s original Camelot paper describes the problem pretty clearly: there was no universal way to communicate and view printed information across different applications and systems. That goal makes sense. But once the answer is “ship a self-contained representation of the document and have a reader reconstruct it,” you already have enormous hidden complexity in the form of a parser and renderer. This is also where the security problem starts, because a PDF was never just plain text. Even a boring PDF is a structured file that a reader has to parse and interpret. It can contain text, fonts, images, vector graphics, compressed streams, metadata, page objects, cross references, and layout instructions. That complexity only grew as adoption increased.
Feature Creep Became an Attack Surface
At some point, PDF stopped being just a way to preserve a document and became a container for business workflows. Looking at the first openly published spec, the standardized PDF 1.7 specification from 2008, published as ISO 32000-1, is not a tiny document-format spec. It covers a large amount of behavior: objects, streams, encryption, actions, annotations, interactive forms, digital signatures, embedded files, optional content, multimedia, and more. PDF technology also has JavaScript support through ECMAScript for PDF. Adobe’s own Acrobat JavaScript API Reference documents a large API surface for interacting with PDF documents. Need media, layers, external links, launch actions, 3D content, browser integration, email previews, enterprise document workflows, and support for every strange file emitted by every half-broken PDF generator since the 90s? Congratulations, now you have the modern PDF ecosystem. It sounds a lot like the history of email, technology gains widespread adoption, and then becomes a swiss army knife of features and technical debt.
Individually, a lot of these features make sense. Businesses need forms. Legal and finance workflows need signatures. Publishers need reliable rendering. Users need links. Governments and corporations need archival formats. The problem is not that every feature is absurd in isolation. The problem is what happens when all of those features land inside one format that everyone is expected to open.
A modern PDF reader may need to:
- parse a complicated object graph
- decompress multiple object streams
- resolve external and internal references
- render fonts and vector graphics
- process annotations and forms
- validate digital signatures
- enforce permissions
- display external links
- decide what to do with embedded files and external content
- decide whether JavaScript or actions should run
- integrate with browser, email, and operating system workflows
That is a lot of behavior for a file extension that most users consider as simply a print format. Every parser, renderer, compatibility feature, and interactive behavior becomes a place where assumptions can fail. Sometimes that means an actual exploit against the reader. Sometimes it means abusing a feature the reader supports. Sometimes it means using the PDF as a convincing front-end for a phishing flow. The last one is important: a malicious PDF does not need to drop malware or execute shellcode to be successful. It can just look like a normal invoice and send the user to a fake login page.
The Reader Is the Other Half
All of that complexity lives in the file, but the file is only half the story. “PDF support” means so many different things in practice. A PDF might be opened in Adobe Acrobat, Preview on macOS, Chrome, Edge, Firefox, a mobile reader, an email preview pane, an enterprise document system, a PDF library inside a backend service, or some embedded viewer inside another application. The reader decides what the document actually does. Different readers support different features, parse malformed files differently, sandbox differently, expose different UI, handle embedded files differently, and make different decisions around scripting and external links. So “it is just a PDF” is not a useful security statement. The better question is:
A PDF opened by what software, on what system, with what configuration, and by what user?
That is where the trust boundary gets fuzzy. A file format this complex is not just data sitting on disk. It is input to a large interpreter that has to make a lot of decisions. Some of those decisions are security relevant, and many of them are invisible to the user.
This is also why PDFs are a problem beyond desktop malware. Server-side PDF tooling has its own attack surface. If an application accepts uploaded PDFs, generates thumbnails, extracts text, reads metadata, converts documents, or stitches files together, then some PDF parser or renderer is processing attacker-controlled input.
That does not mean every PDF workflow is doomed. It just means PDFs deserve more respect than they usually get.
Why PDFs Work So Well in Phishing
The technical attack surface matters, but the social side is probably more important. PDFs belong in email, it is their natural habitat. People expect invoices as PDFs. They expect contracts as PDFs. They expect school forms, bank notices, government letters, resumes, quotes, receipts, benefits documents, and shipping labels as PDFs. An executable attachment looks suspicious. A random shell script looks suspicious. A macro-enabled Office document now has a reputation due to awareness. A PDF looks like work and most crucially they are everywhere in business workflows, enabling a wide range of social engineering tactics.
Attackers do not need to convince someone to run a program when they can convince them to open a file posing as a document they already open all the time. The PDF can be the payload, but it can also just be the wrapper. It can contain a link to a fake Microsoft login. It can show a “secure document” message. It can use a QR code to move the interaction onto a phone, where the user has less context and probably worse security tooling. It can claim the invoice is protected, expired, encrypted, pending signature, or only available through a portal. No memory corruption required. No sandbox escape required. No clever exploit chain required. Just a normal-looking document in a normal-looking workflow asking the user to take the next normal-looking step.
That is what makes the format so useful in phishing. The PDF gives the attacker a trusted visual surface. It lets them wrap a malicious interaction in something that already belongs in the inbox. MITRE’s ATT&CK knowledge base describes both phishing through attachments and phishing through links as common ways adversaries attempt to gain access, and PDFs fit both patterns well because they can be the attachment and the thing that points the user somewhere else. Check Point’s 2025 write-up on the weaponization of PDFs makes the same point from a detection perspective: PDFs remain useful because they are common in everyday business communication and provide enough room for deceptive behavior. That is not some exotic edge case. That is the point.
PDFs work because they do not need to look weird. They are most effective when they look boring.
Why We Cannot Just Block Them
The annoying part is that the obvious answer does not work. You cannot just block PDFs everywhere.
The entire world needs them: businesses, governments, schools, banks, attorney offices, hospitals, accounting firms, etc. They’re generated for internal corporate workflows and shared with customers, partners, and employees. PDF is infrastructure now. Not great infrastructure, arguably, but infrastructure. PDFs are risky enough to deserve suspicion, but common enough that blocking them outright breaks real work. So instead of treating them as obviously dangerous, organizations tend to treat them as normal documents with some scanning layered on top. Sometimes that is the only realistic option, but it also means the format keeps getting the benefit of the doubt. And attackers understand that perfectly.
This is the same reason so many security problems persist in boring business systems. The risky thing is also the useful thing. Nobody wants to break invoicing, contracts, onboarding, vendor paperwork, school paperwork, tax forms, or internal reporting just because PDFs are a mess. So the mess stays. It gets scanners, warnings, sandboxing, allow lists, user training, and some email filtering bolted around it, but the core workflow remains.
That is not necessarily a criticism. It is just reality. You defend PDFs in layers because removing them is not a real option for most organizations.
The Practical Takeaway
The point is not that every PDF is dangerous, but the mental classification of the file type can lead to irresponsible actions. PDFs should not be mentally categorized as harmless digital paper. They are complex, feature-rich containers interpreted by complex software and embedded into high-trust workflows. That makes them useful. That makes malicious PDFs dangerous.
A better posture is:
- Treat unexpected PDFs as active content containers, not plain documents.
- Be suspicious of PDFs that push you to a login page.
- Be suspicious of “secure document,” “protected invoice,” and “view message” flows.
- Prefer going directly to known portals instead of clicking links inside attachments.
- Keep PDF readers and browsers patched.
- Disable unnecessary scripting where possible.
- Open suspicious files in safer viewers or isolated environments.
- Remember that phishing PDFs often do not need exploits to work.
CISA, NSA, FBI, and MS-ISAC’s joint phishing guidance is broader than PDFs, but the defensive posture maps well here: reduce the chance of successful execution, harden email and browser paths, report suspicious messages, and avoid treating document workflows as inherently safe.
The Cost of Success
PDFs became a security nightmare because they succeeded. They solved a real portability problem, became mandatory business infrastructure, and then kept absorbing features until opening a “document” meant trusting a surprisingly large amount of software behavior. That is the uncomfortable part. The PDF still looks like paper. The security model does not.
References
- John Warnock, The Camelot Project
- Adobe / ISO, ISO 32000-1:2008, Document management: Portable document format, Part 1: PDF 1.7
- PDF Association, ISO 21757, ECMAScript for PDF
- Adobe, Acrobat JavaScript API Reference
- MITRE ATT&CK, Phishing: Spearphishing Attachment
- MITRE ATT&CK, Phishing: Spearphishing Link
- Check Point Research, The Weaponization of PDFs
- CISA, NSA, FBI, and MS-ISAC, Phishing Guidance: Stopping the Attack Cycle at Phase One