1 Trillion Dollar Refund – How To Spoof PDF Signatures

The Portable Document Format (PDF) is the de-facto standard for document exchange worldwide. To guarantee the authenticity and integrity of documents, digital signatures are used. Several public and private services ranging from governments, public enterprises, banks, and payment services rely on the security of PDF signatures. In this paper, we present the first comprehensive security evaluation on digital signatures in PDFs. We introduce three novel attack classes which bypass the cryptographic protection of digitally signed PDF files allowing an attacker to spoof the content of a signed PDF. We analyzed 22 different PDF viewers and found 21 of them to be vulnerable, including prominent and widely used applications such as Adobe Reader DC and Foxit. We additionally evaluated eight online validation services and found six to be vulnerable. A possible explanation for these results could be the absence of a standard algorithm to verify PDF signatures – each client verifies signatures differently, and attacks can be tailored to these differences. We, therefore, propose the standardization of a secure verification algorithm, which we describe in this paper. All findings have been responsibly disclosed, and the affected vendors were supported during fixing the issues. As a result, three generic CVEs for each attack class were issued [50–52]. Our research on PDF signatures and more information is also online available at https://www.pdf-insecurity.org/.


INTRODUCTION
Introduced in 1993 by Adobe Systems, the Portable Document Format (PDF) was designed as a solution for the consistent presentation of documents, independent of the operating system and hardware. Today, the PDF format has become the standard for electronic documents in our daily workflow. The total number of PDF files in the world is hard to guess, but according to Adobe System's Vice President of Engineering, Phil Ydens, there were about 1.6 billion PDF files on the web in 2015 [3], whereby 80% were created in the same year. This leads him to estimate that about 2.5 trillion PDF files were created since 2015. Whether this is correct or not, PDF files are heavily used in everyone's life -for exchanging information, for creating and archiving invoices and contracts, for submitting scientific papers, or for collaborating and reviewing texts.
PDF Digital Signatures. The PDF specification supports digital signatures since 1999 in order to guarantee that the document was created or approved by a specific person and that it was not altered afterward. PDF digital signatures are based on asymmetric cryptography whereby the signer possess a public and private key pair. The signer uses his private key to create the digital signature. Any document modification afterward invalidates the signature and leads to an error message thrown by the corresponding PDF viewer or validation service. PDF digital signatures must not be confused with electronic signatures, which are the electronic equivalent of handwritten signatures; this is done by basically adding an image of the signer's handwritten signature into the document. Electronic signatures do not provide any cryptographic protection so that spoofing attacks are trivial and not further considered.
In 2000, President Bill Clinton enacted a federal law facilitating the use of electronic and digital signatures in interstate and foreign commerce by ensuring the validity and legal effect of contracts. He even approved the eSign Act by digitally signing it [35]. Since 2014, organizations delivering public digital services in an EU member state are required to support digitally signed documents, which are even admissible as evidence in legal proceedings [48]. In Austria, every governmental authority digitally signs any document [17, §19]. In addition, any new law is legally valid after its announcement within a digitally signed PDF. Several countries like Brazil, Canada, the Russian Federation, and Japan also use and accept Figure 1: Validly signed PDF document by Amazon with a spoofed content. Adobe Acrobat DC claims that the 'document has not been modified since the signature was applied'.
digitally signed documents [53]. Outside governmental services digitally signed PDFs are used by the private sector to sign invoices and contracts: e.g., invoices by Amazon, Decathlon, Sixt, and even more are concluded secretly between companies. Even in the academic world, PDF signatures are used to sign scientific papers (e.g., ESORICS proceedings) as evidence of the paper's submission state. According to Adobe Sign, the company processed 8 billion electronic and digital signatures in 2017 alone [1]. We thus raise the question: Is it possible to spoof a digitally signed PDF document in a way such that the spoofed document is indistinguishable from a valid one?.
Novel Attacks on PDF Signatures. In this paper, we show how to spoof a digitally signed PDF document. The only requirement of our attacks is access to a signed PDF (e.g., an Amazon invoice). Given such a PDF, our attacks allow an attacker to change the PDF's content arbitrarily without invalidating its signature -see Figure 1. Plausible attack scenarios which may abuse the vulnerabilities could include, for example, the manipulation of the billing date on the digitally signed receipt to extend the warranty period of a product or changing the contract's information to attain more resources than agreed upon.
We systematically analyze the verification process of PDF signatures in different desktop applications as well as in server implementations, and we introduce three novel attack classes, see Figure 2. Each of them gives a blueprint for an attacker to modify a validly signed PDF file in such way that for the targeted viewer, the displayed content is altered without being detected by the viewer's signature verification code -all elements in the GUI related to signature verification are identical to the original, unaltered document.
On a technical level, each attack class abuses a different step in the signature validation logic.
(1) The Universal Signature Forgery (USF) manipulates meta information in the signature in such a way that the targeted viewer application opens the PDF file, finds the signature, but is unable to find all necessary data for its validation. Instead of treating the missing information as an error, it shows that the contained signature is valid.
(2) The Incremental Saving Attack (ISA) abuses a legitimate feature of the PDF specification, which allows updating a PDF file by appending the changes. The feature is used, for example, to store PDF annotations, or to add new pages while editing  the file. The main idea of the ISA is to use the same technique for changing elements, such as texts, or whole pages included in the signed PDF file to what the attacker desires. The PDF specification does not forbid this, but the signature validation should indicate that the document has been altered after signing. We introduce four variants of ISA masking the modification made without raising any warnings that the document was manipulated.
(3) The Signature Wrapping Attack (SWA) targets the signature validation logic by relocating the originally signed content to a different position within the document and inserting new content at the allocated position. We introduce three different variants of SWA which we used to bypass the signature validation.
Large-Scale Evaluation. We provide the first large-scale evaluation covering 22 different PDF viewers installed on Windows, Linux, or MacOS. We systematically analyzed the security of the signature validation on each of them and found signature bypasses in 21 of 22 of the viewers, including Adobe Reader DC and Foxit. Additionally, we analyzed eight online validation services supporting signature verification of signed PDF files. We found six of them to be vulnerable against at least one of the attacks, and included, among others, DocuSign -one of the worldwide leading cloud services providing electronic signatures and ranked #4 on the Forbes Cloud 100 [15]. The results are reasoned by the fact that: (1) There is almost no related work regarding the security of digitally signed PDF files, even though integrity protection is part of the PDF specification since 1999. (2) The PDF specification does not provide an implementation guideline or a best-practices document regarding the signature validation. Thus, developers implement a security critical component without having a thorough understanding regarding the actual risks.
Contributions. The contributions of this paper are: • We developed three novel attack classes on PDF signatures. Each class targets a different step in the signature validation process and enables an attacker to bypass a PDF's integrity protection completely, shown in section 4. • We provide the first in-depth security analysis of PDF applications. The results are alarming: out of 22 popular desktop viewers, we could bypass the signature validation in 21 cases, as seen subsection 5.1. • We additionally analyzed eight online validation services used within the European Union and worldwide for validating signed documents. We could bypass the signature validation in six cases, shown in subsection 5.2. • Based on our experiences, we developed a secure signature validation algorithm and communicated it with the application vendors during the responsible disclosure process, as seen in section 6. • By providing the first in-depth analysis of PDF digital signatures, we pave the road for future research. We reveal new insights and show novel research aspects regarding PDF security, shown in section 8.
Responsible Disclosure. In cooperation with the BSI-CERT, we contacted all vendors, provided proof-of-concept exploits, and helped them to fix the issues. As a result, the following three generic CVEs for each attack class covering all affected vendors were issued [50][51][52].

PDF BASICS
This section deals with the foundations of the Portable Document Format (PDF). We give an overview of the file structure and explain how the PDF standard for signatures is implemented.

Portable Document Format (PDF)
A PDF consists of 4 parts: header, body, xref table, and a trailer, as depicted in Figure 3.
Header. The header is the first line within a PDF and defines the interpreter version to be used. The provided example uses version PDF 1.7.
Body. The body specifies the content of the PDF and contains text blocks, fonts, images, and metadata regarding the file itself. The main building blocks within the body are objects, which have the following structure: Each object starts with an object number followed by a generation number. The generation number should be incremented if additional changes are made to the object. In the example depicted in Figure 3, the body contains four objects: Catalog, Pages, Page, and stream. The Catalog object is the root object of the PDF file. It defines the document structure and can additionally declare access permissions. The Catalog refers to a Pages object which defines the number of the pages and a reference to each Page object (e.g., text columns). The Page object contains information on how to build a single page. In the given example, it only contains a single string object "Hello World!".  Figure 3: A simplified example of a PDF file's internal structure. We depict the object names after the obj string for clarification.

Document Parts
Xref table. The Xref table contains information about all PDF objects. An Xref table can contain one or more sections.
• Each Xref table section starts with a line consisting of two integer entries a b (e.g., "0 5" as shown in Figure 3) which indicates that in the Xref table the following b = 5 lines describe objects with ID (also known as object numbers) ranging from a ∈ {0, . . . , b − 1} = {0, . . . , 4} • Each object entry (a ∈ {0, . . . , b − 1}) in the Xref table has three entries x y z, where x defines the byte offset of the object from the beginning of the document; y defines its generation number, and z ∈ { ′ n ′ , ′ f ′ } describes whether the object is in-use ( ′ n ′ ) or not ( ′ f ′ , say "free"). For example, the line "0000000060 00000 n" is the third line after "0 5" and, thus, describes the in-use object with object number 2 and generation number 0 at byte offset 60 (see "2 0 obj" in Figure 3).
Trailer. After a PDF file is read into memory, it is processed from the end to the beginning. Thus, the Trailer is the first processed content of a PDF file. It contains references to the Catalog (1 0 R) and the Xref table.

Creating a PDF Signatures
This section explains how a digitally signed PDF file is built.
Incremental Saving. PDF Signatures rely on a feature of PDF called incremental saving (also known as incremental updates), allowing the modification of a PDF file without changing the previous content.
In Figure 4,  Structure of a Signed PDF. The creation of a digital signature on a PDF file relies on incremental saving by extending the original document with objects containing the signature information.
In Figure 5, an example of a signed PDF file is shown. The original document is the same document as depicted in Figure 3. By signing the document, an incremental saving is applied and the following content is added: a new Catalog, a Signature object, a new Xref table referencing the new object(s), and a new Trailer. The new Catalog extends the old one by adding a new parameter Perms, which defines the restrictions for changes within the document. The Perms parameter references to the Signature object.
The Signature object (5 0 obj) contains information regarding the applied cryptographic algorithms for hashing and signing the document. It additionally includes a Contents parameter containing a hex-encoded PKCS7 blob, which holds the certificates as well as the signature value created with the private key that corresponds to the public key stored in the certificate. The ByteRange parameter defines which bytes of the PDF file are used as the hash input for the signature calculation and defines two integer tuples: (a, b) : Beginning at byte offset a, the following b bytes are used as the first input for the hash calculation. Typically, a=0 is used to indicate that the beginning of the file is used while a+b is the byte offset where the PKCS#7 blob begins. (c, d) : Typically, byte offset c is the end of the PKCS#7 blob, while c+d points to the last byte range of the PDF file and is used as the second input to the hash calculation.  According to the specification, it is recommended to sign the whole file except for the PKCS#7 blob (located in the range between a+b and c) [21].

Verifying a signed PDF File
If a signed PDF file is opened with a desktop application that supports signatures, it immediately starts to verify it by: (1) extracting the signature from the PDF and applying the cryptographic operations to verify its correctness and (2) verifying if the used signing keys are trusted, e.g., an x.509 certificate. One thing that all applications had in common is that by default, they do not trust the operating system's keystore. Similar to web browsers such as Firefox, they distribute their own keystore and keep the list of trusted certificates up to date. Additionally, every viewer allows the utilization of a different keystore containing trusted certificates. This feature is interesting for companies using their own Certificate Authority (CA) and disallowing the usage of any other CA. As a result, similar to key pinning, the viewer can be configured to trust only specific certificates.

ATTACKER MODEL
In this section, we describe the attacker model including the attackers' capabilities and the winning conditions. Victim. A victim can be either a human who opens the file using a certain PDF desktop application or a website offering an online validation service.
Attacker Capabilities. It is assumed that the attacker is in possession of a signed PDF file. The attacker does not possess the proper private key that was used to sign it. Also, we assume that the victim (1.) Signature is valid and trusted.
(2.) Untrusted key is used to sign the document.

UI-Layer 2
Click to open UI-Layer-2 Your Amazon Invoice #345123 (PDF Content) (a) A screenshot of Adobe Acrobat DC is depicted after opening a signed PDF document. A signature validation bar (UI-Layer 1) is automatically shown. A signature panel (UI-Layer 2) can be opened by pressing the corresponding button. The panel provides more details, e.g., the error message or email address of the signer.
(b) There are 3 validation states: (1.) A green icon indicates a valid and trusted signature. (2.) If the icon appears in yellow, the key used to sign the PDF is untrusted, e.g., because a self-generated certificate is used. (3.) The red icon indicates an invalid signature, e.g., if the PDF file is modified. Figure 6: PDF signature validation with two UI-Layers.
only trusts specific certificates (e.g., via the trust store) and the attacker does not possess a single private key that is trusted by the victim. Thus, malicious PDF files which are digitally signed by the attacker with a self-generated or untrusted certificate will be not verified successfully by the viewer. Apart from this restriction, the attacker can arbitrarily modify the PDF file, for example, by changing the displayed content.
The attacker finally sends the modified PDF file to a victim, where the file is then processed.
Winning Conditions. For the successful execution of this attack, we have defined two conditions: Cond. 1) When opening the PDF file, the target application, i.e., the viewer or online service, shows a UI displaying that it is validly signed and is identical to the originally unmodified signed PDF file. Cond. 2) The viewer application displays content which is different from the original file. For viewer applications, both winning conditions must be met. For the online validation services, only the first condition must be fulfilled because online services do not show the content of a PDF file. Instead, they generate a report containing the results of the verification, see Figure 11. Therein, the services show whether the PDF file is validly signed.
Desktop viewer applications differ substantially in displaying the results of the signature verification. To classify if an attack is successful and to determine if the victim could detect the attack, we defined two different UI-Layer: • UI-Layer 1 represents the UI information regarding the signature validation which is immediately displayed to the user after opening the PDF file. It is shown without any user interaction. Examples for Adobe Acrobat DC UI-Layer 1 are presented in the top part of the purple box in Figure 6.
• UI-Layer 2 provides extended information regarding the signature validation. It can be accessed by clicking on the respective menu option. Examples for Adobe Acrobat DC UI-Layer 2 are displayed in the bottom-left part of the green box in Figure 6. If the information presented on the UI-Layer 2 states that the signature is invalid or the document has been modified after the application of the signature, the attack can still be classified as successful for UI-Layer 1.
In Figure 6, an example of a successful signature validation on UI-Layer 1 and UI-Layer 2 is presented. After opening the PDF file, the information Signed and all signatures are valid is displayed. Further information is revealed by clicking on the Signature Panel and can be seen in the green box of UI-Layer 2.
Self-Signed PDFs. We do not consider self-signed PDF as a legitimate attack and neither use nor rely on them because a self-signed PDF can clearly be distinguished from a PDF signed with a trusted certificate; cf. green and yellow icon in Figure 6.

HOW TO BREAK PDF SIGNATURES
In this section, we present three novel attack classes on PDF signatures: Universal Signature Forgery (USF), Incremental Saving Attack (ISA), and Signature Wrapping Attack (SWA). All attack classes bypass the PDF's signature integrity protection, allowing the modification of the content arbitrarily without the victim noticing. The attacker's goal is to place malicious content into the protected PDF file, such that the previously defined winning conditions for viewer applications and online validation services are satisfied.
During the security analysis, we designed many broken PDF files for each attack class which are clearly violating the PDF specification in order to bypass the signature verification process.
We also learned that nearly every PDF viewer has a high level of error-tolerance so that these PDF files could be successfully opened even if required parameters are missing. We can only assume that this is due to the individual interpretation of the PDF specification by each vendor.

Universal Signature Forgery (USF)
The main idea of Universal Signature Forgery (USF) is to disable the signature verification while the application viewer still shows a successful validation on the UI layer. This attack class was inspired by existing attacks applied to other message formats like XML [42] and JSON [33]. Such attacks either remove all signatures or use insecure algorithms like none in JSON signatures. For PDFs we estimated two possible approaches -either to remove information within the signature which makes the validation impossible, or to remove references to the signature to avoid the validation. Removing references did not lead to any successful attack. Thus, we concentrated on manipulations within the signature. In this case, the attacker manipulates the signature object in the PDF file, trying to create an invalid entry within this object. Although the signature object is provided, the validation logic is not able to apply the correct cryptographic operations. This leads to the situation that a viewer shows some signature information even though the verification is being skipped. In the end, we define 24 different attack vectors, eight of them are depicted in Figure 7. In the given example, the attack vectors target two values: a) the entry Contents contains the key material as well as the signature value and b) the entry ByteRange defines the signed content in the file. The manipulation of these entries is reasoned by the fact that we either remove the signature value or the information stating which content is signed. In Variant 1, as depicted in Figure 7, either Contents or ByteRange are removed from the signature object. Another possibility is defined in Variant 2 by removing only the content of the entries. In Variants 3 and 4, invalid values were specified and tested. Such values are for instance null, a zero byte (0x00), and invalid ByteRange values like negative or overlapping byte ranges. Providing such tests is common for penetration testers since many implementations behave abnormally when processing these special byte sequences.

Incremental Saving Attack (ISA)
This class of attack relies on the incremental saving feature. The idea of the attack is to make an incremental saving on the document by redefining the document's structure and content using the Body Updates part. The digital signature within the PDF file protects precisely the part of the file defined in the ByteRange. Since the incremental saving appends the Body Updates to the end of the file, it is not part of the defined ByteRange and thus not part of the signature's integrity protection. To summarize, the signature remains valid, although the Body Updates changed the displayed content.

Body
Xref Table   Trailer Body Updates Xref Table   Trailer Body Updates Xref Table   Trailer Header Body Xref Table   Trailer Body Updates Xref  Figure 8: Bypassing the signature protection by using incremental saving. In (1), the main idea of the attack is depicted, while (2)-(4) are variants to obfuscate the manipulations and prevent a viewer to display warnings.
During our research, we elaborated four variants of ISA. These variants are reasoned by the fact that some vendors recognized that incremental saving is dangerous when concerning PDF signatures. These vendors implemented countermeasures to detect changes after the document's signing. As part of our black-box analysis, we were able to determine these countermeasures and find generic bypasses that worked for multiple viewers which we describe below.
Variant 1: ISA with Xref table and Trailer. For Variant 1 of the ISA class, as depicted in Figure 8, only two of the evaluated signature validators were susceptible to the attack. This is not very surprising since this type of modification is exactly what a legitimate PDF application would do when editing or updating a PDF file. A digital signature in PDF is designed to protect against this behavior; the signature validator recognizes that the document was updated after signing it and shows a warning respectively. To bypass this detection, we found two possibilities. (1) We included an empty Xref table. This can be interpreted as a sign that no objects are changed by the last incremental saving. Nevertheless, the included updates are processed and displayed by the viewer. (2) We used an Xref table that contains entries for all manipulated objects. We additionally added one entry which has an incorrect reference (i.e., byte offset) pointing to the transform parameters dictionary, which is part of the signature object. The result of these manipulations is that the viewer application does not detect the last incremental saving. No warning is shown that the document has been modified after signing it but the PDF viewer displays the new objects. Variant 4: ISA with a copied signature and without a Xref table and Trailer. The previous manipulation technique was improved by copying the Signature object within the last incremental saving. This improvement was forced by some validators which require any incremental saving to contain a signature object if the original document was signed. Otherwise, they showed a warning that the document was modified after the signing.
By copying the original Signature object into the latest incremental saving, this requirement is fulfilled. The copied Signature object, however, covers the old document instead of the updated part. To summarize, a vulnerable validator does not verify whether each incremental saving is signed, but only if it contains a signature object. Such verification logic is susceptible to ISA.

Signature Wrapping Attack (SWA)
The Signature Wrapping Attack (SWA) introduces a novel technique to bypass signature protection without using incremental saving. During our research, we observed that the part of the document containing the signature value is excluded from the signature computation and thus it is not integrity protected. The ByteRange defines the exact size of this unprotected space. Consequentially, we focused on manipulations on the ByteRange entries to increase the size of the unprotected space and allowing the injection of malicious content.
The main idea is to move the signed part of the PDF to the end of the document while reusing the xref pointer within the signed Trailer to an attacker manipulated Xref table. To avoid any processing of the relocated part, it can be optionally wrapped by using a stream object or a dictionary. We distinguish two variants of SWA.
Variant 1: Relocating the second hashed part. Each ByteRange entry of the Signature object defines two hashed parts of the document. The first variant of the attack relocates only the second hashed part. In Figure 9, two documents are depicted. On the left side, a validly signed PDF file is depicted. The first hashed part begins at byte offset a and ends at offset a+b, the second hashed part ranges from offset c until c+d. On the right side, a manipulated PDF file is generated by using SWA as follows: Step 1 (optional): The attacker deletes the padded zero bytes within the Contents parameter to increase the available space for injecting manipulated objects. 1 Step 2: The attacker defines a new /ByteRange [a b c* d] by manipulating the c value, which now points to the second Step 6: The attacker moves the signed content defined by c and d at byte offset c*. Optionally, the moved content can be encapsulated within a stream object. Noteworthy is the fact that the manipulated PDF file does not end with %%EOF after the endstream. This was necessary due to the reason that some validators throw a warning that the file was manipulated after signing due to an %%EOF after the end of signed document (byte offset of EOF > c+d). To bypass this requirement, the PDF file is not correctly closed. However, it will still be processed by any viewer.  Step 6, the attacker copies the first hashed part (byte offsets a to a+b) concatenated with the second hashed part (byte offsets c to c+d) at byte offset c * . The algorithm is based on our evaluation result that all tested viewer applications verified if the first entry of the /Byterange equals zero. This makes it impossible to move the first hashed part to an arbitrary position because of a > 0 and leads to a warning. For this reason, we used the trick to concatenate both hashed parts to a single unit. By this means, the value of a could remain zero. Surprisingly, no viewer verified whether b > 0, but even in such a case, we can apply SWA. A lightly different Variant 2 * can be created by using the fact that the beginning of every PDF file starts with \%PDF-followed by the specified interpreter version, e.g., 1.7. Therefore, a byte range from byte offsets a=0, . . . , b=5 can always be used. A comparison of all SWA variants is depicted in Figure 10.

EVALUATION
In this section, we present the results of our evaluation. We applied various manipulations based on the three presented attack classes to a validly signed test document. Afterward, we conducted blackbox security tests to evaluate whether native applications or online validation services in the scope of this paper can be successfully attacked using our attack classes.

Applications
In the first phase of our evaluation, we searched for desktop applications validating digitally signed PDF files. We analyzed the security of their signature validation process against our three attack classes.
The 22 applications listed in Table 1 fulfill these requirements. We evaluated the latest versions 2 of the applications on all supported platforms (Windows, MacOS, and Linux).
Results. During our evaluation, we identified vulnerabilities in 21 of the 22 evaluated applications. These vulnerabilities allow us to bypass the document integrity protection provided by the signature completely and to manipulate the displayed content of signed PDF files. There was only one application which could not successfully be attacked: the last Linux version of Adobe Reader (9.5.5) which was released in 2013. All other applications could be successfully attacked using at least one attack vector. The SWA class turned out to be the most successful. It led to successful attacks on 17 applications, while ISA could be used to successfully attack half of the evaluated applications and USF succeeded for four applications.
In the following section, we present interesting results as an example for each attack class. The complete results are depicted in Table 1.
Universal Signature Forgery. USF attacks were successful against four applications. However, two of these applications are Adobe Acrobat Reader DC and Adobe Reader XI. Surprisingly, these two applications are not vulnerable to any attack we evaluated except the USF attack. To bypass the protection of the applied digital signature in Adobe Acrobat Reader DC and Adobe Reader XI an attacker only needs to remove the /ByteRange entry of the signature object which specifies the part of the document protected by the signature, or replace its value with null. 3 Afterwards, he can arbitrarily change the displayed content of the document. Nevertheless, both applications showed a blue banner stating that the document is "Signed and all signatures are valid". The applications also informed the user in the signature panel that the document "has not been modified since the signature was applied" although the manipulated content was displayed.
Incremental Saving Attack. By using the ISA class, it was possible to attack 11 of the 22 evaluated applications successfully. For example, PDF Studio Viewer 2018 and Perfect PDF 10 Premium inform the user that the document has been changed after the application of the signature when a regular incremental saving is applied to a signed document. However, it is sufficient to delete the Xref table and trailer of the incremental saving and add the keyword startxref as a comment at the end of the file to create a successful attack for these applications. When the manipulated document is opened, the applications display the exchanged content but still inform the user that the applied signature is valid and the document has not been changed since it was applied.
We found two even easier bypasses of the document integrity protection for LibreOffice. Both bypasses are based on Variant 1 of the ISA class, whose structure is very similar to regular incremental saving. The manipulated files both contain body updates, a new Xref  . Some of these test files are called "Revision 1" and some are called "Revision 2" in the signature panel. The application's behavior when the "View Signed Version" option in the signature panel is selected differs for these two revisions. For all files called Revision 1, the option opens a new tab showing the original file's content ("Hello World!" for our test files). However, for all files called Revision 2, the opened tab also displays the manipulated content ("Hello Attacker!"), and the signature panel now states that "after adding the signature, the document has not been modified". This implies to the user that the opened document has been altered after the signature was applied; nevertheless, the content displayed in the new tab after clicking on "View Signed Version" is the original file content. These attacks are not classified as successful because the attacker model specifies that both UI-layers must not state that the document was modified after the application of the signature when the manipulated document is opened.

Online Validation Services
In the second phase of our evaluation, we focused on online validation services. These services are used to verify the integrity and validity of signed PDF documents. Thus, validating the signature of PDF documents can be automated and outsourced to these services. One of the most prominent vendors of validation services is DocuSign. Aside from its online validation service, DocuSign also offers a cloud PDF viewer and a signing application used by most companies of the Fortune 500 list. Prominent examples include Dell, eBay, VISA, Microsoft, Nike, and the USENIX Association [4,13].
Test Setup. We evaluated each online validation service as follows. First, we uploaded a validly signed PDF file (document_signed.pdf) to the service by using the available upload functionality. The service then generates a report containing details regarding the signature validity status. Another output was not provided in any case, especially the content of the PDF file is not displayed.
We then modified the signed PDF file using different variants of all three attack classes successively. If one of these attack vectors results in a report that is indistinguishable from the report of document_signed.pdf, we classify the attack as successful. An example of a successful attack is presented in Figure 11.
Results. We analyzed eight free and publicly available validation services against all three attack classes. The signature validation could be bypassed on six services (cf. Table 2).
To summarize, two of the analyzed services [9,38] were vulnerable to SWA and five services [9,10,12,14,20] could be bypassed using the ISA class. This is contrary to the results from the evaluation of viewer applications, where we could find more applications vulnerable to SWA.
One interesting challenge during the evaluation was to find a clear indication in the report whether a signature is valid. For example, the DSS Demonstration WebApp [14] prints out two fields containing the verification report: Indication and Signature scope, see Figure 11: Validation report created by the Digital Signing Service for a manipulated but signed PDF file [14]. Figure 11. The Indication field summarizes the results of the digital signature validation. In our case, the result is: TOTAL PASSED. With respect to USF and SWA we received a warning or a error message if the attacks are detected. Regarding ISA, the Signature Scope contains information indicating whether the complete document is signed or not. In case that the ISA attack is detected, the validation service should print out that the scope is partial and only parts of the document are signed. According to our evaluation, version 5.2 of the DSS Demonstration WebApp is susceptible against ISA since it returns a Full PDF as Signature scope even if the document was modified via incremental saving in Variant 2. Along with all EU validation services, we analyzed DocuSign -one of the worldwide leading cloud services -was the only service vulnerable against both attacks ISA and SWA.

HOW TO FIX PDF SIGNATURES
In this section, we propose concrete countermeasures to fix the previously introduced attacks. We,therefore, carefully studied the main reasons for the attacks on PDF signatures and were able to identify two root causes: (1) The specification does not provide any information with a concrete procedure on how to validate signatures. There is no description of pitfalls and any security considerations. Thus, developers must implement the validation on their own without best-common-practice information. (2) The error tolerance of the PDF viewer is abused to create non-valid documents bypassing the validation, yet is correctly displayed to the user.
The Verification Algorithm. When considering a proper countermeasure, we defined an algorithm which addresses USF, ISA, and SWA but does not negatively affect the error tolerance of the PDF viewers (cf. Listing 2). It describes a concrete approach on how to compute the values necessary for the verification and how to detect manipulations after the PDF file was signed. The specified algorithm must be applied for each signature within the PDF document.  As an input, it requires the PDF file as a byte stream and the signature object.
In Line 4, we first extract the ByteRange from the signature object. To prevent USF, we ensure that ByteRange is not null or empty in Line 7.
Lines 9-22 then validate the values a, b, c, and d of the ByteRange. First, Line 10 ensures that it contains exactly four values to minimize an attacker's attack surface. Line 11 additionally ensures that each ByteRange value is an integer. Lines 14 to 20 ensure that ByteRange satisfies the following condition: 0=a < b < c < (c+d), which is equivalent to a=0 and b > 0 and c > b and d > 0. Enforcing this condition ensures that the signature always covers the beginning of the file (a = 0), prevents signed blocks of length zero (b > 0, d > 0), and ensures that both signed blocks are non-overlapping (c > b).
Finally, we verify that ByteRange covers the entire file (Line 22) in order to detect ISA. Lines 24-29 parse the Contents parameter of the signature object, which is a PKCS#7 blob.
The critical aspect is that we interpret everything that is not covered by the ByteRange as the Contents parameter of the PDF signature. Theoretically, the check in Line 27 should never fail, because we previously verified (a+b)=b and b < c. Thus it holds that pkcs7Blob.length > 0. Nevertheless, we leave this line here due to its importance for preventing SWA. Line 29 additionally ensures that only hex characters can be in the unprotected part of the PDF file, preventing further unwanted modifications of the file.
Lines 31-32 parse the PKCS#7 blob and extract the information to be used for the signature verification. Lines 34-38 determine the bytes of the input PDF that are signed.
Finally, Line 41 calls the PKCS#7 verification function and returns the validity status of the signature.
Drawback. Specifying the algorithm in Listing 2 requires a change in the PDF specification which defines ByteRange as an optional parameter [21,Section 8.7]. In this case, the signature value will be computed only over the signature dictionary leaving the entire document unprotected. Such a feature allows an even more powerful attack since the attacker can create validly signed documents by only injecting the signed signature dictionary without a /ByteRange. Currently, none of the evaluated viewers supports this feature.
Additionally, the algorithm leads to one usability issue if multiple signatures are provided. Although these signatures are valid, only the one covering the entire document will be displayed as valid. This problem can be addressed by providing additional information to the user that some of the signatures are valid but cover only a specific revision and not the entire document. Adobe uses a similar approach for the signature validation. All Adobe viewers show information about the document revision protected by a signature and allow only to open this revision. Thus, a user can easily verify which information is signed and which is not.
Responsible Disclosure. After discovering the vulnerabilities we created a security report containing the description of the attacks, a list with the affected implementations, a proof-of-concept exploit for each successful attack vector, and the pseudo-code preventing the attacks [34]. On the 8th of November, we sent the report to the BSI-CERT team who distributed it to all affected vendors and governmental organizations dealing with PDF [34]. During the responsible disclosure process, we supported BSI-CERT and the vendors to fix the issues. The complete information relating to our research on PDF signatures was published February 25, 2019 on https://www.pdf-insecurity.org/. To support all vendors, we also published all available exploits. Some vendors already integrated these files in their test environments.

RELATED WORK
At the beginning of our research phase, we gathered and studied the existing related work to PDF and file format security. This work can be separated into the following four categories.
PDF Malware and PDF Masking. In 2010, Raynal et al. provided a comprehensive study on malicious PDFs abusing legitimate features in PDFs leading to Denial-of-Service (DoS), Server-Side-Request-Forgery (SSRF), and information leakage [37]. Additionally, the authors considered potential security issues regarding the signature verification by criticizing the design of the certificate trust establishment. In 2012, Hamon et al. published a study revealing weaknesses in PDFs leading to malicious URI invocation [49]. In 2013 and 2014, multiple vulnerabilities in Adobe Reader were reported abusing the support of insecure PDF features, JavaScript, and XML [22,40]. Inführ [23] published a summary of the supported languages, file formats and features in PDFs leading to these security issues. In 2018, Franken et al. evaluated the security of third-party cookies policies [16]. Part of the evaluation revealed weaknesses in two PDF reader by forcing them to call arbitrary URIs. In the same year, multiple vulnerabilities in Adobe Reader and different Microsoft products were discovered leading to URI invocation and NTLM credentials leakage [24,39].
Besides PDF malware, research has been provided on content masking. In 2014, Albertini discovered new attack classes by combining a PDF and a JPEG into a single polyglot file [2]. In 2017, Markwood et al. introduced a novel attack related to content masking by using font encoding [31].
PDF Malware Detection. As a result of the discovered attacks during the recent years, different security tools were implemented detecting maliciously crafted documents [8,26,28,30,41,43]. Such tools rely on the detection of known attack patterns and structural analysis of PDFs.
In 2016, Carmony et al. build a JavaScript reference extractor for detecting parsing confusion attacks [6]. In 2017, Tong et al. introduced a concept for a robust PDF malware detection based on machine learning algorithms [46]. In the same year, Tong et al. published a framework based on these algorithms and capable of detecting PDF malware [47]. Maiorca et al. provided an overview of the current PDF malware techniques and analyzed the existing security tools by comparing them [29]. This paper mentions the Incremental Saving (IS) feature for the first time in conjunction with attacks, but up until our research, the feature has not been combined with attacks on PDF signatures. PDF Signatures. While studying the related work, we discovered a gap in existing security analysis. We were able to find only a few articles directly related to the security of PDF signatures.
In 2008 and 2012, Grigg et al. described the risks associated to electronic signatures [18,19] based on the missing cryptographic signature allowing an attacker to forge any signature.
In 2012, Popescu et al. presented a proof-of-concept bypass for a specific digital signature [36]. The attack is based on a polymorphic file containing two different files -a PDF and TIFF. The risk exists if a victim signs the document unaware of the hidden content inside the file. In 2015, Lax et al. documented potential security topics related to digitally signed documents [27]. The authors concentrated on issues related to the signature generation process like malware, signed documents containing dynamic content like macros or JavaScript, and polymorphic documents similar to [36]. In 2017, Stevens et al. discovered an attack against SHA-1 [45] breaking the collision resistance. For the proof-of-concept, the authors created two different PDF files containing the same digest value. As a result, an attacker could create a PDF file with new content without invaliding the digital signature. In his master thesis, Stefan et al. provided an in-depth analysis of PDF signatures [44]. The author also implemented a library verifying PDF signatures. However, the security considerations addressed only known attacks related to PDFs and none of our discovered attack classes.
Signature Bypasses in different Data Structures. In 2002 Kain et al. addressed possible threats related to digitally signed documents like MS Word, MS Excel, or PDFs resulting from PKI issues, dynamic content loaded from a website, and code execution by supported programming languages within documents [25]. In the paper, the authors briefly describe the possibility to create an unsigned PDF document which is visually identical to the signed one, but they do not deliver any proof-of-concept exploit and do not evaluate if and how this can be achieved. In 2005, Buccafurri et al. describe a file format attack where the attacker forces two different views of the same signed document which contains an image as BMP and HTML code [5]. Depending on the file extension, the content of the image or the HTML code is processed. PDF files are mentioned as a possible target for such an attacker, but no concrete ideas are described.
The general concept of SWA -the relocation of the hashed part of a document -has been applied to XML-based messages before. In 2005, McIntosh and Austel described an XML rewriting attack on SOAP web services [32] and was adapted to SAML-based Single Sign-On in 2012 [42]. However, the adaption to PDF is much more complicated because the hashed part of the file is located using a byte range instead of an object identification number and has not been found in any previous work.
Attacks that exclude a document's signature have been applied to SAML [42] and JSON [33]. In contrast to our USF attack, these vulnerabilities simply remove the signature of the document in order to bypass the validation logic. This would work identically for PDFs, but a victim expects to open a signed file, and he will become suspicious if no signature information is shown once he opens the document. Thus, USF is a more advanced variant of signature exclusion adapted to PDF.
In this paper, we provide the first step into the security analysis of PDF signatures. We discovered further potential targets for attacks opening new research directions and challenges.
PKCS-based Attacks. The signature value is either a DER-encoded PKCS#1 binary data object or a DER-encoded PKCS#7 binary data object. Considering the complexity of both formats, the question arises if the verification of the PKCS object is correctly implemented. The goal of PKCS-based attacks is the creation of an always valid object. The impact of such an attack would be equal to the impact of USF, whereby any modification of the signed document is possible.
Additionally, the PKCS object contains the certificates used during the verification. If untrusted certificates are used, security warnings are displayed to the user. Thus, an attacker is not able to create a validly signed and trusted document. Future research should concentrate on the certificate validation by targeting this step and forcing the validation to accept an untrusted certificate.
Transformation Method Attacks. The PDF specification defines three different transformation methods applied on the document before signing it: DocMDP, UR, and FieldMDP. The transformation methods define which objects are included and excluded in the computation of the digital signatures. In this paper, we focused on the DocMDP transformation which is the short term for modification, detection, and prevention and permits changes by filling in forms, instantiating page templates and signing. Any other modification invalidates the signature.
DocMDP allows further adjustments regarding permitted and forbidden changes depending on different parameters. Future research should investigate if such restrictions are correctly applied and if they can be bypassed. Additionally, the transformation methods UR, protecting the defined usage rights, and FieldMDP detects changes in contained form fields should be also analyzed. Since these transformation methods process the data which should be signed differently than DocMDP, an in-depth security analysis could discover further vulnerabilities.
PDF Advanced Electronic Signatures. Motivated by the idea of eGovernment, the European Union published the PDF Advanced Electronic Signatures (PAdES) specification, which extends the PDF signature specification. For the significance of sensitive documents exchanged within governmental services, it is essential to analyze the current specification and the existing implementations.
In our evaluation, we discovered vulnerabilities in online validation services by adapting our attack vectors on PAdES documents. Since our attacks abuse features in the PDF specification, it is not surprising that PAdES signatures are also affected. It is essential that future research analyzes the PAdES specification carefully and evaluates the security of the specification itself. In this paper, we did not provide such an analysis.
Content Masking. Markwood et al. introduced techniques to bypass topic matching algorithms, plagiarism detection, and document indexing [31] by creating malicious fonts and constructing new word and character maps to mask the malicious content. In the context of signed PDFs, content masking attacks abuse dissimilarities between the signed and displayed content. For example, by defining new fonts and thus changing the presentation of some characters, the IBAN in an invoice document can be changed.
Another attack idea is to abuse the error tolerance of the viewer. During our tests, we detected presentation differences of the same document by using different viewers. The error-tolerance can be abused by an attacker validly signing one document, for example, a contract and distributing it to multiple parties. If these use different viewers, they may accept different contracts.
Verification UI Forgery. Similar to content masking attacks, an attacker can try to create a UI forging the view of a signed document. The PDF specification supports multiple interactive forms like button fields, rich text string, and form actions. Such features facilitate the creation of a UI imitating a signature panel where the results of the signature validation are usually displayed. As a result, an attacker could create a malicious document which appears trustworthy after opening. These kinds of attacks have already been described in the web context by Zalewski [54]. Researchers should concentrate on features defined in Section 12 in the PDF specification [21].

CONCLUSION
The PDF specification is a very complex standard. Unfortunately, when it comes to cryptography and, as in our case to digital signatures, it lacks concrete implementation guidelines and documents describing the best current practices. Our investigation reveals that almost all desktop applications fail to validate PDF signatures correctly. We identified three main reasons for this: (1) The specification itself does not enforce a strict policy, e.g., it does not enforce a signature to cover the whole document. This could be abused by SWA and relocating the signed content to a different position.
(2) PDF applications are error tolerant and process the content of a PDF file even if it is not standard compliant. We heavily abused this behavior with ISA and created non-standard compliant documents that force a viewer application to believe that it has not been updated; however, an attacker could manipulate the document.
(3) Even if the above aspects are correctly handled, as in the case of Adobe, there can be mere programming mistakes that break the whole cryptography. In the case of USF, an unexpected missing of mandatory information leads to a valid signature.
Our evaluation of PDF viewer applications and online validation services has alarming results. In 95% of all analyzed viewer applications, at least one of the problems, which were identified above, occurs and allows an attacker to stealthy manipulate contents of a signed PDF file. Analogous results could be found for online validation services in 75% of the tested cases. We responsibly disclosed our findings via the BSI-CERT to all vendors and proposed a validation algorithm to prevent our attacks.
Concerning the digitalization of offices and eGovernment, we see a strong need for the improvement of the given specification and best practices. PDF security related to cryptographic features have been overlooked for too long. We, therefore, pointed out new research directions in the field of PDF security in order to address this issue.