Report Open Access
van der Knijff, Johan
The aim of the this report is to provide an overview of features of the Portable Document Format (PDF) that are important from a long-term preservation and accessibility point of view. It starts with an overview of the file structure of a PDF document, the object types that are the building blocks of the format, and the logical document structure these objects are organised into. This is followed by a discussion of some general features of the different PDF versions, such as file identifiers, and how specific PDF versions relate to versions of Adobe’s Acrobat Reader. The remaining chapters each focus on a specific theme. For each theme (e.g. ‘fonts’, ‘password-protection’, etc.), its relevance to digital preservation and accessibility is explained, and the important features that are associated with it are discussed. The risks that are associated with each theme and the implementation history of all discussed features are summarised in separate tables. This will hopefully facilitate getting an overview of the most important risks and understanding the ‘history of PDF’. The final section of each chapter explains how ‘risky’ features can be identified at the level of objects and object attributes in a file.