Identification of preservation risks in PDF with Apache Preflight - a first impression
van der Knijff, Johan
This report explores the feasibility of using the Apache Preflight PDF/A validator to detect 'risky' features in 'regular' (i.e. non-PDF/A) PDF documents.
The specific objectives of this work were:
To get a first impression of the Apache Preflight (part of PDFBox) PDF/A-1b validator.
To investigate if Apache Preflight is able to detect unwanted (from a preservation point of view) features in PDF files (i.e. PDFs that are not necessarily of the PDF/A sub-type) such as password protection, encryption and non-embedded fonts.
To provide a comparison with the Preflight module of Adobe Acrobat 9.5.
To decide if doing more work on Apache Preflight (more elaborate testing, possible involvement in its development) are worthwhile.