Report Open Access

Identification of preservation risks in PDF with Apache Preflight - a first impression

van der Knijff, Johan

This report explores the feasibility of using the Apache Preflight PDF/A validator to detect 'risky' features in 'regular' (i.e. non-PDF/A) PDF documents.

The specific objectives of this work were:

  •     To get a first impression of the Apache Preflight (part of PDFBox) PDF/A-1b validator.
  •     To investigate if Apache Preflight is able to detect unwanted (from a preservation point of view) features in PDF files (i.e. PDFs that are not necessarily of the PDF/A sub-type) such as password protection, encryption and non-embedded fonts.
  •     To provide a comparison with the Preflight module of Adobe Acrobat 9.5.
  •     To decide if doing more work on Apache Preflight (more elaborate testing, possible involvement in its development) are worthwhile.

Files (573.1 kB)
Name Size
573.1 kB Download
All versions This version
Views 6363
Downloads 4040
Data volume 22.9 MB22.9 MB
Unique views 6060
Unique downloads 3838


Cite as