Report Open Access

Identification of preservation risks in PDF with Apache Preflight - a first impression

van der Knijff, Johan

This report explores the feasibility of using the Apache Preflight PDF/A validator to detect 'risky' features in 'regular' (i.e. non-PDF/A) PDF documents.

The specific objectives of this work were:

  •     To get a first impression of the Apache Preflight (part of PDFBox) PDF/A-1b validator.
  •     To investigate if Apache Preflight is able to detect unwanted (from a preservation point of view) features in PDF files (i.e. PDFs that are not necessarily of the PDF/A sub-type) such as password protection, encryption and non-embedded fonts.
  •     To provide a comparison with the Preflight module of Adobe Acrobat 9.5.
  •     To decide if doing more work on Apache Preflight (more elaborate testing, possible involvement in its development) are worthwhile.

Files (573.1 kB)
Name Size
pdfProfilingJvdK19122012.pdf
md5:8032fef68a2377bc25f7ff6fa955f4c6
573.1 kB Download
47
32
views
downloads
All versions This version
Views 4747
Downloads 3232
Data volume 18.3 MB18.3 MB
Unique views 4545
Unique downloads 3030

Share

Cite as