Prototyping an Organizationally Adaptive Repository for the National Parks Service
Description
We present a prototype digital asset management system based on linked data as the unifying layer over existing file organizing schemes. This system is built around Trellis LDP, Cassandra, and Kafka, making it a distributed platform that can expand to perform computational treatments at scale. The prototype extends our previous work on Digital Repositories at Scale that Invite Computation (DRASTIC) and was developed over three years by the Advanced Information Collaboratory at the University of Maryland through a research agreement with the United States National Parks Service. It was designed to house the output of established vendor digitization workflows for the National Archives of Black Women's History. These workflows already deliver thousands of paged objects according to strict file naming conventions and alongside file fixity manifests and spreadsheets of item-level metadata. Our digital preservation survey and analysis performed in Jupyter notebooks identified diverse formats, but strict naming and metadata conventions. Upon upload of a large vendor package the prototype verifies these existing file conventions to provide quality control, then extracts paged object structures and descriptive metadata. By extracting information into linked data graphs, we leave existing files unchanged; adding surrogates and named entity tags; providing repository services without disrupting workflows.
Files
4_Gregory_Jansen_Prototyping_an_Organizationally.pdf
Files
(706.2 kB)
Name | Size | Download all |
---|---|---|
md5:bc4a7775e61e0482e80261691e8cbd20
|
706.2 kB | Preview Download |