Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published June 13, 2022 | Version v1
Conference paper Open

Open-Source Email Curation Software Designed for Reusability

  • 1. University of North Carolina at Chapel Hill

Description

Email is more than half a century old and fills a vital role in activities across all sectors of society. However, the professional curation of email is still relatively immature. Many proprietary tools that extract and query email content operate as black boxes and cannot be easily evaluated or integrated with other digital curation software. Recently, there has been progress in the development of open-source software for email curation, but many institutions struggle to integrate these tools into digital curation workflows that usually involve other applications and systems.  

We report on a project called Review, Appraisal and Triage of Mail (RATOM) that developed software for interactive review, selection and appraisal of email collections held in PST, OST, and mbox formats. These tools allow users to create, validate, and query reports and metadata generated from email collections. We have designed the software for reusability from the ground up. The software is open source and distributed as several independent modules that can be incorporated into existing and emerging workflows. Output is structured to be easily queried in support of new and emerging tasks and access scenarios. We also describe a reusable tool to simplify management of machine learning models for identifying named entities in email.

Files

IDCC 2022: Open-Source Email Curation Software Designed for Reusability.pdf