Published September 12, 2024 | Version v1
Poster Open

Preserving digital news by partnering with newspapers and their platforms

  • 1. Portico

Description

After researching the digital news preservation landscape in the USA, Portico identified a potential gap. While libraries and archives have strategies for preserving print newspapers, hyper-local digital newspapers are less likely to be preserved. This is at a time of rapid loss of hyper-local print news and an increased dependence on digital-only news.

A mechanism for preserving digital news is web archiving, but due to the rapid turnover of stories on news websites, it can be difficult to visit the site frequently enough to capture every article. Some newspapers provide RSS feeds, but not all, and it can be difficult to detect corrected articles in these feeds. Other newspapers implement a subscription model and cannot be harvested without a special arrangement. Some newspapers are aggregated into larger databases, but these often don’t include the smallest digital-only platforms and are for-profit subscription services that may not be preserved by a third party.

Portico is a community-supported dark archive for scholarly material that forms agreements and works with publishers to preserve their content. Based on this research, Portico initiated a pilot to determine if digital news articles could be managed in a similar way to journal articles. Portico partnered with a single newspaper and worked with their content management system provider to retrieve an XML export of every article. The XML and supporting files (photos etc.) were successfully ingested into the archive and were similar to journal articles. To confirm if this was repeatable, Portico worked with another newspaper on the same platform and reused the workflow with few changes.

Portico is repeating this experiment with two more newspapers on different platforms. If the content can be archived from each, Portico will seek to expand the work and develop a business model to support a broader effort in digital news preservation. An early step will be to reach out to the ~3000 newspapers on the platforms that have already been configured.

For the poster, the author will share details of the process used for this project and seek feedback from the community about the value of this approach for preserving digital news.

Files

20240916_Hanson_iPRES_poster.png

Files (5.4 MB)

Name Size Download all
md5:a011b0067b4379dbba5f97aaf19bcdf9
902.1 kB Preview Download
md5:179a6017b566ffb38c50f83b2a29487f
4.5 MB Download

Additional details

Dates

Created
2024-08-30

References

  • McCain, Edward, Neil Mara, Kara Van Malssen, Dorothy Carner, Bernard Reilly, Kerri Willette, Sandy Schiefer, Joe Askins and Sarah Buchanan. Endangered But Not Too Late: The State of Digital News Preservation. Columbia, MO: University of Missouri, 2021. https://doi.org/10.32469/10355/80931