D5.4 Final report on Data Infrastructure update and extension
Contributors
- 1. CNR-ISTI
- 2. University of South Wales
- 3. Polo Universitario Città di Prato, Università degli Studi di Firenze
Description
The deliverable reports work undertaken in Work package 5 under Tasks 5.1, 5.2, 5.3, 5.4, 5.5 and 5.7 during the final 12 months of the project, providing an update to D5.3, and assessing what has been accomplished during the lifetime of the project. The final DMP produced within Task 5.6 is reported In D5.5. Related activity has been reported in D4.3 and D4.4.
ARIADNEplus has delivered a transformational upgrade and extension of the original research data infrastructure created within the preceding ARIADNE project (2013-17). It has extended ARIADNE in several dimensions:
1. Wider geographical coverage with new partners.
2. Wider disciplinary coverage with a greater emphasis on the sub-domains of palaeoanthropology, bioarchaeology, environmental archaeology, dating methods, and on the archaeology of standing structures.
3. The time span covered.
4. The depth of database integration, with a greater degree of item-level integration.
5. Greater integration of texts.
6. Broader audiences.
7. Greater range of services.
This deliverable describes the final update procedures which are being followed by partners, and outlines the steps in the aggregation pipeline. This was initially presented in D5.2 but it is rehearsed here for the sake of completeness and including updates. There are two options for aggregation: the standard approach using a suite of tools for the semi-automated aggregation of large data-sets, and a basic approach for the manual upload of small numbers of records. The majority of partners have used the standard approach, but a small number used the FastCat tool for upload of a few records, and the tool has also proved invaluable for the addition of bespoke Collection records for harvested resources.
Partners following the standard approach must:
1. Describe their data according to the AO-Cat using the 3M tool, usually with one mapping per partner.
2. Map subject terms to the Getty AAT using the Vocabulary Matching Tool.
3. Define any period terms used so that they are uploaded to Period0. Where temporal data needs cleaning to create consistent use of date ranges and periods, partners use an additional tool, Time Spans, to normalise date ranges. They must also ensure that spatial data is compliant with WGS 84.
Partners using FastCat instead manually enter their data records in a spreadsheet-like tool, where the column headings already correspond to AO-Cat core mandatory fields, so that there can be a single mapping covering multiple partners.
Data aggregated by both routes is then transformed into the ARIADNE triplestore, and is also used to create the indices used to power OpenSearch in the ARIADNE portal. Data is initially loaded into a “staging” portal for checking, before it is published in the “public” portal.
Aggregation proceeds according to an agreed priority list and is managed via regular meetings of the aggregation task force, which comprises representatives of UoY-ADS, PIN, CNR, USW and FORTH, with SND often in attendance to deal with any issues which require changes to the portal interface. Progress is also monitored by a software tool, Activity Dash (implemented by FORTH in WP14), which makes it easier to monitor the progress across a large number of partners, but use is also made of the ARIADNE D4Science and Redmine help desks, the shared Google document notes from the aggregation task force meetings, and also email.
Since the interim deliverable D5.3 at M36, we have aggregated an additional 1.5 million data resources covering the majority of ARIADNE subject types: archaeological sites and monuments, fieldwork events, fieldwork reports, fieldwork archives, inscriptions, dates, artefacts, rock art, building surveys, maritime, scientific analyses, burials and coins. The workflow both for new datasets and also that for adding updates to existing datasets is now tried and tested. At M48 we have over 182 million triples in the public knowledge base (387 million including the inferred triples), and over 3.3 million resources listed in the public portal. Our focus for the last months of the current project has been to complete the aggregation of datasets from the remaining data provider partners and associate partners, but also to make provision for sustainability of the infrastructure and to develop a sustainable business model for continued updates from existing partners, as well as for the addition of new datasets from organisations wanting to join (see D6.5).
Since M36 we have also continued the development of application profiles for data types which extend the subject range of the ARIADNE infrastructure, and take us into item level aggregation, as reported in Deliverables D4.2 and D4.4.
Finally, we have continued to support the ARIADNE portal, migrating the indices from ElasticSearch to the open source application OpenSearch, and adding several enhancements to the portal interface.
Files
D5.4 Final Report on Data Infrastructure update & extension.pdf
Files
(4.0 MB)
Name | Size | Download all |
---|---|---|
md5:c14d9ec679f4cf3c81f7b87d80a35721
|
4.0 MB | Preview Download |
Additional details
Related works
- Is referenced by
- Other: https://ariadne-infrastructure.eu/resources/ariadneplus-deliverables/ (URL)
Funding
References
- Aloia, N., Binding, C., Cuy, S., Doerr, M., Fanini, B., Felicetti, A., Fihn, J., Gavrilis, D., Geser, G., Hollander, H., Meghini, C., Niccolucci, F., Nurra, F., Papatheodorou, C., Richards, J., Ronzino, P., Scopigno, R., Theodoridou, M., Tudhope, D., Vlachidis, A. and Wright, H. 2017 "Enabling European Archaeological Research: The ARIADNE E-Infrastructure", Internet Archaeology 43 2017, https://doi.org/10.11141/ia.43.11
- Bardi, A. 2022. ARIADNEplus questionnaire responses for metadata aggregation (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7221186
- Bardi, A., Binding, C., Felicetti, A., Meghini, C., Richards, J. & Theodoridou, M. Unpublished report 2022. Data Aggregation Pipeline: User Guide. Version 2.4. 30/5/2022.
- Niccolucci, F. and Richards, J.D. 2019 "ARIADNE and ARIADNEplus" in Richards and Niccolucci (eds.) The ARIADNE Impact. Archaeolingua, 7-25.