Published December 6, 2016 | Version v1
Presentation Open

Metadata: Foundation, Philosophers' and Rosetta Stones

Authors/Creators

  • 1. Keith G Jeffery Consultants

Description

Research is now digital in execution and reporting/preservation. The end-to-end process of research – from idea to proposal to funded project to outputs – is supported by ICT (Information and Communication Technologies).

The various entities of the research domain require digital description to facilitate discovery, contextualization (for relevance and quality but including rights, costs, security, and privacy restrictions) and action. Metadata is the magic key for this; starting with datasets metadata now describes software, workflows, persons, organisations, equipment, computing resources and more.

Metadata must have formal syntax and declared (multilingual) semantics. Most existing metadata ‘standards’ do not meet these criteria, but many can be interconverted to a subset of a canonical form that does to permit interoperation. CERIF (Common European Research Information Format: a European Union Recommendation to Member States) is a widely-used canonical data model to meet these objectives. RDA (Research Data Alliance) is evolving a list of metadata elements to be recommended to support the operations described above; the set accords well with CERIF and – like CERIF - is a superset of other metadata ‘standards’.

The philosopher’s stone was reputed to turn base substances to valuable ones. The Rosetta stone permitted multilinguality. Metadata has these properties.

Keith Jeffery is an independent consultant and past Director IT at STFC Rutherford Appleton Laboratory (http://www.stfc.ac.uk/about-us/where-we-work/rutherford-appleton-laboratory/) with 360,000 users, 1100 servers and 140 staff. Keith holds 3 honorary visiting professorships, is a Fellow of the Geological Society of London and the British Computer Society, is a Chartered Engineer and Chartered IT Professional and an Honorary Fellow of the Irish Computer Society. Keith is past-President of ERCIM and past President of euroCRIS, and serves on international expert groups, conference boards and assessment panels. He has advised government on security and green computing. He chaired the EC Expert Groups on GRIDs and on CLOUD Computing. He is serving as co-chair in several working and interest groups of the Research Data Alliance (Metadata - https://rd-alliance.org/groups/metadata-ig.html, Metadata Standards Directory- https://rd-alliance.org/groups/metadata-standards-directory-working-group.html, Data in Context - https://rd-alliance.org/groups/data-context-ig.html, Metadata Standards Catalog - https://rd-alliance.org/groups/metadata-standards-catalog-working-group.html).

Long abstract
Research is now characterized by digital recording, storage/curation with provenance, analysis,
modeling, mining, visualization and reporting. Moreover, increasingly it is characterized by cooperation
and sharing, by re-use (for validation and for multidisciplinary research) and by new methods of digital
intercommunication among researchers from videoconferencing to blogs and wikis to liquid publications.
The end-to-end process of research – from idea to proposal to funded project to outputs – is supported
by ICT (Information and Communication Technologies).
However, for this ICT support to operate effectively and efficiently, the various entities of the research
domain require digital description to facilitate discovery, contextualization (for relevance and quality but
including rights, costs, security, and privacy restrictions) and action. Initially describing datasets,
metadata now is used also to describe software components, services (including workflows), persons,
organizational units, projects, funding, facilities, equipment, computing resources including
instrumentation, research outputs (publications, patents, products) and more. This permits portals to
catalog assets and provide access for download and use but increasingly it allows VREs (Virtual Research
Environments) to assist a researcher in constructing workflows over distributed and heterogeneous data
and software to achieve the research objective. Metadata also allows research managers to assess
research and produce research strategies.
To achieve all this, metadata must have formal syntax and declared (multilingual) semantics. Most
existing metadata ‘standards’ do not meet these criteria, but many can be interconverted to a subset of
a canonical form that does. This interconversion is critical for utilization of research assets from
heterogeneous sources and across research domains and illustrative examples underline the point. Many
research organizations and projects have utilized CERIF (Common European Research Information
Format: a European Union Recommendation to Member States) as the canonical data model to meet
these objectives. RDA (Research Data Alliance) is evolving a list of metadata elements to be
recommended to support the operations described above; the set accords well with CERIF and – like
CERIF - is a superset of other metadata ‘standards’.
The philosopher’s stone was reputed to turn base substances to valuable ones. The Rosetta stone
permitted multilinguality. Metadata with formal syntax and declared semantics makes research assets
valuable and available multilingually.

Files

Files (3.2 MB)

Name Size Download all
md5:1b906f668059a51e607e3b4d934189a2
3.2 MB Download