Published January 11, 2013 | Version v1
Report Open

STFC Centre for Environmental Data Archival (CEDA) Annual Report 2012 (April 2011-March 2012)

Description

The mission of the Centre for Environmental Archival (CEDA) is to deliver long term curation of scientifically important environmental data at the same time as facilitating the use of data by the environmental science community. CEDA was formed to host two of the Natural Environment Research Council (NERC) designated data centres: the British Atmospheric Data Centre and the NERC Earth Observation Data Centre, as well as the UK arm of the IPCC Data Distribution Centre. In 2011, the UK Solar System Data Centre joined CEDA. Here we present the fourth annual report, covering joint activities from April 2011 to March 2012 (previously the constituent centres reported independently). The report itself is in two sections, the first broadly providing a summary of activities and some statistics, and the second a selection of short reports on some specific activities beginning, under way, or completed. This section is intended to provide a taster for the range of activities that CEDA undertakes, rather than a complete report of activities, since CEDA staff are involved in a huge range of scientific and informatics projects, not all of which are appropriate for reporting here. CEDA continues to engage in informatics projects to help improve the provision of: (1) suitable tools to document and manage both high volume and highly heterogeneous data; (2)tooling and services to enable the community to exploit CEDA data holdings, and; (3) fundamental standards. The latter, both to improve the likelihood that others can build standards compliant software we can deploy, and to support interdisciplinary science. As in the previous year, the 2011/2012 year was dominated by the two major challenges of dealing with CMIP5 (e.g. see page 36) and the establishment of new services under the banner of the International Space Innovation Centre (discussed in the articles on CEMS on pages 27 and 28). However, while those were high profile external activities, issues of scale became dominant internally; the funding report on page 14 summarises some of the issues: of the order of 108 files – o(108) – using o(petabytes) of disk, on o(300) different computers, split into o(600) datasets on o(100) disk partitions – without a consistent metadata standard or file format across the archive. Despite a decade of effort on metadata systems, and what had been a very efficient computing environment, CEDA was beginning to creak at the seams – with disk failures, insufficient documentation, and complex network issues becoming more and more prevalent. Ongoing growth using the same technical environment would have been a problem. Fortunately, in late 2011, CEDA received significant capital investment, culminating in the delivery in March 2012 of a new computing system – JASMIN/CEMS – consisting of storage and compute funded both by NERC and UKSA and delivered by CEDA in what was then the e-Science department in STFC (now part of the Scientific Computing Department). JASMIN is discussed on page 26 and CEMS on pages 27 and 28. JASMIN/CEMS are not just about supporting the traditional archival services of CEDA though – they are intended to additionally provide support for high performance analysis of high volume data by the greater NERC scientific community. The physical delivery of these systems is of course just part of the story, in next year's annual report we will be discussing the difficulty of migrating data to the new environment, and some of the new services which their advent has engendered. While we expect the physical system issues to be resolved with the new hardware, issues of documentation still exist – both in terms of the content, and how it is organised. CEDA continues to invest, with both core and project funding, in new metadata developments, aiming to address both issues. Work on data publication and citation is intended both to improve the integrity of the scholarly record, and to provide incentives for the production of good documentation, and work on metadata standards to ensure that we have the information organised fit for automating our environment and scientific use! Many of the one page reports discuss projects in this arena.

Notes

Previously curated at: http://cedadocs.ceda.ac.uk/939/ The publish date on this item was its original published date. This item was previously associated with content (as an official url) at: http://www.ceda.ac.uk. This work was funded by: Science and Technology Facilities Council; and, Natural Environment Research Council. Main files in this record: 2011-12_CEDA_Annual_Report_V08.pdf Item originally deposited with Centre for Environmental Data Analysis (CEDA) document repository by Dr Graham Parton. Transferred to CEDA document repository community on Zenodo on 2022-11-24

Files

2011-12_CEDA_Annual_Report_V08.pdf

Files (2.8 MB)

Name Size Download all
md5:9d0468c9649384ecfa5432930efb72af
2.8 MB Preview Download

Additional details

Related works

Is supplemented by
https://www.ceda.ac.uk/ (URL)