Video Data Management System archives and provides online access to NOAA deep-sea corals digital video and image data

Since late 2002, the National Oceanic and Atmospheric Administration's (NOAA) Office of Ocean Exploration and Research (OER) has been collaborating with the National Oceanographic Data Center (NODC) and its three divisions, the NOAA Central Library (NCL), the Marine Data Stewardship Division (MDSD), and the National Coastal Data Development Center (NCDDC) to address the requirements for archiving, preserving, managing, and providing online access to digital videos and still images from OER oceanographic expeditions. The Video Data Management System (VDMS), which was developed to facilitate managing online information and access to video and images obtained during NOAA-sponsored oceanographic expeditions, now enables access to hundreds of digital video clips, highlight movies, still images and related documents and products from OER expeditions. Using discovery tools available via NOAALINC, the NCL online catalog (http://www.lib.noaa.gov/), NOAA scientists and other researchers can discover and download online video and still images to deep sea coral ecosystems areas. Upon special arrangement, the NCL can provide equipment and an appropriate environment for users to view, copy and/or download requested off-line scientific video data or view original expedition tapes from the NOAA Library Archives. Additional online information includes related cruise reports, educational lesson plans, original video and image annotation documents, digital maps and Web sites, and links to other oceanographic observation data. During the development of the VDMS project plan, which is a part of a larger comprehensive OER Data Management Project, the VDMS team defined and established several 'best practices' to support OER video data management requirements. Metadata guidelines for digital video (DV12) and digital still images (DI12) help scientists and data managers in the field to create complete descriptive metadata about their image data. Scientists, librarians and archivists then use this information to create MARC21, FGDC, or Dublin Core metadata records. The VDMS team also developed a work-flow for managing digital video by defining the process for moving video data from ship to library to archive, including steps for creating archival backup copies and web-accessible video clips and highlights. The VDMS presently manages off-line access to more than 1500 MiniDV and 500 DVCAM tapes, over 1500 DVDs, and online access to more than 300 digital video clips and highlights collected during NOAA ocean exploration cruises. Over 80% of all digital video and image data are from OER expeditions to various deep-sea coral areas. A growing collection of the digital data, including in situ physical and chemical ocean observations are archived in NCL and NODC. In situ data are accessible through the search and retrieval functions of the NODC Ocean Archive System (OAS) at http://www.nodc.noaa.gov/Archive/Search/.


I. INTRODUCTION
Many types of data and information products are collected and created during a typical oceanographic cruise, including planning documents, cruise summary reports, laboratory specimen lists, video and still images, and navigation and other observational data. As required by NOAA Administrative Orders NAO 15-217 and NAO 205-17, the NOAA Central Library (NCL) and National Oceanographic Data Center (NODC) are receiving an increasing number of video data collections from diverse NOAA components, including the Office of Ocean Exploration and Research (OER), the National Marine Sanctuaries Program (NMSP), and the Coral Reef Conservation Program (CRCP).
Beginning in late 2002, the NCL and NODC began collaborating with OER data managers to develop and implement an end-to-end data management plan for data and information collected during OER-sponsored cruises. The Integrated Product Team (IPT) was formed to develop a comprehensive plan, with several working groups focusing on each component of the overall plan. One working group was tasked with developing a Video Data Management System (VDMS) for acquiring, cataloging, archiving, maintaining and providing access to digital video data. The IPT also recognized the added benefits of developing VDMS requirements, documentation and processes that could serve as a model for all NOAA scientific video data. This paper describes the processes to assure that digital video data from the underwater explorations are archived for the long term and managed consistently and effectively with minimal staff resource requirements.
The VDMS objectives are: 1. To provide timely online access to OER video and still image data.
2. To educate the public about NOAA oceanographic expeditions and underwater explorations through related lesson plans.
3. To archive and preserve unique video and related data for the long term.
4. To foster collaboration between NOAA librarians, data managers, and scientists from different line and program offices. 5. To use or extend existing library tools, guidelines, and metadata standards to support new media formats, such as digital video, digital image, and digital text documents.
6. To enhance data access and metadata sharing between the NCL's NOAALINC, NODC's Ocean Archive System (OAS) and NCDDC's MERMAid catalogs, CoRIS and OER Digital Atlas databases.

II. VIDEO AND IMAGE DATA MANAGEMENT
The primary media currently used for capturing video images are MiniDV and DVCAM digital tapes or DVD-R discs. Each of these media types uses different encoding (file format structure). Video processing software is required to convert the native uncompressed video formats to current industry and archival standard formats (i.e., uncompressed DV or AVI) to facilitate online access and long-term management. At present, the VDMS process uses DV and/or AVI formats as the archival video encoding and MP4 and MOV as the online access encoding. Video files intended for online access are encoded with MPEG-4 and/or H.264 compression codecs which are supported by widely used video players, such as QuickTime™, Windows Media Player™, and RealMedia™ .
At present, the NCL manages a growing collection of multiple video media. This collection includes more than 1500 MiniDV tapes, 500 DVCAM tapes, approximately 1000 VHS tapes and more than 1500 DVDs. Over 80% of the collection are videos from deep corals ecosystems in the Gulf of Alaska, Gulf of Mexico, along the North American Atlantic and Pacific coasts, and other locations. These original media contain the entire sequence of video footage captured with standard resolution video equipment during dozens of cruises. In some cases, the video was captured using high-definition (HD) video equipment. These videos provide a relatively complete record of events, submersible dive or related activity. Original video media are currently stored in a climate controlled room and are being migrated to new mass-storage media for ongoing long-term archival preservation and improved access.
In addition to original media, the NCL archives, and provides online access to clips and highlights created from the original, full length raw video. Clips typically contain very short (10-60 seconds) excerpts of interesting or unusual features. Highlights are usually a series of short video segments (2-15 minutes) selected by the principal Investigators (PI) and/or data manager as a representative sample of images collected during the cruise. Figure 1 shows a Quick Time™ frame from the highlights video from the Life on the Edge 2005 expedition to a deep corals bank in the South Atlantic Bight. The video highlight on MiniDV tape was "captured" using Apple™ Final Cut Pro™ (FCP) video editing software. The captured file was then exported to DV and AVI archival formats using FCP Compressor™ and exported to MOV and MP4 formats for the online accessibility. By testing different codecs and file format combinations, the VDMS team developed a Standard Operating Procedure (SOP) for in-house digital video conversion. The VDMS SOP uses a 'best practices' approach, based on the limited guidelines for managing digital video processing and conversion provided by national authorities, such as the Library of Congress and the National Archives and Records Administration.

III. ONLINE ACCESS TO DEEP-SEA CORAL VIDEO AND STILLS
The NOAA Central Library provides online access to clips and highlights via links in standard MARC21 records in the library online catalog -NOAALINC (http://www.lib.noaa.gov/). A search in NOAALINC for "deep-sea corals AND digital video" returns a list of all catalog records that include links to the digital videos on deepsea corals, as well as links to other related media and documents. Figure 2 shows an example of the NCL metadata record in MARC21 standard for the Expedition to the Deep Slope 2007 investigating deep-sea coral areas of the Gulf of Mexico. Currently, there are 28 digital video collections in NOAALINC using the criteria listed above. These links provide off-line access to over 1500 original video tapes. The links also provide online access to more than 300 clips and highlights, as well as over 6000 still images. An accurate inventory of original still images, usually copied from separate CD-ROMs or DVDs, is more difficult to obtain and organize. For example, a single collection of still images captured during one OER cruise includes more than 90,000 images. However, many selected still images of deepsea corals and coral reef animals (see Figure 3) are available through the online NOAA Photo Library, a digital image collection of over 38,000 photos. The collection is arranged

IV. DOCUMENTATION AND METADATA MANAGEMENT AND EXCHANGE
Many types of metadata are required to ensure that digital video and still images can be interpreted and used by software that may be available on future computers and operating systems. Technical metadata about the encoding format, codecs used, playback rates, etc., are critical for software to be able to correctly interpret and display the contents of the digital files. Descriptive metadata is necessary to provide context for the content being displayed and answers questions about when, where, and why the images were obtained and who obtained them. It is often easiest to acquire these types of descriptive and technical metadata from the PI or data manager soon after each cruise.
Video shot during submersible operations is often the primary data collection activity for the dive and is intrinsically a form of geospatial data. As a result, the use of video as a source of quantifiable geospatial data (e.g., percentage of seafloor area covered by sponges and echinoderms, identification of common and unique species at a specific location) makes the content management and metadata requirements different than video that is primarily a record of a historic event.
The VDMS working group developed a set of tools and documents to assure standardized access to data and information collected during NOAA OER-sponsored cruises. These include: • A VDMS requirements document defines technical standards for the system, describes archival storage formats and conditions, and specifies online retrieval requirements.
• Metadata standards requirements include using the Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata and MARC21, the international library standard. Other systems, such as CoRIS, may access VDMS information in NOAALINC using the Z39.50 search protocol.
• A crosswalk and converter in the MERMAid system (NCDDC online catalog) enables sharing common metadata in both FGDC and MARC21 metadata records, as well as converting FGDC metadata to MARCXML format.
Collection-level (parent) MARC21 metadata records from the NCL can be linked to dive-level (child) FGDC metadata records created by the NCDDC using the crosswalk and conversion techniques developed for the VDMS. These metadata records provide the descriptive information for both NOAALINC and MERMAid metadata discovery tools ( Figure  4). • Each collection, tape, clip or highlights video, or still image may be documented using guidelines developed by the VDMS working group. These guidelines are referred to as DV12 (Digital Video 12 descriptive elements) for video and DI12 (Digital Image 12 descriptive elements) for still images. DV12 and DI12 templates were developed to help field personnel document the contents of videos they created. Information from the field personnel (in DV12 or DI12 form) are used by NCL staff to create MARC21 collection level records.
• The VDMS team created a Video Lab to provide a venue for transferring image data from original media to massstorage media. It is equipped with video processing hardware and software that enables VDMS project staff to develop and document workflow processes, enable more robust file conversion and video management capabilities, and create files suitable for online access and long term retention. Final Cut Pro 5™ software is installed on two Apple™ G5 video editing workstations to allow in-house video processing by encoding the raw video data into online-accessible format.
Additional metadata is needed to assure that observations can be referenced to a specific point in the world ocean. Geographic information is often collected automatically from shipboard systems, using Global Positioning System (GPS) or other navigation technologies. The geospatial relationship of video data to positional information is maintained by using time-stamp information available from both the video and navigation sources. In addition to time-stamp annotations for individual video tapes, data managers and PIs often provide descriptive metadata for a video and/or image collection using the DV12 and/or DI12 templates. They may also provide copies of cruise reports and/or other data reports created concurrent with or subsequent to the video collection. Descriptive information about video content, technical details about file formats, encoding algorithms, and processing equipment are needed to ensure that these videos are accessible and meaningful for the long term. Related navigation data, technical details and digital reports are archived with other observation data and become accessible through the NODC Ocean Archive System (OAS http://www.nodc.noaa.gov/Archive/Search/).

V. VIDEO DATA ARCHIVING AND MULTIPLE ACCESS
When video tapes and related data are received at the NCL, a metadata librarian organizes and enters the received products into online inventories and appropriate expedition folders. Figure 5 illustrates the flow of video and other data from data providers to the archive centers [3]. NCL notifies NODC that a new collection of video (and other materials) has been acquired. An NODC data content manager creates an accession entry in the NODC Accession Tracking Data Base (ATDB) [4] for the collection of tapes and related materials.
NODC provides long term archival storage, management and stewardship of digital oceanographic observation data and metadata. The NODC accession number assigned during the data ingest process is a tracking number for the collection. A copy of clips and highlights files from specific OER cruises is added to the associated NODC accession, with a link to the file established in the NCL MARC21 record for the collection (Figure 2). Non-video observation data collected during the cruise may be included in the same NODC accession or managed in a separate accession, with a reference to the video accession.
Archival digital files maintained at NODC are stored primarily on RAID media, with offsite backups on tape media [4]. At present, original video media provided by cruise PIs or data managers are stored and maintained by the NCL. Other hard-copy information, including a binder of paper forms created during a cruise, are also stored and maintained by NODC, the NCL, or at NCDDC. As resources for digitizing these paper media become available, digital surrogates of the paper will be maintained in the NODC digital archives with related ocean observation data. Most ocean data archived at NODC can be discovered and downloaded using the NODC Ocean Archive System (OAS, online at http://www.nodc.noaa.gov/Archive/Search/) [4].
The OER Digital Atlas was developed by the OER/NCDDC team using Google™ Maps applications for researchers looking for deep-coral video and raw data. Users can filter the color-coded dots by year and exploration theme on the map. Each dot represents an OER signature cruise from 2001 to the present. Expedition information, including scientific observations, dive activities, ship navigational data can be tracked down and retrieved. Figure 6 illustrates the OER Digital Atlas interface.
Additional links provide access to data sets archived at the National Geophysical Data Center (NGDC), NODC or NCL, to Expedition Education Modules (EEMs) developed specifically for some NOAA signature expeditions, and to geographic information system (GIS) applications with detailed cruise and related data layers. The VDMS project establishes a solid example of procedures to assure that NOAA's scientific video data in both physical and online formats are archived and preserved for future generations. The VDMS project working group continues to collaborate closely with NOAA OER project scientists, oceanographers, and IT specialists to develop data management requirements and strategies. This project provides an ongoing opportunity to improve the quality and completeness of metadata and information used in the NOAALINC catalog and the NODC Ocean Archive System and to provide online access to NOAA ocean exploration video and related data to a global customer base.
The successes of the VDMS project demonstrate that much has been done, but there is more work to do. As additional resources become available, plans include providing online access to broader subsets of available digital video holdings, hosting an informal seminar series, and examining how other groups (e.g., educators, other scientists) use the digital video data. The long term VDMS Project plans include: • Increasing access to multi-platform video images through the NOAA Libraries Online Catalog (NOAALINC) and to the WorldCat catalog. WorldCat is the world's largest and richest database of bibliographic information, linking approximately 108 million bibliographic records from the catalogs of over 54,000 libraries in 109 countries.
• Developing a web-based portal from which diverse OER ocean data, including video, still image, and audio files will be accessible via text-driven searches or from map-driven searches using a digital atlas.
• Using the NODC Archive Management System as the central digital file management repository for video, still images, ocean observations, and related documentation.
• Expanding the scope of relevant video and image data to include similar data and information from other NOAA Line Offices and Program Offices.