Scaling the pipe: NASA EOS Terra data systems at 10

Standard products from the five sensors on NASA's Earth Observing System's (EOS) Terra satellite are being used world-wide for earth science research and applications. This paper describes the evolution of the Terra data systems over the last decade in which the distributed systems that produce, archive and distribute high quality Terra data products were scaled by two orders of magnitude.


INTRODUCTION
In the late 1990s when the initial version of NASA's Earth Observing System's (EOS) distributed Terra data systems was being developed, a research data system that produced such a large number and volume of research products was unprecedented. The Terra satellite carries five instruments: Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), Clouds and Earth's Radiant Energy System (CERES), Multi-angle Imaging SpectroRadiometer (MISR), MODerate Resolution Imaging Spectrometer (MODIS) and Measurement Of Pollution In The Troposphere (MOPITT). Over 70 calibration and geophysical science algorithms with complex interdependencies had to be integrated and tested before the first -)n , -orbit Terra data from these instruments were available. Two of the instruments involved the additional complexity of collaborating with international partners: Japan for ASTER and Canada for MOPITT.
At the time of the Terra launch in December 2009, the Terra data system was the most complex component of the distributed FOS Data and Information System (EOSDIS) and involved a distributed set of production, archive and distribution facilities. Once the Terra instruments data became available in early 2000 the data systems helped the instrument science teams to continuously improve the science algorithms. Multiple reprocessing campaigns have continuously improved the algorithms giving the community stable high quality validated earth science products. Over the last decade, the data systems needed to be scaled to keep up to support these activities. For example, the data archive system went from primarily a tape-based archive to an on-line multi-petabyte disk archive that greatl y improved user data access and sped up reprocessing activities.
Several factors were key to the success of the initial Terra data systems and evolution of the systems over the last decade. The most important were strong leadership of the Terra project. EOSDIS, and instruments' science team leaders. This leadership helped coordinate the large NASA science teams and keep the data system development focused on the science goals and objectives. Second was In evolvable and scalable set of data systems that were developed through close interaction with science teams. These data systems evolved over time and grew by two orders of magnitude in terms of processing and storage to allow for the daily (forward) processing, science testing and reprocessing rates needed to meet of the science teams' and the community's expectations. Finally, active applications and outreach activities facilitated a rich and varied set of Terra products that is widely used by the global community for near real-time and regional research and applications. The feedback from this community was also invaluable in the continual improvement of the standard algorithms as well as the data system capabilities.
Part of this data system evolution occurred before the launch of Terra. In an effort to make the data s ystem more distributed and reduce its components to more manageable " , chunks", the generation of standard products was moved, in most cases, from the EOSDIS Core System (ECS) to Science Investigator-led Processing Systems (SIPSs) developed and operated by the respective instrument teams. The approach to the development of ECS was also modified to result in more frequent releases of its Science Data Processing Segment based on priorities expressed by the science community. These steps led to the successful completion of all subsystems needed to support Landsat-7 (launched in April 1999) and Terra. Given the experience in getting ready for Landsat-7 and Terra, especially the multiple end-to-end tests (dubbed Mission Operations and Science System or MOSS tests), the overall readiness of the data systems for the Aqua, ICESat and Aura missions was https://ntrs.nasa.gov/search.jsp?R=20110007086 2020-01-24T16:06:35+00:00Z better and so the initial production data flows went much more smoothly [1].
In the next sections we will talk about the growth of the three main components of the EOS data systems: production, archive and distribution. This is followed by a discussion of the recent evolution in the data systems.

PRODUCTION
Two areas of processing need to be addressed in the EOS mission science data system, forward processing and reprocessing. In this discussion, we use the MODIS as an example of production system growth since MODIS produces the majority of the Terra products [2].
Forward production needs to keep up with the data stream from the instrument. Typically, Terra science products are produced within a few days of acquisition. To maintain this production rate, the forward processing system needs to be able to process a data-day worth of instrument data in a calendar day (we refer to this rate as IX). To accomplish this, and to be able to catch up when issues occur because of anomalies in the instrument, data stream or production hardware, a system capable of a production rate close to 2X is needed. At launch, because of limited resources and the late-1990s technologies, the initial processing rate for the Terra missions was very close to IX and so the catch-up capability was limited.
In addition to the standard forward processing, a near real-time (NRT) processing capability may be needed for instruments to support operational and application users.
These systems may also be used for education and outreach.
The NRT systems need to be designed to provide products specific to the application users and because of the low latency requirements of these users, may not produce the best science quality products. For Terra, one of the key systems in this area is the MODIS Rapid Response system which has played a key role in the development of many applications and has been widely recognized for its contributions to the Terra mission's public outreach.
Because of the low-latency and NRT requirements, a production rate similar to the forward production system is needed, though typically for a more limited set of products.
Over Another key to scaling to the higher reprocessing rates has been to move the Level 0 data (raw instrument data) from tape to disk. The Terra data systems have also benefited from Moore's law [3]. By buying the latest technology in the period leading up to the reprocessing production phase, the overall cost of procuring the needed hardware is minimized. So, over the 10 years since launch as the system capacity grew by an order of magnitude, the overall yearly hardware procurement has been constant while keeping up-to-date with the most recent computer technologies.

ARCHIVE
At launch, Terra archive data was held in robotic tape archives. In early 2002, the total EOSDIS archive first exceeded I petabyte in size. To improve the distribution of data to users, we started migrating some of the data of high interest to the community onto data pools, which were disk caches of the order a few tens of terabytes, considered "large" at the time [4]. Today the EOSDIS archive is close t^ One way to illustrate the growth in the EOS data system over time is to look at the amount of data that was distributed to the public in the early mission and more recently ( Table 1)   • Movement of most of the data into on-line archives (with tape back-up) to provide improved access and online services upon request, • Improvements in operational capabilities of the EOS Clearing House (ECHO) and the Warehouse Inventory Search Too] (WIST) as a search and order client [8].
As indicated above, many of the lessons learned from Terra were applied to facilitate getting ready for the follow-on EOS missions. It is also expected that NASA's upcoming Earth Science Decadal Survey missions will benefit from the Terra and EOSDIS experience.

CONCLUSION AND THE FUTURE
There has been a significant increase in the capacity and capabilities of the data system Supporting the Terra mission in the ten years of its operation. Production, archiving and distribution have all improved significantly, the most notable trend being the increase in distribution to the external user community. This has been achieved despite reductions in budget due to improvements in technology as well proactive re-architecting and evolution of the data system. In fact, the cost of operating EOSDIS has been reduced by about 30% since 2009 as a result of the recent evolution activities. Most of the data are on-line and hence are more easily accessible to users. Near real-time capabilities are being provided to support applications requiring the data within a few hours of acquisition. The Terra data systems need to remain agile to serve the community as technology and users' expectations change. The goal is to make Terra data more useful for the scientific and broader user community and to make scientific collaboration easier. Some of the emergent technologies on the horizon that need to be considered are data fusion, cloud Computing and access from mobile devices. The data processing teams are evaluating the possible use of cloud Computing because of the potential for possible savings in terms of cost and schedule. Some issues that need to be addressed before adopting this technology are data stewardship, I/O bandwidth., scientific reproducibility and governance and the cost of computing and storage in the cloud.
As the Terra spacecraft and its instruments age, an important consideration is planning for ensuring that all the ancillary data that are needed to be preserved along with the data products are captured from the currently distributed sources (e.g., instrument teams) and placed at the appropriate EOSDIS Data Centers.

ACKNIOLEDGENIENT
The authors would like to acknowledge the assistance provided by 1.,alit Wanchoo (Adnet-Systems, Inc.) in developing the metrics in Table I and Figures I and 2. This work was performed by the authors as a part of their duties as employees of NASA. Any opinions expressed are those of the authors and do not necessarily reflect the official position of NASA.