Server-side OPeNDAP Analysis - A General Approach Utilizing Legacy Applications through TDS
- 1. Weathertop Consulting, LLC
This slide deck was presented as part of the OPeNDAP Developer's Workshop, Winter 2007 that was held from February 21-23, 2007 at the NCAR Foothills Laboratory in Boulder, CO.
Within the OPeNDAP community the GrADS Data Server (GDS) was a pioneer [Wielgosz, 2003] in introducing server-side analysis capabilities. GDS server-side calculations have proven to be enormously popular because they permit data-intensive calculations (e.g. climatological averaging) to be performed on fast data resources local to the server greatly reducing the volume of traffic across the Internet. Within the Live Access Server (LAS) project we adopted the GDS code framework and URL syntax in the development of the Ferret Data Server (FDS) [Rogers, 2004]. Through this server LAS was able to guarantee a consistent geo-referenced and COARDS-standardized OPeNDAP access to the gridded data served by LAS.
In recent work we have ported the FDS functionality to the flexible framework that is available through the Unidata THREDDS Data Server. (We refer to the new server as F-TDS.) We developed a Ferret I/O Service Provider (IOSP) for the Java netCDF library. Ferret is a legacy, command-line analysis and graphics package that reads netCDF (COARDS and CF-1.0), ASCII and various binary file formats. Through Ferret directives (“commandsâ€Â) one can define new virtual variables which represent the result of some analysis operation applied to one or more of the data variables. By registering the Ferret IOSP with the THREDDS Data Server (TDS) a Ferret script which reads netCDF data and defines virtual variables then becomes an OPeNDAP data set. All of the variables (the real and virtual variables) defined by the script are visible to OPeNDAP clients – seamlessly through the netCDF API.
Much of the power of the netCDF API is that it allows applications to 'see' the full coordinate domain of a dataset, but to delay the reading of the data until the client specifies the precise subset of interest. A special character of Ferret is that it allows this approach (so-called 'delayed mode') to be applied to virtual variables. It performs the analysis calculations on-demand, only on the sub-set of the data requested.
The Ferret IOSP and THREDDS Data Server combination can also be used to enable on-the-fly server-side analysis. Data access information for multiple data sources and the Ferret commands to define virtual variable can be embedded into the OPeNDAP URL for this server. This information is parsed on the server; a new script is dropped into the scan area being monitored by TDS and the new data set with its virtual variables is immediately available via the TDS.
The techniques that we have employed are readily applicable for integrating broad classes of legacy applications (Ferret, GrADS, NCL, CDAT, Matlab, etc.) as OPeNDAP server-side computation engines. Other efforts such as SWAMP [Wang, 2007] demonstrate the community's interest in OPeNDAP server-side capabilities. We think it would be a good idea to develop a common syntax for communicating the underlying mathematics of the server-side functions. Having a common syntax would simplify the work for client writers and users who want to take advantage of server-side analysis available from GDS, the Ferret IOSP/TDS and other frameworks and servers. We would appreciate some suggestions, comments and volunteers toward nailing down a server-side analysis framework.
Joe Wielgosz, Brian Doty, The Grads-Dods Server: An Open-Source Tool For Distributed Data Access And Anaysis, 19th Conference on IIPS
Richard Rogers, Steve Hankin and Ansley Manke, The Ferret DODS Server, 20th Conference on IIPS
Daniel L. Wang, Univ. of California, Irvine, CA; and C. S. Zender and S. F. Jenks DAP-enabled Server-side Data Reduction and Analysis, 23rd Conference on IIPS
[Almost all of the presentation listed here are available in the OPeNDAP Community Zenodo library.]
Wednesday February 21, 2007
Session 1
0900-1000 OPeNDAP Current Status
- Introduction by Peter Fox, President of OPeNDAP
1000-1015 Break
Session 2 (moderator: P. Fox)
1015-1200 The OPeNDAP 4 Data Server (aka Hyrax)
- James Gallagher and Nathan Potter - Customizing and Extending Hyrax
- Patrick West - Harnessing the power of the Server 4 Back-End Server modular framework
- Jose H. Garcia - Numerical Grid Computations with the OPeNDAP Back End Server (BES)
- Patrick West - OPeNDAP Server 4 Back-End Server Authentication and Authorization
Lunch 1215-1330
Session 3 (moderator: P. Cornillon)
1330-1530 Security
- John Relph and Kenneth S. Casey - Software Development and Security at NOAA
- John Caron - The THREDDS Data Server and OPeNDAP security
- Tim Pugh - BMRCs OPeNDAP data service, how far can it reach?
- Discussion
1530-1545 Break
Session 4 (moderator: J. Gallagher)
1545-1730 Semantics
- Benno Blumenthal- Using an RDF framework to carry metadata for datasets
- Rob Raskin - SWEET - an upper-level ontology for Earth and Space Sciences
- (withdrawn) Luis Bermudez - MMI and CF
- Peter Fox - OPeNDAP and semantics
- Discussion
No-host dinner, Cafe Gondolier
Thursday February 22, 2007
Session 5 (moderator: D. Holloway)
0900-1030 APIs and Clients
- Roberto De Almeida - Data Access Protocol meets Python
- Peter Cornillon - The Matlab OPeNDAP GUI Suite
1030-1100 Break
- Poster: Denis Nadeau - Enhancing Access to NASA Satellite Data OGC Web Services using OPeNDAP.
Session 5 continued
1100-1215 APIs and Clients (continued)
- John Chamberlain - OPeNDAP: User Versus Programmatic Access
- Greg Janee - Data Discovery in a Distributed Environment and Darren Hardy's AGU presentation.
Lunch 1215-1315
Session 6 (moderator: P. Cornillon)
1315-1445 Data Providers
- Roy Mendelssohn - OPeNDAP at ERD with suggestions for future development
- Jim Potemra - IPRC data transport, LAS and EPIC server
- Christopher Lynnes - OPeNDAP Developments at the Goddard Earth Sciences DISC
- Thomas LOUBRIEU - OPeNDAP in European oceanography data management
Session 7 (moderator: N. Potter)
1445-1715 Server-Side Operations with 1530-1600 Break
- Wenli Yang - The ROSES ACCESS OPeNDAP/OGC Gateway Project
- Roland Schweitzer - Server-side OPeNDAP Analysis - A General Approach Utilizing Legacy Applications through TDS
- Daniel L. Wang - Server-side Data Reduction and Analysis with Script Workflow Analysis for MultiProcessing (SWAMP)
- Thomas LOUBRIEU - Dap4cor: A Dapper-like server for CORIOLIS in-situ data centre
Dinner on your own
Friday February 23, 2007
Session 9 (moderator: P. Fox)
0900-0930 Server-side functions (continued)
- James Gallagher - Server-side Functions for Geo-spatial Selection
0930-1200 Strategies and Directions
- Discussion (with nominal time allocations) focusing on...
- community forums for deciding on standards processes within (and beyond) the OPeNDAP community, who is involved, timeframe, prototypes, etc. (1hr)
- security (20min)
- semantics (profiles, vocabularies) (20min)
- server-side (aggregation) (20min)
- response types (e.g. get_capabilities) (20min)
- metrics (10min)
- others: gateways, Server4, TDS and other general direction issues
Hyrax Tutorial (FL2/1001): 1300-1630
No-host dinner, location TBA
2007_02_22_11_Schweitzer_Server-side OPeNDAP Analysis_RHS-OPeNDAP_Dev_Con.pdf
(1.8 MB)
Name | Size | Download all |
1.8 MB | Preview Download |