MODIStsp: A tool for automatic preprocessing of MODIS time series
2016-05-01
Introduction
MODIStsp is a novel “R” package allowing to automatize the creation of time series of rasters derived from MODIS Land Products data. It allows to perform several preprocessing steps on MODIS data available within a given time period.
Development of MODIStsp started from modifications of the ModisDownload “R” script by Thomas Hengl (2010), and successive adaptations by Babak Naimi (2014). The basic functionalities for download and preprocessing of MODIS datasets provided by these scripts were gradually incremented with the aim of:
- Developing a standalone application allowing to perform several preprocessing steps (e.g., download, mosaicking, reprojection and resize) on all available MODIS land products by exploiting a powerful and user-friendly GUI front-end;
- Allowing the creation of time series of both MODIS original layers and additional Quality Indicators (e.g., data acquisition quality, cloud/snow presence, algorithm used for data production, etc. ) extracted from the aggregated bit-field QA layers
- Allowing the automatic calculation and creation of time series of several additional Spectral Indexes starting form MODIS surface reflectance products
All processing parameters can be easily set with a user-friendly GUI, although non-interactive execution exploiting a previously created Options File is possible. Stand-alone execution outside an “R” environment is also possible, allowing to use scheduled execution of MODIStsp to automatically update time series related to a MODIS product and extent whenever a new image is available.
Required MODIS HDF files are automatically downloaded from NASA servers and resized, reprojected, resampled and processed according to user’s choices. For each desired output layer, outputs are saved as single-band rasters corresponding to each acquisition date available for the selected MODIS product within the specified time period. “R” RasterStack objects with temporal information as well as Virtual raster files (GDAL vrt and ENVI META files) facilitating access to the entire time series can be also created.
Installation
IMPORTANT: MODIStsp requires R v \(\ge\) 3.2.1 and GDAL (Geospatial Data Abstraction Library) v \(\ge\) 1.11.1 To be installed in your system. Brief instructions for installing R and GDAL can be found HERE.
On Windows
Install and load the gWidgetsRGtk2
package:
install.packages("gWidgetsRGtk2")
library(gWidgetsRGtk2)
Upon loading the package, an error window will probably appear. Don’t worry! This is just signaling that libatk-1.0-0.dll is missing from your system. This is due to the fact that library “GTK+” is not yet installed on your system and needs to be installed. To do so, press “OK”. A new window dialog window will appear, asking if you want to install “GTK+”. Select “Install GTK+” and then “OK”. Windows will download and install the GTK+ library. When it finishes, the RSession will be restarted and you should be ready to go!
Install MODIStsp package from GitHub. (You’ll need to have the “devtools” package installed and loaded)
install.packages("devtools")
library(devtools)
install_github("lbusett/MODIStsp")
On Linux systems
- Install the following required dependencies:
- Cairo \(\ge\) 1.0.0, ATK \(\ge\) 1.10.0, Pango \(\ge\) 1.10.0, GTK+ \(\ge\) 2.8.0, GLib \(\ge\) 2.8.0 (required by package
RGtk2
)
- Curl (required by package
curl
)
- GDAL \(\ge\) 1.6.3, PROJ.4 \(\ge\) 4.4.9 (required by package
rgdal
)
On Debian and Ubuntu-based systems, to install packages open a terminal and type
sudo apt-get install r-cran-cairodevice r-cran-rgtk2 libcurl4-openssl-dev libgdal-dev libproj-dev
From R install the libraries gWidgetsRGtk2
and devtools
:
install.packages(c("devtools","gWidgetsRGtk2"))
Install MODIStsp package from GitHub (you’ll need to have the “devtools” package loaded):
library(devtools)
install_github("lbusett/MODIStsp")
Dependencies
MODIStsp exploits functionalities of several other “R” packages. In particular, the following packages are imported:
bitops
(\(\ge\) 1.9.6), data.table
(\(\ge\) 1.9.6), gdalUtils
(\(\ge\) 2.0.1.7), gWidgetsRGtk2
(\(\ge\) 0.0-54), hash
(\(\ge\) 2.2.6), plyr
(\(\ge\) 1.8.3), raster
(\(\ge\) 2.5-2), RCurl
(\(\ge\) 1.95-4.8), rgdal
(\(\ge\) 1.1-8), rgeos
(\(\ge\) 0.3-8), xts
(\(\ge\) 1.0-10), sp
(\(\ge\) 1.2-2), stringr
(\(\ge\) 1.0.0), XML
(\(\ge\) 3.98-1.1),
while the following are suggested:
knitr
, rmarkdown
, png
, grid
.
Additionaly, MODIStsp requires availability of the GDAL - Geospatial Data Abstraction Library v. >1.11.1, with support for HDF4 raster format.
Selection of processing parameters through the GUI
When running MODIStsp in Interactive Mode, a user-friendly GUI allows selection of all processing options required for the creation of the desired MODIS time series. The main available processing options are described in detail in the following.
MODIS Product, Satellites and Layers
Allows to select the MODIS product of interest from a drop-down menu. The user can also select which MODIS sensors should be considered for download and creation of the time series (“Terra”, “Aqua” or both). After selecting the product, pushing the “Select Layers” button opens the “Select Processing Layers” GUI panel (Figure 2), from which the user must select which MODIS original and/or derived QI and SI layers should be processed:
- The left-hand frame allows to select which original MODIS layers should be processed
- The central frame allows to select which Quality Indicators should be extracted from the original MODIS Quality Assurance layers.
- For MODIS products containing surface reflectance data, the right-hand frame allows to select which additional Spectral Indexes should be computed.
Some of the most commonly used Spectral Indexes are available for computation by default (Table II).
Table II: List of default Spectral Indexes available in MODIStsp
NDVI |
Normalized Difference Vegetation Index (Rouse et al. 1973) |
EVI |
Enhanced Vegetation Index (A. Huete et al. 2002) |
SR |
Simple Ratio(Tucker 1979) |
NDFI |
Normalized Difference Flood Index (Boschetti et al. 2014) |
NDII7 (NDWI) |
Normalized Difference Infrared Index - Band 7 (Hunt and Rock 1989) |
SAVI |
Soil Adjusted Vegetation Index (A.R Huete 1988) |
NDSI |
Normalized Difference Snow Index (Hall et al. 2002) |
NDII6 |
Normalized Difference Infrared Index - band 6 (Hunt and Rock 1989) |
GNDVI |
Green Normalized Difference Vegetation Index (Gitelson and Merzlyak 1998) |
RGRI |
Red Green Ration index (Gamon and Surfus 1999) |
GRVI |
Green-red ratio vegetation index (Tucker 1979) |
Users can however specify other SIs to be computed without modifying MODIStsp source code by clicking on the “Add Custom Index” button, which allow to provide info related to the new desired SI using a simple GUI (Figure 3) interface.
Provided information (e.g., correct bandnames, computable formula, etc…) is automatically checked, and the new index added in the list of available ones for all products allowing its computation at the next MODIStsp execution (Figure 4).
Processing Period
Allows to specify the starting and ending dates (dd/mm/yyyy) to be considered for the creation of the time series.
Spatial Extent
Allows to define the area of interest for the processing. Two main options are possible: .
Full Tiles Extent: the user must specify which MODIS tiles he would like to process using the “Start” and “End” horizontal and vertical sliders in the “Required MODIS Tiles” frame. During processing, data from the different tiles is mosaiced, and a single file covering the total area is produced for each acquisition date (Note: pressing the “show map” button, a representation of the MODIS tiles grid is shown to facilitate the selection).
Resized: the user can specify the spatial extent of the desired outputs either;
Manually inserting the coordinates of the Upper Left and Lower Right corners of the area of interest in the “Bounding Box” frame. Coordinates of the corners must be provided in the coordinate system of the selected output projection.
pressing the “Load Extent from a Spatial File” and selecting a raster or vector spatial file. In this case, the bounding box of the selected file is retrieved, converted in the selected output projection, and shown in the “Bounding Box” frame.Required input MODIS tiles are also automatically retrieved from the output extent, and the tiles selection sliders modified accordingly.
Reprojection and Resize
Allows to specify the options to be used for reprojecting and resizing the MODIS images. In particular:
The “Output Projection” menu allows to either select one of the pre-defined output projections or specify a user-defined one by selecting “User Defined” and then inserting a valid “Proj4” string in the pop-up window. Validity of the Proj4 string is automatically checked, and error messages issued if the check fails.
The “Output Resolution”, “Pixel Size” and “Reprojection Method” menus allow to specify whether output images should inherit their spatial resolution from the original MODIS files, or be resampled to a user-defined resolution. In the latter case, output spatial resolution must be specified in the measure units of the selected output projection. At the moment, the resampling method can instead be chosen among “Nearest Neighbour” and “Mode” (Useful for downsampling purposes). Other resampling methods (e.g., bilinear, cubic) are not currently supported since i) they cannot be used for resampling of categorical variables such as the QA and QI layers, and ii) using them on continuous variable (e.g., reflectance, VI values) without performing an a-priori data cleanind would risk to contaminate the values of high-quality observations with those of low-quality ones.
Processing Options
Allows first of all to specify the format desired for the output images. Two of the most commonly formats used in remote sensing applications are available at the moment: ENVI binary and GeoTiff. If GeoTiff is selected, the type of file compression can be also specified among “None”, “PACKBITS”, “LZW” and “DEFLATE”.
The user can then specify if virtual multitemporal files should be created. These virtual files allow access to the entire time series of images as a single file without the need of creating large multitemporal raster images. Available virtual files formats are ENVI metafiles and GDAL “vrt” files. Additionally, the user can select if he desires to save the time series also as “R” rasterStack objects (with temporal information added through the “setZ” method of the raster package). This may be useful in order to easily access the preprocessed MODIS data within “R” scripts.
Finally, users can select if the NoData values of MODIS layers should be kept at their original values, or changed to those specified within the “MODIStsp_Products_Opts” XML file. By selecting “Yes” in the “Change Original NODATA values” checkbox, NoData of outputs are set to the largest integer value possible for the data type of the processed layer (e.g., for 8-bit unsigned integer layers, NoData is set always to 255, for 16-bit signed integer layers to 32767, and for 16-bit unsigned integer layers to 65535). Information about the new NoData values is stored both in the output rasters, and in the XML files associated with them.
Main Output Folder for Time Series Storage
Allows to specify the main folder where the pre-processed time series data will be saved stored. The “Reprocess Existing Data” checkbox allows to specify if images already available should be reprocessed if a new run of MODIStsp is launched with the same output folder. If set to “No”, MODIStsp skips dates for which output files following the MODIStsp naming conventions are already present in the output folder. This allows to incrementally extend MODIS time series without reprocessing already available dates.
Output Folder for Original HDF Storage
Allows to specify the folder where downloaded original MODIS HDF files are stored. The “delete original HDF files” checkbox allows to specify if the downloaded images should be deleted from the file system at the end of the processing. To avoid accidental file deletion, this is always set to “No” by default, and a warning is issued before execution whenever the selection is changed to “Yes”.
Processing
Upon pressing the “Start” button, the main processing routine is launched (MODIStsp_process). This routine performs the following main tasks:
Retrieve the processing options from the GUI (or the saved RData file in case of non-interactive execution
Connect to the lpdaac http MODIS distribution archive, and retrieve the list of HDF images of the selected MODIS product available for the time period selected by the user, for each tile required to cover the selected study area.
For each identified date of acquisition;
Download all required hdf images.
For each original hdf layer i) selected by the user, or ii) required to compute a selected QI or SI layer, extract the data from the original MODIS images and resize and reproject it to the selected output projection, extent and resolution. If more than one tile is needed to cover the output extent, all required tiles are automatically mosaicked before resizing (gdalbuildvrt functionalities are used to avoid creating large temporary raster files). All the main spatial processing tasks are performed using standard GDAL routines, exploiting the wrappers for “R” provided in the “gdalUtils” package (Greenberg and Mattiuzzi 2014). Results are saved as raster GeoTiff or ENVI files using MODIStsp naming conventions (See Section 5).
Starting from files created at point b), compute required QI and/or SI layers and save the results as GeoTiff or ENVI files. Quality Indicators are computed from original QA layers using bitwise operators available in the “bitOps” package (Dutky and Maechler 2013), using a generalization of the “modis.qc.R” script by Yann Chemin (Chemin 2008). Computation of SI layers exploits on-the-fly parsing of the indexes’ formulae to identify the required input raster files and perform the computation.
Delete raster files created at point b) used to compute the QI or SI layers but that correspond to original HDF layers not originally selected for processing.
- When all dates have been processed, create the virtual time series files if required.
Accessing and analyzing the processed time series from R
Preprocessed MODIS data can be retrieved within R scripts either by accessing the single-date raster files, or by loading the saved RasterStack objects. This second option allows accessing the complete data stack and analyzing it using the functionalities for raster/raster time series analysis, extraction and plotting provided for example by the raster
or rasterVis
packages. MODIStsp provides however an efficient function MODIStsp_extract for extracting time series data at specific locations. The function takes as input a RasterStack object with temporal information created by MODIStsp, the starting and ending dates for the extraction and a standard _Sp*_ object (or an ESRI shapefile name) specifying the locations (points, lines or polygons) of interest, and provides as output a _ xts object containing time series for those locations. If the input is of class SpatialPoints, the output object contains one column for each point specified, and one row for each date. If it is of class SpatialPolygons (or SpatialLines), it contains one column for each polygon (or each line), with values obtained applying the function specified as the “FUN” argument (e.g., mean, standard deviation, etc.) on pixels belonging to the polygon (or touched by the line), and one row for each date.
As an example the following code:
#Set the input paths to raster and shape file
infile = 'in_path/MOD13Q1_MYD13Q1_NDVI_49_2000_353_2015_RData.RData'
shpname = 'path_to_file/rois.shp'
#Set the start/end dates for extraction
startdate = as.Date("2010-01-01")
enddate = as.Date("2014-12-31")
#Load the RasterStack
inrts = get(load(infile))
# Compute average and St.dev
dataavg = MODIStsp_extract(inrts, shpname, startdate, enddate, FUN = 'mean', na.rm = T)
datasd = MODIStsp_extract (inrts, shpname, startdate, enddate, FUN = 'sd', na.rm = T)
# Plot average time series for the polygons
plot.xts(dataavg)
loads a RasterStack object containing 8-days 250 m resolution time series for the 2000-2015 period and extracts time series of average and standard deviation values over the different polygons of a user’s selected shapefile on the 2010-2014 period.
Standalone execution and scheduled processing
MODIStsp can be executed as a standalone application using the MODIStsp.bat (for Windows) or MODIStsp.sh (for Linux) batch execution scripts available in the “MODIStsp/ExtData/launcher” subfolder of the package installation. Double-clicking the files or launching them from a shell without parameters launches MODIStsp in interactive mode.
Non-interactive mode is triggered by adding the “-g” argument to the call, and specifying the path to a valid Options File as “-s” argument
- In Linux:
yourpath_to_MODIStsp_sh/MODIStsp.sh -g -s "/yourpath/youroptions.RData"
(launch yourpath_to_MODIStsp_sh/MODIStsp.sh -h
for details).
- In Windows:
yourpath_to_MODIStsp_bat\MODIStsp.bat -g -s "X:/yourpath/youroptions.RData"
(launch yourpath_to_MODIStsp_bat\MODIStsp.bat -h
for details).
Standalone non-interactive execution easily allows to automatically update the time series of a selected product over a given study area whenever a new MODIS image is available. To do that, the user must simply:
Open the MODIStsp GUI, define the parameters of the processing specifying a date in the future as the “Ending Date” and save the processing options. Then quit the program
Schedule non-interactive execution of MODIStsp.bat (or MODIStsp.sh) as windows scheduled task (or linux “cron” job) according to a specified time schedule, specifying the path of a previously saved Options file as additional argument:
- In Linux: edit your crontab by opening a terminal and typing
bash crontab -e
Then add an entry for the MODIStsp.bsh For example, if you want to run the tool every day at 23.00, add the following row:
bash 0 23 * * * /bin/bash /yourpath_to_MODIStsp_sh/MODIStsp.sh -g -s "/yourpath/youroptions.RData"
- In Windows: create a Task following these instructions; add the path of the MODIStsp.bat launcher as Action (point 6), and specify
-g -s "X:/yourpath/youroptions.RData"
as argument.
Adding links to desktop/Start menu for standalone execution
Links to the MODIStsp.bat or MODIStsp.sh standalone launchers can be created automatically launching from R the function MODIStsp_install_launcher()
In Linux: this creates a desktop entry (accessible from the menu in the sections “Science” and “Geography”), and a symbolic link in a known path (default: /usr/bin/MODIStsp). If the path of the symbolic link is included in the user PATH variable, the standalone execution can be done simply calling MODIStsp -g -s "/yourpath/youroptions.RData"
.
In Windows:: this creates a link in the Start Menu and optionally a desktop shortcut.
See ?install_MODIStsp_launcher
for details and path customisations.
Installing R and GDAL
Installing R
Windows
Download and install the latest version of R which can be found here.
Linux
Please refer to the documentation which can be found here, opening the directory relative to the user Linux distribution. The documentation provides instruction to add CRAN repositories and to install the latest R version. With Ubuntu 15.10 Wily (and newer) this step is not mandatory (altough recommended), since packaged version of R is \(\ge\) 3.2.1 (although not the latest); in this case, user can insall R by simply typing in a terminal
sudo apt-get install r-base
Installing GDAL \(\ge\) 1.11.1
Windows
The easiest way to install GDAL on Windows is from the OSGeo4W Website
- Open the OSGeo4W Website
- In the Quick Start for OSGeo4W Users section, select the download of 32bit or 64bit of OSGeo4W network installer
- Run the installer
- Easiest Option:
- Select Express Desktop Install, then proceed with the installation. This will install GDAL and also other useful Spatial Processing softwares like QGIS and GRASS GIS
- Advanded Option:
- Select Advanced Install, then click on “Next” a few times until you reach the “Select Packages” screen.
- Click on “Commandline_Utilities_”, and on the list look for “_gdal: The GDAL/OGR library…" entry
- Click on “Skip”: the word “skip” will be replaced by the current GDAL version number
- Click on “Next” a few times to install GDAL
Debian and Ubuntu-based systems
Ensure that your repositories contain a version of gdal-bin
\(\ge\) 1.11.1. In particular, official repositories of Ubuntu 15.04 Vivid (or older) and Debian Jessie (or older) provide older versions of GDAL, so it is necessary to add UbuntuGIS-unstable repository before installing. To do this, follow instructions here). With Ubuntu 15.10 Wily (and newer) this step is not mandatory, altough recommended in order to have updated version of GDAL installed.
To install GDAL a terminal and type
sudo apt-get install gdal-bin
ArchLinux
GDAL is maintained updated to the latest version as binary package within the community repository; although that, the support for HDF4 format is not included. To bypass this problem, ArchLinux users can install gdal-hdf4
package from AUR (see here or here for the package installation from AUR). This package is updated manually after each release of gdal
on the community repository, so a temporal shift between a new gdal
release and the update of gdal-hdf4
could happen. If you want to manually add the support for HDF4 in case gdal-hdf4
is out-of-date, you can do it following these instructions.
Other Linux systems
Install the packaged binary of GDAL included in your specific distribution; if the version is older than 1.11.1, or if the support for HDF4 format is not included, you can manually install the HDF4 library and compile the source code by adding the parameter --with-hdf4
to the configure
instruction).
References
Boschetti, M., F. Nutini, G. Manfron, P. A. Brivio, and A. Nelson. 2014. “Comparative analysis of normalised difference spectral indices derived from MODIS for detecting surface water in flooded rice cropping systems.” PLoS ONE 9 (2). doi:10.1371/journal.pone.0088741.
Chemin, Y. 2008. modis.qc.R “R” script.
Gamon, J.A., and J.S. Surfus. 1999. “Assessing leaf pigment content and activity with a reflectometer.” New Phytologis 143 (1): 105–17.
Gitelson, A.A., and M.N. Merzlyak. 1998. “Remote sensing of chlorophyll concentration in higher plant leaves.” Advances in Space Research 22 (5): 689–92. doi:10.1016/S0273-1177(97)01133-2.
Hall, Dorothy K, George A Riggs, Vincent V Salomonson, Nicolo E DiGirolamo, and Klaus J Bayr. 2002. “MODIS snow-cover products.” Remote Sensing of Environment 83 (1-2): 181–94. doi:10.1016/S0034-4257(02)00095-0.
Hengl, T. 2010. “Download and resampling of MODIS images.” http://es.
Huete, A., K. Didan, T. Miura, E.P. Rodriguez, X. Gao, and L.G. Ferreira. 2002. “Overview of the radiometric and biophysical performance of the MODIS vegetation indices.” Remote Sensing of Environment 83 (1-2): 195–213. doi:10.1016/S0034-4257(02)00096-2.
Huete, A.R. 1988. “A soil-adjusted vegetation index (SAVI).” Remote Sensing of Environment 25 (3): 295–309. doi:10.1016/0034-4257(88)90106-X.
Hunt, J.R., and B. Rock. 1989. “Detection of changes in leaf water content using Near- and Middle-Infrared reflectances.” Remote Sensing of Environment 30 (1): 43–54. doi:10.1016/0034-4257(89)90046-1.
Rouse, J.W.J., R.H. Haas, J.A. Schell, and D.W. Deering. 1973. “Monitoring vegetation systems in the Great Plains with ERTS. Third ERTS Symposium, NASA SP-351. U.S. Gov. Printing office.” Edited by S.C. Freden, E.P. Mercanti, and M.A. Becker. NASA.
Tucker, C.J. 1979. “Red and photographic infrared linear combinations for monitoring vegetation.” Remote Sensing of Environment 8 (2): 127–50. doi:10.1016/0034-4257(79)90013-0.