


Reads in ERA Interim files and outputs a struct containing the requested
variables for a given spatial domain and time period.
ERA = read_ERA_wind(Mobj, startDate, endDate, datadir, varlist)
DESCRIPTION:
Find all the ERA Interim NetCDF files in datadir and aggregate them
into a single MATLAB struct containing the variables specified in
varlist. In addition to the specified variables, time, longitude and
latitude will be extracted.
The ERA data consist of two files types, gafs*.nc and ggas*.nc. The
former contain:
- SSTK: sea surface temperature
- SP: surface pressure
- MSL: mean sea level pressure
- U10: u wind (10m)
- V10: v wind (10m)
- T2: 2m temperature
- SKT: skin temperature
the latter:
- UVB: downward UV radiation at the surface
- E: evaporation
- FISR: top of atmosphere incident solar radiation
- LSP: stratiform precipitation (large scale precipitation)
- CP: convective preciptation
- SSHF: surface sensible heat flux
- SLHF: surface latent heat flux
- SSRD: surface solar radiation downwards
- SSR: surface solar radiation
- TSR: top solar radiation
- TTR: top thermal radiation
- TP: total precipitation
- TSRC: top net solar radiation, clear sky
- TTRC: top upward thermal radiation, clear sky
- SSRC: surface net solar radiation, clear sky
- STRC: surface net thermal radiation, clear sky
Not all of these are necessarily useful with FVCOM, but they're the
ones that stood out as potentially useful.
INPUT:
Mobj - Mesh object containing the following fields:
* nativeCoords - 'spherical' or 'cartesian'
* lon, lat or x, y - node coordinates (depending on nativeCoords)
startDate - model start date ([YYYY, MM, DD, hh, mm, ss])
endDate - model end date ([YYYY, MM, DD, hh, mm, ss]). Start and end
dates must fall within the same year.
datadir - path to the directory which contains the ERA NetCDF files
varlist - list of the particular variables to extract from the NetCDF
files.
OUTPUT:
era - struct containing the variables specified in varlist.
Author(s)
Pierre Cazenave (Plymouth Marine Laboratory)
Revision history:
2012-10-19 First version based loosely on read_NCEP_wind.m in the
fvcom-toolbox.
2013-07-08 Modified to automatically find the necessary files and
extract the data to cover the time period in question.
==========================================================================

0001 function era = read_ERA_wind(Mobj, startDate, endDate, datadir, varlist) 0002 % Reads in ERA Interim files and outputs a struct containing the requested 0003 % variables for a given spatial domain and time period. 0004 % 0005 % ERA = read_ERA_wind(Mobj, startDate, endDate, datadir, varlist) 0006 % 0007 % DESCRIPTION: 0008 % Find all the ERA Interim NetCDF files in datadir and aggregate them 0009 % into a single MATLAB struct containing the variables specified in 0010 % varlist. In addition to the specified variables, time, longitude and 0011 % latitude will be extracted. 0012 % 0013 % The ERA data consist of two files types, gafs*.nc and ggas*.nc. The 0014 % former contain: 0015 % - SSTK: sea surface temperature 0016 % - SP: surface pressure 0017 % - MSL: mean sea level pressure 0018 % - U10: u wind (10m) 0019 % - V10: v wind (10m) 0020 % - T2: 2m temperature 0021 % - SKT: skin temperature 0022 % the latter: 0023 % - UVB: downward UV radiation at the surface 0024 % - E: evaporation 0025 % - FISR: top of atmosphere incident solar radiation 0026 % - LSP: stratiform precipitation (large scale precipitation) 0027 % - CP: convective preciptation 0028 % - SSHF: surface sensible heat flux 0029 % - SLHF: surface latent heat flux 0030 % - SSRD: surface solar radiation downwards 0031 % - SSR: surface solar radiation 0032 % - TSR: top solar radiation 0033 % - TTR: top thermal radiation 0034 % - TP: total precipitation 0035 % - TSRC: top net solar radiation, clear sky 0036 % - TTRC: top upward thermal radiation, clear sky 0037 % - SSRC: surface net solar radiation, clear sky 0038 % - STRC: surface net thermal radiation, clear sky 0039 % 0040 % Not all of these are necessarily useful with FVCOM, but they're the 0041 % ones that stood out as potentially useful. 0042 % 0043 % INPUT: 0044 % Mobj - Mesh object containing the following fields: 0045 % * nativeCoords - 'spherical' or 'cartesian' 0046 % * lon, lat or x, y - node coordinates (depending on nativeCoords) 0047 % startDate - model start date ([YYYY, MM, DD, hh, mm, ss]) 0048 % endDate - model end date ([YYYY, MM, DD, hh, mm, ss]). Start and end 0049 % dates must fall within the same year. 0050 % datadir - path to the directory which contains the ERA NetCDF files 0051 % varlist - list of the particular variables to extract from the NetCDF 0052 % files. 0053 % 0054 % OUTPUT: 0055 % era - struct containing the variables specified in varlist. 0056 % 0057 % Author(s) 0058 % Pierre Cazenave (Plymouth Marine Laboratory) 0059 % 0060 % Revision history: 0061 % 2012-10-19 First version based loosely on read_NCEP_wind.m in the 0062 % fvcom-toolbox. 0063 % 2013-07-08 Modified to automatically find the necessary files and 0064 % extract the data to cover the time period in question. 0065 % 0066 %========================================================================== 0067 0068 % datadir = '/data/modellers/to_archive/momm-ERA40-interim/'; 0069 % datadir = '/users/modellers/pica/Data/ECMWF/2006'; 0070 % varlist = {'U10', 'V10', 'SKT', 'E', 'TP', 'SSRC', 'STRC'}; 0071 0072 %warning off 0073 0074 if nargin ~= 5 0075 error('Incorrect number of input arguments') 0076 end 0077 0078 subname = 'read_ERA_wind'; 0079 0080 global ftbverbose 0081 if ftbverbose 0082 fprintf('\nbegin : %s \n', subname) 0083 c = 0; % set a counter for the file load loop 0084 end 0085 0086 if exist(datadir, 'dir') ~= 7 0087 error('file: %s does not exist', datadir) 0088 end 0089 0090 %-------------------------------------------------------------------------- 0091 % Open ERA Interim data and check for time range 0092 %-------------------------------------------------------------------------- 0093 0094 [start_year, start_month, start_day] = deal(startDate(1), startDate(2), startDate(3)); 0095 [end_year, end_month, end_day] = deal(endDate(1), endDate(2), endDate(3)); 0096 0097 if start_year ~= end_year 0098 error('Cannot (yet?) process more than a single year at a time') 0099 end 0100 0101 % Get a list of files to use. This is not pretty, but seems to work OK. 0102 files = cell(0); 0103 months = start_month:end_month; 0104 for mm = 1:length(months) 0105 tfiles = dir(fullfile(datadir, num2str(start_year), sprintf('%02d', months(mm)), '*.nc')); 0106 0107 % Add the files to the overall list. 0108 for f = 1:length(tfiles) 0109 if strcmpi(tfiles(f).name, '.') || strcmpi(tfiles(f).name, '..') 0110 continue 0111 else 0112 files{length(files) + 1} = fullfile(datadir, num2str(start_year), sprintf('%02d', months(mm)), tfiles(f).name); 0113 end 0114 end 0115 end 0116 0117 nf = length(files); 0118 0119 % Start with a clean slate 0120 tggas = []; 0121 tgafs = []; 0122 gg = 0; 0123 ga = 0; 0124 clear era 0125 0126 for f = 1:nf 0127 0128 % Get the filename only for prefix comparison 0129 [~, ff, ext] = fileparts(files{f}); 0130 0131 if ftbverbose 0132 c = c + 1; 0133 fprintf('File %s (%i of %i)... ', [ff, ext], c, nf) 0134 end 0135 0136 % File name is ????YYYYMMDDOOFF or ????YYYYMMDDHHFF (with no apparent 0137 % rhyme or reason...). OO is the time origin (midday/midnight) and FF 0138 % is the number of hours since the time origin (3, 6, 9 or 12). Why the 0139 % files couldn't consistently use 24 hour times... 0140 ymd = greg2mjulian(str2double(ff(5:8)), ... 0141 str2double(ff(9:10)), ... 0142 str2double(ff(11:12)), ... 0143 str2double(ff(13:14)) + str2double(ff(15:16)), ... 0144 0, 0); % don't have minutes and seconds in the file name 0145 0146 if strcmpi(ff(1:4), 'ggas') 0147 gg = gg + 1; 0148 tggas(gg) = ymd; 0149 elseif strcmpi(ff(1:4), 'gafs') 0150 ga = ga + 1; 0151 tgafs(ga) = ymd; 0152 else 0153 warning('Unrecognised ERA Interim file prefix (%s)', ff(1:4)) 0154 end 0155 0156 % Now load in the variables in varlist for all the relevant files into two 0157 % structs, ggas and gafs. We're assuming that the files are listed 0158 % increasing with time (i.e. ????YYYYMMDDhhmm.nc). 0159 nc = netcdf.open(files{f}, 'NOWRITE'); 0160 0161 % Only do the spatial data on the first file (the data are global). 0162 if f == 1 0163 lat_varid = netcdf.inqVarID(nc, 'latitude'); 0164 lon_varid = netcdf.inqVarID(nc, 'longitude'); 0165 eralatvector = netcdf.getVar(nc, lat_varid, 'double'); 0166 eralonvector = netcdf.getVar(nc, lon_varid, 'double'); 0167 dx = unique(diff(eralonvector)); % should be all the same value 0168 dy = mean(abs(diff(eralatvector))); % varies, hence mean 0169 0170 % To save on memory, cull the data which doesn't come from the 0171 % domain we've specified (with a 2 element buffer). 0172 if min(Mobj.lon) < 0 && max(Mobj.lon) < 0 0173 % Easy, just shift by 360. 0174 idx_lon = find(eralonvector > (min(Mobj.lon) - (2 * dx)) + 360 ... 0175 & eralonvector < (max(Mobj.lon) + (2 * dx)) + 360); 0176 elseif min(Mobj.lon) < 0 && max(Mobj.lon) > 0 0177 % This is the tricky one. We'll do two passes to extract 0178 % the western chunk first (min(Mobj.lon) + 360 to 360), 0179 % then the eastern chunk (0 - max(Mobj.lon)) 0180 idx_lon{1} = find(eralonvector >= (min(Mobj.lon) - (2 * dx)) + 360); 0181 idx_lon{2} = find(eralonvector < (max(Mobj.lon) + (2 * dx))); 0182 else 0183 % Dead easy, we're in the eastern hemisphere, so nothing 0184 % too strenuous here. 0185 idx_lon = find(eralonvector > (min(Mobj.lon) - (2 * dx)) ... 0186 & eralonvector < (max(Mobj.lon) + (2 * dx))); 0187 end 0188 % Latitudes are easy because there's only one system. 0189 idx_lat = find(eralatvector > (min(Mobj.lat) - (2 * dy)) & eralatvector < (max(Mobj.lat) + (2 * dy))); 0190 0191 % Make a grid of the domain 0192 [era.lon, era.lat] = meshgrid(eralonvector(idx_lon), eralatvector(idx_lat)); 0193 0194 end 0195 0196 for v = 1:length(varlist) 0197 % Use a try catch to pass on the files which don't contain the 0198 % variable we're currently looping over. 0199 try 0200 % Get the data 0201 varid_era = netcdf.inqVarID(nc, varlist{v}); 0202 data = netcdf.getVar(nc, varid_era, 'single'); 0203 catch 0204 % Skip this variable and try the next one 0205 continue 0206 end 0207 0208 try 0209 % Try to apply the scale factor and offset (only if using the 0210 % data from the website - the PML data are already converted). 0211 sf = netcdf.getAtt(nc, varid_era, 'scale_factor', 'double'); 0212 ao = netcdf.getAtt(nc, varid_era, 'add_offset', 'double'); 0213 if isfield(era, varlist{v}) == 0 0214 era.(varlist{v}).data = permute(double(ao + (data(idx_lon, idx_lat) .* sf)), [2, 1, 3]); 0215 else 0216 era.(varlist{v}).data = cat(3, era.(varlist{v}).data, permute(double(ao + (data(idx_lon, idx_lat) .* sf)), [2, 1, 3])); 0217 end 0218 catch 0219 % Otherwise just dump the data as is. 0220 if isfield(era, varlist{v}) == 0 0221 era.(varlist{v}).data = permute(double(data(idx_lon, idx_lat)), [2, 1, 3]); 0222 else 0223 era.(varlist{v}).data = cat(3, era.(varlist{v}).data, permute(double(data(idx_lon, idx_lat)), [2, 1, 3])); 0224 end 0225 end 0226 0227 % Put the lon/lat into the field too. 0228 era.(varlist{v}).lat = eralatvector(idx_lat); 0229 era.(varlist{v}).lon = eralonvector(idx_lon); 0230 0231 end 0232 0233 if ftbverbose 0234 fprintf('done.\n') 0235 end 0236 end 0237 0238 % Dump the time variables to the struct 0239 era.time_gafs = tgafs; 0240 era.time_ggas = tggas; 0241 0242 netcdf.close(nc) 0243 0244 if ftbverbose 0245 fprintf('end : %s \n', subname) 0246 end