Home > fvcom_prepro > read_MetUM_forcing.m

read_MetUM_forcing

PURPOSE ^

Read Met Office Unified Model (TM) (hereafter MetUM) netCDF files and

SYNOPSIS ^

function MetUM = read_MetUM_forcing(files, varlist)

DESCRIPTION ^

 Read Met Office Unified Model (TM) (hereafter MetUM) netCDF files and
 extact the variables in varlist.

 MetUM = read_MetUM_forcing(files, varlist)

 DESCRIPTION:
   Given a cell array of file names, extract the variables in varlist into
   a struct. Field names are the variable names gives.

 INPUT:
   files - cell array of file names (full paths to the files)
   varlist [optional] - list of variable names to extract. If omitted, all
   variables are extracted.

 OUTPUT:
   MetUM - MATLAB struct with the data from the variables specified in
   varlist.

 EXAMPLE USAGE:
   varlist = {'x', 'y', 't_1', 'sh', 'x-wind', 'y-wind', 'rh', 'sh', ...
       'lh', 'solar', 'longwave'};
   files = {'/tmp/sn_2011010100_s00.nc', '/tmp/sn_201101016_s00.nc'};
   MetUM = read_MetUM_forcing(files, varlist);

 NOTE:
   The last 4 times are dropped from each file because the Met Office
   Unified Model is a forecast model with four hours of forecast in these
   PP files.

 Author(s):
   Pierre Cazenave (Plymouth Marine Laboratory)

 Revision history:
   2013-08-29 First version.
   2013-09-02 Amend the way the 3 and 4D variables are appended to one
   another. The assumption now is time is the last dimension and arrays
   are appended with time.
   2013-09-06 Trim the last 4 time samples from all variables (these are
   the forecast results which we don't want/need for forcing the model. I
   suppose at some point, given the patchy temporal coverage of the data
   (i.e. the Met Office FTP server doesn't have all files in a usable
   state), it might be better to use the forecast data to partially fill
   in gaps from missing files. However, given this forcing is hourly and I
   was previously using four times daily forcing, I'm not that fussed.
   2013-09-12 Add support for extracting the surface pressure level from
   the 4D temperature variable (temp_2).
   2013-10-23 Fix the way time is handled. Previously a time variable had
   to be specified in varlist. Now, each data variable's time is returned
   as an array within the MetUM.(variable) struct, giving
   MetUM.(variable).time and MetUM.(variable).data. This means if each
   data variable uses a different time sampling, that can be accounted for
   later (by interpolating to a common time series with interp3, for
   example). Currently the code extracts the first 6 hour's worth of data.
   The assumption there is that the Met Office do 4 runs per day, so 6
   hours of data from each run gives you a day's worth.

==========================================================================

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function MetUM = read_MetUM_forcing(files, varlist)
0002 % Read Met Office Unified Model (TM) (hereafter MetUM) netCDF files and
0003 % extact the variables in varlist.
0004 %
0005 % MetUM = read_MetUM_forcing(files, varlist)
0006 %
0007 % DESCRIPTION:
0008 %   Given a cell array of file names, extract the variables in varlist into
0009 %   a struct. Field names are the variable names gives.
0010 %
0011 % INPUT:
0012 %   files - cell array of file names (full paths to the files)
0013 %   varlist [optional] - list of variable names to extract. If omitted, all
0014 %   variables are extracted.
0015 %
0016 % OUTPUT:
0017 %   MetUM - MATLAB struct with the data from the variables specified in
0018 %   varlist.
0019 %
0020 % EXAMPLE USAGE:
0021 %   varlist = {'x', 'y', 't_1', 'sh', 'x-wind', 'y-wind', 'rh', 'sh', ...
0022 %       'lh', 'solar', 'longwave'};
0023 %   files = {'/tmp/sn_2011010100_s00.nc', '/tmp/sn_201101016_s00.nc'};
0024 %   MetUM = read_MetUM_forcing(files, varlist);
0025 %
0026 % NOTE:
0027 %   The last 4 times are dropped from each file because the Met Office
0028 %   Unified Model is a forecast model with four hours of forecast in these
0029 %   PP files.
0030 %
0031 % Author(s):
0032 %   Pierre Cazenave (Plymouth Marine Laboratory)
0033 %
0034 % Revision history:
0035 %   2013-08-29 First version.
0036 %   2013-09-02 Amend the way the 3 and 4D variables are appended to one
0037 %   another. The assumption now is time is the last dimension and arrays
0038 %   are appended with time.
0039 %   2013-09-06 Trim the last 4 time samples from all variables (these are
0040 %   the forecast results which we don't want/need for forcing the model. I
0041 %   suppose at some point, given the patchy temporal coverage of the data
0042 %   (i.e. the Met Office FTP server doesn't have all files in a usable
0043 %   state), it might be better to use the forecast data to partially fill
0044 %   in gaps from missing files. However, given this forcing is hourly and I
0045 %   was previously using four times daily forcing, I'm not that fussed.
0046 %   2013-09-12 Add support for extracting the surface pressure level from
0047 %   the 4D temperature variable (temp_2).
0048 %   2013-10-23 Fix the way time is handled. Previously a time variable had
0049 %   to be specified in varlist. Now, each data variable's time is returned
0050 %   as an array within the MetUM.(variable) struct, giving
0051 %   MetUM.(variable).time and MetUM.(variable).data. This means if each
0052 %   data variable uses a different time sampling, that can be accounted for
0053 %   later (by interpolating to a common time series with interp3, for
0054 %   example). Currently the code extracts the first 6 hour's worth of data.
0055 %   The assumption there is that the Met Office do 4 runs per day, so 6
0056 %   hours of data from each run gives you a day's worth.
0057 %
0058 %==========================================================================
0059 
0060 subname = 'read_MetUM_forcing';
0061 
0062 global ftbverbose
0063 if ftbverbose
0064     fprintf('\nbegin : %s \n', subname)
0065 end
0066 
0067 assert(iscell(files), 'List of files provided must be a cell array.')
0068 
0069 % Find the approximate surface pressure level (1013.25mbar) for the 4D
0070 % temperature data.
0071 nc = netcdf.open(files{1}, 'NOWRITE');
0072 [~, numvars, ~, ~] = netcdf.inq(nc);
0073 levelidx = [];
0074 for f = 1:numvars
0075     [varname, ~, ~, ~] = netcdf.inqVar(nc, f - 1);
0076     if strcmp(varname, 'p') % p = pressure levels in the temp_2 variable.
0077         varid = netcdf.inqVarID(nc, varname);
0078         tmpdata = netcdf.getVar(nc, varid, 'double');
0079         % Find the index for the level closest to the surface. The
0080         % documentation at:
0081         %   http://badc.nerc.ac.uk/data/um/Met_Office_NAE_Output.pdf
0082         % suggests 980hPa is the surface.
0083         [~, levelidx] = min(abs(tmpdata - 980));
0084     end
0085 end
0086 
0087 % If that failed, use a best guess of the 5th index (based on my checking a
0088 % bunch of random files where the 1000mbar value falls in the p index).
0089 if isempty(levelidx)
0090     levelidx = 5;
0091 end
0092 
0093 MetUM = struct();
0094 
0095 for f = 1:length(files)
0096 
0097     % Set the number of time steps to extract to default to 6. It should be
0098     % checked for each file anyway (assuming there's a time variable being
0099     % requested).
0100     nh = 6;
0101 
0102     if ftbverbose
0103         % Don't display the full path if it's really long.
0104         if length(files{f}) > 80
0105             [~, fname, fext] = fileparts(files{f});
0106             dispname = [fname, fext];
0107         else
0108             dispname = files{f};
0109         end
0110         fprintf('File %i of %i (%s)... ', f, length(files), dispname)
0111     end
0112 
0113     nc = netcdf.open(files{f}, 'NOWRITE');
0114 
0115     % Query the netCDF file to file the variable names. If the name matches
0116     % one in the list we've been given (or if we haven't been given any
0117     % particular variables), save it in the output struct.
0118     [~, numvars, ~, ~] = netcdf.inq(nc);
0119 
0120     for ii = 1:numvars
0121         % Find the name of the current variable
0122         [varname, ~, ~, varAtts] = netcdf.inqVar(nc, ii - 1);
0123 
0124         if ismember(varname, varlist) || nargin == 1
0125             varid = netcdf.inqVarID(nc, varname);
0126 
0127             % Some variables contain illegal (in MATLAB) characters. Remove
0128             % them here.
0129             safename = regexprep(varname, '-', '');
0130 
0131             % Append the data on the assumption the last dimension is time.
0132             % Don't append data with only 2 dimensions as it's probably
0133             % longitude or latitude data. The time variable ('t') is
0134             % turned into a list of time stamps.
0135             tmpdata = squeeze(netcdf.getVar(nc, varid, 'double'));
0136             nn = ndims(tmpdata);
0137 
0138             if isfield(MetUM, safename)
0139                 switch varname
0140                     case {'x', 'y', 'x_1', 'y_1', 'longitude', 'latitude', 'lsm'}
0141                         continue
0142                     case {'t', 't_1', 't_2', 't_3', 't_4', 't_5', 't_6', 't_7', 't_8'}
0143                         % Ignore time variables.
0144                         continue
0145                     otherwise
0146                         try
0147                             % Extract the time for this variable.
0148                             temptime = fix_time(nc, varid);
0149 
0150                             % Find how many indices to extract to at least
0151                             % 6 hours of data.
0152                             interval = mean(roundn(diff(datenum(temptime)) * 24 * 60, 0));
0153                             if abs(60 - interval) < abs(30 - interval)
0154                                 % Hourly
0155                                 nh = 6;
0156                             elseif abs(30 - interval) < abs(60 - interval)
0157                                 % Half-hourly
0158                                 nh = 12;
0159                             else
0160                                 error('Unsupported time sampling interval (support hourly and half-hourly sampling).')
0161                             end
0162                             % Check we don't try and get more data than we
0163                             % have.
0164                             if nh > size(temptime, 1);
0165                                 nh = size(temptime, 1);
0166                             end
0167 
0168                             MetUM.(safename).time = [MetUM.(safename).time; temptime(1:nh, :)];
0169 
0170                             % Append along last dimension.
0171                             if nn == 3
0172                                 MetUM.(safename).data = cat(nn, MetUM.(safename).data, tmpdata(:, :, 1:nh));
0173                             else
0174                                 % We're flattening from 4D to 3D here, so
0175                                 % nn - 1.
0176                                 MetUM.(safename).data = cat(nn - 1, MetUM.(safename).data, squeeze(tmpdata(:, :, levelidx, 1:nh)));
0177                             end
0178                         catch err
0179                             fprintf('\n')
0180                             warning('Couldn''t append %s to the existing field from file %s.', safename, files{f})
0181                             warning('%s\n', err.message)
0182                         end
0183 
0184                 end
0185             else % first time around
0186                 switch varname
0187                     case {'x', 'y', 'x_1', 'y_1', 'longitude', 'latitude', 'lsm'}
0188                         MetUM.(safename).data = tmpdata;
0189                     case {'t', 't_1', 't_2', 't_3', 't_4', 't_5', 't_6', 't_7', 't_8'}
0190                         % Ignore time variables.
0191                         continue
0192                     otherwise
0193                         % This is data.
0194 
0195                         % Extract the time for this variable.
0196                         temptime = fix_time(nc, varid);
0197 
0198                         % Find how many indices to extract to at least
0199                         % 6 hours of data.
0200                         interval = mean(roundn(diff(datenum(temptime)) * 24 * 60, 0));
0201                         if abs(60 - interval) < abs(30 - interval)
0202                             % Hourly
0203                             nh = 6;
0204                         elseif abs(30 - interval) < abs(60 - interval)
0205                             % Half-hourly
0206                             nh = 12;
0207                         else
0208                             error('Unsupported time sampling interval (support hourly and half-hourly sampling).')
0209                         end
0210                         % Check we don't try and get more data than we
0211                         % have.
0212                         if nh > size(temptime, 1);
0213                             nh = size(temptime, 1);
0214                         end
0215 
0216                         MetUM.(safename).time = temptime(1:nh, :);
0217 
0218                         if nn == 3
0219                             MetUM.(safename).data = tmpdata(:, :, 1:nh);
0220                         else
0221                             % Assume temperature at pressure levels.
0222                             % Extract the 1000mb pressure level
0223                             % (approximately the surface).
0224                             MetUM.(safename).data = squeeze(tmpdata(:, :, levelidx, 1:nh));
0225                         end
0226                 end
0227             end
0228         end
0229     end
0230 
0231     if ftbverbose
0232         fprintf('done.\n')
0233     end
0234 
0235 end
0236 
0237 % Squeeze out singleton dimensions.
0238 fields = fieldnames(MetUM);
0239 for i = 1:length(MetUM)
0240     MetUM.(fields{i}) = squeeze(MetUM.(fields{i}));
0241 end
0242 
0243 if ftbverbose
0244     fprintf('end   : %s \n', subname)
0245 end
0246 
0247 function fixedtime = fix_time(nc, varid)
0248 % Little helper function to get the time data for the current variable.
0249 %
0250 % INPUT:
0251 %   nc : netCDF file handle
0252 %   varid : current variable ID
0253 %
0254 % OUTPUT:
0255 %   tt : date string for the current file (Gregorian date)
0256 
0257 % Extract the time array for this variable's time dimension.
0258 [numdims, ~, ~, ~] = netcdf.inq(nc);
0259 dimnames = cell(numdims, 1);
0260 for jj = 1:numdims
0261     [dimname, ~] = netcdf.inqDim(nc, jj - 1);
0262     dimnames{jj} = dimname;
0263 end
0264 
0265 % Find the dimensions of this variable.
0266 [~, ~, dimids, ~] = netcdf.inqVar(nc, varid);
0267 % We presume the time variable starts with a t.
0268 ttidx = strncmpi(dimnames(dimids + 1), 't', length('t'));
0269 ttvarid = netcdf.inqVarID(nc, dimnames{dimids(ttidx) + 1});
0270 % There are issues around precision here, so
0271 % convert tt to minutes and then back to fractions
0272 % of a day.
0273 tt = netcdf.getVar(nc, ttvarid, 'double');
0274 tt = roundn(tt * 24 * 60, -1) / 24 / 60;
0275 
0276 [~, ~, ~, tVarAtts] = netcdf.inqVar(nc, ttvarid);
0277 
0278 for j = 1:tVarAtts
0279     timeatt = netcdf.inqAttName(nc, ttvarid, j - 1);
0280     if strcmpi(timeatt, 'time_origin')
0281         timeval = netcdf.getAtt(nc, ttvarid, timeatt);
0282     end
0283 end
0284 mt = datenum(timeval, 'dd-mmm-yyyy:HH:MM:SS');
0285 
0286 fixedtime = datestr(mt + tt, 'yyyy-mm-dd HH:MM:SS');

Generated on Wed 20-Feb-2019 16:06:01 by m2html © 2005