JAGURS-D (JAGURS-Dispersive) update history

--------------------------------------------------------------------------------
2013.08.30 V0100

Original version is released.
This version is based on flow-speed/dispersive version named "JAGURS_with_DISP_V0120".

--------------------------------------------------------------------------------
2013.09.26 V0110

- Make-time MPI interface error (MPI_Waitall) on K-computer has been resolved.

- For NetCDF output, simulation start date can be specified with
  run-time parameter file as 'start_date="2013-09-26 00:00:00"'.
  Any string within 64 characters can be specified to 'start_date' and
  the default value is '2000-01-01 00:00:00'.
  This value is put int the 'units' attribute of 'time' record.
  Like, 'seconds since [start_date]'.

--------------------------------------------------------------------------------
2013.10.01 V0111

Default value of max_step and conv_val has been changed.

   max_step: 1000 -> 9999
   conv_val: 1.0d-6 -> 1.0d-8

--------------------------------------------------------------------------------
2013.10.17 V0120

- Flood conditions have been modified.

- MPI_DOUBLE is not defined in MPI Fortran interface.
  This should be MPI_DOUBLE_PRECISION. So, real.h has been modified.

--------------------------------------------------------------------------------
2013.10.22 V0130

Convergence step is written into "conv_step.[domain name]".
(i.e. "conv_step.whole")

--------------------------------------------------------------------------------
2013.11.06 V0140

- Debug for flood conditions.

- Redundant calc. of ddx and ddy in hxynl_rwg@mod_hxy.f90 is deleted.
  This doesn't affect the results.

--------------------------------------------------------------------------------
2013.11.06 V0150

Max. elapsed time can be specified on run-time parameter file as
'max_time="24:00:00"', 'max_time="30:00"', or 'max_time="3600"'.
If max_time is not specified, max. elapsed time is not specified.
Then elapsed time reaches specified max_time, execution is truncated.
Such a function is necessary to close NetCDF snapshot files correctly.

--------------------------------------------------------------------------------
2013.11.18 V0151

- Debug for "#ifndef NCDIO" & "#ifdef CONV_CHECK" case.

- Restart file output for truncated run is supported.

--------------------------------------------------------------------------------
2013.11.26 V0160

Code simplification and basic optimization are applied into dispersive part.

--------------------------------------------------------------------------------
2013.11.27 V0161

More optimization is applied into dispersive part.

--------------------------------------------------------------------------------
2013.12.25 V0170

- More optimizations for K-computer are applied to fxy_rwg_disp.
  They become effective if macro variable __FUJITSU is specified.

- Debug for ABC with many processes.

- If "with_disp=2" on run-time parameter file, dispersive becomes invalid
  only on root domain.

- Support negative max height.

--------------------------------------------------------------------------------
2014.01.16 V0171

DEBUG for dispersive case.
Boundary elements of flux fx/fy referred on convergence iteration should be
ones on this time step.
So, call of interp2fine is added right before convergence iteration.

--------------------------------------------------------------------------------
2014.01.17 V0172

A2A3D & SINGLE_A2A have been supported.

--------------------------------------------------------------------------------
2014.01.21 V0173

Minor debugs and support NEC SX.

--------------------------------------------------------------------------------
2014.01.23 V0174

Usual finalization is performed if "Maximum wave height exceeds 1 million" occurs.

--------------------------------------------------------------------------------
2014.02.17 V0175

Output on tgs files on K with double precision is modified.
Because "E" is NOT displayed if exponential part is triple digits.

--------------------------------------------------------------------------------
2014.02.27 V0176

Default value of "itmap_end" is changed from 9999999 to 99999999.

--------------------------------------------------------------------------------
2014.05.28 V0177

Multi-domains can be specified with "plotgrd".
For example, if "plotgrd=2,5" is specified, binary output files only
for domain 2 and 3 are output.

--------------------------------------------------------------------------------
2014.06.25 V0178

Added functions for JAGURS-MRI are installed.

- Initial displacement is given with Gaussian distribution if
  "init_disp_gaussian=1" is specified in "tsun.par".
  Parameters for Gaussian distribution are specified by the file named "gaussian".
  Its format is the following.

   &gaussian
   lon_o=141.300201d0 ! Center lon.[Degrees]
   lat_o=38.337348d0  ! Center lat.[Degrees]
   L=80.0d0           ! Width[km]
   /

- Multi-members on a job is supported.
  Specify "MULTI=ON" in Makefile.* and make it. Run jagurs like...

   $ mpirun -np 400 ./jagurs 1-100 tsun.par

  In this case, 100 members run on a job with 400 processes
  (4 processes for each member).
  Input files should be in "input.[member ID (6 digits)]" and output files are put into
  "member.[member ID (6 digits)]".

--------------------------------------------------------------------------------
2014.08.05 V0200

Difference scheme for advective term is changed into 3rd-order upwind from original
1st-order upwind.

--------------------------------------------------------------------------------
2014.09.04 V0201

Station (tgs) output is written into single file per process
when SINGLE_TGS=ON is specified in Makefile.
This file can be split into files for each station by "splittgs.sh".

--------------------------------------------------------------------------------
2014.10.28 V0202

Staggered adjustment for velocity output is applied.

--------------------------------------------------------------------------------
2014.11.11 V0203

How to apply difference on dispersive term is changed.
This provide additional stability and checker-board instability should be mitigated.

--------------------------------------------------------------------------------
2014.11.28 V0300

Cartesian axis becomes available with "CARTESIAN=ON" on Makefile.
In addition, difference scheme for advective term can be switched on make time. 
If "UPWIND3=ON" is specified on Makefile, 3rd-order upwind is utilized.

NOTE: Multi-domains (nesting) execution is NOT supported on Cartesian mode yet!

--------------------------------------------------------------------------------
2014.12.19 V0310

Multi-domains (nesting) execution has been supported on Cartesian mode.

--------------------------------------------------------------------------------
2014.12.31 V0311

DEBUG for abnormal velocity (> 1,000 m/s) around coastline is applied.

--------------------------------------------------------------------------------
2015.01.07 V0312

Snapshot output of flow speed has been available if "speedgrd=1" is specified
in "tsun.par".

--------------------------------------------------------------------------------
2015.02.10 V0320

NOTE: Currently NEC SX is not supported. We need to replace FFT routines.

- Elastic loading and seawater density stratification are supported.

   Allgeyer, S., and P. Cummins (2014),
   Numerical tsunami simulation including elastic loading and seawater density stratification,
   Geophys. Res. Lett., 41, 2368-2375, doi:10.1002/2014GL059348

Additional parameters in "tsun.par" are as follows.

   with_elastic_loading=1 ! Specify 1 to activate elastic loading.
   m_radius=2.0d3         ! Radius in which the loading effect will be estimated.
   m_pyfile='PREM_Ggz.nc' ! NetCDF file to specify Green function.

   with_density=1         ! Specify 1 to activate seawater density stratification.
   m_rho=1025.5d0         ! The value of rho.
   m_K=2.2d9              ! The value of K.

- Debug for multi-case execution on NEC SX is applied.

- Debug for the case lon_west or lon_east is grater than 180.0d0 is applied.

[ToDo]

- Sophisticate MPI implementation of elastic loading.

- Support NEC SX (Replace FFT routines).

- Performance optimizations.

--------------------------------------------------------------------------------
2015.02.23 V0321

DEBUG: The name "MPI_GROUP" is utilized in Cray MPI?
So, "MPI_GROUP" is renamed into "MPI_XYZ_GROUP".

--------------------------------------------------------------------------------
2015.03.03 V0322

Coriolis force is included in linear case if "coriolis=1" is specified in "tsun.par"

--------------------------------------------------------------------------------
2015.04.15 V0323

- DEBUG for the position of stations on Cartesian axis.

- Initial dispersive with sin-wave becomes available on polar coordinates.

- DEBUG for MULTI & elastic loading/density/sinwave (arbitrary wave) case.

- Apply performance optimization for NEC SX to mod_fxy[_disp].f90.

- Repeated allocate/deallocate of MPI buffers for edges/nesting have been resolved.

- Elastic loading is supported on NEC SX.

--------------------------------------------------------------------------------
2015.05.07 V0324

- Friction term is changed into semi-implicit scheme.

- Flax value is limited by maximum Froude number.
  This value can be specified with the value "froude_lim" in "tsun.par"
  (default: froude_lim=2.0d0).

--------------------------------------------------------------------------------
2015.05.08 V0325

FFT for elastic loading is changed from complex to faster real.

--------------------------------------------------------------------------------
2015.05.11 V0326

Performance optimization, mainly acceleration of OpenMP parallelization,
has been applied to data copies attached to FFT for elastic loading.

--------------------------------------------------------------------------------
2015.05.14 V0327

DEBUG for wall-clock timer routine "wallclock" in mod_timer.f90 and
mod_truncation.f90 on NEC SX.
Utilized routine should be of wall-clock/real time. Not of CPU/user time.

--------------------------------------------------------------------------------
2015.07.16 V0328

NOTE: PROJ.4 library (URL: https://trac.osgeo.org/proj/) is necessary.

- The following functions about initial displacement become available.

   - Initial displacement is given by fault calculation if "init_disp_fault=1"
     is specified in "tsun.par".
     Fault parameters are given by namelist file "fault".
     Calculated displacement is put to "[domain].initl_disp.grd(.[MPI rank])".

   - Horizontal displacement effect is applied to initial displacement with
     fault calculation if "hzdisp_effect=1" is specified in "tsun.par".
     Calculated displacement is put to "[domain].initl_disp.grd(.[MPI rank])".

   - Kajiura filter is applied to initial displacement if "apply_filter=1"
     is specified in "tsun.par".

- Debug for initial dispersive with sin-wave is applied.

--------------------------------------------------------------------------------
2015.07.17 V0329

Only p2c interpolation is performed (in other words, c2p copy is NOT performed)
if "nest_1way=1" is specified in "tsun.par".

--------------------------------------------------------------------------------
2015.08.03 V0330

- The name of the parameter "apply_filter" is changed into "apply_kj_filter".

- If there exist the grid points in which distance to the source is zero,
  the source is displaced +1.0d-6 [m] to x direction.
  This prevents zero-division in subroutine SRECTG (okada.sub.f).

- Coordinate conversions about the grid points in which distance to the source is
  > 30.0d0 [degree] are skipped.
  This prevents invalid operation on broad area calc.

--------------------------------------------------------------------------------
2015.09.28 V0331

Initial displacement is calculated/read only for root domain and ones for
child domains are given by interpolation if "init_disp_interpolation=1" is
specified in "tsun.par".
In default, 3rd order spline interpolation is utilized.
If "use_linear=1" is specified in "tsun.par", linear interpolation is utilized.

--------------------------------------------------------------------------------
2015.10.20 V0332

Output of initial displacements is improved. Now, actual displacements should be included.
Use ncdmerge.V0180 or later.

--------------------------------------------------------------------------------
2015.11.09 V0333

- MPI_IN_PLACE cannot be utilized with MPI_Reduce on SX. So, communication buffer
  is explicitly allocated and used.

- Lower limit of depth to adopt horizontal displacement effect can be specified
  with the namelist valuable "min_depth_hde" in "tsun.par".
  Default value is "min_depth_hde=50.0d0" [m].

- Multiple fault parameters are supported.
  Specify "multrupt=1" and "init_disp_fault=1" in "tsun.par" to use it.

--------------------------------------------------------------------------------
2015.12.18 V0334

- DEBUG for MPI & multiple rupture & interpolation has been applied.

- Windows version without MPI parallelization can be made with "Makefile.Windows".
  Only Intel compilers are supported.

--------------------------------------------------------------------------------
2016.02.08 V0335

- DEBUG for A2A3D & MULTI on K-computer has been applied.

--------------------------------------------------------------------------------
2016.10.24 V0400

- Modifications to avoid zero-division on Okada function has been applied.

- DEBUG for offshore boundary conditions with 3rd order upwind difference method has been applied.

- DEBUG for MPI standard violation in displacement_calc_h0_lat1 @ mod_displacement.f90.
When MPI_IN_PLACE is utilized, the value of receive buffer must not be intent(out).

- DEBUG for lack of display digit in loading_initialize @ mod_loading.f90 and
displacement_apply_kj_filter @ mod_displacement.f90 has been applied.

- DEBUG for the problem that the lack of neighbor exchange layer on the North edge
of child domain has been applied.

- Output interval of tgs output can be specified with the parameter "itgrn" in "tsun.par".

- GNU Fortran (gfortran) has been supported.
Note that, the format of ASCII output is affected slightly.

- To compute/output max velocity can be skipped to save time
(most noticeably for linear computation).
This become effective if "SKIP_MAX_VEL=ON" is specified in Makefile.

- Performance optimization for FFT on NEC SX series (padding) has been applied.

- Convergence check on dispersive calc. is done only every 10 steps to save time
if "LESS_CC=ON" is specified in Makefile.

- DEBUG for restart with sinwave/file-input has been applied.

--------------------------------------------------------------------------------
2016.12.20 V0401

Snapshot output on time-step 0 is performed.

--------------------------------------------------------------------------------
2017.02.13 V0402

- Support single-precision calculation except FFT related part again.

- Uplift time "tau" can be specified as less than timestep "dt" (i.e. "tau=0.0d0").
Then, "tau" is reset to "dt".

--------------------------------------------------------------------------------
2017.06.19 V0403

- DEBUG for DOUBLED Kajiura filter when "init_disp_fault=1" and "apply_kj_filter=1"
has been applied.

- Performance optimization for c2p_all on NEC SX has been applied.

--------------------------------------------------------------------------------
2017.06.19 V0404

- VERY MINOR: Missing precision specifier (i.e. "d0" of "1.0d0") is redeemed.

--------------------------------------------------------------------------------
2017.07.11 V0405

- Wave height of dry cell in snapshot file has been changed from zero to
"missing value" (=-1.0d10) on dry cell.

NOTE: Use "ncdmerge.V0181" in case "NCDIO" for this version!

- DEBUG for failing restart with SKIP_MAX_VEL has been applied.

- DEBUG: When "def_bathy=0", hz is changed on dry cell and it can become wet.
It's not good. So, hz is changed only when it's wet cell.

--------------------------------------------------------------------------------
2017.07.12 V0406

- When extremely numerous initial displacements are specified with NCDIO,
one entry can exceed 4 GByte. So, default format is changed into NetCDF4.

NOTE: Use "ncdmerge.V0182" in case "NCDIO" for this version!

--------------------------------------------------------------------------------
2017.07.18 V0407

CRITICAL!!!: DEBUG for exchange_edges_zz on mod_mpi.f90 has been applied.
The form of the array zz is zz(nlon,nlat). But, operation on exchange_edges_zz is
for the array its form is zz(0:nlon,0:nlat).
This bug affect calc. results and can cause SIGSEGV. Affected cases are as follows.

- V0400 or later, "MPI=ON" and,

- "init_disp_fault=1" or "apply_kj_filter=1" or "init_disp_interpolation=1".

Please use this version henceforward!

--------------------------------------------------------------------------------
2017.10.31 V0408

Miner change to the format of tgs output files has been applied.

--------------------------------------------------------------------------------
2018.06.20 V0500

- Pre-decomposition with grd2dsplit.sh for input GMT files and
  post-mergence with split2dgrd.sh or ncdmerge for output GMT or NetCDF files
  become unnecessarily when ONEFILE=ON is specified in Makefile.
  Note that this function is under the verification.

- Sea wall as line-data with water overflow calculation with Hom-ma formula
  is available when BANKFILE=ON is specified in Makefile.
  Specify xyz-format line-data file to 8 th column in gridfile.
  Wall breaking caused by water overflow can be simulated.
  Specify "broken_rate=[real number]" in "tsun.par".
  Then, the height of sea wall becomes [real number]-times when the first overflow occurs.

- Minimum wave height on each grid point during the simulation is output
  when HZMINOUT=ON is specified in Makefile. Use ncdmerge.V0200 or later.

- Tsunami arrival time on each grid point is output
  when "check_arrival_time=1" is specified in "tsun.par".
  The threshold can be specified with "check_arrival_height=[real number]"
  in "tsun.par" (default value is 0.01d0, 1 cm). Use ncdmerge.V0200 or later.

- To specify "maxgrdfn", "mingrdfn" and "vmaxgrdfn" in "tsun.par" becomes optional.

- Debug and enhancement about initial displacement with Gaussian function
  are applied (contributed by N.Yamamoto @ NIED. Thanks!).
  - Initial displacement file becomes available.
  - Amplitude (h0) can be specified on polar coordination case, too.

- In default, elastic loading is calculated only on the root domain and ones on
  child domains are given by interpolation.
  In default, 3rd order spline interpolation is utilized.
  If "use_linear=1" is specified in "tsun.par", linear interpolation is utilized.
  If "elastic_loading_interpolation=0" is specified in "tsun.par", elastic loading
  is calculated on each domain.

- On flux calculation, the argument of sqrt on limiter operations can sometimes
  be negative. To prevent this, 

     d = half*(dz(i+1,j)+dz(i,j)+hz_old(i+1,j)+hz_old(i,j))
     d = half*(dz(i,j+1)+dz(i,j)+hz_old(i,j+1)+hz_old(i,j))

  are modified into,

     d = max(0.0d0, half*(dz(i+1,j)+dz(i,j)+hz_old(i+1,j)+hz_old(i,j)))
     d = max(0.0d0, half*(dz(i,j+1)+dz(i,j)+hz_old(i,j+1)+hz_old(i,j)))

--------------------------------------------------------------------------------
2018.07.17 V0501

CRITICAL!!!: DEBUG for line-data function.
Almost all line-data will be ignored on V0500.
Stop to use V0500 and use V0501 if you need line-data function.

--------------------------------------------------------------------------------

