$Id: Readme.txt 520 2008-10-15 21:54:02Z lendl $

This directory contains various Steim-1 encoded miniSEED examples useful for
(unit-) testing.


 Note: All ASCII dumps are in the same format as the GIPPtools 'mseed2ascii'
       and 'mseedinfo' utilities will produce.


Test 001 -- Single EDL miniSEED data record
-------------------------------------------

  This is a single, 4 kilobyte miniSEED record as produced by an EarthData
  logger unit (EDL). There are no pitfalls, no problematic values and no
  strings attached.

  001-single-record.mseed       The miniSEED file (Steim-1 encoded)
  001-single-record.header      ASCII dump of the header
  001-single-record.data        ASCII dump of the sample values
  001-single-record.file        Tabular summary of file content
  001-single-record.summary     One line summary of file content
  001-single-record.overview    One line, easy readable overview
  001-single-record.index       One line per record summary
  001-single-record.checksum    Checksum of the miniSEED data stream


Test 002 -- QLib2 miniSEED file
-------------------------------

  A series of 6000 samples / 4 records in Steim-1 encoding. The miniSEED file
  was created with the 'qmerge' tool by Doug Neuhauser (of 'qlib2' fame) from
  actual EDL data.

  002-qlib2-repacked.mseed      miniSEED file (Steim-1 encoded)
  002-qlib2-repacked.header     ASCII dump of the header
  002-qlib2-repacked.data.gz    ASCII dump of the sample values (gzipped)

  Note:  To create the miniSEED file the following command was used:

           'qmerge -f 2006/10/07.15:17:00 -t 2006/10/07.15:18:00  \
                   -T -v -O STEIM1 -r                             \
                   -o 002-qlib2-repacked.mseed e3359061007150000.pri0'

         Important here is the '-r' switch that forces "repacking" (i.e. re-
         encoding) of the output file. In other words, the output file is a
         "QLib2 miniSEED" file and not longer an "EDL miniSEED" file.


  Note:  The same time series was also encoded using the 'Mini-SEED Library'
         (libmseed) by Chad Trabant (IRIS) via the 'msrepack' command.
         The resulting miniSEED file is binary identical to the QLib2 created
         file and therefore not again tested here.


Test 003 -- PASSCAL miniSEED file
---------------------------------

  A miniSEED file created by the 'ref2mseed' utility that comes with the
  PASSCAL software (the source is located inside the 'ref2segy' subdirectory).

  Possible pitfalls:
  - Instead of the commonly used 4 kilobyte records this PASSCAL utility writes
    1024 byte miniSEED records! This is follows the SEED standard and is OK.
  - The "data indicator" is not "D" as with EDLs but "R" instead (following
    the SEED V2.4 standard).
    Comment: Using "R" instead of "D" is actually better (indicating "raw"
    unprocessed data instead of just "data"). Maybe we should request EarthData
    to update their software?
  - The ASCII sequence number at the beginning of each miniSEED record is
    "000000" for every record. (Should be increasing.)
  - Different word order for the fixed section of the miniSEED record (the
    first 64 bytes containing the header information) and the time series data
    inside the same miniSEED record!
    This is a violation of the SEED V2.4 standard. The SEED reference manual
    explicitly mentions that for miniSEED ("Data Only SEED") files there can
    only be one word order inside a record. (See SEED V2.4 Reference Manual,
    Appendix G, page 187, item 6,  "The order of the fixed section of the data
    header must be the same as field 4 of blockette 1000 implies.")

  003-passcal-ref2mseed.mseed    miniSEED file (Steim-1 encoded)
  003-passcal-ref2mseed.header   ASCII dump of the header
  003-passcal-ref2mseed.data.gz  ASCII dump of the sample values (gzipped)

  Note:  This file seems to be typical for miniSEED files that were created by
  PASSCAL software. However, there may be other variants! It looks like the
  miniSEED subroutines were duplicated (cut & paste) for use inside the sub-
  directories of the various PASSCAL utilities at different times during
  development. (Or it is just my mistake that I could not identify the central
  library containing the miniSEED routines.)


Test 004 -- Single sample miniSEED record
-----------------------------------------

  The test file contains three miniSEED records that were taken from an EDL
  file by cutting out the first 190 records.

  Possible pitfall:
  - The last record in the test file contains only one single sample! In other
    words, the first and (at the same time) last sample in that record are the
    same, possibly causing havoc when the checking of the integration constants.

  004-one-sample.mseed    miniSEED file (containing three records)
  004-one-sample.header   ASCII dump of the headers
  004-one-sample.data.gz  ASCII dump of the sample values (gzipped)


Test 005 -- "Padded" frames in miniSEED record
----------------------------------------------

  The test file contains four miniSEED records that were recorded by an EDR
  (no sensor attached).
  
  The problematic "feature" of this miniSEED file is that it contains "zero 
  nibbles" inside (!) the Steim-1 "data frames" indicating that parts of the 
  miniSEED record should be skipped! Usually these zero nibbles are only used 
  to indicate padding at the end of a recording (to fill up the fixed size 
  miniSEED record when the data stream ends early). EarthData however seems 
  to use it to align the 64-byte "data frames" with (real world) minute 
  boundaries.
  
  There is also an error in the miniSEED file! The header indicates that the 
  records contain one blockette. However, in reality two blockettes (#1000 
  and #1001) are located in the header. Fortunately, GIPPtools does not care
  about the blockette counter. Instead the utilities just try to access a 
  specific blockette directly.
  
  Note: Both problems are "historic". They have been fixed by a EDR firmware 
        update in Nov/Dec 2013. (Nevertheless, the test cases remain just 
        in case.)

  005-padded-frame.mseed    miniSEED file (containing four records)
  005-padded-frame.header   ASCII dump of the four headers
  005-padded-frame.data.gz  ASCII dump of the sample values (gzipped)
  

Test 006 -- Integer under-/overflow in Steim encoding
-----------------------------------------------------

  Steim encoded miniSED records store sample values as differences between 
  successive sample amplitudes. Because of this it is possible to encode 
  samples values that are outside of the number range of a four byte integer!
  (Example: Any long enough sequence of positive sample differences eventually
  leads to absolute sample values that are outside the numerical range of a 
  four byte integer.) This "peculiarity" of the Steim encoding formats leads 
  to two issues:
  
  1. Decoding such miniSEED record results in sample values that cannot be 
     stored as (32 bit) integer values. The decoding software must either 
     catch the numeric under- or overflow and indicate an error to the user.
     Alternatively, the software could handle the situation by writing to a 
     data type that can hold the sample values. (The later approach is used by
     the GIPPtools collection since 2015.)
     
     Note: When the integer underflow is not caught by the software, the 
           returned sample values will flip from extreme negative amplitudes
           to extreme positive values in one sample (or vice versa for integer 
           overflows). This incorrect behavior is documented in the two files 
           with the string "wrong" in the filename (see below).
  
  2. Although this is in principal "legal" (?) from the theoretical encoding 
     perspective, it is usually a sign of malformed (input) data. After all,
     the source for the time series data is usually sensor producing "safe" 
     integer numbers exclusively. Therefore, a user warning should be given!
  
  The following two test file sets can be used to verify correct behavior for 
  the under- and overflow case:
  
  006-integer-overflow.mseed     miniSEED file (containing one 4k record)
  006-integer-overflow.data.gz   ASCII dump of the samples (correct result)
  006-integer-overflow.wrong.gz  ASCII dump of the samples (incorrect result)
  006-integer-overflow.error     Standard error output containing the warning
  
  006-integer-underflow.mseed     miniSEED file (containing one 4k record)
  006-integer-underflow.data.gz   ASCII dump of the samples (correct result)
  006-integer-underflow.wrong.gz  ASCII dump of the samples (incorrect result)
  006-integer-underflow.error     Standard error output containing the warning
  
  Note: Both miniSEED input files are synthetic and contain a simple sine 
        wave that was shifted close to the upper and lower limit of the 
        integer number range.

