$Id: Readme.txt 2605 2018-08-31 11:43:21Z lendl $

This directory contains various problematic miniSEED files that have been 
causing problems of some sort. Most cases were encountered while preparing 
the recorded data of various experiments for the inclusion into the GEOFON 
archive. These files were often recorded with instruments that are not in 
use at the GIPP and contain therefore various "surprises" that are never
seen when using GIPP loggers.

 Note: All ASCII dumps are in the same format as the GIPPtools 'mseed2ascii'
       and 'mseedinfo' utilities will produce.


Test 001 -- A miniSEED record containing zero samples
-----------------------------------------------------

  This three record long miniSEED snippet was directly taken from a real file
  that was recorded during the PASSEQ experiment. The issue with this file is
  the second record, which does not contain any time series data (zero trace
  length). In addition, the start time of the second record is before the stop
  time of the previous first record.

    001-zero-samples.mseed        The miniSEED file
    001-zero-samples.header       ASCII dump of the header
    001-zero-samples.data.gz      ASCII dump of the sample values (gzipped)
    001-zero-samples.file         Tabular summary of file content
    001-zero-samples.index        One line per record summary


Test 002 -- Different data streams with the same name/id
---------------------------------------------------------

  This five record long miniSEED snippet was directly taken from a real file
  that was recorded during the PASSEQ experiment. The "special feature" in
  this snippet is the change in the sample rate between the second and third
  record (from 20Hz to 100Hz) without a hint in the miniSEED station or
  channel id (nor does the network or location id change). The same thing
  happens again in the opposite direction (100Hz to 20Hz) between the fourth
  and fifth record.

    002-stream-change.mseed        The miniSEED file
    002-stream-change.header       ASCII dump of the header
    002-stream-change.data.gz      ASCII dump of the sample values (gzipped)


Test 003 -- Duplicated samples (identical)
------------------------------------------

  This miniSEED file was encountered while preparing PASSEQ experiment data
  for inclusion into GEOFON. Here the problem is that the file contains all
  samples twice with some jumps along the time axis! Fortunately, at least
  the sample values are identical.

  Note: The original file was several megabytes large and covered a full
        day (twice). To save disk space and to speed up unit tests, only
        the records necessary to demonstrate the problem were kept.

        The remaining example file contains two (complete) copies of about
        140 seconds of data. The file begins with miniSEED records containing
        the second half of the data, followed by records containing the
        complete 240 seconds. (This obviously duplicates the second half of
        the data.) Finally the file contains the first 120 seconds (again).

    003-duplicated-samples.mseed   The miniSEED file


Test 004 -- Duplicated samples (different)
------------------------------------------

  Like the previous test case these files were found in the PASSEQ experiment
  data as well. There are about 13 minutes that were recorded twice, however
  with completely different sample values.

    004-double-data.1.mseed
    004-double-data.2.mseed
    004-double-data.3.mseed


Test 005 -- Steim-1 encoded data stream does not end at word boundary
---------------------------------------------------------------------

  The Steim-1 and Steim-2 encoding schemes store sample differences in 32-bit
  words. Steim-1 for example can store either four 1-byte differences, two
  2-byte differences or a single 4-byte difference.

  This can become problematic at the end of a data stream when there are not
  enough samples left to completely fill the 32-bit word. For example the
  second miniSEED record in the test file ends with three 1-byte differences.

  One solution for encoding the three last samples would be to use two 2-byte
  differences and one 4-byte difference. This is the method used by Steim
  encoding routines that are part of the GIPPtools. A different solution is
  to use a four 1-byte differences word for encoding where only the first
  three differences are used. The value of the fourth difference is undefined!
  This is the method used in the miniSEED file in this test case...

  As the previous miniSEED files, this miniSEED file was encountered while
  preparing PASSEQ experiment data for inclusion into GEOFON.

    005-unexpected-end.mseed       The miniSEED file
    005-unexpected-end.header      ASCII dump of the header
    005-unexpected-end.data.gz     ASCII dump of the sample values (gzipped)


Test 006 -- Blockette #1000 at different locations
--------------------------------------------------
  
  These test files feature a blockette #1000 that does NOT follow directly 
  the "fixed header section" of the miniSEED record. Although impractical, 
  it is legal..
    
    006-two-blockette.mseed      MiniSEED file with a blockette #1001 before 
                                 the mandatory #1000
    006-three-blockette.mseed    MiniSEED file with blockette #1001, #100 
                                 and #1000 (in that order)
  
  Note: The second file (containing three blockettes) causes additional 
        "difficulty" because three three blockettes do not fit into the 
        first 64 bytes of a miniSEED record!
        However, by default only 64 bytes of a record are read initially 
        by GIPPtools. (The 64 byte correspond to the smallest legal miniSEED 
        record possible.) Hence, special handling (i.e. reading more "header 
        bytes") is needed..
  
  