======================================================== Dataset: D1D-TRNS4 Truly Random Number Seeds with Verified Statistical Quality (Jul.2016) Release Notes Copyright (c) 2016 by Harris V. Georgiou ======================================================== Release: Jul 31, 2016 - Version: 1.1a - Format: .wav/.bin ======================================================== This file contains important information about the current version of the dataset package. Downloading and using this material hints that you accept the EULA/Terms-of-Use (please read carefully). We welcome your comments and suggestions. _______________________________________________ WHAT'S IN THIS PACKAGE? - Overview - Available file formats - Files and Datasets - License Agreement _______________________________________________ OVERVIEW The generation of truly random number streams is a task of fundamental importance in cryptography and computer simulations. The term "truly" refers to the use of a natural phenomenon that is inherently random, e.g. the decay of a radio-active element. Instead, a more practical approach is to use pseudo-random number generators (PRNG) that closely resembles such random process (output), as long as they have a very long period and their initial seed is inherently random. Additionally, some cryptographically strong algorithms, such as hashing/digest or encryption functions can also be used, since they act by design as "entropy diffusers". This package contains four sets of data that combine (a) trully random sources as seed and (b) cryptography for additional randomness in the output. Specifically, low-quality (noisy) sound recordings of rainfall and RF static are used as input for multiple PGP/GPG encryption steps with random keys. The result is a set of random binary data of very high statistical quality (sizes 690KB-1.27MB), which can be used as-is for one-time pads or as seeds for high-quality PRNG implementations. The source data is low-quality noisy waveforms: two rainfall samples with (approximately) mean=1.3KHz and stdev=600Hz in both cases; and two RF static samples with (approximately) mean=1.6KHz and stdev=800Hz. These can also be used as-is, but for quality PRNG seeds these raw data have to be properly pre-processed (at least whitened). The final data are evaluated using the ENT command-line tool and the appropriate statistics in Excel/LibreOffice format. In essence, ENT compares the bit-level statistics of the binary (PGP/GPG) files to the theoretically optimal values if the data are perfectly random. This is an example of the test output, confirming the high-quality randomness of the produced data (seed02-run3): .................... Entropy = 1.000000 bits per bit. Optimum compression would reduce the size of this 10707224 bit file by 0 percent. Chi square distribution for 10707224 samples is 0.59, and randomly would exceed this value 44.27 percent of the times. Arithmetic mean value of data bits is 0.4999 (0.5 = random). Monte Carlo value for Pi is 3.144041925 (error 0.08 percent). Serial correlation coefficient is -0.000666 (totally uncorrelated = 0.0). .................... For further information see: * ENT - A Pseudorandom Number Sequence Test Program: http://www.fourmilab.ch/random/ https://www.gnu.org/software/gnu-crypto/manual/api/gnu/crypto/tool/Ent.html _______________________________________________ AVAILABLE FILE FORMATS The datasets are available in the following formats (included): *.wav : source sound files (4bit/mono/8kHz/32kbps) *.bin : binary files ready for use (PGP/GPG output) _______________________________________________ FILES AND DATASETS Root folder: D1D-TRNS4\ \1: .wav source file and 3 binary randomized outputs (PGP/GPG) \2: .wav source file and 3 binary randomized outputs (PGP/GPG) \3: .wav source file and 3 binary randomized outputs (PGP/GPG) \4: .wav source file and 3 binary randomized outputs (PGP/GPG) \stats: ENT and Excel/LibreOffice statistics for evaluation _______________________________________________ LICENSE AGREEMENT This material was produced primarily for academic research and educational purposes. Downloading and using this material implies acceptance of the Creative Commons License: Attribution-NonCommercial-ShareAlike 4.0 International (BY-NC-SA), 2016. * http://creativecommons.org/licenses/by-nc-sa/4.0/ Copyright (c) 2016 by Harris V. Georgiou (MSc,PhD) -- http://xgeorgio.info --