Dataset Open Access
Harris V Georgiou (MSc,PhD)
Truly Random Number Seeds with Verified
Statistical Quality (Jul.2016)
Copyright (c) 2016 by Harris V. Georgiou
Release: Jul 31, 2016
- Version: 1.1a
- Format: .wav/.bin
This file contains important information about the current version of the dataset package. Downloading and using this material hints that you accept the EULA/Terms-of-Use (please read carefully).
We welcome your comments and suggestions.
WHAT'S IN THIS PACKAGE?
- Available file formats
- Files and Datasets
- License Agreement
The generation of truly random number streams is a task of fundamental importance in cryptography and computer simulations. The term "truly" refers to the use of a natural phenomenon that is inherently random, e.g. the decay of a radio-active element. Instead, a more practical approach is to use pseudo-random number generators (PRNG) that closely resembles such random process (output), as long as they have a very long period and their initial seed is inherently random. Additionally, some cryptographically strong algorithms, such as hashing/digest or encryption functions can also be used, since they act by design as "entropy diffusers".
This package contains four sets of data that combine (a) truly random sources as seed and (b) cryptography for additional randomness in the output. Specifically, low-quality (noisy) sound recordings of rainfall and RF static are used as input for multiple PGP/GPG encryption steps with random keys. The result is a set of random binary data of very high statistical quality (sizes 690KB-1.27MB), which can be used as-is for one-time pads or as seeds for high-quality PRNG implementations.
The source data is low-quality noisy waveforms: two rainfall samples with (approximately) mean=1.3KHz and stdev=600Hz in both cases; and two RF static samples with (approximately) mean=1.6KHz and stdev=800Hz. These can also be used as-is, but for quality PRNG seeds these raw data have to be properly pre-processed (at least whitened).
The final data are evaluated using the ENT command-line tool and the appropriate statistics in Excel/LibreOffice format. In essence, ENT compares the bit-level statistics of the binary (PGP/GPG) files to the theoretically optimal values if the data are perfectly random.
AVAILABLE FILE FORMATS
The datasets are available in the following formats (included):
*.wav : source sound files (4bit/mono/8kHz/32kbps)
*.bin : binary files ready for use (PGP/GPG output)