Comparison of Entropy Coding mechanism on IEEE1857.2 Lossless Audio Compression Standard

ABSTRACT


INTRODUCTION
Moving forward from the first innovation of audio compression, the demand for higher quality data has risen throught various fields like medical innovations such as ECG analysis or heat beat analysis, in terms of audio data storage and audio streaming, this inevitably leads to a demand of lossless audio compression as generally high-quality data requires not only high data storage, but also high data rate for transmission and analysis [1].The idea behind lossless audio compression is to remove redundant bits of data without any loss of quality, by using an encoder containing a predictor which can predict the signal as close as possible and calculating the error of this predicted signal.This prediction residue is finally encoded with both its predictor and various other necessary parameters.When the user receives this encoded data, it uses a decoder that can reconstruct the original signal.On top of this, the encoder will encode the residual error using an entropy coding, which is a part of information theory to further reduce the number of redundant bits [2].The smaller the residual error given from the predictor, the more efficient the entropy coding is and the smaller the compressed signal as it will contain more zeroes in its error residual.
Overall, the predictor and entropy coding are essential parts of any lossless compression mechanism.In entropy coding, a bitstream may be compressed by removing redundant bits through a generalized algorithm.In tthe IEEE1857.2lossless audio compression standard, both Golomb-Rice and arithmetic coding is mechanism is defined, thus in this paper, we wish to compare the performance of these compressor with respect to the IEEE1857.2predictor algorithm and the other algorithms.

177
In most studies and description of lossless audio compression, the focus is more on prediction block, however, entropy coding is just as important in this case as it used as a tool to write the intended compressed bitstream through its algorithm.Generally, the purpose of entropy coding is not limited to audio per say but as a universal coding mechanism to compress bitstream data.Nevertheless, through this paper we will compare two entropy coding mechanism specifically designed for the IEEE1857.2lossless audio compression and to find which universal entropy coding may be more suited for its application.In the previous paper, we had evaluated the performance of the IEEE1857.2Linear Predictor Coding (LPC) block and found interesting relationships of the LPC block with the preprocessing block as well, thus this study of the preprocessing block is also extended into this paper to it's effect on the different entropy mechanism of the IEEE1857.2standard [3].
In the rest of this paper, for section 2, we will discuss Golomb-Rice Coding mechanism and subsequently Arithmetic Coding in section 3.Then, in section 4, we will divide into 4 subsections, were we will detail components of the IEEE1857.2,by revisiting the pre-processing block cocept, entropy selection, Golomb-Rice encoding, and arithmetic encoding algorithms used.Following that, we will describe the experimental setup and measurement in section 5, as well as the results and discussion of this experimentation in section 6.Then finally conclude the paper in section 7.

GOLOMB RICE CODING
Golomb-Rice Coding is widely used for various Lossless Audio Compression popular tools and standards, such as FLAC and MPEG-4 ALS [4], [5].The reason being is that Golomb-Rice coding is a derivation of Huffman Coding which is suited for time dependant applications, thus useful for improving the encoding speed, but with the expense of compression ratio [6].In the later chapters, we will verify whether this is applicable to lossless audio compression applications.
Firstly, the way this method is executed is by giving a unique parameter m; Then a positive integer n, which we wish to encode is divided into a remainder of and quotient, [7].If , the code word for n will consists of k-least significant bits of n, followed by the number formed by the remaining most significant bits of n in unary representation and a stop bit [2].The length of this code word is, During the earlier invention of lossless audio compression, the estimation for the parameter k is given in and is used in AudioPaK and it is based on the expectation already computed in the predictor block where, The parameter k is defined to be constant over an entire frame and takes values between 0 and (b −1) in the case of b bit audio samples [2].This concept is still widely used in current existing codecs and standards.Nevertheless, in terms of the lossless audio compression block, it is important to bear in mind that the Golomb codes defined to be optimal for exponentially decaying probability distributions of positive integers and because the prediction residuals may not all positive, it maybe required to map the error residual to an unsigned value as defined in the equation below [8]: (3)

ARITHMETIC CODING
Another alternative for universal encoding mechanism in Lossless Compression is arithmetic coding.Firstly, in a nutshell, arithmetic coding calculates the probabilities of occurance of each symbol in a message over a finite alphabhet.This method does so by incorporating two vaariables, L and R, where L is the smallest binary value consistent with the code representing the symbols found so far, whilst R is the product of probabilities of the symbols found.The simple mechanism of arithmetic coding can be described as the following sequence, with steps 2 to 4 occuring recursively as new symbols is processed [9]. 1) Initialization of and 2) Encode next symbol ( th of the alphabet) by 3) New 4) Output sequence bit c, for each bit sequence between and .
Where and is the probability of the ith (current) and jth (next) symbol in the alphabet respectively.

IEEE 1857.2 STANDARD 4.1. Preprocessing Block
As mentioned in the previous paper, one of the advantages of the IEEE 1857.2 lossless compression is that the bit error introduced during transmission can be minimized by the pre-processing block [3].This is because the audio frames can be decoded at one frame interval since the information of a frame are independent of each other.However, the prediction residues will be bigger compared to the rest of the frame content, resulting in the increase of the prediction residues dynamic range.To compensate for this, the entropy encoder then increases both the calculation complexity and the alphabet size.
The pre-processor block intends to overcome this situation by taking the prediction residues and downshift them to a certain degree.This operation allows the amplitude to decrease and the envelope of the prediction residues is flattened by adaptive normalization, which was also proved to improve the compression rate on MPEG 4-ALS [8].The residual sample are quantized into power of 2 using the PARCOR coefficient k to make sure that it is integers for lossless coding as well normalized by down-shifted with as followed [10], [11]: The term within the summation are pre-computed by RA_shift and RA_shift12 fixed table, which were provided by the standard for fast computation and device portability.

Entropy Coding Selection
As mentioned previously, the IEEE1857.2standard has a selection of two types of enthropy coding, which are Golomb-Rice Coding and Arithmetic Coding and it utilizes one bit in the output bitstream to switch between the two methods for the decoder to recognize which method was used in the encoding process [12].This is illustrated in the Figure 1 below, so the coder will select the entropy type to use and encode this selection as well.In the standard it mentions that this selection occurs depending on the environment of the users, but for simplicity in this study, we explicitly select the entropy type by parsing an argument to the tool.

Adaptive Golomb Encoding
For the Golomb-Rice Entrypy selection, the standard utilizes an adaptive Golomb-Rice mechanism, an advanced method, whereby unlike it predecessors, it doesn't only update the Rice parameter (leadzero, division and remainder bits), but it also updates the m value after each block is processed, which is fixed in the original Golomb-Rice method.
Firstly, in the encoder, the LPC residue is divided into subblocks of , where can be selected to be between 1 and up to 5. Then initial m is calculated by the mean ( ) of the LPC residual error, by the following equation, (6) when each subblock is encoded, the m is updated based on the parameter RICE_NUM_MUL = 32 after the Golomb-Rice Encoder. Figure 1 above, illustrates this updating process of m and the following equations describes how m is updated each time by each subblock, The sum in this case is the Golomb-Rice cumulative sum which is calculated based on the each subblock residual cumulative sum that is processed at that time.

Arithmetic Encoding
Next, for the Arithmetic Encoding selection of IEEE1857.2block it is also similar in it's subblocking of the LPC residue, but x only goes up to 3. Firstly, in this block, the mean of the flattened residue is computed, which then is logarithmicly quantized by the following equation, (10) After this, this quatized mean index is locally dequantized and then used to generate the Probabability table from a set of probability template.The following equation is used in this scaling procedure, (11) The probability template is essentially a set of probability density values from a trained audio data, which is approximately a Gaussion function (mean -0.1 and standard deviation 0.6), but unique to the IEEE1857.2defined standard.Previous work has shown that this trained table can achieve a lossless compression performace better than that of the Gaussion function, thus this table used [10].
Figure 3 describes the overall process of the Arithmetic encoder.Also defined in this process is the MSB/LSB split block, which is an advantage to the Golomb-rice method.This is because the Arithmetic

EXPERIMENTAL SETUP AND MEASUREMENT
The analysis was conducted in a combination of MATLAB application and (.exe) file compiled from C on a DELL laptop, which was running with Intel core i7 processor to measure the performance of each tool.The database consisted of Audio Books which were atleast 1 hour long with a combination of English and Arabic book recordings.Table 1 shows a sample of the Audio books database which were used in the investigation.For the measurements, one of the basic measurement for a lossless audio codec is the compression ratio, which is measured by the comparison of the output file to the original raw file with the following formula: (12) This compressibility determined the proportionality of bits in which were compressed, the larger this ratio, the more bits were compressed [5].As well as this, it is also important to ensure that the rate of compressing is efficient, thus encoding speed will also need to be measured.The encoding/decoding speeds are measured in terms of how fast the encoder processes relative to the total length of the Audio file in MB.This is defined by the following equations: (13)

RESULTS AND ANALYSIS
In this experiment, the internal settings of the IEEE1857.2entropy tool was compared by explicitly adding a parsed option to the tool selecting either Golomb Rice (GR) or Arithmetic Coding (AC), as well as enabling and disabling of the Preprocessing for each Entropy Coder.Additionally, it is also benchmarked with the latest FLAC and MPEG-ALS with varying predictor order to find how well each algorithm works  5, we can see that the preprocessing block does improve the performance of Entropy coding for the Golomb-Rice Block with higher predictor order by atleast 1.29% from predictor order 14.This improvement increases with predictor order value.However, the same can't be concluded for Arithmetic Coding, as there is no clear pattern on whether the preprocessing method improves the Arithmetic Coding or not.
In terms of Compression ratio, the preprocessing block does improves the compression ratio by atleat 0.37% for the Golomb-Rice Coding in Figure 4, however very insignificant improvement is seen with the Arithmetic block as well as this the performance somehow worsen at predictor order 16 and 20.
Comparing the values of the encoding speed and compression ratio for Arithmetic Coding and Golomb-Rice Coding itself, the Golomb-Rice method has a much faster encoding speed compared to the Arithmetic Coding mechanism, where as the Arithmetic Coding has better compression ratio overall to the Golomb-Rice Coding for lossless audio compression application.This is consistent to the previous work which compares the two methods for image compression application.
In addition to this comparison, another interesting finding was found when comparing the database to the other Algorithms.As seen in the Figure 6, despite previous work [10], the MPEG-ALS has the the best compression ratio, where as the FLAC fastest in terms of encoding speed.Overall FLAC and MPEG-4 ALS algorithms surpasses the IEEE1857.2compression ratio, as well as the encoding speed as well.One possibility for this occurrence could be related to the fact that the database used are all Audio books files, where as the previous work were all music files, with shorter length.This explains why MPEG-4 ALS is the best as the standard uses long term LPC residue prediction which is more suited for speech files, where as none of there other algorithms contain this [13].The FLAC comes in a close second, but it also contains an audio detection which switches between different types of speech modelling other than LPC, thus allowing better error residue compared to IEEE1857.2.Additionally, from the FLAC code, it uses its own syscall POSIX interface for read/write protocal, which allows faster read and write speed to the stdlib interface.This could be the reason behind it's significant difference in encoding speed to the others [4].

CONCLUSION
Overall, the experiment shows that for lossless audio compression, the arithemetic coding has a higher compression ratio to golomb rice coding in the IEEE1857.2,with the expense of encoding speed and the pre-processing block does improve the compression ratio of entropy coding especially Golomb-Rice coding, but may worse it's encoding speed for higher predictor order.Nevertheless, when comparing it to other methods, it still falls behind to MPEG-4 and FLAC in terms of compression ratio and encoding speed especially in the case of audio books.Ultimately, IEEE1857.2may be more suited for music files, but further improvement could be done by detection of speech files and long-term prediction mechanism.In terms of speed, the IEEE1857.2and MPEG-4 ALS could have a better comparison with the FLAC codec by using FLAC syscall POSIX interface read/write interface.


ISSN: 2502-4752 Indonesian J Elec Eng & Comp Sci, Vol. 10, No. 1, April 2018 : 176 -183 180Encoder can further split the LPC residue to MSB and LSB parts so that in can further increase the efficiency of the Arithmetic Encoding Algorithm, in the case of trailing bits.

Figure 2 .
Figure 2. Comparison of Arithmetic Coding (left) and Golomb Rice Coding (right) Compression Ratio by Enabling and Disabling Preprocessing Block

Figure 4 .
Figure 4. Comparison of Compression Ratio (left) and Encoding Speed (right) for Various Algorithms

Table 1 .
Sample Audio Book Database For a fair comparison, the tool was recreated in C code on top of AVS China Codec and verified by comparing the output file of the decoder to the original file.The following are the options with its description which are parsed to each of the tool: Comparison of Entropy Coding Mechanism on IEEE1857.2Lossless Audio... (Fathiah Abdul Muin) 181 with higher precision.