Enhancing Security of Quality Song With Embedding Encrypted Hidden Code in Tolerance Level (SQHTL)

Embedding secret information with song signal may hamper its audible quality as well as originality also would be infringed. Therefore, passing secret code with song signal precisely needs to measure the embedded data volume with song signal within embedding tolerance level. In this paper, embedded secret data with linear coding principles has been applied, enhanced security criteria has been maintained with the help of elliptic curve cryptographic application. Hiding information into audio signals and audible quality is maintained only by calculating the acceptance ratio of embedding data using human perception, modulation of channel capacity in modified form is used to determine the embedded data and song signal trade-off ratio for getting future guideline embedding information in song signal with correlation among embedded data is fabricated. A comparative study has been made with similar existing techniques for performance analysis and experimental results are also supported with mathematical formula based on Microsoft WAVE (".wav") stereo sound file.


INTRODUCTION
Applying techniques to protect quality songs is one of the primary necessities for creative industries. Increasing/updating the processing technology is enhancing the pirated versions of original songs. Some people use the original one and make their own versions just tactically changing some parts or modifying some portions to spread business in contemporary market [1,5]. Therefore, original investors are losing their revenues, i.e., to protect the original version of audio songs -a set of measures need to be taken [i.e., restoring the copyright property of original song (IPR)], such techniques will not hamper the originality of the audio songs but carry an authenticated secret information for shielding its own properties. Even, the modified song can be easily detected by verifying the secret information. If any part of the song is being modified, would directly change the original secret code that hidden with specified position of the audio song signal.
In this paper, a systematic approach has been applied over song signal for embedding secret information, corresponding embedding impurities acceptance ratio has been measured with the help of linear coding principles. Two layers of security has been incorporated -one based upon Electronic copy available at: https://ssrn.com/abstract=3611626 linear coding principles integrated with lower frequency region and another over some selected region of song signal with the help of elliptic curve cryptographic approach. Total embedded data volume also has been computed for getting balance of embedding data and audible quality of song signal. Embedded signals with secrete code can easily distinguish the original from similar available songs. Finally, incorporating channel coding theorem in modified form for calculating acceptance ratio of hiding message into song signal is done without affecting its quality. It is experimentally observed that proposed technique will not affect the song quality but provide a level of security to protect the piracy of song signal without violating the acceptance level.
Organization of the paper is as follows. Encoding and embedding secret message are presented in section 2.1 to 2.3. The authentication procedure has been depicted in section 2.4, estimation of embedded message is given in section 2.5 that of extraction in section 2.6. Experimental results are given in section 3. Conclusions are drawn in section 4. References are given at end.

THE TECHNIQUE
The scheme fabricates the secret key with help of linear coding technique in lower frequency region  [which is not used by audio systems] as well as encoding the lower frequencies (1-200 Hz) followed by embedding secret key in the encoded frequency region. The operation is done with an approximate estimation of limit of impurities impacted with song signal to retain the audible quality of signal. Algorithms termed as SQHTL-ELF, SQHTL-MSE and SQHTL-EED are proposed as double security measure, the details of which are given in section 2.1 and section 2.2 respectively.

Encoding Lower Frequencies (SQHTL -ELF)
Encoding lower frequencies in the specific positions of signal  is performed with help of linear coding technique over GF (8) ["GF" stand for "Galois field"]. The procedure of Encoding lower frequencies is depicted in the following algorithm.
Step 1: Take  The value of 1 st C matrix needs to put in specified positions of higher frequency region for getting original frequency value set in the time of decoding process [which is described in section 2.6]. If the song is stereo type, then the above method may be repeated for the second channel also. Therefore, if any value changes in processing, the above relationship in lower frequency region will break and can easily detect the error.

Moulding Secret encrypted Information (SQHTL -MSE)
As the song signal has been represented by linear combination of data over Galois Field [GF (8)]. Therefore, added information has to be converted into similar pattern as well as security and hidden criteria need to be associated with it. After accepting a secret code, elliptic curve cryptography technique over GF(8) may be applied to represent the secret code into specified hidden format. Following method may be incorporated as described below.
Step 1: Select two values of 'a' and 'b' for the equation of the elliptic curve on a binary field F2 m is y 2 + xy = x 3 + ax 2 + b, where b ≠ 0. Here the elements of the finite field are integers of length at most m (=3) bits.
Step 2: Select the domain parameters for elliptic curve over F2 m are m, f(x), a, b, G, n and h. G is the generator point (x G , y G ), a point on the elliptic curve chosen for cryptographic operations. Whereas, n is the order of the elliptic curve. The scalar for point multiplication is chosen as a number between 0 and n -1, h is the cofactor where h = #E(F2 m )/n. #E(F2 m ) is the number of points on an elliptic curve. The public key is a point in the curve and the private key is a random number. The public key is obtained by multiplying the private key with the generator point G in the curve.
Step 3: Select some positions from song signal and make amplitude values of those positions equal for both channels [if monotype song, make two consecutive position's values equal], apply any standard public key cryptographic algorithm with generated public key through step 1 to 3 above over any of two channels amplitude value and put resultant value in the place of original value.
Step 4: As amplitude values of song are not as a binary polynomial of degree m -1, therefore, step 1 of SQHTL -ELF may be used and put the extra values in similar pattern in higher frequency locations.
Therefore, another additional authenticated code will be added with song signal. But total number of amplitude values will be considered based upon the maximum tolerance level of embedding extra data over song signal without hampering its audible quality as described in section 2.5.

Embedding Extra data (SQHTL -EED)
Embedding the storage extra values [in step 1 to 2 of section 2.1 and step 4 of 2.2] and values of 1 st C window in the higher frequency region will create another security criteria over original song signal without hampering its audible quality. The embedding process is as follows.
i. Make equal the magnitude values of two channel of stereo type song as it will do not affect over audible quality of song signal [2,3]. ii.
Separate each digit position of extra value and convert into equivalent lower magnitude value. Let, if the extra value is .0200, then, separated lower magnitude values are 0, 0.0002, 0 and 0 respectively.
iii. Add the magnitude values in the higher frequency region of song signal [above 20,000 Hz] as follows. Let, the magnitude value is V, V value will add to i th position, then the same will add with alternate channel of (i+1) th position, i.e., Electronic copy available at: https://ssrn.com/abstract=3611626 iii. Continue the step i to iii until all extra values are added in the higher frequency region of consecutive locations. The storage 1 st C matrix value also can be add in same region by similar way where each matrix element should be converted as above step ii.
In case of mono type song, the lower magnitude values can be add by separating specified positions with same channel.

Authentication
Embedding extra values in higher frequency region in specified manner creates a secure code that will use to identify the original song. The encoding data set in lower frequency set creates a unique relationship among magnitude values of song signal. The addition operation between window k th and (k+1) th will create result window of same size which will equivalent of the original values of window that originally constitutes song signal. On the other side, the secure embedded code of specified positions of song signal as described in the section 2.2 will provide another level of authentication of original song signal.
Therefore, if any changes during processing, it will create a difference with the authenticating codes that present in the higher frequencies region of the song signal and changing a position will create difference with the hidden code in that region as well as linear coding relationship will break in lower frequency region.

Estimating Limit
Estimating limit of embedding data over song signal is one of the major issues when quality is a factor. For this purpose, an approach has been made for estimating the boundary of adding impurities without compromising its audible quality is done with the help of channel coding theorem in modified form as follows.
Find the channel capacity with N 0 W, where N 0 message embedding rate (here extra data) and W is the highest magnitude value of the song signal, here highest frequency value [i.e. 20,000 Hz]. Shannon limit may be considered for generating limit with help of associated formula. r = k/N, where r is the channel transmission rate, k is number of added values (magnitude values), may also use for calculating embedding data limit with song signal.
Channel capacity can be expressed by following formula Putting above values in the equation (1), we can easily find the noise value N 0 .

Spectrum density = N 0 /2 [maximum noise] and (N-k) number of added impurities (noise) in sampled values of song signal.
Where r<=C, according to channel coding theorem. Therefore, we can conclude, if the above limit exceeds then song will lost its audible quality.

Extraction
The decoding is performed using similar calculations as encoding technique. The algorithm of the same is given below.

Algorithm:
Input: Modified song signal with embedded authenticating code in higher frequency range. Output: Original song signal.

Method:
The details of extraction of original song signal are given below.
Step 1: Apply FFT over x to get magnitude values in frequency domain of song signal, says Y(n), n is the total range of frequencies of song signal.
Step 2: Find the 1 st C matrix [as described in SQHTL -ELF section] from higher frequencies region and using this matrix element find original sequence of magnitude values in lower frequency region. Then remove all the secret codes from higher frequencies region (above 20,000 Hz).
Step Step 4: Apply inverse FFT to get back the sampled values of original song signal.

EXPERIMENTAL RESULTS
Encoding and decoding technique have been applied over 10 seconds recorded song, the song is represented by complete procedure along with results in each intermediate step has been outlined in subsections 3.1.1 to 3.1.4. The results are discussed in two sections out of which 3.1 deals with result associated with SQHTL and that of 3.2 gives a comparison with existing techniques.

Results
For experimental observation, strip of 10 seconds classical song ('100 Miles From Memphis', sang by Sheryl Crow) has been taken. The sampled value of the song is given in table 1 as a two dimensional matrix. Figure 1 shows amplitude-time graph of the original signal. SQHTL is applied on this signal and as a first step of the procedure which is performed SQHTL over input song signal. The output generated in the process is shown in figure 2. Figure 3 shows the difference of frequency ratio of original and modified song after embedding secret code. From figure 3, it is seen that the deviation is very less and there will not affect the quality of the song at all.

Original recorded song signal (10 seconds)
The values for sampled data array x(n,2) from the original song is given in table 1. Whereas the graphical representation of the original song, considering all rows (441000) of x(n,2) is given in the figure 1, table 2 is showing properties of song signals.

Modified song after encoding lower frequencies and adding secure code (10 seconds)
The graphical representation of the modified song signal is shown in the figure 2.

The difference of magnitude values between original and modified signals
The graphical representation of the magnitude differences of original and modified songs is shown in the figure 3.

Estimating limit of embedded code
Applying equation (1) to above song [16-bit stereo type sampled at 44.1 kHz] , hidden extra data (noise) that added for authenticating original song is very less than the maximum noise (0.0128), will not affect over all song audible quality. Because, only about 900 positions have been altered from 441000 sampled values of taken song signal.

Comparison with existing systems
Various algorithms [6] are available for embedding information with audio signals. They usually do not care about the quality of audio but we are enforcing our authentication technique without changing the quality of song. A comparison study of properties of our proposed method with Data hiding via phase manipulation of audio signals(DHPMA) [4] and Secret Data Hiding within Tolerance Level of Embedding in Quality Songs (DHTL) [8] before and after embedding secret message/modifying parts of signal (16-bit stereo audio signals sampled at 44.1 kHz.) is given in table 2, table4 and table5. Average absolute difference (AD) is used as the dissimilarity measurement between original song and modified song to justify the modified song. Whereas a lower value of AD signifies lesser error in the modified song. Normalized average absolute difference (NAD) is quantization error is to measure normalized distance to a range between 0 and 1. Mean square error (MSE) is the cumulative squared error between the embedded song and the original song. A lower value of MSE signifies lesser error in the embedded song. The SNR is used to measure how much a signal has been tainted by noise. It represents embedding errors between original song and modified song and calculated as the ratio of signal power (original song) to the noise power corrupting the signal. A ratio higher than 1:1 indicates more signal than noise. The PSNR is often used to assess the quality measurement between the original and a modified song. The higher the PSNR represents the better the quality of the modified song. Thus from our experimental results of benchmarking parameters (NAD, MSE, NMSE, SNR and PSNR) in proposed method obtain better performances without affecting the audio quality of song. Table 4 gives the experimental results in terms of SNR (Signal to Noise Ratio) and PSNR(Peak signal to Noise Ratio). Table 5 represents comparative values of Normalized Cross-Correlation (NC) and Correlation Quality (QC) of proposed algorithm with DHPMA and DHTL. The Table 6 shows PSNR, SNR, BER (Bit Error Rate) and MOS (Mean opinion score) values for the proposed algorithm. Here all the BER values are 0. The figure 4 summarizes the results of this experimental test. It shows this algorithm's performance is stable for different types of audio signals. Table 3. Metric for different distortions Table 4. SNR and PSNR Electronic copy available at: https://ssrn.com/abstract=3611626 The quality rating (Mean opinion score) is computed by using equation (2).
Where N is a normalization constant and SNR is the measured signal to noise ratio. The ITU-R Rec. 500 quality rating is perfectly suited for this task, as it gives a quality rating on a scale of 1 to 5 [7]. Table7 shows the rating scale, along with the quality level being represented.

CONCLUSION AND FUTURE WORK
In this paper, an algorithm for encoding the lower frequency region with linear coding technique as well as embedding some secret code with the help of elliptic curve cryptographic technique in some specified region of song signal has been proposed. Another second layer of security also has been incorporated considering the extra deducted values of lower frequency region by sequentially adding in higher frequency region which will not affect the song quality but it will  Electronic copy available at: https://ssrn.com/abstract=3611626 ensure to detect the distortion of song signal characteristics. Additionally, the proposed algorithm is also very easy to implement. This technique is developed based on the observation of characteristics of different songs with human audible characteristics and an approach is also made for estimating the embedded extra data limit with the help of Shannon's limit in the channel encoding scheme. It also can be extended to embed an image into an audio signal instead of text and audio. The perfect estimation of percentage of threshold numbers of sample data of song that can be allow to change for a normal conditions will be done in future with all possibilities of errors in song signal processing.