Advanced watermarking technique to improve medical images’ security

,


Introduction
The Development in medical imaging technology has made medical images can be used to do early detection of various diseases, including diseases regarded as deadliest diseases. Data from World Health Organization (WHO), in 2013, stroke and heart disease are the deadliest disease in Indonesia. Around 3 million of Indonesian were diagnosed with heart disease and 2 million were diagnosed with stroke. On woman side, in the same year, cervical cancer and breast cancer are the deadliest disease in Indonesia with 98,692 cases for cervical cancer and 61,682 cases for breast cancer [1]. In line with that the high number of cervical cancer cases has made Indonesia become the country with the second highest in the world for cervical cancer [2].
Early detection of the diseases utilizing medical imaging technology is expected to be able to minimize the number of deaths caused by those diseases. Medical imaging technology is an advanced technology that has a function to produce a visual representation of human body which is used to do more accurate diagnoses and appropriate treatment decision. In addition, medical image can be send over the internet (well known as telemedicine) which allow doctors, physicians, and any other health care professionals do an evaluation, diagnose and treatment from a distance [3]. There are various types or modalities of medical images available, such as radiography, computed tomography, magnetic resonance imaging, ultrasound imaging [4]. Whereas, the difference between each modality is on the types of energies and acquisition technology used to produce medical image. Magnetic Resonance Imaging and Computed Tomography (CT) modalities are the most common used to diagnose stroke, heart disease, cervical and breast cancer [5,6]. The use of medical imaging technology continues increasing in every year. Figure 1 shows the usage of CTs and MRIs devices increasing from the last 30 years in Organization for Economic Co-Operation and Development (OECD) countries [7]. According to Ministry of Health Indonesia, governors will support in increasing the number of  [8].
Increase in the number of medical imaging technology can lead to more and new cyberthreats since it is transferable over the internet. Therefore, it is a must to secure the medical images from any kind of attack, since any changes on medical images may lead to incorrect diagnoses and treatment decisions for the patient. In medical imaging technology arena, Digital Imaging and Communication in Medicine (DICOM) is used as an international standard for transferring, storing, retrieving medical image information. In short, DICOM image format containing two important parts: The header file which save important information such as patient information, study information, and the pixel data. DICOM Standard implement several security schemes on both, the confidential data and images as stated in PS3. 15 of the DICOM Standard, which about Security System Management Profiles [9].
DICOM Standard currently implements digital signature to guarantee the authenticity and integrity of the pixel data and saved in group tag (FFFA,FFFA). Furthermore, DICOM Standard implements encryption such as AES and triple DES on selected DICOM Data Set, one of them is the Electronic Patient Record (EPR).Moreover, DICOM Standard also specifies anonymization scheme by deleting or changing any confidential DICOM Data Set in order to allow the DICOM file uploaded publicly for certain purposes such as research, presentation, study, etc. [9]. The limitations of the security schemes implemented by DICOM Standards are integrity and authenticity are not addressed for selected DICOM Data Set and the anonymized DICOM file could not be used for further diagnoses anymore, since the information about patient was removed permanently Furthermore, unauthorized people could edit partly or completely alter the pixel data of the medical image and cannot localize of altered pixel. Those become the major limitations in the DICOM Standard [10,11].
Knowing the limitations of current security schemes, the work presented in this paper tries to find and develop a security scheme for medical images. This study proposes a securing technique which based on dual-layer fragile image watermarking to ensure the integrity, confidentiality and authenticity of the medical images. The remaining of the paper is organized as follows: section 2 discusses the literature review on digital image watermarking for medical images; then section 3 explains about the proposed method; followed by section 4 which describes the methodology used; section 5 illustrates the examination result of our proposed method; and finally, section 6 presents the conclusions of the study.

Digital Watermarking for Medical Images
Digital watermarking is a technique for embedding data or information called 'watermark' into a multimedia object which called the 'cover' such as image, video, audio or even only a text. According to its functionalities, digital watermarking can be classified into two categories: robust and fragile watermarks. Robust watermarks are good for authentication and characterizing resistant to a common signal processing such as compression. On the other hand, a fragile watermark will not survive from signal processing, however it is good to be used for data integrity [12]. Therefore, this study develops a fragile watermarking to ensuring the integrity of the medical image which also followed by ensuring its authenticity and confidentiality. Moreover, in this study, Least Significance Bit (LSB) modification, a spatial domain watermarking scheme is implemented to develop a fragile watermarking for medical images. Whereas, the LSB modification image watermarking works by changing the least significant bit (LSB) of the cover image with the watermark bits [13]. Embed the watermark into the last two significant bits for example (2 nd and 1 st LSB of cover image) for each pixel value, the watermark mostly not going to detectable by human eyes [14]. For example, if the image pixel is 180 which has binary value of 10110100 and the watermark bits 0, the value of the pixel will be the same, 180. Same pixel value with watermark bits 1, the pixel value will be 10110101 which is 181 in decimal. Thus, human eyes are less sensitive with two colors that have value 180 and 181 in gray colors between black is 0 and white is 255 [15].
Digital watermark techniques have been proposed for securing medical images. [16] proposed a robust and imperceptible dual watermarking. In their proposed method hybrid error correcting codes (combination of BCH & repetition code) are utilized to encode the watermark data before embedding it into the cover image. Since they use another image as the cover image, their method requires cover image which can hold all the watermark data (the medical image). The [17,18] proposed methods with the idea of combining digital watermarking with compression or/and cryptography algorithm. Both methods utilize the encryption and compression on watermark data (EPR and bits of ROI) and do the process before inserting it into LSB of the cover image. The difference between them is [17]'s method use Region of Interest (ROI) as the cover image, thus may cause degradation on ROI's quality. [18]'s method on the other hand, use Region of Non-Interest (RONI) as the cover image which means the RONI part should be large enough to hold all the watermark.
Instead of just hiding EPR or image's bits into cover image (same image or other unrelated image), researchers also develop method for tamper detection and localization to know whether the image has been tampered and its location. [19,20] proposed a tamper detection and localization of image based on watermarking block. However, their methods only focus whether the image has been tampered or not, and they ignore the DICOM tags such as EPR, image properties, study and series properties, etc. [21] proposed a dual-layer watermarking for medical images. The method embeds the EPR data into 2nd LSB of RONI part and block-based image on 1 st LSB of a medical image for tamper detection and localization.
Other proposed method by [6] do the combining of cryptography and watermarking to provide confidentiality, authenticity and integrity of the medical image. Both [12]'s and [21]'s methods embed the EPR data into RONI part of the images, which means their method really depend on the size of the RONI who can hold all watermark bits.
The study presented in this paper proposes a technique to secure medical images which based on dual-layer fragile digital watermarking. The proposed technique uses hash value of DICOM tags to ensure the integrity of the DICOM tags, encrypts the EPR to provide confidentiality and authenticity, and then embeds it in the 2 nd LSB of the medical image and lastly calculates the hash value of 8x8 non-overlapped block, and creates an id for each block which then embeds it into 1 st LSB to provide integrity of the medical image. The proposed technique has been tested on CT and MRI medical images of heart diseases, strokes, cervical cancers and breast cancers which are the deadliest disease in Indonesia. The detail of proposed technique is discussed in section 3.

The Proposed Security Technique
There are two core processes that need to be performed in our proposed technique, i.e. watermark embedding process and watermark extraction process. This section discusses the processes in details.

Watermark Embedding Process
The watermark embedding process mainly has two main sub-processes, which are the watermark generation and watermark insertion. The flow of the process is summarized in a diagram which presented in Figure 2. In our study, two watermarks are used and embedded into its corresponding medical image: concatenation of EPR and hash of DICOM Data Set (DICOM Tags) and the concatenation of id and hash value of each non-overlapped block. Here is more detail about the watermark generation: a. Separate the EPR tags and implement AES256 encryption algorithm to encrypt (EncCt) used to guarantee the confidentiality and authenticity of the EPR.  The next step is watermark insertion (embedding) process. Embed the ConDC into 2 nd LSB and concatenation of id and hash of each block into its corresponding blocks. The detail of insertion method discusses in the following manner: a. Expand the length of ConDC to fit the size of the cover image b. To increase the security, insert randomly each bit of ConDC into 2 nd LSB of image using the following equation: whereas, k is the secret prime key in which k ∈ [1, n]; x is the bit position of ConDC with x ∈ [1, length of ConDC]; and n is the total number of pixels available for watermark embedding. c. Then move for second watermarks. Since We use SHA256 on each block, which generate 64 hex-bytes characters, then We need to choose randomly 16 hex-bytes characters to fit the total bits available on 8x8 non-overlapped blocks using formula: whereas, k is the secret prime key in which k ∈ [1, n]; x is the characters position of the hash result with x ∈ [1,16]; and n is the total number of pixels available for watermark embedding. d. Embed randomly the selected characters into 1st LSB of corresponding block using following formula: whereas, k is the secret prime key equal to previous k used in previous step; x is the characters position of the concatenated string in binary x ∈ [1, 64]; and n is the total number of pixels available for watermark embedding.

Watermark Extraction Process
The main objective of watermarking extraction process is to check the integrity, confidentiality and authenticity of the image. Details on this process are presented in Figure 3. As depicted in Figure 3, there are two phases to check the data integrity, first check whether the image has been tampered or not, then check whether the DICOM Tags has been tampered or not. Here is more detail about the watermark extraction: a. Divide the medical image into 8x8 non-overlapped blocks and extract 1st LSB of each block. b. Calculate hash of each block then compare the calculation result with the extracted hash value. If the value is not the same, this means the image has been altered. The non-overlapped blocks can show the location of tampered image. Otherwise, continue to the next steps. c. Extract the 2 nd LSB of the cover image to get the concatenated string of DICOM Tags hash value and the encrypted EPR. d. Calculate the hash of the DICOM tags then compare with the extracted DICOM tags.
If the values are same, then continue to the next steps, otherwise the DICOM tags have been altered. e. Lastly, after checking the medical image and the DICOM tags, decrypt the encrypted EPR.

Methodology
The schematic flow of this study is firstly by reviewing previous works and identifying the study position and contribution to the topic of securing medical images. Then, design and implementation of the proposed technique, followed by collecting the study materials and data collection. Lastly, analyze the data and draw a conclusion.

Source of Materials
Medical images which related with four deadliest diseases in Indonesia are used in this study to test the proposed watermarking technique. The anonymized medical images are taken from open website. The image format from of the medical image originally is in "dcm" format (DICOM format). The size of medical image of CT scan is about 512x512-pixel matrix and the MRI is about 256x256-pixel image with 16 allocated bits and the image in monochrome mode. Those sizes are the common size of medical image for CT scan and MRI modalities [22]:

Performance Analysis on Image Quality (Imperceptibility Analysis)
In this research work, the proposed schematic use of LSB watermarking technique is to embed the watermark to the cover image. As a result, an image quality degradation will occur. The good quality of image watermarking technique is that the image can hold as much data as possible, and the image quality degradation level is low [23]. There are various techniques to evaluate the difference between an original image and a watermarked image whether qualitatively or quantitatively. Among those techniques, the most and commonly used criteria and become the standard evaluation of image quality measurements are Mean Squared Error (MSE) and the Peak Signal to Noise Ratio (PSNR) for quantitative method and Structural Similarity Index Metrix (SSIM) for qualitative method.
The goal of MSE is to measure or provide quantitative scores of two signals/images which describe how similar the signal/image to each other, the degree of error or distortion where NM is the image size, xj,k indicates the jk-th pixel value of watermarked image and x'j,k indicates the jk-th pixel value of the original image. The smaller the MSE value indicates the degradation level of the watermarked image is low.
Peak Signal to Noise Ratio (PSNR) is a mathematical way to measure the image quality based on the pixel different between the two images: the original image and watermarked image. MSE need to calculated first before calculating the PSNR which can be calculated by following formula: Both MSE and PSR values are simple and easy to calculate, however, they are not very well matched to perceived visual quality. For that reason, in 2004, Structural Similarity Index Metrix were proposed by [24]. SSIM is one of Human Visual System (HVS) method to check image quality and SSIM used to compare two images quality by measuring their similarity. Three aspects are calculated to determine the similarity of two images: luminance, contrast and structure of the images. The luminance functions l (x y) for reference image x (in this case is the original image) and test image y (the watermarked image) is: where μx and μy are the mean values of x and y, and C1 is a stabilizing constant.

Performance Analysis on Tamper Detection (Fragility Analysis)
Several malicious attacks such as image cropping, copy-paste attacks, collage attack and constant-average attack are commonly used to the fragility of watermarking scheme [25]. Whereas, image cropping and copy-paste attacks were done by crop or copy-paste some part of the watermarked image randomly. Collage attack in this study done by copying the authenticated block of watermarked image into another part of the image. Constant intensity will not resist to this kind of attack. These attacks are chosen in order to compare the proposed scheme with other fragile digital watermarking which proposed by [26] and [27]. Calculated tampered detection rate can be used to evaluate the quality of the tamper detection method. Tampered detection rate calculates the number of tampered blocks detected in the image. Higher rate of tampered block indicates a good tamper detection method. Here is the formula to calculate the tampered detection rate (TDR): = 100% (10)

Results and Discussion
This section presents the results of study conducted and discusses the interpretation of the results and its implications.

Medical Images Dataset
The proposed technique was developed and implemented using python programming language. It was then tested on medical images which related to the four deadliest diseases in Indonesia. Figure 4 shows all medical images dataset used in this research work with detail properties stated in Table 1.   Figure 6. After the de-identifier process, a DICOM file with complete DICOM tags is ready to use for testing the performance of the proposed technique.

Performance Analysis on Image Quality (Imperceptibility Analysis)
Perceptual imperceptibility is one of the most significant indexes in the watermarking performance analysis. This means that human eyes should not be able to detect the embedded watermark in the cover image. Figure 7 gives the watermarked images generated by proposed watermarking schemes on medical images. We can see that the watermarked images in Figure 7 are almost the same as the original image in Figure 4. It shows the proposed scheme provides a satisfactory watermark imperceptibility. The small difference between the images can be seen from their MSE, PSNR and SSIM values as describes in Table 2. where the watermark bit embedded into the LSB of cover image without randomization and the proposed watermarking technique randomized the insertion of the watermark bits. Both, standard and proposed watermarking techniques has PSNR values more than 44 dB, and SSIM values almost 1 whereas 1 is the maximum value of SSIM) these prove the imperceptibility of the watermark objectively. From this table, then generate graphs like shown in Figure 8 and Figure 9. From these figures, PSNR and SSIM values of proposed technique consistently show higher values than the standard watermarking. This means that randomized the insertion of watermark bits gives higher value of PSNR and SSIM which indicate the proposed technique produces better watermarked image quality compare with the standard watermarking technique.

Performance Analysis on Tamper Detection (Fragility Analysis)
Malicious attacks are commonly used to test the fragility of digital watermarking [25]. Some classical malicious attacks are performed on the watermarked images such as image cropping, copy-paste, collage attack, and constant-average attacks to test the fragility of proposed watermarking technique. These attacks are chosen in order to compare the proposed technique with other fragile digital watermarking technique which proposed by [26,27]. These attacks performed on one of the sample images, the "mr_cervix" image as shown in Figure 10.
The proposed technique is a blind watermarking technique where there is no need for original image to detect whether the watermarked image has been tampered or not. The tamper localization maps obtained by the proposed watermarking technique are shown in Figure 11. From that figure, we can see that the falsified areas are clearly identified by the proposed technique. It means that the proposed watermarking technique has achieved good tamper identification and localization results for various malicious attacks.  Table 3 shows the fragility comparison of proposed technique and other fragile digital watermarking which proposed by [26] and [27]. Data in the table clearly show that the proposed technique offers better security in term of fragility of the watermarked image. The proposed technique is a blind watermarking technique and more resist on collage and constant-average attacks compare with the references.
The proposed watermarking technique then is compared with current DICOM Standard security and the standard watermarking techniques which stated in [12]. The proposed technique provides authenticity and confidentiality for header data of medical image by implementing AES256 encryption before the embedding process. Moreover, the integrity of the header data can be achieved by checking the hash value of the header during extraction process. The comparison of proposed technique, DICOM Standard and the standard watermarking technique is shown in Table 4. Based on data on Table 4, it can be concluded that the proposed watermarking technique provides better security to medical images compared to the current DICOM security technique and proposed watermarking technique by [12]. The proposed technique succeeds in providing the integrity, confidentiality and authenticity for medical images.

Conclusion
In this study a dual-layer fragile digital watermarking is proposed to secure medical images especially with DICOM format. In the proposed watermarking technique, first, the confidentiality of DICOM tags is encrypted using AES256 encryption algorithm and concatenated with the other DICOM tags, then embed randomly into 2nd LSB of the medical image. This step is to guarantee the authenticity, confidentiality and integrity of header data. Next, to guarantee the integrity of the medical image, the image divided into 8x8 non-overlapped block, calculate hash value of its block using SHA256 algorithm and concatenate with the row and column of the block then embed into 1st LSB of the image.
The proposed technique has been tested on six medical images from four deadliest diseases in Indonesia with CT and MRI modality. The images are in monochrome mode and has 16 bits for each pixel and the size between 256x256 and 512x512 pixel. From the result, it can be concluded that the proposed technique has a good quality of watermarked image where the human eyes could not be able to detect the embedded watermark data in the image and watermarked image has high value of PSNR above 44dB and SSIM value above 0.99 that almost reach the maximum value which is 1. Furthermore, the proposed watermarking technique also resist from several malicious attacks such as cropping attack, copy-paste attack, collage attack and constant-average attack. The proposed technique able to detect and locate the tampering precisely. Lastly, comparing with current DICOM security standard and another fragile watermarking technique for medical image, the proposed technique has offered more aspect of security for medical image. The proposed technique has succeeded to guarantee the authenticity, integrity and confidentiality of medical image for both header data and the image pixel data.
As a suggestion for further research; a proposed dual-layer fragile digital watermarking is far from a perfect security scheme for medical image, hence many improvements need to be done to close its flaws. First suggestion is to test the proposed technique on more medical images to get more reliable results. Moreover, proposed technique needs to be tested on original medical images and the quality of watermarked image needs to be examined directly by the doctor, physician, or other experts since current result is only based on calculation of PSNR and SSIM. Moreover, proposed technique still focuses only on MRI and CT modalities of medical images in which having monochrome image mode. There are still other modalities which have colored image. Therefore, the second suggestion for future research is to improve proposed technique in order to work properly on colored medical images and can be used for any modality of medical images.