An Antonym Substitution-based Model on Linguistic Steganography Method

ABSTRACT


INTRODUCTION
The implementation text document is one of most important medium in media information such letter appointment. The wishes of text documents as the important point of information are still high in the domain of business and academic. Meanwhile, the most other medium information such as image, audio and video mostly implemented in entertainment environment Therefore, text documents should be a concern for all of people due to the value of information it carries and the high risk of the communication channel [1]. The irresponsible third party perhaps irregularly tempered the information to for their purpose. One of solution to deal that issue is implementation of information hiding area named steganography.
Steganography as the category information hiding that efforts to hide the hidden messages into innocuous looking media in a camouflage manner which avoids suspicions from unintended recipients. There are lot of and variable types of media in which secret messages can be embedded such as images, text-based documents, video and/or audio files. The implementation of text steganography is hiding the hidden message in the text medium that the third party is unable to discover the existences of message in text. Its means, the implementation steganography in medium of expectedly to make the hidden information invisible and could be noticed by unauthorized party. The implementation of steganography itself is divided into two categories. Firstly; the implementation of steganography in different mediums of image, audio, video and other digitally invisible code namely technical steganography [2]. Secondly; the implementation of steganography in the medium of text is known as text steganography. It is anticipated that the steganography approach can give some solution for safeguarding the security of information in the text media [3]. The implementation of text steganography is by hiding the secret text in the medium of text so that the third party is unable to discover the existence of the message. In other words, steganography in the medium of text can make the secret information invisible and unnoticed for third party to see or detect, and it is directed to the appropriate receivers to apprehend the information. However, steganography in the text domain is the most challenging method of operation compared to the other domains. The challenge in implementing steganography in the text domain since the text file has a small quantity to hide information [2].
There are two groups of text steganography which are linguistic and format-based steganography. Linguistic steganography is dependable on the linguistic order of the sentence of text. The development of format-based steganography consists of two kinds of techniques, which are word-rule based and featurebased. Word-rule based is the technique that embeds the hidden message based on word pattern by shifting in the text. The techniques consist of line-shift coding and word-shift coding. Line-shift coding hides the hidden message with vertically shifting hidden message in text lines. Meanwhile, word-shift coding hides the hidden message with horizontally shifting the hidden message in length between words [2]. The second type formatbased steganography is feature-based which can be defined as a technique that alter unique feature characteristic in text based on code words. This technique covers hidden message based on pattern letter or length of the word that conceals so no changes happen in the text [3]. In linguistic steganography, the level of security in secret message is low in making it prone to attack. The implementation of substitution-based approach in linguistic steganography is using the synonyms. However, the literature has shown that the use of synonym in substitution leads to a stego message that is easy to guess [4]. The detectors can discover the hidden messages exist in the text, if they notice the changes in the analyzed text [5]. Low security is also the issue in linguistic steganography approach especially in synonym-based substitution [6]. One evidence is the synonym paraphrasing technique in Spanish language that is easily to attack [7].
To consider for this issue, this paper proposes for the use of antonym substitution-based, in effort to mask the meaning of the original message. Most existing study of substitution-based linguistic steganography apply synonym substitution-based approach, therefore, there is a need to study antonym-based approach as an alternative to implement in linguistic steganography. The effectiveness of this study will be evaluated using verification and validation.

RELATED WORKS
The existence of intruders in the communication technology enables anyone to retrieve information easily. Irresponsible intruders or attackers may disclose the secret information to uninvolved parties to check or modify it for abusing that information [8]. The impact of the easier access on the internet making of the increasing of the chance of attackers and intruders, it is a responsibility to take additional measures to ensure the right is well protected. One of the solutions in dealing this issue is to secure the circumstance by using steganography, which is particular linguistic steganography in this paper. This paper elaborated the several development and function in steganography especially in linguistic steganography that begin with the linguistic semantic reviews.

Linguistic Semantic
Linguistic semantic is the study of significance that is used by humans to express about themselves through language [9]. It is often used in ordinary language to denote a problem of understanding that comes down to word selection. In English language for example, the synonym of the word ``presence'' is ``attend'', but its antonym is ``absence''. One can indicate the two different meanings based on the definition for synonym and antonym.
A synonym is a word that has the same, or almost the same, meaning as another word. Synonyms and antonyms are used by teachers, students, writers, editors, poets, and songwriters. Synonyms are usually used to define a word with the same meaning but presented in another word. The difference with antonym is that it is used to define a word that is against its meaning and is absolutely a different word.
Meanwhile, antonym change the word in terms of giving an opposite meaning from the actual word. In Table 1 is showed some example of dictionary several words with their relevant antonym which is showed the word that has antonym that most possible used word encode for hiding secret message in the text.

Linguistic Steganography Method
Linguistic steganography is altered the hidden secret message that information concerning the procedure of words and linguistic modification when covering the hidden message. One general technique in linguistic steganography that usually implement is synonym substitution method. The synonym substitutionbased able to implement in several languages which the assumption the language the text is of has synonym word. Some of previous researchers effort of linguistic steganography field within the last decade is shown in Figure 1 as follows.  Figure 1, it is found that most implementation of steganography is focusing on formatbased steganography as opposed to linguistic steganography. Within the area of substitution-based approach in linguistic steganography, only the use of synonyms has been investigated in the past research. However, synonym substitution has expected to improve the cover text that generated with same meaning as plain text. This paper elaborate the mechanism of antonyms in substitution-based approach as the alternative to the standing synonym-based approach.

ANTONYM-BASED SUBSTITUTION
This paper discus a method that used as tool of hiding message in a cover text using the antonyms words from the Thesaurus of Antonyms database. The tool that is developed is called Antonym Substitutionbased (ASb) Steganographic Tool based on linguistic steganography [16]. This approach maintains the syntactic and semantic structure of the cover text such that it appears harmless to any unintended recipient. The encryption process is based on word replacement, with words being replaced by their word absolute antonym. It begins by feeding the system with cover text, wordlist and secret text. Then, the system will read the cover text, word by word simultaneously. If the word matches in wordlist found (bit=1), the word will be replaced by its antonym. Once it completes reading the cover text, the stego text will be generated.

START
INPUT cover text, wordlist, secret text READ secret text CONVERT secret text  binary bits EMBED binary bits in cover text REPEAT-UNTIL end of cover text READ word IF bit = 1 perform substitution ELSE-IF bit = 0 no substitution END-IF END-LOOP PRINT stego text END Figure 2. Pseudo code ASb model From Figure 2, it is shown that the pseudo code as the logical design. The implementation of flow process and pseudo code begins the secret text is processed in a substitution environment where ASb it be applied into secret text. The secret text will them be converted to binary bits and embedded into cover text, which in turn will be substituted with antonym. Then, the antonym text will be taken from antonym wordlist and replaced the identified text in order to generate a stego message. At last, processing decode the stego text, the antonym text that was applied will be extracted, and hence the secret text will be displayed. In term of logical design, the process model of ASb is showed in Figure 3 as follows.  Figure 3 shows the proposed approach has been applied into a tool called the ASb Steganogaphic Tool, where the algorithm of the proposed approach is implemented into this tool. The interface of the tool established using GUI of VB.NET programming language. It also shows the user interface that is concerned with how users add information to the tool and how the tool presents information back to them interface of ASb. The main characteristic of ASb is that the tool provides spaces for user to place the secret text. The, ASb system approach is tested with two categories plain texts that are cover text which select Reuter News 21578 and secret text which select sentence in the language. The classifications of input datasets are compared based on length of characters of the text and capacity size bit in the text [17,18]. Every set of input data set was entered into ASb tool and the stego text generated were recorded.

EVALUATION RESULTS
This section discusses about the analysis and result performance of proposed study based on the verification evaluation. Table 2 shows the length character in stego text for antonym and synonym stego text. It shows the result of number character of stego text 1 (ST 1) until stego text 20 (ST 20). Based on Table 2, there are changes of character for each stego text from cover text. Its mean all of binary bits are embedded in obtained the stego text. bytes. It also involved 20 secret texts with capacity size bit ordering in the range between 36 bytes and 38 bytes. The relationship of input datasets between secret text and cover text based on size bit is shown in Table 3. Table 3. List of size bit of stego text and secret text Table 4 shows the size bit stego text of antonym and synonym toward the secret text. It showed the larger size bit of secret text influenced generated larger stego text size bit capacity. Both antonym and synonym stego text output show the size of stego text generated were almost same for each dataset tested. This mean that the antonym was comparable with the synonym Next, the experiment is about size bit stego text toward cover text is similar with size bit stego text toward secret text. However, this experiment compared the size bit between cover text and stego text. It similar with the previous experiment that is discovered number of stego text. It also used the antonym and synonym approach. This result of experiment used in order to compare the number of size bit stego text that had been discovered with the original capacity of size bit the cover text. The approach that measures the size bit is antonym and synonym that comparing the number of stego text and cover text obtained which is showed in Table 5 as follows. From Table 5, the size bit stego text of cover text in both approach increased. All the stego text that generate from antonym and synonym approach shows the increment of 0.18% to 0.20% of size of bytes. The size of both approaches also comparable each other.

CONCLUSION
This paper is elaborated development the new linguistic steganography approach named antonym substitution. This approach obtain the idea implementation of substitution process hiding that targeted the word that has antonym. . A tool has been developed to test the proposed approach and it has been verified. This proposed method has been verified based on its character length stego text towards the cover text, bit size types of the secret text towards the stego text and bit size types of the cover text towards the stego text. For future work, the ASb is expected continue in producing innocuous text and increase utilization cover texts.