Enhanced the accuracy of idle time in cognitive radio network

This paper addresses the essential part of modelling primary user (PU) activity pattern in the time domain, which involved choosing the best distribution fit to represent idle and busy time. The accurate PU activity model plays a vital role in developing high-performance cognitive radio (CR) network. This work formulates the PU activity model by using the empirical data measured from wireless local area network (WLAN) testbed. The detected idle time analysed in this work in two different scenarios, then a statistical approach performed to find the best fits. The finding shows the generalised Pareto (GP) distribution as the best fit with D KS =0.266 compared to other distribution fits.


INTRODUCTION
A transformation from voice communication to the multimedia application and the evolution of the internet of thing (IoT) technology increase the need of spectrum frequency in the wireless communication industry. Nevertheless, the use of frequency band in the radio spectrum limited due to allocation to primary user for particular services. This condition will cause spectrum congestion as there are many users access the same spectrum band at the same time. On the other side, there are a lot of spectrum band have been found underutilized and the demand for the spectrum band is different in term of users and the usage time. This condition will leave certain spectrum frequency unused and release the spectrum frequency to the cognitive radio (CR) users to occupy.
Performance of CR depends on the appearance of idle time in a channel which also known as spectrum occupancy. A detected idle time is the opportunity length of time that has been sensed by a secondary user (SU) during the spectrum sensing process. The spectrum occupancy in each channel caused by the activity pattern of primary user (PU). The detected spectrum occupancy in PU relates to the state of PU signal, which is categorised as static and dynamic states. The state of PU channel depends on the behaviour of PU signal during sensing duration [1]. Hence, modelling the accurate and realistic PU activity model is crucial in  [2][3][4][5][6][7], most of the previous PU activity models are based on assumptions and modelled based on the exponential distribution. Not with standing, there are several empirical measurement studies contradict with the finding. According to [8][9][10][11], in the real system the period of time of PU activity is not exponentially distributed and less accurate. Therefore, an accurate model of spectrum occupancy patterns that describe the real system should be developed to encounter this issue.
In this paper, an experimental setup to observe a CR system is executed, and a wireless local area network (WLAN) is emulated as a PU to represent random PU activity. The experimental data is analysed and energy detection is performed during the sensing period. An energy detection technique is used because it can detect the signal in both the frequency and time domains [12]. The main contribution of this paper is the proposed model of an empirical idle time from the measured WLAN signal using generalised Pareto (GP) distribution. An algorithm was developed to detect a series of WLAN signals using energy detection. During the sensing period, spectrum holes are detected and evaluated to provide opportunities for SU to access a channel without interfering with the PU network. The paper is organized as follows, section 2 explains the a detailed research method and overview of modelling the PU activity based on the empirical model, and section 3 discusses a result and analysis for the experimental models is presented. Finally, section 4 concludes the outcome presents in this paper.

RELATED WORKS
The understanding of previous studies and gaps in wireless technologies knowledge remained central during this process of identifying potential solutions to the research problem. There are a variety of ways to measure spectrum occupancy in time, space, or frequency domain. The measurement configurations and testbeds depend on study purposes, ranging from simple to more complicated methods.
The wireless local area network (WLAN) signals can be measured to obtain the spectrum frequency in time domain. The testbed to detect PU signal can be configured by using an antenna for signal detection, as well as access point and spectrum analyser to display the detected signals. This testbed known as antennabased WLAN setup. This type of measurement for the PU activity pattern setup was studied by [8,[13][14][15].
According to [8], a complex WLAN baseband signal is detected by a vector signal analyser that uses both antenna-based and isolated RF setups that guarantee to be free of interference from other adjacent devices. Meanwhile, in [14] this type of measurement was used to detect two independent WLAN systems under the indoor environment. Conversely, [13] has combined a commodity of 802.11 wireless LAN card with 2.4 GHz RF transceiver IC for the purpose of spectrum measurements. A probabilistic model was built using the empirical data to determine the activity of the 802.11 and non-802.11 separately. Some of studies use a universal software radio peripheral (USRP) to measure WLAN signal and to validate the simulation works. These works [16][17][18][19] have been carried out their measurement using USRP as an interface to connect with softwares such as GNU Radio, LabVIEW, and MATLAB.

RESEARCH METHOD
An experimental setup is constructed to measure the Wireless Local Area Network (WLAN) signal to demonstrate random PU activity like a real-time wireless circumstance. The same testbed of WLAN has been used in [14,19] to maximize the SU throughput by clustering the idle time. The experiment consists of two stations (STAs) and a wireless access point 1 (AP1). Both STAs are recognised as STA1 and STA2 and are connected to AP1 through a wired and wireless LAN, respectively. Accordingly, STA1 and STA2 share a significant amount of the data (i.e. file), through AP1. The STA2 retrieves the data file from STA1 using the MS Windows file sharing facility. Indeed, the large data file is used to ensure that the download process has not been completed during the measurement process. Therefore, the resultant traffic via access to the WLAN is considered as full-buffering.
Meanwhile, in the second scenario the other two stations, STA3 and STA4 are added up in the same channel. Both STA3 and STA4 are connected to wireless access point 2 (AP2) similar as the connection in scenario 1. The specification of the WLAN system used in the experiment is shown in Table 1. The packet accessed in the system is discovered by the detecting antenna (DA), which is a wireless LAN Omni-antenna. The DA is connected to a real-time spectrum analyser, SA2600 (Techtronic) to display the spectral activities of the system in an indoor real-time environment. The measurement antenna is located near to AP1 and AP2 to maintain the power of the detected signal. As a result, false alarms and miss detections are avoided. Figure 1 and Figure 2 illustrate the structure of the experimental setup, and the specification of the signal detector are listed in Table 1.

Modelling of primary user activity based on the empirical model (EM-PuO)
The detected signals of the wireless network are observed from AP1 and AP2 as displayed in the SA2600 spectrum analyser and are then saved for offline processing and analysis. The spectrum analyser captured a minimum number of PSD samples in time dimension which is defined by the sampling rate and the measurement period. The measurement is executed indoor and evaluated during the short-term measurement campaign. The detected signal displayed is an OFDM signal with the given specification in Table 2. The signal is then converted to PSD vs time to execute evaluation in time domain. The displayed signal emulated as PU activity in a channel with fixed frequency in a changing value of time.
The TOL that have been extracted from the WLAN signal is modelled as EM-PuO. The EM-PuO is the empirical measurement data of the PU. It is imitated by the WLAN system as random PU activity for SU opportunistic transmission. The modelled EM-PuO is used to form the realistic spectrum occupancy of the PU channels based on the real measurement of the WLAN system.
According to the MAC protocol of the IEEE 802.11a standard, the WLAN specification considered in defining the states space and analysed them. IEEE 802.11a has different access modes which are distributed coordination function (DCF) and enhanced distributed coordination access (EDCA). The IEEE 802.11a MAC protocols have a standard length of interframe space (IFS), which depends upon the previous frame type, the following frame type, the coordination function inuse and the physical layer (PHY) type [20].

Characterisation of WLAN signal
The detected WLAN signal is classified into four (4) type of states which are the data packet, short inter-frame spacing (SIFS), ACK, and a stop period, tp. The stop period tp is a combination of distributed coordination function spacing (DIFS) and the random back-off time tRB. Each inter frame spaces (IFS) will define the priority for a station to access the wireless channel.
The temporal measurement for both scenario 1 and scenario 2 are based on one cycle signal as illustrate in Figure 3 and Figure 4 respectively. In scenario 1, only one PU is generated from the experiment, where the signal consists of 1 , , and for one cycle of the signal. The 1 spaces with 5 mW power that displays in Figure 3 indicated as the PU signal. In the standard transmission data packet, the appearance of SIFS indicates the end of the data packet, and the ACK signal is sent by AP to acknowledge the TELKOMNIKA Telecommun Comput El Control  Figure 4, there are two different power of signals which are 1 = 5 and 2 = 2 , both represent PU 1 and SU 1 respectively. The characterisation of the detected WLAN signal with the BUSY and IDLE states is shown in Figure 5. Thet_SIFSandt_DIFSspace classified as BUSY state as the interframe spaces have important function in WLAN signal sequence, which cannot be accessed by SU. Thet_SIFSis the shorter IFS that indicate the end of the data frame and before the ACK signal. Importantly, in this system, the PU also performs sensing to detect wireless access, and in detecting any access during t_RBwhereby, if detected, the countdown of the back-off is immediately stopped.
During measurement, the SU transmitter and receiver are configured to communicate using shortrange communication with minimal signal power. However, even though the SU signal is low, it produces harmful interference to the PU due to the close distance between the SU transmitter and the PU receiver. Accordingly, in this situation, the PU is not able to detect access to the system, but instead, suffers from the hidden node terminal interference. Therefore, to avoid this problem, the SU will only exploit the spectrum during the back-off period,t_RB. The appearance of a data series, 1 , , and periods thereby specify that the channel is in the busy state and is known as . Meanwhile, during the running random backoff time, the channel is identified as the idle state of the model. The states of BUSY and IDLE are expressed as given by:

Probability of generalised Pareto (GP) distribution
The statistical characterisation of the channel is needed to specify the accurate PU activity traffic. A statistical behaviour will be captured by finding the distribution of the component of idle and busy time. This section analyses the suitable considered cdf models in describing both idle and busy time by employing the KS distance tests. Based on the extracted length of and obtained from the empirical data, a cumulative distribution (CDF) was derived and compared to the probability of the distribution model. The distribution of generalised Pareto (GP) has gained attention for estimating parameters in practical application. Moreover, the GP distribution model has captured the primary user traffics variation accurately in [3,21,22] and increased the SU throughput by 5% in [23]. The probability distribution for GP is given as [24]: Meanwhile, the cumulative distribution function of GP ( , , ) is given as: where ≠ 0 denotes the shape parameter, is the location and is the scale. Noted that, for = 0, the GP converges to the exponential distribution. The mean value of the distribution is ( ) = −1 for shape > 1. The Kolgomorov-Sminorv (KS) test is calculated for both the empirical data and the distribution fit to quantify the distance, : where is empirical CDF of . After running the KS test, the and are approximately the exponential distribution random variables as and respectively. MLE is a standard method of estimation in a statistic. It provides efficient estimators than other methods and has been well-known in a distribution fitting. A likelihood function is given as: where the ( | ) = ( = ) is the unknown distribution with a sample 1 , . . . , that was observed. Then, the function of ( ) tells the likelihood of the observed sample as: The value of indicates the MLE, which shows the largest likelihood of the observed data. The MLE often maximizes { ( )} if the data is independent and the likelihood are ( ) = ( ), Since maximizing { ( )} equivalent to maximize ( ) the log-likelihood function can be written as in [25] as: The higher value of the MLE or log-likelihood indicates the best fits of the observed distribution to the empirical data [26].

RESULTS AND ANALYSIS
The total number of the detected idle time in both scenarios is displayed in Figure 6. The number of idle times detected in scenario 2 is higher than scenario 1 with 183 idle times. This situation occurs as there are two stations which are PU and SU competed to get the opportunity to transmit in the channel. Meanwhile, in scenario 1 only 156 idle times detected due to no competition between stations as there are only one PU in the channel. Nevertheless, the detected idle time in scenario 1 have longer period than in scenario 2. Table 3 shows the observation of which represent the idle time base on different distribution fits. The accuracy of the empirical data in this work compared the best distribution fit by parameter of DKS and MLE. According to the DKS value, the GP distribution is the best fit for idle period, in both scenarios. The best and accurate fit is determined based on the minimum values of DKS which indicates that the fit describes the empirical data [2]. In addition, among the distribution fits listed in Table 3, the GP showed the highest value of loglikehood in . The GP distribution also appeared as the best fit to the empirical data in other services as pointed in [27,28].   Figure 7 shows the empirical CDF and the fitted distribution for the idle time. The result indicated that the distribution fits started to approach the empirical CDF when the idle time reached 0.4 ms. Meanwhile during the shorter idle time, which is 0.2 ms, there are a big gap between the empirical CDF and others fitted distribution. According to the KS test, the GP distribution is the best fit for idle period with =0.266. The graph also indicated that 50% of the detected idle times in scenario 1 are lower than 0.1 ms. The empirical CDF and distribution fit for idle time in scenario 2 illustrated in Figure 8. This figure shows that almost 50% of the detected idle times are below than 0.06 ms, which is shorter than in scenario 1. The best fit for is GP distribution with =0.3370 and MLE=3185. In scenario 2 with two users which is PU1 and SU1 (refer Figure 2) access a channel, the idle time (t RB ) is slightly shorter period than the idle time in scenario 1 (Figure 1). The shorter period in scenario 2 happen as there is a competitive condition as many users access the channel at the same time. The WLAN system generates more numbers of t RB to give equal opportunity to other users to transmit in the channel. While longer idle time in scenario 1 will open the longer vacant time for SU to exploit.

CONCLUSION
This paper analyses the idle time detected in the WLAN and compared the duration of the idle time in two different scenarios. Then a statistical approach evaluated from the detected idle and busy time to provide the best fit. The generalised Pareto distribution outperformed other distributions in characterising idle time with the lowest value of DKS=0.266, which meant that this distribution was accurate and approximately represented the empirical data.