Towards Reliable IEEE 802.15.4g SUN with Re-transmission Shaping and Adaptive Modulation Selection

In this paper, we propose and evaluate two mechanisms aimed at improving the communication reliability of IEEE 802.15.4g SUN (Smart Utility Networks) in industrial scenarios: RTS (Re-Transmission Shaping), which uses acknowledgements to track channel conditions and dynamically adapt the number of re-transmissions per packet, and AMS (Adaptive Modulation Selection), which makes use of reinforcement learning based on MAB (Multi-Armed Bandits) to choose the modulation that provides the best reliability for each packet re-transmission. The evaluation of both mechanisms is performed through computer simulations using a dataset obtained from a real-world deployment and two widely used metrics, the PDR (Packet Delivery Ratio) and the RNP (Required Number of Packet transmissions). The PDR measures the ratio between received and transmitted packets, whereas the RNP is the number of packet repetitions before a successful transmission. The results show that both mechanisms allow to increase the communication reliability while not jeopardizing the battery life-time constraints of end devices. For example, when three re-transmissions per packet are allowed, the PDR reaches 98/96% with a RNP of 2.03/1.32 using RTS and AMS, respectively. Additionally, the combination of both proposed mechanisms allows to reach a 99% PDR with a RNP of 1.7, making IEEE 802.15.4g SUN compliant with the stringent data delivery requirements of industrial applications.


Introduction
LPWAN (Low-Power Wide Area Networks) are a key enabling technology for the IIoT (Industrial Internet of Things). They combine robust modulation techniques with low data rates, allowing to create networks with star topologies that support large communication distances between the gateway and the end devices. In turn, this allows to simplify the deployment and maintenance of such networks, which is a critical aspect for their adoption in an industrial context. However, LPWAN technologies face network reliability and scalability issues due to the use of unlicensed spectrum (i.e., 868 MHz band in Europe or 915 MHz band in America, or the 2.4 GHz band worldwide) and the use of low data rates (i.e., in the order Pere Tuset-Peiró peretuset@uoc.edu Extended author information available on the last page of the article. of kbps). In particular, using unlicensed spectrum has an impact on network reliability, as transmissions from other devices within the network and devices from other networks create interference that leads to packet collisions and retransmissions. Also, combining long range communications with low data rates has an impact on scalability, as it limits the number of devices that can be supported by the network. On top of that, industrial applications have stringent reliability requirements that are difficult to meet given these constraints.
One LPWAN technology that specifically focuses on industrial automation is Wi-SUN, targeting SUN (Smart Utility Networks) and FAN (Field Area Networks) applications, which allow to remotely monitor and control industrial equipment in the electric grid and other utilities. The core of Wi-SUN is the 6LoWPAN (IPv6 over Low power Wireless Personal Area Networks) stack, as defined in [21], and the IEEE 802.15.4 standard, as defined in [10]. On one hand, the 6LoWPAN stack combines a reduced version of IPv6 and the related transport and application protocols (i.e., TCP/UDP, CoAP). On the other hand, IEEE 802.15.4 defines the data-link and the physical layers, which are responsible for organizing and managing the underlying network and delivering frames between end devices. This allows end devices to connect directly to the network backbone while having a high reliability and maintaining a low-power profile that allows them to operate using batteries for multiple years.
Considering the rising interest in IEEE 802. 15.4g SUN for LPWAN and its potential impact in the IIoT, in this paper we explore two data-link layer mechanisms that are aimed at improving its reliability. First, RTS (Re-Transmission Shaping) exploits acknowledgements to track channel conditions and dynamically adapt the number of re-transmissions per packet. Second, AMS (Adaptive Modulation Selection) exploits reinforcement learning to determine the most suitable IEEE 802. 15.4g modulation (i.e., FSK, OQPSK and OFDM) for each packet retransmission. To evaluate these mechanisms we perform computer simulations based on a dataset captured from a real-world IEEE 802. 15.4g SUN deployment [23] in an industrial scenario, and we use the PDR (Packet Delivery Ratio) and the RNP (Required Number of Packet transmissions) as the metrics to measure their impact on link reliability. The PDR is the ratio between received and transmitted packets, whereas the RNP is the number of packet repetitions before a successful transmission.
The results show that both AMS and RTS are good mechanisms to improve link reliability. For example, when using up to 3 re-transmissions per packet and a random modulation selection strategy, a PDR=93.4% and a RNP=1.49 are achieved. In particular, RTS allows to reach a PDR=98.2% with a RNP=2.03, whereas AMS allows to reach a PDR=96% with a RNP=1.32. Moreover, we also demonstrate that combining both mechanisms is feasible and beneficial to improve link reliability, reaching a PDR=99% while maintaining a RNP=1.70. Please notice that the work presented in this paper extends the research presented in [8,22] and [18], where we explored the feasibility of using RTS and AMS independently. Specifically, the dataset presented in [22] is the driver of our evaluation. Moreover, compared to [8], we introduce the use of reinforcement learning to determine the best suitable IEEE 802. 15.4g SUN modulation in AMS, and, compared to [18], we combine RTS with AMS to further improve PDR while reducing the RNP.
The remainder of the paper is organized as follows. Section 2 presents an overview of the IEEE 802.15.4g SUN standard and the research related to improving the reliability of LPWAN networks. Section 3 presents an overview of the proposed mechanisms aimed at improving the reliability of IEEE 802. 15.4g SUN, namely re-transmission shaping and adaptive modulation selection. Section 4 presents the system model that we use to evaluate the proposed mechanisms, as well as the system model where both mechanisms are combined. Section 5 presents the methodology, based on computer simulations using an existing IEEE 802. 15.4g dataset, that we have used to evaluate the proposed mechanisms. Section 6 presents and discusses the results obtained from computer simulations. Finally, Section 7 concludes the paper.

Background and Related Work
This section introduces the IEEE 802.15.4g SUN standard and presents the research focused on improving its reliability under the field of LPWANs.

Overview of IEEE 802.15.4g SUN
The IEEE 802.15.4g amendment was included in the IEEE 802.15.4-2015 standard as an alternative to implement LPWANs, which require long-range communications and low-power operation. It defines three new modulations targeted to SUN applications: SUN-FSK, SUN-OQPSK and SUN-OFDM. A total of 31 different physical layer configurations are supported, with bit-rates ranging between 6.25 kbps and 800 kbps. These configurations allow to trade data rate, power consumption, and occupied bandwidth, while providing robust long range communications.
First, the SUN-FSK modulation is included mainly due to its power efficiency and to ensure compatibility with legacy systems, and targets low data rates and high energy efficiency applications. Three different operation modes are defined for each frequency band supported in the standard. They define modulation and channel parameters, such as the modulation type (BFSK or 4FSK), the channel spacing, and the modulation index. The data rate may vary from 50 to 200 kbps.
Second, the SUN-OQPSK modulation is combined with DSSS (Direct Sequence Spread Spectrum) to allow a better resistance to interference. It defines four different rate modes within each frequency band, providing data rates ranging from 6.25 kbps to 500 kbps, depending on the spreading factor. For some bands, it is possible to use an alternative spreading mode, called Multiplexed DSSS (MDSSS) [10]. The OQPSK modulation was also considered in the IEEE 802.15.4-2006 standard, but in the IEEE 802.15.4g, other operation modes were defined, allowing its use in different frequency bands.
Finally, the SUN-OFDM modulation provides high data rates and long communication range, while dealing with interference and multi-path propagation. Four different OFDM options are defined, each one with a different number of active tones. For each option, a set of Modulation and Coding Schemes (MCS), numbered from 0 to 6, may be used. The MSC defines the modulation (i.e., BPSK, QPSK, or 16-QAM), the frequency repetition configuration (i.e., 4x, 2x or no frequency repetition), and the code rate (i.e., 1/2 or 3/4).

Reliability of IEEE 802.15.4g SUN
Over the past few years, several papers have evaluated different aspects of IEEE 802.15.4g SUN, including coverage and link reliability depending on the different physical layer configurations and the target application scenarios.
In [12], the authors evaluate all 31 physical layer configurations of the IEEE 802.15.4g for environmental observations. The results show that the longest radio links were obtained when using SUN-FSK or SUN-OQSPK, compared to SUN-OFDM. In [13], the authors evaluate three IEEE 802.15.4g configurations operating at 2.4 GHz: O-QPSK, OFDM with frequency repetition, and OFDM without frequency repetition, and show that channel hopping makes sense even when using SUN-OFDM, as the channel width is small and all sub-carriers are influenced in a similar way by multi-path fading. In [14], the authors also evaluate the IEEE 802.15.4g using the 2.4 GHz ISM band, for smart building applications. The results show that for the considered indoor scenario, which is severely impacted by multi-path propagation, SUN-OFDM can provide better reliability than SUN-OQPSK. Overall, these papers show that the different physical layer configurations may present different levels of quality for different scenarios. However, the authors do not propose or evaluate any mechanism to deal with the different challenges of low-power wireless communications.
Diversity schemes are widely used in wireless networks to improve the communication reliability, and to deal with the temporal and spatial variations in link quality. Different strategies may be adopted at different layers. At the physical layer, antenna and coding diversity are two well-established mechanisms, but they are not widely used in low-power networks as they require additional or complex hardware. At the data-link layer, packet replication in the time and frequency domains are two simple and widely used diversity schemes. However, packet replication increases node energy consumption and impacts network congestion, whereas frequency diversity requires time-synchronization among nodes, thus increasing complexity.
Several papers have proposed the use of diversity schemes in IEEE 802.15.4 networks. For example, the authors of [15] focus on the data-link layer and propose an adaptive algorithm based on MAC (Medium Access Control) parameters (i.e., macMinBE, macMax-CSMABackoffs, and macMaxFrameRetries) for minimizing power consumption while guaranteeing reliability and delay constraints in the packet transmission. The works in [27] and [26] proposed the use of time synchronization and channel hopping (i.e., sending subsequent packets over different frequency channels) at the physical layer as a means to combat both multi-path propagation and external interference. The protocol proposed in [7] combines multichannel communication, real-time link quality estimation, and dynamic channel allocation, to deal with the problems that affect the link quality in industrial environments.
Several recent papers have applied machine learning in IEEE 802.15.4 networks. In [2], different supervisedlearning algorithms are evaluated for the inference of the radio-link state, i.e., LoS (Line-of-Sight) or NLoS (Non Line-of-Sight) radio links. By monitoring the link state in real-time it is possible to dynamically adapt the transmission scheme to improve reliability. However, only the O-QPSK modulation at 2.4 GHz is considered in the evaluations, and no diversity mechanism is proposed.
In [6,20], and [3], MAB (Multi-Armed Bandit) algorithms are used to optimize IEEE 802.15.4-TSCH (Time Slotted Channel Hopping) networks. In the first one, the scheduling problem is modeled in terms of a combinatorial MAB process, with the goal of computing the optimal schedule based on real-time interactions with the wireless network. In the second one, the channel quality estimation process is modeled as a MAB problem, in order to classify the channels and manage the blacklists (i.e., the list of channels that are not allowed to be used by the nodes). In [3], the authors also use MAB algorithms to select channels in TSCH networks. In [9], the distributed channel selection problem is modeled as a MAB problem to select channels in IEEE 802.15.4g networks under the interference of Sigfox and LoRaWAN devices.
Some articles already discussed in this section propose adaptive diversity strategies to improve network reliability, but they do not consider the use of multiple modulations. Modulation diversity is a method to improve the reliability of communications by using different modulations. That is, consecutive packets can be transmitted using two or more modulations (e.g., FSK or PSK), taking advantage of their different properties regarding propagation and interference effects. For example, it is well known that narrowband modulations, such as FSK, are more robust against interference, whereas wideband modulations, such as OQPSK-DSSS, provide better tolerance against multipath propagation.
Regarding modulation diversity, some papers have applied this concept in different ways and for different purposes. In [28], a dual mode IEEE 802.15.4 receiver is proposed. The receiver can choose between a MSK (Minimum Shift Keying) detector or a OQPSK detector to trade energy consumption, latency, and reliability. In addition, it can define the mode based on a SNR indicator to optimize performance. However, the authors do not consider the use of the SUN modulations, neither propose the use of different modulations to transmit the packets. In [19], the authors propose the use of cooperative modulation diversity to improve reliability. In particular, nodes rotate a QPSK constellation and interleave the phase and quadrature components independently. However, this approach requires modification at the physical layer, and the use of relay nodes, which can be difficult in sparse networks. Finally, in [17] the authors propose using modulation diversity for LoRa networks to improve a localization algorithm. The modulation diversity is obtained by changing bandwidth, spreading factor and code rate. Overall, these papers propose the use of modulation diversity, but they do not propose or evaluate adaptive modulation selection strategies for low-power wireless networks.
More recently, some papers have proposed using different physical layers in TSCH networks. In [16], the 6TiSCH implementation of the OpenWSN operating system is evaluated using three different physical layer configurations: SUN-FSK Option 1 (at 50 kbps and using the 868 MHz band), SUN-OFDM Option 1 MCS 3 (at 800 kbps and using the 868 MHz band), and 6TiSCH default physical layer (O-QPSK at 250 kbps and using the 2.4 GHz band). The results show that SUN-FSK provides the best network formation time, reliability and latency, while the O-QPSK provides the longest lifetime. The SUN-OFDM presents balanced results.
In [4], the use of different MCS of the SUN-OFDM in TSCH networks is proposed. The slot bonding concept is introduced, to deal with the different bit rates of the different PHY configurations, and a mixed integer linear program model is described, to determine the optimal configuration aiming to maximize the PDR and minimize energy consumption. However, the model does not consider the temporal and spatial variations in link quality that occur in dynamic scenarios (e.g., in industrial environments). In practice, the proposed slot bonding scheduling would need to be recomputed continuously to deal with these variations, which could incur in a high overhead. Although [16] and [4] have evaluated different physical layer configurations, including SUN modulations, the different modulations were not used in a combined way to improve reliability and no adaptive mechanism to select the modulations was proposed.
In [25], a multi-PHY TSCH protocol is proposed using fixed time slots, but allowing to transmit several packets inside a slot when a modulation with high bit rate is used. In addition, the protocol dynamically selects the more appropriate modulation to be used by a node using a RSSI (Received Signal Strength Indicator) based link quality estimator. As it can be observed in [23], it may be difficult to asses the quality of the link using only RSSI. However, other estimators could be integrated to the proposed protocol. The protocol was evaluated experimentally considering two modulations (i.e., SUN-FSK at 50 kbps and the nonstandardized 4-GFSK at 1 Mbps). Although this paper has proposed the use of dynamic modulation selection, the concept of modulation diversity to transmit consecutive packets using different modulations to improve reliability is not considered.
In [8], three different adaptive modulation diversity strategies are proposed for IEEE 802.15.4g SUN networks, called 1M, 2M and 3M. These strategies use a simple link quality estimation mechanism based on the ACK Reception Ratio. In this current paper, we compare our adaptive modulation selection strategy based on MAB algorithms to the 3M mechanism, which presents the better performance in the evaluations described in [8].
In summary, despite some recent works have proposed the use of modulation diversity, to the best of our knowledge, this is the first work that proposes and evaluates the combination of two mechanisms (i.e., retransmission shaping and adaptive modulation selection) aimed to improve the communication reliability of IEEE 802.15.4g SUN networks in industrial scenarios. In addition, different from other described solutions, this work uses reinforcement learning to choose the modulation to be used in each packet re-transmission.

Overview of Re-transmission Shaping and Adaptive Modulation Selection
In this section, we introduce the RTS (Re-Transmission Shaping) and the AMS (Adaptive Modulation Selection) mechanisms, both proposed to improve the link reliability of IEEE 802.15.4g SUN networks.

Re-transmission Shaping
The re-transmission of packets is a common mechanism used at the data-link layer to guarantee the delivery of data packets between an end-device and a gateway using a wireless communication technology. Whenever the acknowledgment packet from the gateway is not received at the end-device due to physical layer effects (i.e., multipath propagation or internal/external interference), the enddevice re-transmits the original data packet again to provide another opportunity for the packet to be successfully delivered. However, since physical layer effects are not deterministic and packet re-transmission increases the energy consumption of the end-device and the network load, a maximum number of re-transmissions per data packet is typically set. Nevertheless, the use of a fixed number of retransmissions per packet may not be optimal. In particular, if channel conditions are too adverse, an originating enddevice could need more than the fixed number of retransmissions to deliver a given data packet. Contrarily, when the link conditions are favorable, the end-device does not need to use all the allowed re-transmission attempts for most data packets.
In order to address the aforementioned problem, RTS can dynamically adapt the number of maximum retransmissions per packet according to channel conditions in order to meet both the data delivery requirements of the application and the target battery lifetime of end-devices. That is, given the average number of re-transmissions per data packet, RTS keeps track of the number of retransmissions that have not been used to transmit previous data packets (e.g., when packets have been received by the gateway at the first transmission attempt). These unused re-transmission attempts are accumulated and can be used in the future when channel conditions are adverse and the average number of re-transmissions per data packet is not sufficient to guarantee a successful delivery.

Adaptive Modulation Selection
As introduced in Section 2, the IEEE 802.15.4g SUN standard defines multiple modulations (i.e., SUN-FSK, SUN-OQPSK and SUN-OFDM) and each modulation has different properties (i.e., occupied bandwidth, data rate and robustness mechanisms) against physical layer effects. Hence, the possibility of using different modulations for each packet re-transmission emerges naturally to take advantage of the varying nature of channel conditions at the physical layer. The decision process to decide which modulation to use for each packet re-transmission can be performed using simple learning techniques based on the probability of success of each modulation. That is, for each packet re-transmission we can use the acknowledgment packet from the gateway as a reward to evaluate the performance of each modulation, thus assigning the modulation associated with the highest probability of receiving an acknowledgement packet.
The problem of recurrently choosing an action from a set (either finite or infinite) based on the expected reward that the action will produce has been widely studied and formalized in the model known as Multi-Armed Bandit (MAB) problem. In this specific setting, we also have to deal with the non-stationarity of the wireless channel, but the MAB problem lacks a complete formal solution for such case. Nevertheless, there is a variety of algorithms that have proven their effectiveness for this task [11]. Among them, in this paper we evaluate some of the most common ones: Epsilon Greedy (EG), Boltzmann Exploration (BE), also known as Softmax, and the Discounted UCB1 algorithm (D-UCB) [5].

System Model
In this section we present the system models that describe the operation of each of the mechanisms introduced in the previous section.
In general, we consider a network with n end-devices, equipped with a battery of capacity C (mAh), that periodically transmit a data packet with length L (bytes) and period T (seconds), and one gateway that receives the packets transmitted by the end-devices. Upon successfully receiving a data packet from an end-device, the gateway transmits an acknowledgment packet (ACK) back to the originating end-device. If the originating end-device does not receive the ACK, either because the original data packet is not successfully received or the ACK is lost, then the enddevice re-transmits the data packet, if the maximum number of re-transmissions has not been reached yet.

Re-transmission Shaping
For RTS, considering the values of C, L and T , we define N AVERAGE as the average number of re-transmissions allowed per data packet to be transmitted to the gateway. We assume that the value of N AVERAGE is set in advance in order to meet the battery lifetime of the end device in the worst case scenario, i.e., when each data packet requires the maximum allowed number of re-transmissions. For example, if N AVERAGE =3, then the end device is allowed to perform 3 transmissions attempts per data packet while operating for 1 year. 1 Then, when the source end device succeeds in transmitting a data packet i with a number X i of re-transmissions, being 0 ≤ X i < N AVERAGE , the number U i = N AVERAGE − X i of unused re-transmissions can be accumulated for the re-transmission of subsequent data packets.
Hence, as depicted in Figure 1, the model of the re-transmission shaping mechanism is based on 5 variables: N AVERAGE , N MAXIMUM , N ALLOWED , N USED and N AVAILABLE , as described next: -N AVERAGE is an input value that represents the average number of transmission attempts per data packet that are allowed while ensuring the lifetime of the end device. It is not required to be an integer, as it can be directly derived from the energy consumption constraint of the end device; -N MAXIMUM is an input value that represents the number of extra re-transmissions per packet that are allowed in addition to N AVERAGE ; Figure 1 Diagram of the re-transmission shaping mechanism with the input (N AVERAGE , N MAXIMUM and N USED ), output (N ALLOWED ) and the internal state (N AVAILABLE ) variables, and its relationship with the data-link layer.
-N ALLOWED is an output value that represents the maximum number of re-transmissions that are allowed for the current data packet being transmitted; -N USED is an input value that represents the number of re-transmissions that have been required to successfully deliver the previous data packet to the gateway; -N AVAILABLE is an internal state variable that accumulates the number of re-transmissions that have not been spent in previous data packet transmissions and, hence, can be used in the future. It is initialized to zero.
Regarding N MAXIMUM , notice that its value is set to avoid a given packet transmission that experiences adverse instantaneous channel conditions to deplete all the N ALLOWED re-transmissions available. Hence, its value has to be set depending on the context of each deployment. For environments presenting short deep drops in the link reliability it can be set to a high value, allowing to strongly increase the number of re-transmissions for short periods of time. In contrast, for environments with long shallow drops in link reliability it can be set to a low value, allowing to extend the effects of re-transmission shaping for a longer period of time.
Using these variables, the operating principle of the retransmission shaping mechanism is the following. Before a data packet transmission starts, the re-transmission shaping mechanism calculates the N ALLOWED of re-transmissions available as so that, if unused re-transmissions are available, they are added to N AVERAGE without exceeding the threshold set by N MAXIMUM .
The N ALLOWED value is then used by the data-link layer to perform re-transmissions until the data packet is either successfully delivered (i.e., including the reception of the ACK) or the number of re-transmissions becomes zero and no more re-transmissions can be performed. In either case, the re-transmission shaping module receives the number N USED of re-transmissions used for that particular data packet and performs the following operation to update the internal N AVAILABLE variable: Notice that N AVAILABLE cannot be negative because N USED is always lower than N ALLOWED , which depends on N AVAILABLE as in Eq. 1.
Since in the first iteration N AVAILABLE = 0, the formula that expresses N AVAILABLE for a generic time step k is: Notice that using these variables, the re-transmission shaping mechanism can emulate the usual re-transmissions strategy where N AVERAGE is set to a constant value per data packet. This behavior can be achieved by setting N AVERAGE to an integer value and making N MAXIMUM equal to zero. In that case, for every data packet transmission the maximum number of re-transmission attempts is constantly equal to N AVERAGE . In this case, the re-transmission shaping mechanism does not perform any additional task to the basic fixed number of re-transmissions case. In fact, as it can be noticed from Eq. 1, despite N AVAILABLE increases, the value of N ALLOWED is always upper bounded by N AVERAGE . Hereinafter, we will refer to the particular case in which N MAXIMUM is set to zero as no re-transmission shaping (i.e.,  noRS), as this represents the base scenario that allows to compare the performance gains provided by our proposal.

Adaptive Modulation Selection
In order to practically implement AMS and to allow for a meaningful results comparison with the re-transmission shaping, we have implemented a basic re-transmission schema on top of it. Hence, we assume that we can retransmit each packet up to a fixed number of attempts which we denote as N ALLOWED , in analogy with the RTS mechanism presented earlier. In any practical implementation, this value is set by taking into account the values of C, L, and T defined earlier in order to ensure that the energy consumption of the end device is below the desired boundary.
For each re-transmission attempt, the end device may choose between three available modulations (i.e., SUN-FSK, SUN-OQPSK, and SUN-OFDM). 2 The gateway has three radio modules, and may receive packets using the three different modulations simultaneously. Depending on the channel conditions, these modulations present different physical properties and different probabilities of successfully delivering a packet. Hence, the task for each re-transmission is to estimate and select the modulation associated with the highest packet delivery probability, based on the history of previous packet re-transmission attempts. The general framework that we have adopted is the MAB problem, as introduced in Section 3.2.
To model this scenario, as depicted in Figure 2, we use 4 variables: -N ALLOWED is an integer input value of the modulation selection block that specifies the (fixed) number of allowed re-transmissions per packet; -MOD AVAILABLE is a state variable that contains the information about the available modulations, a quality index is associated to each modulation; -MOD K is an output variable that represents the modulation selected by the modulation selection block to transmit the packet in the link layer; -ACK is a boolean input variable of the modulation selection block, and an output of the data-link layer, that indicates whether the data packet has been received by the gateway or not. It is used to update the quality indexes stored in MOD AVAILABLE as well as to stop the current packet re-transmissions.
The modulation selection block iteratively selects the value of MOD K that presents the highest quality index, as stored in MOD AVAILABLE . Each data packet is retransmitted up to N ALLOWED times, as in the basic retransmission mechanism. When the current data packet is successfully delivered, the link layer returns a positive ACK and the subsequent packet is transmitted. The ACK value (1 if positive, 0 if negative) is used as a reward in the MAB formulation to update the quality index of the modulation MOD K .

Combining Re-transmission Shaping and Adaptive Modulation Selection
The RTS and AMS mechanisms can be easily merged into a single algorithm with minimal changes to their original structure, as depicted in Figure 3. The RTS block works as in the previous case. For each packet, the value of N ALLOWED is the output of the RTS block and taken as input by the AMS block. In addition, the AMS block is adapted to provide the N USED variable as an output. This variable represents the number of packet re-transmissions before receiving a positive ACK or reaching the maximum number of available packet re-transmissions. As in the pure retransmission shaping scenario, this value is used to update N AVAILABLE . Finally, the data link layer block behaves identically as in the pure modulation selection scenario.

Evaluation Methodology
In this section, we introduce the metrics, the dataset, and the simulator that we use to evaluate the performance of the retransmission shaping and the adaptive modulation selection mechanisms described in Section 3. We also introduce the complementary modulation selection strategies that serve

Performance Metrics
To evaluate the suitability of the re-transmission shaping and the adaptive modulation selection mechanisms we use two performance metrics: -PDR (Packet Delivery Ratio): defined as the ratio between the number of successfully delivered data packets and the total number of transmitted data packets at the application layer, and the -RNP (Required Number of Packet transmissions): defined as the average number of packet transmission attempts before a data packet is successfully delivered.
Both PDR and RNP metrics have been selected according to [1], where the authors present a survey of radio link quality estimation techniques specifically targeted at lowpower wireless communications. Both metrics provide complementary information regarding the link quality and allow to study the benefits introduced by each proposed mechanism, as well as the combination of both mechanisms together. Please take into account that the objective of both the re-transmission shaping and the adaptive modulation selection mechanisms is to provide the highest possible PDR while maintaining the lowest possible RNP for an instantaneous link quality between a given end device and the gateway. This minimizes the amount of sensor data losses due to transmission failures and, at the same time, ensures that end devices can operate for the planned duration.

Dataset Overview
The dataset used to evaluate the mechanisms presented in this paper is detailed in [22], and is publicly available in a GitHub repository. 3 As a summary, the network has been deployed in an industrial warehouse located in Madrid (110,044 m 2 ) for 99 days, and consists of 11 sensing end devices and 1 gateway that provides Internet connectivity. Both the sensing end devices and the gateway are built using OpenMote-B boards [24], which include the Atmel AT86RF215 radio-transceiver that fully supports the IEEE 802.15.4g SUN standard operating in the Sub-GHz and the 2.4 GHz bands.
Regarding network operation, the end devices sample the sensors (i.e., temperature, relative humidity, pressure and light) once every minute and transmit a packet with 32 byte of payload nine times, three times with each IEEE 802. 15  receive packets using the three modulations simultaneously without requiring any synchronization from the end devices. Also, notice that all data and acknowledgement packet transmission between end devices and the gateway (and vice-versa) are performed at the same data rate (i.e., 50 kbps) regardless of the IEEE 802.15.4g SUN modulation being used, as described in [22].
Given the deployment characteristics, we have arranged the end devices based on their distance from the gateway and on their position in the warehouse. Specifically, end devices deployed within a distance of 80m (i.e., EUI-16: 56-53, 55-AD and 55-E4) have been included in the close group, end devices presenting a distance between 80m and 150m from the gateway (i.e., EUI-16: 55-99, 55-DD, 55-65 and 56-0B) have been included in the medium group while all the remaining ones (i.e., EUI-16: 56-32, 55-B3, 55-63 and 63-0A) have been considered as belonging to the far group.
Finally, Figure 4 shows the median and the IQR (Inter-Quartile Range) of the PDR for each of the IEEE 802.15.4g SUN modulations for the different end device groups. The average PDR at the physical layer is 73.3% for the close group, 76.4% for the medium group and 77.4% for the far group. However, as it can be observed, there is a high variability in the PDR, with values ranging from 0% to values close to 100% regardless of the end device group. Moreover, the close group presents a worse average PDR than the medium group and the far group. Despite being counter-intuitive, this can be attributed to the fact that devices in the close group operate in an environment with higher radio-frequency interference levels. As all nodes use a CCA (Clear-Channel Assessment) mechanism prior to transmitting a data packet to avoid collisions, nodes in the close group refrain from transmitting more often than nodes in the other groups, thus creating an artificially lower PDR value.

Simulation Process
To evaluate the re-transmission shaping and the adaptive modulation selection mechanisms using the PDR and the RNP metrics, we have written a Python simulator that uses the trace files obtained from the dataset presented earlier.
Each trace file has been created from the original dataset by estimating the transmission packet success probability for each IEEE 802.15.4g modulation. In particular, for each modulation we compute the frequency of successful packet transmissions over a sliding window of n minutes (n = 5 was used in our analysis). It is important to remark that the sliding windows do not overlap; each new window starts where the previous one ends. Also notice that to avoid windows with a transmitting probability equal to zero, windows have been spread until the first successful transmission has been found, the width of each window has been recorded. If no successful transmission has been recorded in more than 75 minutes, the corresponding window has been removed from the trace file. This is done to remove Internet connectivity outage periods  where the gateway could not transmit any packet to the database. For each n-minutes window, n new packet transmission simulations have been run. All the re-transmissions of a specific packet have been ideally considered as instantaneous; i.e., the success probability is considered to be equal for all the re-transmission attempts of the same packet and there is no fixed limit for the number of re-transmissions within a window. When the gateway receives the packets, it transmits an acknowledgement packet with the same modulation of the incoming packet. Please notice that the probability of the acknowledgement packet being received is considered equal to the probability of successfully transmit the corresponding data packet (i.e., the channel is considered to be symmetric in the two directions of the link).

Modulation Selection Strategies
To evaluate the RTS and the AMS mechanisms presented earlier, we also define the RANDOM and the BEST modulation selection strategies. On the one hand, the RANDOM modulation selection strategy consists simply in randomly choosing a modulation for each transmission attempt from a uniform distribution. On the other hand, the BEST modulation selection strategy consists in choosing always the modulation associated with the highest probability of delivering the packet. Hence, the RANDOM strategy represents a lower performance bound, whereas the BEST strategy represents an upper bound for each specific case.
In addition to the RANDOM and BEST modulation selection strategies, we also compare the MAB-based adaptive modulation selection strategies to the 3M strategy, which was introduced and evaluated in [8]. In contrast to the MAB-based strategies, for each transmission, 3M evaluates the probability to use each modulation a according to where parameter w is used to control the differences between the calculated probabilities and ARR(a) is the ACK Reception Ratio, defined as the ratio between the number of ACKs received successfully and the number  of transmitted packets in a given interval with modulation a. As defined in [8], the width of the interval used to compute ARR(a) has been set equal to 10 consecutive retransmissions while the value of w has been set to 20. Once P a (t) has been computed the algorithm then samples the modulation from the obtained probability distribution.

Results
This section presents the results obtained when applying the RTS and the AMS mechanisms with the objective of improving link reliability. In particular, Section 6.1 presents the results for RTS, Section 6.2 presents the results for AMS, and, finally, Section 6.3 presents the results when combining both techniques simultaneously. In all cases, we use the IEEE 802.15.4g SUN dataset, as well as the PDR and RNP metrics described earlier. Also, the results present the average for 30 repetitions for each maximum number of packet re-transmissions value. Please notice that the lines shown in each plot are synchronized in time, making it possible to observe the common behaviour of the series in correspondence of a drop in the channel reliability caused by environmental conditions.
We use the following parameters for the different

Re-transmission Shaping
As described earlier, RTS allows to accumulate the unused re-transmissions for each packet transmission, and use them at a later time to compensate for the lower PDR values that may occur unexpectedly due to propagation and interference conditions. Hence, we expect RTS to improve the PDR, as more packets will be delivered. However, we also expect an increase in the RNP as more packets will be transmitted. Figure 5 presents the evolution of the PDR (a) and RNP (b) metrics for the RANDOM and BEST modulation selection strategies with and without RTS for N AVERAGE = 3. As it can be observed, RTS allows to increase the a b  Regarding RNP, we observe that RTS increases the average RNP for both the RANDOM and the BEST strategies. Specifically, RTS increases the RNP by 40% (from 1.50 to 2.10) for the RANDOM strategy and by 27% (from 1.26 to 1.6) for the BEST strategy. We now consider different average re-transmissions per packet (i.e., N AVERAGE = {2, 3, 6, 9}) to show their impact on the PDR and the RNP metrics. Notice that in all cases the maximum number of re-transmissions per packet has been set to N MAXIMUM = 9 to ensure that devices can operate for the established lifetime according to the values of C, L and T , as defined in Section 4.
As it can be observed in Figure 6 and in Table 1, for N AVERAGE = 1 the PDR is 77.4% and 88.9% for the BEST and RANDOM strategies regardless of whether re-transmission shaping is enabled or not. Also, when N AVERAGE = 1 we have that RNP = 1 for all cases. Both results are expected, as we only allow one packet re-transmission per packet and, hence, the re-transmission shaping does not make any effect.
As we increase N AVERAGE to N AVERAGE = 2, it is interesting to notice that for the RANDOM modulation selection strategy using the RTS mechanism the PDR improves by 6.9% (from 89.3% to 96.2%) while the RNP increases 35.9% (from 1.31 to 1.78). In contrast, setting N AVERAGE = 3 to the RANDOM modulation selection strategy without RTS enabled would increase PDR by 4.6% (from 89.3% to 93.4%) while the RNP would increase by 13.7% (from 1.31 to 1.49). With N AVERAGE = 6 and retransmission shaping the PDR reaches 98.8%, whereas with N AVERAGE = 9 the PDR reaches 99.2%. Of course, the RNP raises to 2.20 and 2.35 respectively, indicating that more packets are required on average.
To summarize the obtained results, Figure 7 shows the absolute PDR and RNP values for the RTS mechanism  with the RANDOM and the BEST modulation selection strategies, whereas Table 2 shows the percentage variation of the final PDR and RNP values for the BEST and RANDOM strategies with respect to the RANDOM strategy without the RTS mechanism. As it can be observed in Table 2 for the All group, the RTS mechanism with the RANDOM strategy improves the PDR between 1.0% and 7.7% and increases the RNP between 18% and 36%.
In summary, these results show the benefits of using RTS in terms of increasing the PDR while keeping the RNP bounded. As expected, adding more re-transmissions per packet allows to increase the PDR, but further increasing the PDR presents diminishing returns as the value approaches 100%. Also, adding more re-transmissions per packet increases the RNP and, hence, the energy consumption of the node. However, in all cases the mean RNP value is well below the N AVERAGE set for each experiment, indicating that the node will be able to operate for the planned duration. Hence, adding RTS can contribute to increasing the reliability while ensuring the durability of the network.

Adaptive Modulation Selection
As presented earlier, AMS exploits reinforcement learning based on the MAB algorithms (i.e., Epsilon Greedy, Boltzmann Exploration and Discounted UCB) to determine the best modulation to use for each packet re-transmission based on channel conditions. Despite propagation and interference effects in the wireless channel are random, each IEEE 802.15.4g SUN modulation has different physical layer properties (i.e., occupied bandwidth, sensitivity, etc.) that allow the AMS strategies to adapt better to these effects and, thus, perform better on average. Hence, we expect the MAB-based AMS strategies to perform better in terms of PDR and RNP than the RANDOM and 3M modulation selection strategies, closing the gap with the BEST strategy. Figure 8 presents the evolution of the average PDR (a) and the RNP (b) metrics for the different AMS strategies, as presented in Section 3.2, when using up to 3 retransmissions per packet (i.e., N AVERAGE = 3) without RTS. As expected, the BEST and RANDOM strategies present the highest and lowest PDR, respectively, and the lowest and the highest RNP values for all the groups. Please notice that the drops in the PDR and the corresponding boosts in RNP are caused by a drop in the quality of all the available modulations, as shown by the fact that the BEST strategy follows the same general schema of the other modulation selection strategies. Also notice that all the MAB-based AMS strategies perform similarly and improve over the RANDOM strategy for both the PDR and RNP. In particular, compared to BEST, the EG algorithm provides the highest PDR, whereas the BE algorithm provides the lowest RNP.  Taking into account the average re-transmissions per packet (i.e., N AVERAGE = {2, 3, 6, 9}), Figure 9 and Table 3 show the evolution of the resulting PDR and the RNP metrics. As it can be observed, when allowing 1 retransmissions per packet (i.e., N AVERAGE = 1) the EG strategy is able to bring the PDR from the baseline of 77% (i.e., RANDOM strategy) to 86%. Similarly, when allowing up to 3 re-transmissions per packet (i.e., N AVERAGE = 3) the EG algorithm allows to increase the base PDR by three percentage points (from 93% to 96%) with respect to the RANDOM strategy, while also decreasing the RNP by 12.8% (from 1.49 to 1.32). Finally, when allowing up to 9 re-transmissions per packet (i.e., N AVERAGE = 9), the EG algorithm allows to reach an average PDR=99% requiring a RNP=1.67. In contrast, the RANDOM and 3M strategies only provide a PDR=98% with an RNP=1.98 and RNP=1.71, respectively.
To summarize the obtained results, Figure 10 shows the absolute PDR and RNP values for the different AMS strategies, whereas Table 4 shows the percentage variation of the final PDR and RNP values of the 3M and EG strategies with respect to the RANDOM strategy. In both cases the results are presented depending on the allowed number of re-transmissions per packet (i.e., N AVERAGE ). Also, please notice that re-transmission shaping is not used in either case, meaning that each packet is re-transmitted up to N AVERAGE times regardless of the outcome of the previous packet transmissions. As it can be observed, EG performs slightly better than the 3M strategy in terms of both the PDR and the RNP. This can be explained by the fact that EG balances exploration and exploitation by choosing between them randomly, whereas 3M selects each modulation based on its predicted success probability. Hence, once 3M has determined the probabilities of each modulation it becomes more vulnerable to instantaneous channel condition changes of the preferred modulation, which can lead to increased packet loss.
Overall, these results show that the proposed MAB-based can improve over both the RANDOM and 3M modulation selection strategies, resulting in more robust and efficient packet transmissions. In fact, allowing large N AVERAGE values (i.e., N AVERAGE = 9) allows to reach a PDR=99% while keeping a RNP=1.67. That is, thanks to AMS a node with a battery dimensioned to transmit up to two repetitions per packet on average will be able to reach the PDR required by industrial applications (i.e., PDR=99%) without an impact in its predicted battery lifetime.

Combining Re-transmission Shaping with Adaptive Modulation Selection
As demonstrated in the Sections 6.1 and 6.2, both RTS and AMS mechanisms allow to substantially increase the PDR while keeping the RNP bounded, which translates into more robust and efficient packet transmissions. Based on these results, we now explore the benefits that can be obtained by combining both RTS and AMS simultaneously. Figure 11 presents the evolution of the PDR (a) and RNP (b) metrics for all the AMS strategies with RTS enabled, and compares them to the RANDOM and BEST AMS strategies without RTS enabled. In both cases the results are presented  Considering the average number of re-transmissions per packet (i.e., N AVERAGE = {2, 3, 6, 9}), Figure 12 and Table 5 shows the evolution of the resulting PDR and the RNP metrics when combining RTS and AMS. As it can be observed, the EG AMS strategy with RTS enabled can reach a PDR=99% with an RNP=1.70 with only up to 3 retransmissions per packet (i.e., N AVERAGE = 3). In contrast, when RTS is not enabled the EG AMS strategy required up to 9 re-transmissions per packet (i.e., N AVERAGE = 9) with a RNP=1.67. Alternatively, with up to 3 re-transmissions per packet (i.e., N AVERAGE = 3), the EG strategy without RTS enabled could only provide a PDR=96% with a RNP=1.32.
To summarize the obtained results, Figure 13 depicts the benefits of combining AMS and RTS by showing the absolute PDR and RNP values for the different AMS strategies with and without RTS enabled. Similarly, Table 6 summarizes the percentage variation of the final PDR and RNP values for the 3M and EG strategies with RTS enabled, with respect to the RANDOM strategy without RTS. As it can be observed, the combination of AMS and RTS allows to overpass the BEST strategy without RTS in terms of PDR, and to narrow the gap with the BEST strategy with RTS when the number of re-transmissions per packet is equal or greater than 3 (i.e., N AVERAGE = 3). Moreover, the results show that EG performs better than 3M in all cases and for both metrics, indicating that MAB-based AMS strategies are more suitable considering the random nature of the wireless channel.  In summary, the obtained results demonstrate the benefits brought by combining the AMS and the RTS techniques. In particular, combining AMS and RTS allows to reach a PDR=99% with a RNP=1.7 while requiring an N AVERAGE = 3. Hence, combining AMD and RTS should be considered to ensure link reliability of IEEE 802.15.4.g networks while maintaining a number of packet re-transmissions that is below the target.

Conclusions
In this paper, we have presented and evaluated the application of RTS (Re-Transmission Shaping) and AMS (Adaptive Modulation Selection) mechanisms to improve the communication reliability of IEEE 802.15.4g SUN (Smart utility Networks). On one hand, RTS uses acknowledgements to track channel conditions and allows to dynamically adapt the number of re-transmissions per packet. On the other hand, AMS is based on MAB (Multi-Armed Bandit) algorithms and allows to determine the best combination of IEEE 802.15.4g SUN modulations (i.e., FSK, OQPSK and OFDM) to transmit a given number of packet repetitions (i.e., 1, 2, 3, 6 and 9, respectively).
Both RTS and AMS have been applied to a IEEE 802.15.4g SUN dataset obtained from a real-world scenario, and we have used the PDR (Packet Delivery Ratio) and the RNP (Required Number of Packet transmissions) metrics to determine the suitability of each approach. The results, based on computer simulations, show that both RTS and AMS are useful tools to improve link reliability of a IEEE 802.15.4g SUN network. Each mechanism alone allows to improve over the baseline metrics and, as expected, combining them allows to achieve the best results. In particular, combining both mechanisms allows to reach a target PDR=99% while only requiring a RNP=1.7.
Hence, we conclude that applying the proposed adaptive modulation selection and re-transmission shaping strategies should be considered for IEEE 802.15.4g SUN deployments, as it allows to reach the target PDR of 99% while minimizing the energy expenditure. Given the interest of the results and its potential applicability to a real-world environment, as future work we will focus on implementing and deploying both mechanism in a real-world setup to validate the results from both the link reliability and the energy consumption perspectives.