An adaptive IoT architecture using combination of concept-drift and dynamic software product line engineering

Internet of things (IoT) architecture needs to adapt autonomously to the environment and operational to maintain their supreme services. One common problem in the IoT architecture is to manage the reliability of data services, such as sensors’ data, that only sending data to the collector via gateway. If there is a disruption of services, then it is not easy to manage the system reliability. To this, an adaptive environment which is based on software reconfiguration creates a great challenge to provide better services. In this work, the software product line engineering (SPLE) reconfigures the edge devices via rules and software architecture. To identify disruption of data services which can be detected based on anomaly and truncated data. Our work makes use of concept drift to provide a recommendation to the system manager. This is important to avoid misconfiguration in the system We demonstrate our method using an open-source internet of things portal system that integrated to a cluster of sensors which is attached to specific gateway before the data are collected into a cloud storage for further processes. In identifying drifting data, the adaptive sliding window (ADWIN) method outperforms the Page-Hinkley (PH) with more selective identification and sensitive reading. work adopted the dynamic software product line (DSPLE) approach to handle the change of architecture configuration. In sensing the data drifting from a sensor or feature, our technique makes use of concept drift to detect anomaly and misbehavior of data pattern. Despite of bandwidth limitation in most IoT infrastructure. We manage to create a dynamic IoT architecture which, autonomously, reconfigure the edge devices to have its condition of work into normal stage. Our work also makes light on alternative adaptive IoT architecture that is not rely only on devices monitoring but based on data pattern as indicator of problem within the ecosystem. Our results show that the adaptive sliding window (ADWIN) method outperforms the Page-Hinkley with more selective identification and sensitive on data reading. For future works, we would like to add a mechanism to be able to handle failure detection during a reconfiguration process with minimal supervision by the deployment manager. This is important to reduce data traffic inside the IoT ecosystem.


INTRODUCTION
In recent years, Internet of things (IoT) has become a common technology to support human daily activities for almost any living sectors. In the year 2019, at least, 4.81 billion of IoT units had been installed, globally [1]. And it is predicted more than $389 billions [1] will be spent for IoT system within three years. The huge demand of IoT has also increased, since smart technology for transportation, vehicle and residence, mostly, use IoT technology to provide high end services. IoT services has also become a primary demand to have better improvement for public services such as health system [2], intelligent transportation [3], traffic management [4], civil works [5] and many other sectors. Where in their operation, IoT produces vast amount of data that retrieved from a high number of sensors and devices attached to the system. The incoming data are transferred from many sensors into, mostly, cloud data collector before analysed for further use.
In general, the architecture of IoT has capability to provide services to convey one on making their decision based on data reading from sensors and other inputs for specific purpose with the help of machine learning and big data technology. Machine learning processes that using real-time input data should have stable stream as it has different behaviour compares to static data analysis. To this, an unstable flow of data may lead  An adaptive IoT architecture using combination of concept-drift and dynamic… (I Made Murwantara) 1227 into wrong decision which will impact to end-users. As an example, flooding emergency system depends on the water flow measurement that obtained from groups of sensors installed along the monitoring river. During rainy season, some sensors have problem to send their information, the emergency warning system may generate wrong recommendation to the operator. Another critical example is the application of IoT for healthcare system, such as intensive care unit (ICU) that rely on accurate measurement. If the reading of combination of sensors, such as cardiac monitoring and air ventilator, has slightly intermittent data reading, then an alarm of fatality might be raised by the system. This will cause error on health critical analysis. As a result, a mechanism to identify the data behaviour should be developed to address the aforementioned problems.
Despite of IoT has been widely implemented in many sectors, some hesitations on how accurate and significant the data that will be processed, especially, using machine learning still questionable. This work focuses on how to build an adaptive IoT architecture that concern on data drifting. This is important, since sensors [6] may have distortion regarding of their data because of their wrong installation, positioning or malfunction. For example, sensor temperature manages to read the change of temperature on normal condition. However, the application of sensor with small distances to a heat generator such as water boiler will lead into wrong information. Adaptive system provides a kind of autonomous capability to self-adjustment on differences in order for a system to have a reliable and good performance [7]. An adaptive system such as in cloud computing [8] can be realized as an attenuation to specific condition. As an example, in Murwantara et al. [8] make use of transition model to explicate how the change should be going on based on certain condition.
In this work, we focused on data as the sensing elements whether an architecture should have new reconfiguration or not. Some methods using combination of dynamic software product line engineering (DSPLE) [9], [10] and applied data science [11] has convey some research to use data as the feedback for autonomous system. In concept and data drifting technique [12]- [14] the change of pattern and data has also provided the change of information in real-time. This paper has three contributions as follow, − In detecting anomaly of data transaction, we need to monitor all data transaction in real-time and asses it on the fly with current existing data to find any drifting or change pattern, which may indicate anomaly information. When comparing data in real-time some computation may increase the cost of CPU operation. To this we need to find a method that has capability to predict whether the data may drift and when it will occur. In this work, we make use of concept-data drift method that uses machine learning approach to have better understanding of data pattern. − Recommendation of which software package should be included or excluded to have better operation based on the change of data pattern is quite a challenging work. As a result, we need to figure out of how to automate [15] the software reconfiguration mechanism. We concern that the implementation of IoT may have hundreds or thousands of sensors and IoT gateway, which not easy to manage the reconfiguration in a manual way. This work takes advantage of common automation in Software Engineering environment for and in Cloud computing system where an automation tool, such as Ansible [16], helps multiple remote system management in a massive scale. − To manage change of an IoT architecture, we need to monitor and identify simultaneously not only the Thinger condition but also the data pattern that stream from group of sensors as a reference to have better system performance. As a result, an adaptive architecture has its importance to be included as a success factor. In this work, we adopted the dynamic software product line engineering approach to have an adaptive IoT architecture from the software engineering point of view. A mechanism to support the dynamic change of software components configuration should be explicated in this paper using concept-drift approach as the sensing system services & monitoring. In an adaptive system, a real-time adjustment is the most critical point of work. To change software architecture in a real-time condition, we adopted the transition model introduced in [17] which give us benefit to manage the reconfiguration.

BACKGROUND 2.1. Concept drift
Concept drift is a statistical approach using supervised online learning to identify anomaly on data stream by analyzing between input and output data [14] over time. In a dynamic environment, data simultaneously change overtime in an expected or unexpected, and usually occurs with environmental events [18]. Data drift degrades machine learning model accuracy over time. In handling concept drift, mining and analysis of streaming data are the most critical. One of the methods to cope with this problem is the adaptive sliding window (ADWIN) algorithm [19]. This method has efficient variation via dynamic changing to the length of a data window which has good performance in predicting stream data. ADWIN approach monitoring the stream data using sliding windows with different sizes and width, rather than statistically observed. Another  [20] to construct decision tree that performs well using constant memory and time per instance. In [21] has enhanced the Hoeffding Tree capability by analyzing the consistency of error rate. Another monitoring change detection is Page-Hinkley (PH) [22] test using sequential analysis technique that has efficient detection that is designed to detect change in a Gaussian signal approach.

Dynamic software product line engineering
DSPLE augments the software product line engineering (SPLE) to have adaptive behavior to cope with the existing concurrent or real-time problems. Where in SPLE, the member of product line harvests new software products using existing artefacts as core reference by configuring software components in high quality, short time and less cost [23]. For further knowledge on SPLE, we can read from this article [24]. The advantage of reusability method in SPLE has also been implemented in IoT development [25], [26]. In this work, we focus on managing a DSPLE approach to engage with dynamic software architecture for IoT.

RESEARCH METHOD
This work has adopted the ThingerIO [27] portal system to have the cloud services to collect and provides an application programming interface (API) as a gateway for further analytical process. As shown in Figure 1, we extend the IoT framework with SPLE [28] approach to have a kind of adaptive mechanism. Our framework has five layers with specific functionalities as follow, -The lowest layer, edge devices, manages the very basic architecture with sensors and devices to collect data via sensor and execute command using actuator. In this work, we have used Raspberry Pi 4 as the main controller in this layer. -The next is edge layer that provides gateway for groups of Thinger (devices) to send data into the highest hierarchy of this architecture. As shown in Figure 2, this layer only passing through the data without any detection or modification, and its main service is to maintain connectivity between edge devices layer to cloud services. -In fog layer, there are three functionalities. First, it manages the load-balancer to reduce intermittent in collecting data and relaying through cloud services. Second, it maintains temporary storage as input for online learning analysis in the highest layer. And, the last one, this layer also working as the reconfiguration manager via deployment manager to manage dynamic software configuration to the load-balancer, in the same layer, and Raspberry Pi in the edge devices layer. -Cloud layer is the most critical operation in this framework. It manages the cloud services and storage. And, it also performs analytic operation to maintain the reliability and availability of data. This layer accomplishes two analytic computation to have batch and online processing. In batch processing, the data have been accumulated in the data storage which consists of sensors data. In online analytical processing, this layer read and predict the data pattern and behaviour to make sure its reliability. To have a recommendation whether to change the software configuration or not, this layer reading the batch data to know the data consistency and latency to observe communication and traffic which may have a result to reconfigure the load-balancer. Another information from online learning is to identify the problem that may occurs because of edge devices layer failures. We make use of Hoeffding tree (HT) and Hoeffding adaptive tree (HAT) to have information on data pattern and the performance of this approach. To predict the data drifting, we implement the ADWIN and Page-Hinkley model which give us estimation on when the data have problem, how bad is the data inconsistency, and also to know when the data transmission will get back to become normal. This information important because we need to make a decision whether this problem should be addressed into execution of software reconfiguration to specific devices. -In application layer, it provides the client integration to the system to have data and prediction via web services. Two services supported in this framework via application programming interface (API) gateway, where user seamlessly accessing the data. And, client connectivity via direct client-server connection.

RESULTS AND ANALYSIS
Our adaptive IoT architecture, as shown in Figure 1, comprises of five layers with adaptive management exist on two layers, which are cloud layer and fog layer. In identifying data drifting, we make use of concept drift to observe the incident. So that, we can predict and create an action plan to adjust the system into normal operation. It is worth to note that our work concern on IoT software architecture, not the hardware system. In our experiment, the Raspberry Pi connected to sensors and actuators which sends data to Cloud data collector based on ThingerIO IoT architecture [27]. We measure and collect the data, then implement the data drifting identification model, on the fly, as online learning to estimate which action should be done according to our transition model. We have collected the data for one month, which gathered more than 100,000 records of sensors data.
In this work, concept drift is used to create a recommendation of action plan that enable an adaptive mechanism in an IoT software architecture. To this, the concept drift via online learning corroborates in conveying the software reconfiguration. In predicting the possibility of data drifting, the ADWIN method performs with high sensitivity on stream change pattern with distribution monitoring, and Page-Hinkley algorithm using sequential analysis [29]. As shown in Table 1, the result of ADWIN as detector, and Table 2 using Page-Hinkley to detect data drifting. In using ADWIN model, detector has identified more data which can be seen on the interval index. Page-Hinkley model identified a smaller number of index interval that convey to less data drifting.
To evaluate the recommendation model, we make use Hoeffding Tree and Hoeffding Adaptive Model to gain more information, as depicted in Figure 2. In accuracy, both model HT and HAT show 0.4993 and 0.4983, slightly similar value, where HAT has capability to identify drifting while adapting to the change of workload [30]. Where Kappa, as shown in Figure 2, is a sensitive measure to quantify the predictive performance which related to the streaming classifier because the number of instances may change in the data stream. It compares between the detected accuracy with the estimated accuracy. In performance comparison, kappa results for HAT slightly lower and more stable than HT, which have spikes in several time frame. For further information about Kappa can be read in [31]. In handling the transition change between conditions and recommendation, we adopted the transition diagram to convey us on building a DSPLE mechanism. As shown in Figure 3, the configuration of our IoT architecture is divided into three (3) set. The first is high feature when all sensors are sending data in normal data transaction. In medium feature, only half numbers of sensors data are sent by the Raspberry Pi and disable the sensor with readability problem. The Low Feature only have minimal number of sensors included in the configuration.

Figure 3. Transition Diagram
To manage the reconfiguration process, we developed a distributed automated IoT configuration model as shown in Figure 4. During normal operation, the analytic engine always checking the data pattern using batch and real-time processing to identify any disruption or failure based on the data pattern. This is important because we cannot put more workload into the edge devices to avoid problems on data transmission. Gateway provides interaction between rule/policy manager to the edge devices with minimal data engagement. However, when the analytic engine provides a recommendation to reconfigure the edge devices, the rule/policy manager will send an action plan to change the device configuration (Raspberry Pi Sensor configuration). It follows by the deployment manager (reconfiguration/ansible) to call the impacted edge devices to receive an execution order. The order is sent via SSH protocol by Ansible mechanism which has been prepared to receive order via deployment manager. As shown in Algorithm 1, the deployment manager send a request to disable/enable the GPIO port in Raspberry Pi (as edge devices). With this concept in mind, we can reconfigure the edge devices in any location as long as the system has connection to the deployment manager.
In this work, we still doing the improvement of the adaptive mechanism, especially the online learning that assess the data pattern. A stable interconnection is a must to have fully automation. We have tried to run on low bandwidth which is the typical of IoT environment. However, some enhancement should be made more precisely to avoid a runaway configuration, which usually occurs when the communication failed during the reconfiguration process. For example, in reconfiguring an edge devices, an acknowledgement message which indicate the stage of reconfiguration has been done by the devices should be reported to the deployment manager. If there is a problem with communication, then the deployment manager will not have any information regarding the reconfiguration process. Our proposed framework has leading us into knowledge on building an adaptive system for IoT system, can have an alternative, using DSPLE approach with concept-drift as control mechanism. It also has enlightened us on how important to concern about data drifting in an IoT architecture. Algorithm 1. An Excerpt to Enable/Disable GPIO in Raspberry Pi via Ansible -name: enable/disable remote GPIO command: "raspi-config nonint do_rgpio {{0 if raspi-config.enable_rgpio else 1}}" when: "'enable_rgpio' in raspi-config and raspi-config. enable_rgpio!=raspi_rgpio_enabled" tags: -raspi

CONCLUSION
We have proposed an adaptive IoT architecture that concern to the change of stream data behavior to manage a dynamic IoT system. This work adopted the dynamic software product line (DSPLE) approach to handle the change of architecture configuration. In sensing the data drifting from a sensor or feature, our technique makes use of concept drift to detect anomaly and misbehavior of data pattern. Despite of bandwidth limitation in most IoT infrastructure. We manage to create a dynamic IoT architecture which, autonomously, reconfigure the edge devices to have its condition of work into normal stage. Our work also makes light on alternative adaptive IoT architecture that is not rely only on devices monitoring but based on data pattern as indicator of problem within the ecosystem. Our results show that the adaptive sliding window (ADWIN) method outperforms the Page-Hinkley with more selective identification and sensitive on data reading. For future works, we would like to add a mechanism to be able to handle failure detection during a reconfiguration process with minimal supervision by the deployment manager. This is important to reduce data traffic inside the IoT ecosystem.