An Eﬃcient Scheme for Applying Software Updates in Pervasive Computing Applications

The Internet of Things (IoT) oﬀers a vast infrastructure of numerous inter-connected devices capable of communicating and exchanging data. Pervasive computing applications can be formulated on top of the IoT involving nodes that can interact with their environment and perform various processing tasks. Any task is part of intelligent services executed in nodes or the back end infrastructure for supporting end users’ applications. In this setting, one can identify the need for applying updates in the software/ﬁrmware of the autonomous nodes. Updates are extensions or patches important for the eﬃcient functioning of nodes. Legacy methodologies deal with centralized approaches where complex protocols are adopted to support the distribution of the updates in the entire network. In this paper, we depart from the relevant literature and propose a distributed model where each node is responsible to, independently, initiate and conclude the update process. Nodes monitor a set of metrics related to their load and the performance of the network and through a time-optimized scheme identify the appropriate time to conclude the update process. We report on an inﬁnite horizon optimal stopping model on top of the collected performance data. The aim is to make nodes capable of identifying when their performance and the performance of the network are of high quality to eﬃciently conclude the update process. We provide speciﬁc formulations and the analysis of the


Motivation
The rapid evolution of the Internet of Things (IoT) in combination with pervasive computing sets new challenges for the development of new services and applications. The adoption of wireless technologies (e.g., Wireless Sensors Networks -WSNs) and Internet accompanied by the respective hardware gives the opportunity to numerous nodes to be interconnected. The transition from closed networks to interconnected autonomous nodes capable of interacting with their environment and performing simple processing tasks increases the quality of services that end users enjoy. The basic concept of adopting many autonomous devices is the pervasive presence around end users of various technologies such as sensors, actuators, mobile phones and so on [5]. IoT nodes have, usually, different characteristics and capabilities that pose a set of requirements for any novel application. For instance, communications should be realized on top of specific rules adopted for the exchange of the collected data while intelligent interfaces may assist in the efficient management of the observed heterogeneity.
IoT nodes can collect and process ambient data in very dynamic environments. They come with the software/firmware of the corresponding manufacturers. Two main issues may positively affect their performance: (i) the application of software updates extending their functionalities; (ii) the application of firmware updates covering any potential gaps. As IoT nodes could manage sensitive user data, it is imperative to apply the discussed updates during their functioning. The challenge is to apply these updates in the entire network. Timing is everything, thus, the application of the updates should not be delayed.
Once nodes are in operation, they will start receiving updates that should be applied as soon as possible. However, every node performs some tasks significant for supporting applications, thus, applying updates should not disturb them from their initial goal. Updates must be delivered in a way that conserves the limited bandwidth and intermittent connectivity and eliminates the possibility of compromising functional safety.
A central server is responsible to define the steps for distributing the updates based on a delivery protocol. Apart from the simplicity of the solution (updates are located in a single point and transferred to all nodes at the same time), one can identify a number of disadvantages in this centralized approach: (i) the central server is the main responsible for delivering the updates, thus, it should apply complex protocols to distribute them without affecting the performance of the network; (ii) the central server cannot be aware of the nodes being connected in the network, thus, it cannot be aware if all nodes have received the updates; (iii) when the central server distributes the updates, nodes should interrupt their processing tasks to apply them; (iv) the network is flooded by update distribution messages limiting the available bandwidth and reducing the performance of the network.
In this paper, we propose a distributed model for applying updates to avoid the aforementioned disadvantages. Through our approach, the central server should not be aware of the status of each node and the network, thus, it acts as a 'repository'. Nodes undertake the responsibility of downloading and applying the updates when they observe that it is the appropriate time to do it. Nodes' line of actions is realized by our time-optimized, performance-aware mechanism defined by means of the Optimal Stopping Theory (OST). The proposed scheme aims to intelligently support the nodes and provide a decision making model that finds the appropriate time for initiating and concluding the update process.
We focus on an infinite horizon scheme with each node having no deadline for concluding the process. In such scenarios, updates are considered as 'non critical' (e.g., software extensions), however, they should be applied as soon as possible as they positively affect nodes' performance. We consider a reward function and a discount factor for each time step where nodes delay the process, thus, we try to 'force' them towards its immediate conclusion. The reward function involves multiple metrics monitored by nodes that depict their load and the performance of the network.

Our Idea, Contribution & Paper Organization
We propose an update scheme building on the autonomous nature of nodes, i.e., an autonomous entity capable of performing lightweight processing on top of collected data. Nodes are capable of collecting information related to the performance of the network. Specific metrics could be taken into account like the bandwidth, the error rate and so on. Our scheme is distributed i.e., each node autonomously decides when to download and apply software updates (from this point forward we refer to 'updates' to depict software/firmware updates, extensions and patches).
Legacy systems, as we discuss in Section 2, deal with centralized schemes. The most common model in such systems is broadcasting, thus, one of the key challenges is the messaging overhead [39] and [65]. Nodes rebroadcast any new incoming data packet increasing the number of the unnecessary transmissions.
For instance, if a software update of x packets is to be sent over a network of y nodes, potentially x · y times y broadcast packets could be sent out [39].
This may have a negative effect in the performance of the network, i.e., a high number of broadcasts could lead to more cumulative power that should be consumed to support communications. In addition, the increased messaging overhead could introduce more collisions, thus, it may affect the reliability of the channel. Due to the increased size or the multiple assignments of the updates, a set of techniques have been proposed to alleviate the burden of the multiple broadcasted messages. Incremental updates, compression, diffusion and various dissemination protocols are adopted to find the appropriate means for delivering the updates. However, all these techniques have specific drawbacks.
Incremental updates require an efficient management mechanism to maintain the updates history while compression and, accordingly, decompression require a complex management. The additional overhead for the management of incremental and compression/decompression schemes becomes significant limiting the central system's and nodes' performance. Incremental techniques should also incorporate mechanisms for handling the heterogeneity of nodes while diffusion mechanisms require an increased number of messages especially in dense networks. In any case, the design of a reliable dissemination protocol that could manage the heterogeneity of nodes is a real challenge. Heterogeneity makes difficult the application of a protocol that is capable of covering all the types of nodes.
Various operating systems like Contiki [13], LiteOS [9] and RETOS [8] already support dynamic linking and updates loading, however, the adopted ELF format (Contiki adopts the format) leads to binaries potentially large for transfer [62]. In these cases, the minimization of the disseminated messages [54] can reduce the load of the network. In general, legacy dissemination protocols assume an ad-hoc network and use a form of controlled flooding [6]. Epidemic approaches combined with scalable multicasting through which updates are periodically transferred to nodes are another approach (e.g, [42], [45], [27]). Apart from the increased number of messages, one can observe an increased complexity in the resources required for maintaining data related to the management of the updates (e.g., [45], [61], [27], [53] ). These data are related to the patterns of the updates, their advertisement, the combination of their parts an so on.
There is a trade off between the overhead in the management of the updates compared to the time for the final conclusion.
The aforementioned problems of the centralized techniques combined with: (i) the absence of knowledge about if nodes are connected to the network during the dissemination, (ii) the absence of knowledge about if nodes have the availability to receive and conclude the updates at the time of the broadcasting, and (iii) the complexity of the presented dissemination protocols, consist of the motivation behind the current work. The proposed approach can be applied in any network involving autonomous devices, i.e., IoT and WSNs. It should be noted that we do not aim to provide a methodology on how updates will be installed but on when updates should be retrieved and installed towards the maximization of the performance of the network and nodes. The computational and storage complexity of our model is minimal, thus, it can be easily executed by devices with limited capabilities. The novelty compared to the state of the art is that our scheme does not 'impose' the application of updates and leaves nodes independently deciding their course of actions. Hence, we can avoid all the negative consequences of centralized systems for the dissemination, the recreation and the application of the updates locally.
Every node monitors its current load (e.g., the tasks waiting for execution) and the status of the network. We should discern the tasks to: (i) regular tasks (e.g., temperature/humidity monitoring, complex events reasoning); (ii) tasks related to the application of the updates. Nodes, when applying updates, cannot work efficiently till the process is finished, especially, when the size is high. This time is critical as nodes should continuously operate and deliver their results.
Our time-optimized scheme derives when the updates could be applied based on the current status (e.g., current load, current status of the network) and the estimation of the future status (i.e., the expectation of the adopted parameters).
The central system sends to nodes only a lightweight message containing the indication about the presence of an update. We consider two schemes: (i) the server defines a deadline for the application of the update; (ii) the server does not impose a deadline, however, the sooner the update is applied, the better for nodes' performance. The following list reports on the benefits of the proposed solution compared to the remaining research efforts in the domain: • the central server should not be aware of the status of each node; • the central system should not spend resources to handle the heterogeneity of nodes; • the proposed scheme alleviates the complexity of the central system's be-haviour as it is not necessary to define and adopt complex protocols and dissemination schemes; • each node independently runs the proposed scheme and selects the appropriate time for applying the update; • the proposed model takes into consideration the dynamic nature of nodes, i.e., nodes initiate the process by themselves, thus, the application of the updates is secured; • the proposed model is fully adapted on the performance of the network securing the uninterrupted application of the updates. Nodes will decide to initiate the process only when they 'see' that the network's and their performance are at high levels.
The contribution of our work is summarized as follows: • a novel time-optimized, performance-aware mechanism based on OST, which decides when an update process, P, should be activated. As mentioned, the proposed model does not deal with how P should be installed but when P should be concluded towards the maximization of the performance of the network and the node; • a method for delivering the optimal time t * for initiating P. P consists of two stages: (i) get the software from the central system; (ii) install the component locally. At t * , the node enjoys the best possible performance to support P. As the best performance, we define the highest possible network performance and the limited load of nodes; • a method that alleviates the overhead required by legacy systems to disseminate and maintain the updates. The overhead is related to messaging as well as the required resources for combining parts of the updates and their patterns; • a mechanism that is characterized by a lower complexity than legacy systems giving to nodes the room for acting autonomously and defining their own scheduling for P; • a model that requires only light weight advertisement messages compared to legacy systems that require additional messaging for controlled flooding or multicasting parts of P; • extensive simulations and an analysis of the results where we reveal the strengths of the proposed model and compare it with baseline solutions and a centralized approach.
The paper is organized as follows. Section 2 reports on the related work. In Section 3, we describe and formulate the problem under consideration. Section 4 presents the analysis of the problem and provides specific formulations for its solution. Simulations reveal the performance of the proposed model in Section 5. Finally, in Section 6, we conclude our paper by presenting our future research agenda.

Related Work
The pervasive nature of the IoT is enhanced by the presence of sensors either in the 'standalone' mode or embedded into users' smart devices. An IoT node consists of two parts: (i) the hardware that makes the device capable of recording/collecting/sending data from/to the environment, and, (ii) the software that makes the node capable of processing the collected data and producing some outcomes. The produced knowledge could be adopted to deliver decisions related to the presence of events and the provision of the appropriate responses. In this section, we shortly review the IoT domain by presenting the taxonomies of the devices, the communication schemes as well as the envisioned applications.
Finally, we report on schemes for delivering software updates in IoT nodes.
Specific taxonomies have been already defined for describing the components required by the IoT vision [7], [63], [64]. There are three components that enable the pervasive computing aspect [21]: (i) the required hardware i.e., sensors, actuators and communication hardware; (ii) the required middleware i.e., functionalities related to storage, processing and support for intelligent analytics; (iii) the required presentation i.e., tools that support visualization and interpretation. Hardware defines a number of requirements related to power, connectivity and security that should be handled to provide a node ready to support high quality applications. For instance, a device cannot execute energy consuming tasks when the underlying hardware poses strict constraints in the energy consumption. In addition, when a node has limited power or computational capabilities, it cannot be used in applications that require the continuous presence of the node in its environment. For communication purposes, the technology that is widely adopted is RFID. RFIDs help in the automatic identification of anything where they are attached and act as electronic bar codes [31], [69]. In addition, the WSN technology leads to the adoption of low power, miniaturized nodes capable of supporting remote sensing applications. Sensors enable the collection, processing, analysis and dissemination of the observed information gathered in a variety of environments [2]. The interesting is that sensors can easily support both, distributed or centralized applications and, especially, the provision of the basis for deriving intelligent analytics. Nodes communication capabilities offer functionalities for connecting them to e.g., the Cloud to invoke specific services or to fire more complex processing.
Smart IoT nodes are based on the appropriate middleware to be capable of performing the required functionalities. The middleware is a mechanism to combine cyber infrastructure with Service Oriented Architecture (SOA) and WSNs to have access on heterogeneous sensor resources [18]. The management of heterogeneity, accompanied by a platform-independent middleware, are of high importance in future applications. The Open Sensor Web Architecture (OSWA) [58] offers a set of operations and standard data formats to 'cover' the retrieved data and provide a uniform view on the sensors results. The middleware connects different, complex, new or existing software that are not designed to be connected. The architectural aspects of middleware, the requirements and the available methodologies have been already discussed in various research efforts.
Concerning the application domains, those can be classified according to the type of networks, coverage, scalability management, heterogeneity, users involvement and impact [19]. Four categories of the IoT application domains are identified i.e., [21]: (i) Personal and home applications. Intelligent applications are delivered on top of the collected data fully adapted to users' characteristics and the dynamics of the environment; (ii) Enterprise applications. The information is collected by enterprise networks and intelligent applications are delivered in domains like environmental monitoring [1], [20], [23], [24], [35], [36], smart IoT environment [19], [44], etc.; (iii) Utilities. They involve intelligent applications on top of information retrieved by networks adopted to produce solutions for service optimization. Typical examples are the Smart Grid and smart energy management applications like those presented in [14], [17], [47], [55]; (iv) Mobile applications. They are built on top of the information conveyed by mobile nodes. A smart transportation system [4], [10], [32], [71], is the typical representative of such applications.
Software updates for IoT nodes are necessary after deployment to have the nodes capable of efficiently performing the assigned tasks. A survey on the adopted methodologies are presented in [6]. Firmware updates are necessary in non-modular sensor operating systems [41]. Apart from firmware, generic re-programming (i.e., software components that are not directly related to the firmware) is another problem that aims to extend or correct devices' functionalities. Re-programming is related to updates in the software components that aim to provide extensions i.e., new functionalities or solve possible errors. As mentioned, the envisioned algorithms target to minimize the time required for applying any update through the minimization of the amount of data that will be transferred to nodes. Incremental updates and data compression could be adopted to reduce the size of messages [62]. However, splitting the data in multiple parts requires an increased number of messages for concluding the whole process. When only the differences with the previous version of the software is sent, a mechanism that combines the new version with the old one without jeopardizing the functioning of nodes is necessary. The incremental management of the updates does not eliminate the necessary process for maintaining updates history and the aforementioned combination.
Data dissemination protocols are the key part of a dissemination strategy for transferring software updates to any member of a WSN. Some protocol examples are discussed in [28], [30], [38], [52], [40], [60], [66]. The use of a controlled flooding differs from the data collection protocols in storage, coverage and data flow. However, controlled flooding does not eliminate the need for an increased number of messages that will be distributed in the network, especially, in dense networks. In WSNs, resources are limited and nodes cannot cache overheard packets which might not be useful [28]. The normal pattern for dissemination protocols is a three-step process [6]: (i) the advertisement of available software; (ii) the selection of a source; and (iii) the reliable downloading to the target.
A subscription approach could be adopted, however, this results a significant overhead in the network and, more specifically, in the server where the updates are present.
Some widely cited research efforts in the domain are as follows. Trickle [42] is an algorithm that disseminates and, accordingly, maintains software updates in WSNs. Trickle adopts an epidemic approach with scalable multicasting through which updates are periodically transferred. Epidemic approaches, in general, may involve the transmission of several copies to random nodes, thus, there is an increased cost for the management of the received messages. In addition, there is no guarantee that nodes will be always connected to the network to receive the envisioned messages. Special attention is paid by the data discovery and the Dissemination Protocol (DIP) on the elements that could be exchanged between nodes [46]. The protocol tries to randomly scan the network and detect new items while maintaining the latency at low levels. DHV [12] is an efficient code consistency maintenance protocol that ensures that nodes will, eventually, have the same code. The Multicast-based Code redistribution Protocol (MCP) [45] is another protocol that performs code maintenance. MCP 'forces' every node to maintain a table that contains the information of known applications.
The table supports the delivery of multicast-based code dissemination requests.
In any case, the use of additional data structures increase the storage complexity of the corresponding models. Multi-hop, Over-the-Air code distribution Protocol (MOAP) [61] uses a store-and-forward approach providing a pattern of updates. The updated code is broadcasted in a neighbour-per-neighbour basis forcing nodes to disseminate the incoming code to reduce the latency. Deluge [27] is a protocol that builds on top of algorithms related to density-aware, epidemic maintenance protocols and includes several optimisations. It adopts Trickle for the advertisement of code and splits the code into a set of fixed-size pages. Through this approach, the time required for the propagation of large components is reduced. In any case, the adoption of multiple optimizations increases the complexity of the proposed solution especially for the recreation of the updates from multiple parts. Stream [53] adopts Deluge and optimizes the code parts sent in the network. Stream deals with pre-installing in each node the re-programming application. Hence, Stream transmits the minimal support (about one page) needed for the activation of the re-programming image.
Resource-awareness, time-efficiency, and the integration of security solutions are involved in the model presented in [48]. A multi-hop propagation scheme is proposed enhanced by security codes and means from fuzzy control theory.
In any case, the definition of fuzzy logic rules that cover all the aspects of real scenarios is very difficult. MELETE [73] is another code dissemination protocol designed to support multiple concurrent applications in a WSN. It assumes that the network is a set of groups of nodes that execute different tasks. The framework adopts a group-keyed model to selectively distribute the code to only the interested nodes, and reactively distribute the code only when it is required.
A monitoring process (like in our model) over various parameters before performing a set of actions is adopted in various domains. In [57], the authors describe a distributed scheduler that opportunistically schedules data transmissions, with a view of minimizing the energy consumption of a wireless device. By exploiting the stochastic characteristics of the channel, the model postpones the communication up to an acceptable time deadline until it finds the best expected channel conditions. In [51], the authors study Device-to-Device (D2D) communications and demonstrate the energy, capacity and Quality of Service (QoS) benefits of link-aware opportunistic D2D communications. In [11], the authors study the energy efficiency of channel-aware random access with multiple parallel channels under a collision channel. An asymptotic relationship between the energy efficiency and the total bandwidth is described, which shows that the relationship depends on the energy consumption properties of sensors. In [16], the authors focus on the Distributed Opportunistic Scheduling (DOS) model that exploits multiuser diversity in wireless networks without the requirement of a central scheduler. With DOS, users take their own scheduling decisions based on local observations related to the channel. In [67], the authors propose an online scheduling algorithm designed to decide the optimal action in each time slot (i.e., to transmit or hold the packet on the top of the transmitter queue) based on the predicted channel condition and the packet queue status. Finally, [43] reports on a model that builds on top of the prediction of the channel SNR to schedule the desired transmissions.

Problem Statement
In this section, we discuss the problem under consideration and present basic information about our model. In Table 1, we provide the basic notation adopted throughout the paper.

Preliminaries
We envision a setting where a set of IoT nodes N = n 1 , n 2 , . . . , n |N | perform specific processing tasks. These tasks are independent each other even if some of the nodes could perform the same task (e.g., temperature or humidity monitoring). An IoT node, n i , is a physical device embedded with electronics and software having capabilities of collecting, processing and exchanging of data. An update process P is an independent task that alters the software of a node. An update could be [62]: (i) an update of the operating system; (ii) an update of an application; (iii) an addition of a new application; (iv) a modification of parameters in an existing application. A server S, a software component, is responsible to store, manage and disseminate the updates. We try to alleviate the load of S when serving a high number of nodes. When many nodes try to contact S, download and install the available updates, bottlenecks could be present. In such cases, S should adopt an intelligent algorithm or be accompanied by powerful hardware to efficiently serve the increased load. In propose a distributed scheme that each node adopts to be able to get and apply P.
P (from the node perspective) is concluded after the reception of the corresponding S's advertisement message and consists of the following steps: (i) n i is connected to S; (ii) n i downloads P; (iii) n i applies P, locally. We assume that S, when P is available, sends a lightweight message (e.g., a single packet) for P's presence. Based on the criticality of P, the message could be accompanied by a deadline. We define the update epoch U E as the time required to apply P. U E is an interval [1, U] in which the entire set of nodes should have concluded P.
When U = ∞, we consider the infinite horizon version of our scheme, otherwise, we deal with the finite horizon version of the problem. When no deadline is set, n i has unlimited time to conclude P. In any case, n i should conclude P as soon as possible, however, under specific constraints related with n i 's performance that secures the successful conclusion of P. We focus on the infinite horizon version of the aforementioned problem. The finite horizon version is studied in [34] where we provide specific formulations and the solution of the problem through backward induction. The finite version 'suffers' from the need of meeting the pre-defined deadline, thus, nodes are forced to immediately conclude P.
This can affect the performance as multiple nodes may decide to initiate P at the same time. The identified research challenges are as follows: • Research Challenge 1. Each node should monitor its performance and the status of the network to retrieve the corresponding data. These data will become the basis for the decision making related to the conclusion of P. The challenge is to select the appropriate performance metrics and build the discussed mechanism.
• Research Challenge 2. Based on the collected performance data, for every U E, we should find the optimal stopping time t * ∈ [1, U] where nodes will stop the monitoring process and initiate/conclude P.

Update Management Optimization
A performance metric m k measures the activities and the performance of an entity. Let M be the set of the adopted metrics, i.e., M = {m k , k = 1, 2, . . . , |M|} on top of which the decision for concluding P is made. For instance, when the node enjoys a high bandwidth and exhibits a low load, it can initiate P. In such cases, the execution of P will not affect the performance of the node and it will not disturb the node from the execution of the assigned tasks.
In our model, we focus on two types of metrics related to (i) the network , and, (ii) the performance of a node. Network performance metrics can be categorized into [22]: (i) availability; (ii) packet loss and error ; (iii) delay; (iv) bandwidth.  [29], [72] that refers to the maximum unused bandwidth at a link or end-to-end path. This depends on the link capacity and the traffic load during a certain time period. The monitoring process of the aforementioned metrics adds an overhead, however, this overhead could be eliminated if the data collection is characterized by low frequency. Recent studies show that when the reporting frequency is low, the impact on the traffic performance increases [25]. The frequency of the monitoring process affects the throughput and the end-to-end delay. When the reporting frequency is set to a high value, i.e. 15 seconds, the impact is almost zero [25]. In this paper, we consider that nodes adopt a frequency that leads to almost zero overhead for network monitoring. An analytical study on the constraints of the network monitoring overhead is beyond the scope of the paper.
Consider the discrete time T with t ∈ T. Let an advertisement message indicating a new update has been arrived in n i . A new U E j starts and, at t ∈ U E j , n i checks every m k , k = 1, 2, . . . , |M| and calculates the 'reward' that will gain if it initiates P. The reward is based on the observed values and it is proportionally or inverse proportionally affected according to the type of m k .
For proportional metrics, the higher the value is, the higher the reward becomes (e.g., the bandwidth). For inverse proportional metrics, the lower the value is, the higher the reward becomes (e.g., the load of each node). We define the function I k that incorporates the information that m k is proportional or not: Based on I k , n i calculates the reward r t as follows: r t gives an indication about the current status of the node and the network.
r t can be easily calculated and acts as a weighted aggregation scheme where all metrics are of equal weight (the node pays equal attention on all of the observed metrics). n i tries to maximize r t and 'safely' conclude P. Based on r t , n i should decide when it is the appropriate time to initiate and conclude P. Hence, at t, n i should take one of the following decisions: • D1. Stop the monitoring process, initiate and conclude P.
• D2. Continue the monitoring process without deviating from the current task fulfilment.

Optimal Stopping Theory
The OST [56] could be adopted for determining the best time to take an action (decision) based on sequentially observed random variables. An optimal stopping problem is defined by the sequence of random variables X 1 , X 2 , . . . whose joint distribution is known and the sequence of real-valued reward func- x 2 ) , . . .. Let (Ω, B, P ) be the probability space, and G t be the sub-σ-field of B generated by X 1 , . . . , X t . We have the sequence of σ-

Model Analysis
In our infinite horizon problem, each node has not an upper time limit to conclude P. At t, n i enjoys a 'disturbance' in the performance metrics depicted by the m k random values. We focus on independent metrics and do not consider any adaptation process, especially in the underlying network. The reward at t is realized through a function y t = f t (y t−1 , r t ), t ∈ U E j , j = 1, 2, . . .. In our case, we adopt y t = max {r t }, as n i tries to find the maximum possible performance to conclude P. Based on y t , we consider the following reward function: It should be noted that no recall is permitted because n i cannot adopt any previous realizations of the performance metrics (node's and network's performance are dynamically updated). The discount factor β ∈ (0, 1) affects n i 's behaviour as follows. n i should delay the decision in the anticipation of a better z t when β → 1. n i should not delay the decision when the S requires an immediate conclusion of P (β → 0). If n i never stops, the reward is considered equal to zero, thus, we assume z 0 = z ∞ = −∞.
The two main problems that should be solved are: • Identify t * where the expected reward is maximized.
• Find an optimal stopping rule, such that n i terminates the monitoring process to maximize the expected reward Z t , i.e., E[Z t ] (Z is the random variable depicting the n i 's reward).
We can treat the first problem as an infinite horizon problem where n i receives z t and it has no 'pressure' on the final decision. However, β makes n i to conclude P in a rationale time interval.
Once n i observes z t , it decides whether to continue the process or not, by examining the expectation of the future reward without recall. In other words, n i has to find a t * where the supremum in Eq (3) is attained i.e., sup t E[Z t ].
Suppose that at some t, n i has observed Z t = z and it is optimal to continue the process. Then, at t + 1, if Z t+1 is still z, because y t+1 ≤ z, it is optimal to continue due to the invariance of the problem in time [15]. Hence, based on the principle of optimality, this problem can be solved as an optimal stopping problem with discounted future reward and without recall. This means that the reward can be considered as Z t = β t Y t = β t max (R t ) and the problem assumes the same solution as the following problem: Find a t * such that the sup t E[Z t ] is attained.
Consider the rewards Z 0 , Z 1 , . . . , Z ∞ where Z t = f (r 1 , r 2 , . . . , r t ). The sequence r t , F t is defined by a probability space Ω, an increasing sequence of sub σ-algebras {F t } ∞ 1 , the sequence of random variables R i (r i ∈ F t ) and E[r i ], ∀i. The following two assumptions should be true to have an optimal stopping time:  The interested reader could refer to [15] for more details. (3), an optimal stopping time exists.

Theorem 1. For the model defined by Eq
. From the law of large numbers, we get t j=1 |r j | → E[|r|] and tβ t = 0. Hence, limsup t Z t ≤ Z ∞ = 0 and condition [B] is satisfied.
Based on the above, the optimal stopping rule and the optimal stopping time is given by the principle of optimality i.e., t * = min {t ≥ 0 : Z t ≥ Z * }. Hence, Z * is the expected return as defined by the optimal stopping rule. As discussed, the problem is invariant in time, thus, the principle of optimality will never require to recall a previous observation. Hence, the following equation holds true: aims to cover as many real cases as we can. When applying the Uniform distribution, we assume that r t s are of equal probability to be observed by n i .

Expected Reward maximization
By applying the Exponential distribution, we aim to focus on multiple scenarios where performance metrics are affected by the rate of the distribution. It should be noted that there is no reason to adopt a distribution 'favourite' to large values (r t → 1) as in these cases, the intelligent mechanism is useless.
By solving Eq(4), we can get the reward limit Z * above which n i should stop the monitoring process and conclude P.
Proof 3. The pdf h R of the random variable R i.e., a sum of exponentials with different rates, is given by [3]: ∀t ∈ T. By applying h R in Eq(4) and through calculations, we conclude the recursive equation as the Lemma indicates.

Estimation of the Expected Return
For evaluating the final value of Z * , we adopt a Monte Carlo simulation.

Monte Carlo methods (or Monte Carlo experiments) rely on repeated random
sampling to obtain numerical results. In our case, we perform a large number of simulations to obtain the distribution of Z * . In each simulation, we randomly generate the realizations of β. Accordingly, we record Z * that satisfy the equations provided by the aforementioned Lemmas, thus, we can derive the final distribution. In Figure 2, we present the histograms of the Z * distribution. It should be noted that mean values for Z * are 0.48 and 0.32 for the Uniform and the Exponential distributions, respectively.

Methodology and Experimental Setup
We elaborate on the performance of the proposed model. as one of the most important network performance metrics. b assesses the amount of data that a node can transfer through the network. Node performance parameters are related to its behaviour and load concerning the executed tasks (let us define the parameter l to depict the current 'availability' of each node).
When a node has a lot of tasks to execute, l will be low. For simplicity, we do not focus on hardware related issues (e.g., memory size, CPU speed, storage used). However, our model can be easily extended to include more metrics into the decision scheme.

Evaluation Metrics & Simulation Setup
We report on the performance of the OSS concerning two aspects: (i) the time required for deciding the initiation of P; and (ii) the quality of the decision in terms of the network and node's performance (our model aims to identify the highest possible value for the observed metrics that secures the optimal performance). We report on the optimal stopping time t * and 'stopping' values of b and l, i.e., b * , l * . The optimal performance is realized when t * → 0 and b * , l * are the maximum possible for the specific update epoch U E j . Without loss of generality, we consider b, l ∈ [0, 1], thus, the optimal performance is achieved We define the τ metric to depict the time required to conclude a stopping decision. As τ → 0.0, n i requires limited time to conclude P; the opposite stands when τ → ∞. The quality of the result (i.e., the maximum possible b * and l * for the epoch U E j ) is evaluated by the γ and δ metrics. The following equations hold true: where E is the number of the experiments. γ depicts the average bandwidth while δ depicts the average 'availability' of each node. The higher the γ and the δ are, the better performance the OSS exhibits. The optimal performance is achieved when γ → 1.0, δ → 1.0. We also define the ω metric depicting the percentage of nodes deciding to initiate P at each t. ω shows how many nodes decide to start P at the same time. The following equation holds true: where |N t s | is the number of nodes deciding to initiate P at t. Recall that the stopping decision and the initiation of P is based on the network performance as well as the availability of each node. As we are based on the random behavior of nodes, thus, it is an exceptional case to have all the available nodes deciding to initiate P at the same time. This is because, even if nodes observe the same network performance, their availability could differ. It is very difficult to have all nodes with the same load, the same complexity in the tasks they are going to execute in order to take the same decision for the initiation of P.
In any case, apart from the dynamic nature of their internal status, when a number of nodes decide to initiate P, the remaining nodes will enjoy a different network performance affecting their future decisions. The adoption of ω aims at revealing this randomness in nodes behavior that will prove that nodes' decisions are 'distributed' in time and add value to the proposed model compared to a broadcasting scenario.
We compare the proposed model (i.e., OSS) with a distributed deterministic model DM . The DM concludes P immediately when an update message is received only if b and l are over a pre-defined threshold. In our simulations, this threshold is defined to be equal to 0.70 (also produced by simulations). The We also define the difference D for each metric to depict the difference in the performance between the OSS and DM , M AE, EW M A. The following equation holds true: where P OSS is the performance of the OSS and P REF is the performance of the For producing 'fluctuations' in the bandwidth of the network, we consider that every µ seconds, nodes send messages to a random peer. We get µ ∈ {20, 120} aiming to simulate two types of network load. When µ = 20, there is an increased number of messages in the network apart from the messages related to P. If µ = 120, the number of messages is lower than in the previous case.

Performance Assessment
Initially, we report on the complexity of the proposed model which depends on the length of U E j 'fired' just after the reception of P's advertisement message. Nodes should monitor the adopted metrics and perform the required calculations (e.g., calculation of the reward). At t, the realization of each performance metric is stored in a list of historical values, e.g., L B for b and L L for l.
The size of L B and L L could be at most equal to U E j , thus, the computational We report on the probability density estimation (pde) of t * , b * and l * . In Figure 3, we plot the pde(t * ) for β ∈ {0.2, 0.5, 0.95}. In general, t * is below 35 which indicates that the stopping decision requires at most 35 steps. Recall that β applies 'pressure' on nodes to conclude P as soon as possible. In any case, t * , derived by adopting the Uniform distribution, is lower than the t * derived through the adoption the Exponential distribution. In Figure 4, we observe that λ does not affect the realization of t * , however, nodes require more than 1,000 steps (the highest value) to conclude P. Apart from that, the combination of λ 1 and λ 2 do not affect t * . These results show that many fluctuations in the realization of b and l as depicted by the Uniform distribution make the proposed model to immediately conclude P.    Our results related to l * are presented in Figures 8 & 9. We observe similar results as in the experimental outcomes for b * . Figure 8 shows our results when the Uniform distribution is adopted while Figure 9 depicts our results for the Exponential distribution. In the case of the Exponential distribution ( Figure   9), we observe that λ 1 and λ 2 do not significantly affect the performance of the OSS as already discussed for the b * outcomes. In the Cooja simulations, we get an average l * equal to 0.61 and 0.79 for µ = 20 and µ = 120, respectively. The medians are 0.71 and 0.88. We observe that the availability of nodes is high, thus, they could be able to support P.   γ and δ results are presented in Figures 11 & 12. When the Uniform distribution is adopted (Figure 11), any increment in β will slightly increase γ and δ. In any case, γ and δ, are kept above 0.70 for β ∈ {0.2, 0.5, 0.95}. Each node enjoys high b * and l * when it decides to stop the monitoring process and conclude P. When the Exponential is the distribution that b and l follow, we observe that γ and δ are below 0.60. γ and δ get low values compared to the results related to the Uniform distribution. Our model exhibits better performance when the monitored data are characterized by many fluctuations due to the randomness of the observations. Recall that the Exponential distribution exhibits an attitude to low or high values depending on the λ we choose. Let us now report on the comparison of the OSS with the DM , M AE and EW M A. In Table 2, we present the comparison between OSS and DM .
Recall that the DM stops just after the reception of the message indicating the presence of an update in S and only when b and l are over the pre-defined thresholds. We observe that the OSS outperforms the DM concerning τ D which means that it requires less time to conclude P. However, this is realized in the burden of γ and δ. The OSS outcomes are less qualitative except when β = 0.5 and the metric under consideration is γ. The aforementioned results stand for the scenario where the Uniform distribution is adopted. In Table 3, we present our results delivered when the Exponential distribution is adopted. The OSS outperforms the DM for the entire set of the examined metrics. The reason is that the Exponential distribution leads b and l to have limited fluctuations, thus, the DM hardly finds both parameters over the pre-defined thresholds.
The comparison OSS vs M AE is presented in Tables 4 & 5. We observe that the OSS outperforms the M AE for the entire set of metrics. The difference is high when the Exponential distribution is adopted. The increased performance related to γ and δ metrics is accompanied by the limited time required to con-       In Tables 6 & 7, we present our results concerning the comparison between OSS and EW M A. In these results, we observe that the OSS also outperforms the EW M A. The average difference is 33.55% and 35.06% for γ and δ, respectively (the Uniform distribution is adopted for b and l). For the Exponential distribution, the average difference is 123.75% and 123.09% for γ and δ, respectively. The highest difference is observed for β = 0.2 (γ metric) and β = 0.5 for the δ metric when the Uniform distribution is adopted. The scenario involving λ 1 = 5 and λ 2 = 1 leads the OSS to exhibit the highest difference with the EW M A concerning γ. The scenario involving λ 1 = λ 2 = 1 leads to the highest difference concerning δ. In Figures 13 & 14, we see our results related to the ω metric. In these  experiments, we instruct 100 nodes to take a decision in 100 time steps. We get similar results, when the Uniform distribution is adopted (see Figure 13). The use of the Uniform distribution 'allocates' the nodes to the first decision rounds aligned with the results delivered for the optimal stopping time. Nodes are forced to conclude the process as already explained in the provided experimental results. In this set of experiments, the maximum number of nodes deciding to conclude P at the same t is 32. We also observe that the use of the Exponential, distributes the stopping time in the available interval. Except from the last decision round (i.e., t = 100), in the remaining rounds only a limited number of nodes decide to initiate P. In all the experimental scenarios, the maximum number of nodes taking their decision at the same time is 36 out of 100 nodes.
Even in this case, the proposed model saves resources for the conclusion of P compared to a broadcasting scenario.  Finally, we compare the OSS with the centralized system Deluge concerning the number of the required messages and time to conclude P. Our results for OSS are retrieved through the use of the Cooja simulator. We adopt the same experimental scenario described in [27] and compare our model with the basic form of Deluge. Deluge's basic form involves every node occasionally advertising the most recent version of the data object it has available to whatever nodes that can hear its local broadcast. Nodes identifying a difference between the advertised data object and their local copy, they may request it from their neighbours. Nodes receiving requests then broadcast the requested data. Nodes receiving the new data objects, advertise the newly received data in order to propagate it further. Additionally, if a node has not completely received its data after making a number of requests, it searches for a new neighbour to request data. Compared to Deluge, we have to notice that the proposed model is not affected by the communication model and topology of the network. Every message has 1,104 bytes per page and each data packet has a payload of 23 bytes.
In Deluge, the required number of messages for the distribution of 20 pages in 75 nodes is equal to 9,966. In the OSS, we need one advertisement message

Conclusions & Future Work
IoT and pervasive computing demand novel applications on top of the autonomous nature of independent nodes. In this paper, we propose a distributed, time-optimized, performance aware model that aims to assist the autonomous nodes to initiate and conclude an update processes. The proposed scheme alleviates the central servers from the burden of supporting complex protocols for the distribution of the updates while being aware of nodes' specific characteristics. Each node, independently, decides when it will conclude the update process according to the result of a monitoring process. The monitoring process aims to provide a view on the performance of the network and the node itself. When the performance is of high quality, there is a room for applying the updates without disturbing the node from the assigned tasks. The decision will be to realize the communication with the central server and conclude the update process. In contrast to centralized systems, the network is not flooded by update messages and nodes' performance remains at high levels. We adopt an infinite horizon time-optimized model applied on top of multiple performance metrics. The model results the time when the update process should be concluded taking into consideration the dynamic nature of nodes. The proposed mechanism is fully adapted on the performance of the network securing the uninterrupted application of the updates. Future extensions of our work involve the implication of an adaptive model fully aligned with the nodes needs. The adaptive model will try to handle the uncertainty related to the state of the environment and nodes behaviour. With this approach, we will offer a complete model for concluding updates either in short-or in long-term.