Subscribing to fuzzy temporal aggregation of heterogeneous sensor streams in real-time distributed environments

SUMMARY Because of the deployment of heterogeneous sensors in intelligent environments, the fusion and information processing means an arduous and complex process. The data fusion of sensors and the design of processing information in real time are key aspects in order to generate feasible solutions. In order to shed light on this context, we present an approach for distributing and processing heterogeneous data based on a representation with fuzzy linguistic terms. In this way, the heterogeneous data from sensor streams are computed and summarized based on fuzzy temporal aggregations ubiquitously within mobile and ambient devices. This innovative approach provides an intuitive linguistic representation of mobile and ambient sensors as well as implies a drastic reduction of the communication burden. In order to provide high scalability in network communication, the information from sensor is spread under the publication-subscription paradigm, where subscribers receive asynchronous events when the aggregation degree of the linguistic terms overcomes a threshold ( alpha -cut). Finally, in order to illustrate the usefulness and effectiveness of our proposal, we present the results of the fuzzy temporal aggregation of sensor streams with alpha -cut subscriptions in a case study where an inhabitants performs an daily activities in an intelligent environment. Copyright © 2016 John Wiley & Sons, Ltd.


INTRODUCTION
Intelligent environments are interactive spaces where technological devices are adapted to solve everyday problems of the people. They are developed under networks of physical objects, so Internet of Things (IoT) [1] has awhile arisen as a new paradigm where ambient intelligence [2] and ubiquitous computing [3] converge [4] to provide connected smart things within intelligent environments.
These recent paradigms properly locate the information processing in the everyday objects, but new challenges are needed in order to develop systems for expediting the communication between humans and devices. Following, the paradigms, which have been developed significant advances in different areas of knowledge in this context, are discussed.
A first drawback, whose intelligent environments deal with, is the data fusion from heterogeneous sensors [5]. Currently, a wide range of heterogeneous sensors are deployed in intelligent environments to collect multiple and heterogeneous data from mobile, wearables, and ambient devices [6]. The mobile devices integrate user-interaction data (movement, location, etc.) by means of multiple sensors and mobile communication protocols (bluetooth, NFC, Wi-Fi, etc.). Meanwhile, ambient devices are collecting information from our environment using low-power protocols FUZZY TEMPORAL AGGREGATION OF HETEROGENEOUS SENSOR STREAMS 3 of 17

APPROACH FOR DISTRIBUTING AND PROCESSING OF HETEROGENOUS DATA BASED ON A REPRESENTATION WITH FUZZY LINGUISTIC TERMS
In this section, a light scalable approach in which the heterogeneous sensors are modeled by linguistic terms to dynamic subscribers in real-time is presented. To do so, the main features and advantages of our proposal are indicated, where the fuzzy linguistic term and fuzzy aggregation operations for processing sensor streams are presented in detail. Finally, the proposed fuzzy model is focused on defining a˛-subscription of linguistic terms in distributed environments. The architecture for distributing and processing of heterogeneous data based on a representation with fuzzy linguistic terms are illustrated in Figure 1. We show how the raw data streams are translated to linguistic terms and the degree is used to filter relevant messages to subscribers based on -cuts. In this way, each subscriber is able to obtain expressive summaries of sensor stream and to reduce the communication burden dynamically using˛-subscription. Following, the proposed architecture is described with its main features.
Our linguistic approach is proposed to facilitate the fusion and description of heterogeneous data from sensors, using linguistic terms that provides an intuitive and interpretable representation without structures or ontologies. Specifically, fuzzy linguistic terms describe the state of sensor stream, such as movement is high or inhabitant is close to kitchen together with fuzzy linguistic temporal terms, such as, awhile, just now, or just now.
The proposed structure of linguistic terms are related to protoforms, which were proposed by Zadeh [48] as useful knowledge model for reasoning [49], summarization [50], and fusion [51] of data under uncertainty. We have started from the classical protoform in the form of X is A to include a linguistic temporal term generating protoform instances in the shape of: motion is just now low. We note the relevance of temporal processing, integrated in this contribution by means of fuzzy logic, which has offered an intuitive and flexible representation of temporal knowledge in several contexts [52][53][54]. This issue will be further discussed in Section 2.1.
On middleware layer, previous solutions deal with data using heavy computing techniques in desktop computers, delegating light devices to wrappers of raw data. For example, GSN uses MySQL to manage and store the data streams or OpenIot provides subscription services just in server side [55]. Moreover, there are some problems to integrate GSN in light devices within other platforms or languages [56]. Our approach aims to focus the initial information processing in mobile and ambient computers, providing more scalable solutions without persistence of databases. On subscription services, streaming sensor data under subscription paradigm and temporal windows has become an active research area related to ubiquitous and ambient environments because it enables the real time capabilities in intelligent systems. In these contexts, the processing of data from ambient, mobile, and wearable sensors has been focused on statistical and spectral features [57,58], which are calculated in short window sizes, such as fixed static window size [59] and dynamic window size [60].
In this work, we propose an approach focused on the flexibility in defining subscriptions to sensors through linguistic terms. Specifically, clients subscribe to sensors defining a protoform by means of one linguistic and one temporal a term, for example, movement is just now high or kitchen door has been awhile opened together with an alpha-cut threshold and a minimal time interval to notify subscriptors. Each sensor calculates the fuzzy degrees, which represent a fuzzy aggregate value of the matching of these linguistic terms with the data stream of the sensor, sending a realtime response to subscribers when the degree overcomes the alpha-cut. In this way, each sensor processes and distributed information from the data streams in real time using a similar model to top-k/w subscriptions [27] under a fuzzy approach. The˛-subscription of linguistic terms is detailed in Section 2.2.

Fuzzy linguistic terms and fuzzy aggregation of sensor streams
In this section, we describe the use of linguistic terms to describe sensor streams in the proposed approach that uses fuzzy logic to provides the capability for fusion heterogeneous sensor data to achieve high-level inferences about objects and situations [29,61].
The aim of the fuzzy linguistic processing of this work is relating mobile and ambient sensors to linguistic variables, which are defined by a set of linguistic terms. In the fuzzy logic context, the semantic of the linguistic terms is given by fuzzy sets described by membership functions [28]. Formally, the membership function Q A .x/ describes the membership degree of the elements x of the base set X in the fuzzy set A, Q A W X ! OE0; 1.
In our approach, each sensor data stream s j is represented as a set of measures s j D ¹m j i º, where each measure is represented by m j i D ¹v j i ; t j i º; v j i represents a value that depends on each sensor, for example, temperature and heart rate; and t j i represents the timestamp, the time at which this value, v j i , was provided by the device. So, the superscript is related to a given sensor j and the subscript i is related a given term for this sensor s j .
So, we propose to model data stream from each sensor by means of linguistic terms. To achieve this, we define a self-dependent processing of each component v j i and t j i and a subsequent method to fuse both linguistic terms.
Firstly, for each sensor data stream s j , we define a fuzzy linguistic variable, the number of terms, their linguistic terms, and their membership functions, which are all based uniquely on the nature of the sensor. The general process is described as translating the measures to a new fuzzy set V D ¹V r I r D 1; : : : ; nº, which represents the linguistic terms. V r is characterized by a membership function Q V r .v j i /, which is interpreted as the degree of membership of an value v j i in the fuzzy set V r for each v j i 2 m i j 2 s j . For sake of simplicity, we write V r instead of Q V r .v j i /.

of 17
In Figure 2, an example is illustrated by the linguistic variable movement from a mobile accelerometer sensor with two linguistic terms: V D ¹V 1 ; V 2 º={low, high} with two trapezoidal membership functions.
Secondly, we associate fuzzy linguistic terms T D ¹T k I k D 1; : : : ; mº with the temporal component of the data stream. The degree of this temporal linguistic term is obtained by the temporal membership function and the temporal period that is defined by the distance t j i D t 0 t j i ; t 0 > t j i from a reference point of time t 0 to the timestamp of the measurement t i . Each temporal linguistic term T k relates the timestamp of the measurement t j i to a fuzzy set T k , which is characterized by a membership function Q T k .t 0 t j i /. For a given temporal linguistic term, we can write Figure 3, an example of temporal linguistic terms, where the universe of discourse is measured as seconds since the current time, is illustrated with two linguistic temporal terms: T D ¹T 1 ; T 2 º ={just now, awhile}, which represent a fuzzy time interval up to 5 and 20 s respectively between current time and the timestamp of the measure.
Once, the linguistic information for each sensor has been defined, it is necessary aggregating the sensor streams in a time period. To do so, we propose to compute the relevance of a linguistic value term V k .v/ in a fuzzy temporal term T k .t / in two steps:  Firstly, for each sensor measurement, we define an t -norm operation [62] to fuse both fuzzy sets, jointly relating the value and temporal components: In Eq. (2), the independent terms V r and T k are applied to each measure m j i in the sensor data stream providing transformation from ¹v j i ; t j i º to the membership degree of linguistic term V r \ T k , for example, m D ¹0:2 m/s; 5 sº ! is awhile low (0.8). Secondly, the degree of membership of all measures in the sensor data stream are aggregated using the t -conorm operator in order to obtain a single degree of fuzzy sets V r \ T k for the given sensor data stream over a period of time: The linguistic terms T k and V k , which are instanced in the sensor stream s j , represent a protoform in the shape of: s j is T k V r . For example, the protoform instance: inhabitant motion is awhile low. Each protoform instances is measured with the degree of fuzzy aggregation from sensor data in accordance with Eq. (3).
In this way, we highlight that protoforms require an accurate assessment to express properly their semantic and granularity. Basically, in sensor streams, we propose aggregate measures depending on the nature of the sensor stream. So, we propose two models of solving Eq. (3) in order to compute the aggregation degree of data streams based on the variable and terms included in the protoform instances: Min-max. In this first case, the aggregation degree is evaluated in Eq. (3) using the maximal and minimal operators, being S D max; \ D mi n, which calculated the higher degree with co-operator=max of each measured evaluated with the t-operator=min. If we replace Eq. (3) with operators S D max; \ D mi n, we obtain: It represents the best value of the measures in the time term: The semantic of this aggregation is useful to aggregate binary open sensors or the best references to other objects, for example, the next protoform instances: the door is awhile open, inhabitant location is awhile close to microwave, where if a door is opened, or the user has been closed at least once, the degree of linguistic term is raised. Fuzzy weighted average. In this second case, the aggregation degree is evaluated in Eq. (3) using S D P ; \ D . We note that the t -norm operator of this models is related to \ D and the conorm operator to S D P being necessary a degree normalization between [0,1], which is introduced by the following expression: This semantic is related to fuzzy weighted average [63], whose range 2 OE0; 1 represents the frequency degree of the linguist term V r weighted by the term T k in the sensor stream.
This fuzzy aggregation operations are suitable to aggregate measures of high sample rates, for example, wearable or mobile sensors, where the degree is related to the presence of relevant measure in the temporal term, as the next protoform instances: the movement is just now high or the heart rate is awhile low.
In this section, we have provide two aggregation operators: min-max or fuzzy weighted average but other operators could be considered according to the needs.

2.2.˛-subscription of linguistic terms
In this section, we detail the proposed approach based on the Real-time Publish-Subscribe model with M 2M [64] communication, in which the degrees of aggregation of protoforms from sensors are distributed under subscriptions.
This paradigm focuses the information processing of linguistic terms on mobile or ambient devices where the sensors are integrated. To do so, we propose that each sensor will be represented as a publisher to which clients subscribe to a linguistic terms which configure a protoform and an cut threshold to be notified asynchronously. The proposed approach provides the following three advantages: The information processing is distributed in several central processing units, such as mobile and ambient computers. In this way, each device (i) describes the linguistic terms related to their sensors, (ii) computes the aggregation of protoform defined by subscribers, and (iii) publishes degrees to their subscribers using a M2M connection. For example, the mobile devices calculate the degrees of aggregation of protoforms related to each accelerometer sensor notifying in real time to several mobile, desktop, or ambient subscribers. In the same way, each ambient sensor is evaluated by means of remote services that can be deployed in distributed ambient computers. The subscription parameters are intuitive, because subscribers merely choose a relevant linguistic terms. In addition, subscribers can defined a fuzzy threshold called˛ cut, in order to receive just the degree of protoforms that overcomes the˛ cut. It enables subscribers optimizing the asynchronous events received and reducing sending irrelevant degrees of protoforms in an individual way. The subscribers obtain an understanding output; it is a fuzzy degree of aggregation of the linguistic terms relevant to subscribers, facilitating the fusion of interpretable information from heterogeneous sensors.
Following, we present the parameters of the remote services defined by the publisher side and the subscriber side.
Publisher side s j that is a label to represent the sensor stream, such as movement or temperature. Based on the nature of the sensor stream s j , an \ [ aggregation operator is related, min-max or fuzzy weighted average, in order to calculate the degree of the Eq. (3). V j D ¹V j 1 ; : : : ; V j n º that is a set of linguist terms, such as is high or is normal T j D ¹T j 1 ; : : : ; T j n º that is a set of temporal linguist terms, such as, awhile or just now. \ [ aggregation operator. Subscribers specify the operators to aggregate the date: (i) fuzzy weighted average or (ii) min-max.
Subscriber side The parameters of the service to be subscribed to sensor publisher are s j that is the label of the sensor to subscribe. V j r 2 V that is the linguist term from the set V j . T j k A temporal linguist terms from the set T j . ˛ cut that is a threshold which have to overcome the degree of aggregation of the linguist terms V j r and T j k related to Eq. (3).  ˇt that is the minimal time interval in milliseconds that the publisher wait to process and to notify again to the subscriber.
In this way, the information processing of the sensor streams is developed in each publisher by means of four modules responsible, which are illustrated in Figure 4, of collecting and synchronizing the data stream and publishing the fuzzy linguistic processing to subscribers.
Dynamic collection module. A dynamic collection of data from the sensor stream using circular buffer [65], being the size of its buffer enough to represents the fuzzy temporal terms, allocating a minimal time of data t max D max.t j i /; 8T k 2 T; T k .t j i / ¤ 0. Timer task scheduler module. In order to handle the notification time for their subscribers, notifying to them by means of asynchronous events when the interval time between the last sent event and the current time overcomesˇt . Dynamic list module. This module can be dynamically subscribed and unsubscribed to using a M2M service. Fuzzy controller module. In this module, the aggregating membership degree of the fuzzy linguistic and temporal terms is calculated according to Eq. (3) and the \ [ aggregation operator of the subscriber.

CASE STUDY
Intelligent systems for smart environments require a proper uncertainty treatment to obtain a high performance and accurate results [39] in relevant topics, such as activity recognition [38,66], which provides a key successful solution to the aging population [67] helping elderly people to stay with the best quality of life as long as possible in their homes. So, the fuzzy linguistic processing of this work has been focused on describing daily activities from heterogeneous sensor of a complex scene to provide an interpretable, intuitive, and flexible representation that can be analyzed by humans and intelligent computer systems.
In this section, a description of a daily activity in an intelligent environment is presented, which is modeled under our proposed approach for fuzzy linguistic processing and the subscription model.
Firstly, the smart lab and the set of deployed sensors are presented, then the features of the middleware that has been developed to this case study is described. Following, the fuzzy linguistic terms of the set of sensor are indicated and, finally, the results in terms of reduction of the communication burden of the fuzzy temporal aggregation of sensor streams are provided and discussed. The presented scene shows two inhabitant in a smart lab provided by mobile and ambient sensors.

Smart lab and deployed sensors
The case study was carried out in the smart lab of the Center for Advanced Studies in Information Technology and Communication of University of Jaén ¶ . At the scene, an inhabitant A is resting on the living room while another inhabitant B enters into the lab. Inhabitant B turns the hall lights on and then, he cooks in the dinner using the kitchenware and microwave. Inhabitants A and B meet in the kitchen when inhabitant B goes to drink water and afterwards inhabitant B backs into the room. Finally, the inhabitant B eats his dinner in the living room beside the inhabitant A.
To carry out the aforementioned scene, we have included several ambient and mobile sensors whose objective is describing the activities of users from different perspectives. The aim is receiving a homogeneous linguistic descriptions from a wide heterogeneous range of sensor. To do this, we have included the following sensors from ambient and mobile devices, which are illustrated in Figure 5: Vision sensors. Two vision sensors in order to capture the user localization by means of ambient vision devices. To do so, we have integrated a proposal where inhabitants dress shirts with four markers that allow vision sensors to recognize who and where the inhabitants are located, using the open-source project: Minimal library for Augmented Reality applications (Aruco) [68] to detect markers in the shirts of inhabitants. Mobile sensors. Inhabitants A and B hand a mobile device that collects several data streams from the next sensor: ı Accelerometer sensor, which describes the inhabitant motions in the scene. ı Bluetooth location sensor, combined with the iBeacons, provides closeness to rooms and objects using the Bluetooth Low Energy (BLE) protocol [69].
Open/close sensors, which are located at the doors of some home appliances (microwave, fridge, etc.) as well as at the doors of kitchen furniture. Z-Wave lights. The smart lab integrate remote control of lights by means of Z-Wave protocol [70], which can be turned on/off from the mobile devices of inhabitants.
The plane of the smart lab and the locations of set of sensors are illustrated in Figure 6.

Middleware deployment
Because each sensor depends on each manufacturer platform, a metalanguage middleware is required in order to deploy the subscription services.
In this case study, we have implemented the model of fuzzy linguistic terms to describe sensor streams, detailed in Section 2.1, in Java code language as well as the˛-subscription of linguistic terms described in Section 2.2 using ZeroC Ice. This implementation has been integrated into: (i) Android devices, which monitorize the mobile sensors and (ii) an ambient computer, which monitorizes the ambient sensors. ¶ http://ceatic.ujaen.es/. The distribution of real-time data has been implemented from a previous work by the authors [71]. This proposal uses an object-oriented middleware, ZeroC Ice [24], providing the spread of data in real time under the paradigm Data Distribution Service, using subscriptions and sending data through channels of real-time events. It also supports transparent communication for several languages (C++, .NET, Java, Python, Objective-C, Ruby, PHP, and JavaScript) and protocols (TCP, UPD, and SSL).

Fuzzy linguistic terms of sensors
In this section, we detail the fuzzy linguistic terms related to temporal terms and the mobile and ambient sensor of the smart lab.
In this case study, we propose the use of protoforms that are specified by means of variables and linguistic terms. As we have detailed previously, we have started from the classical protoform in the form of X is A, which represents possibilities constraints. Based on this prototypical form, other extensions result useful, for example, in quantification [72] or in time series [73].
In this work, we propose the use of protoform s j is T j k V r to represent the linguistic terms V j k and the temporal terms T j k that describe the sensor stream s j , for example, the protoform instance: inhabitant motion is awhile low. The degree of the protoform instances are computed as the degree of fuzzy aggregation of sensor data related in Eq. (3).
To instance the protoforms, it is necessary to define the variables and linguistic terms related to each sensor stream, following this information is provided for each sensor.
Accelerometer mobile sensor that measures linear acceleration in m/s 2 and represents the (A/B) inhabitant motion, with two linguistic terms: low and high, whose membership functions are illustrated in Figure 7   BLE proximity sensor that measures the distance from mobile to a beacon sensor in meters, m, and represents close to (microwave/sofa) with the linguistic term close, whose membership function is illustrated in Figure 8. We note the same linguistic term can be used for the rooms kitchen and living room because of their similar sizes. Vision sensor to detect the t-shirt mark of (A/B) inhabitant t rue that represent that (A/B) inhabitant location in (kitchen/living room), whose membership function is illustrated in Figure 9. Switch sensor to detect of open/close of the (microwave/ cupboard) status with the linguistic term open, whose membership function is illustrated in Figure 10.    Light (Z-Wave) to detect the status of lights, with the linguistic term on, whose membership function is illustrated in Figure 11.
In this case study, two temporal linguistic terms ={just now, awhile} has been defined to obtain a flexible representation of fuzzy temporal intervals where the sensor streams are relevant (Figure 3).

Results and discussion
In this section, the results on fuzzy linguistic processing of protoforms and the events generated by sensor streams in the case scene are exposed and discussed.  Table I.
Based on the value of sensor streams, the subscriptors received the in real-time evolution of membership degrees related to fuzzy aggregation operation of the linguistic terms. The degrees are represented in a baseline in Figure 12.
On the sensor data, we highlight the sturdiness of integrating two location systems: BLE and vision tracking. Vision tracking provides a prompt response but with losing tracking in some inter frames because of non-frontal captures of shirt markers. Meanwhile, BLE provides a consistent stream locations even thought it presents a delay in adapting the closeness to beacons. Both are complementary to obtain a robust accuracy of the location of inhabitants in real time. In Figure 12C, due to the inhabitant B is opening the door of the smart lab in the hall and he is going to the kitchen between the time interval of [0,10] seconds, we show a delay of presence detection in kitchen, but we receie high movement because he is walking before entering.
The accelerometer sensor by means of linear acceleration of mobile devices present fluctuating data, which are need to be integrated with other sensor data to be useful in human activity descriptions. Ultimately, the ambient binary sensors (Z-Wave light and open/door) provide no-latency reliable data of smart objects. We note how the linguistic transformation binary sensor, where each V r represents a crisp state such as hall light is on, shapes a fuzzy evolution when they are aggregated with temporal terms hall light is just now on.
On the fuzzy information processing, we note the relevance of the fuzzy aggregation to soft the degrees of the linguist terms in the time scene. In this way, the fuzzy processing provides (i) the temporal persistence of the crisp binary sensor by means of a temporal linguistic term, such as ambient sensor, (ii) the smoothing of noise or sawtooth values related to high-frequency sample of data, such as motion or BLE closeness, and (iii) the decreasing in the communication burden using time interval subscriptions without losing relevant information.
In addition, the decreasing in the communication burden can be tuned depending on context by adjusting the value of the minimal time intervalˇt or the threshold˛-cut. In Table II, we compare the number of raw data developed by each sensor with regard to the generated events by computing linguistic protoforms when their degree overcomes the˛-cut. In this way, these data represent the ratio reduction over the data/event stream, which we obtain when distributing raw data of sensor versus events of degree of linguistic proforms. In addition, we have included different values of -cut of subscribers (˛D 0, 0:5, and 0:9) in order to show its influence in the ratio reduction. We highlight the reduction in sensor with high sample rate, where the fuzzy aggregation process with˛-subscriptions is very significant. However, in binary sensors, the temporal persistence of fuzzy linguistic terms increase the numbers of received events, but they provides a membership degree aggregation of linguist terms richer than a single sample. Table II. Sensor, raw data from sensor, and ratio reduction of˛-subscribers.

CONCLUSIONS AND FUTURE WORK
In this work, we propose the use of variables and linguistic terms to represent and process the stream data from heterogeneous sensors. We relate the linguistic terms to protoforms where temporal representation is highlighted. These linguistic expression provide the advantage of interpretability and understanding close to human knowledge over data or semantic structures focused on processing of computers. At the same time, we propose two models of computing the aggregation degree of data stream based on the variable and terms included in the protoform instances.
In addition, the proposed model has been deployed under Real-time Publish-Subscribe model enabling subscriptors define the protoforms, aggregation operators, the alpha-cut, and minimal time intervalˇt to be notified. In the case study, it provides an encouraged stream reduction from raw data to high rate asynchronous events ofˇt =1 s. The data reduction overcomes up to 1/50 ratio in function of the alpha-cut.
The fuzzy linguistic approach presented in this work develop descriptions closed to the language used by human user, but in other contexts, they could need to be complemented with structured data or statical summaries. For example, a heart rate monitoring system could include as input the aggregation degree of the protoform instance heart rate is awhile high, but it is naive that the average, maximal or standard deviation of heart rate must not be required in higher layer of information processing.
A useful future approach of this work is developing fusion subscribers to integrate heterogeneous linguistic terms from different sensor that match the assessment of fuzzy IF-THEN rules. This kind of subscribers would generate new high-level knowledge or activate alert processes focusing on the inference of expert knowledge, for example, IF heart rate is awhile low AND oxygen level is a while scarce THEN send supervisor sms.