On the Imputation of Power System Measurement Streams with Imperfect Wireless Communication

Dependable measurement data are essential for the accuracy and integrity of power system state estimation and actuation. However, distribution grids empowered by wireless connectivity are subject to missing sensor observations due to channel stochasticity. In this paper, we leverage the potential of dynamical models to extract knowledge from the ambient measurement space under unfavorable and ideal network conditions. A rigorous assessment of various network configurations based on empirical evaluations reveals meaningful performance trends for model fitting and recovery of incomplete trajectories. In particular, while the imperfect connectivity may curtail the ability of dynamical models to exploit the intrinsic spatio-temporal structure, measurement imputation under ideal network configuration exhibits favorable performance. Our research outcomes highlight the impact of the underlying wireless connectivity on measurement data acquisition and handling in distribution grids.


I. INTRODUCTION
In recent years, key investments in power systems have remarkably improved the quality of their associated communication and information infrastructure [1]. Dependable measurement data streams, provided by phasor measurement units (PMUs) [2], in conjunction with expanded computational resources and effective techniques for knowledge extraction drive a paradigm shift in monitoring and control. Instrumental to this evolution is the advent of wireless communication (i.e., 5G and beyond) networks offering ubiquitous coverage, low end-to-end latency, high throughput, and secure connectivity among system components [3]. Nevertheless, their utilization, mainly in distribution grids, comes inadvertently with challenges which involve distortions and missing data due to the unreliable nature of the shared wireless medium. Addressing the degradation of received data is essential for high fidelity estimation of state variables and informed system control.
Power system data streams consist of spatially co-evolving time-series, characterized by the key conceptual traits of physical context and temporal smoothness [4]. Such data stem from sensor installations which exist at disparate spatial locations and report several local electrical quantities. The underlying power grid which comprises the physical process giving rise to the collected measurements induces contextual information, while temporal smoothness is associated with the inherent correlation among successive recordings of the same stream. The aforementioned notions elucidate the hypothesis that the exploitation of spatio-temporal cross-correlations among measurements holds the promise of extracting the underlying dynamics which govern their behavior. This is done, among others, in an effort to curtail the loss of data quality attributed to communication channel imperfections.
The problem of missing data recovery in time-series can be addressed by a multitude of strategies; at its foundational level, the following schemes are discussed. The most intuitive approach is conventional linear, cubic, or nearest neighbor, among others, interpolation. Such a scheme, albeit exploiting the temporal smoothness of a single data stream and resonating computational efficiency, fails to incorporate information leveraged from correlations among different measurement sources [5]. This cogently justifies the utilization of methods based on singular value decomposition (SVD). Notably, missing values can be inferred by applying an iterative technique based on consecutive low-rank decompositions [6]. The prototypical hallmark of SVD-based approaches constitutes the ability to deduce linear correlations among measurement trajectories and, thus, reconstruct missing values in a data stream from the rest. It is to be noted that the existence of invariances among low-rank embeddings and the potential presence of transient content pose perils which may reduce the effectiveness of such schemes [7]. A method based on matrix factorization for dealing with missing entries in highdimensional time-series is presented in [8], with the ability to learn and incorporate temporal dependencies. Dynamical systems, capable of exploiting spatio-temporal correlations among data streams, can be utilized in an iterative strategy for estimating missing values [5].
In the framework of power systems, several works have addressed the problem of missing data estimation. In [9], occluded data among PMU streams are recovered through the utilization of a low-rank matrix completion scheme, which has the ability to be deployed as a real-time application. Simultaneous reconstruction of missing values and correction of bad data is proposed in [10] by employing the Hankel matrix of PMU measurements and its low-rank feature. By modeling measurement data as samples from a Gaussian random process with known covariance matrix, Bayesian singular 978-1-7281-6264-5/20/$31.00 ©2020 IEEE value thresholding is introduced in [11] as an efficient, iterative approach to data recovery in low voltage distribution systems, which exhibits the ability to estimate the optimal threshold in each iteration. A real-time missing data estimation strategy based on matrix completion with penalization of singular values by means of Schatten-q quasi-norm is outlined in [12]. An online scheme for learning parameters of hidden Markov models with contextual information is employed in [4], aiming at the recovery of missing PMU data.
There are two main desiderata in the framework of this paper. First, in an effort to regain perspective on the mechanisms which causally induce occlusions, we investigate the lingering influence of communication impairments to the emergence of missing data among measurement streams at the level of a fusion center. In particular, we explore the impact of various network configurations on the quality of the received measurement data with the aid of a tractable expression for the outage probability. Second, we examine the ability of interpretable dynamical systems i) to fit the observed data at the fusion center in the presence of less favorable network connectivity conditions; and ii) to impute missing values under ideal decoding of received measurements. It should be underlined that this work is not inclined towards the identification and correction of bad data among decoded measurement values. It is rather a first step towards the design of appropriate connectivity enablers which allow for decoded values with meaningful spatio-temporal correlations.
Material in this manuscript is organized as follows. The employed network model and the respective probability of sensor connectivity outage are described in Section II. The utilized scheme for data imputation is outlined in Section III. Results pertaining to the performance of model fitting and quality of missing data reconstruction based on numerical simulations under imperfect and ideal decoding conditions, respectively, are presented in Section IV. Section V is reserved for conclusion and discussion of the path forward.

II. NETWORK MODEL
We consider a power grid monitoring scenario composed of a fusion center and a number of deployed sensors. Each spatially allocated sensor captures a multitude of electrical quantities depending on its functionality. For instance, a PMU is typically capable of reporting voltage and current magnitudes and phases, frequency, and rate of change of frequency, while a remote terminal unit associated with conventional supervisory control and data acquisition (SCADA) infrastructure may provide measurements of voltage and current magnitudes, as well as power flows and injections.
A cellular network topology is considered where the sensors, equipped with radio interfaces, transmit their measurements to the fusion center which is co-located with their associated base station (BS). A closest BS association scheme is adopted in this paper. We employ a stochastic geometry framework for the positions of the BSs and sensors to facilitate the modeling of network topology and the tractable analysis of uplink sensor data transmission [13]. In particular, we assume that BSs are distributed on the plane according to a homogeneous Poisson point process (PPP) Ψ b of intensity λ b , while sensors are spatially distributed according to a homogeneous PPP Ψ u = {u k ; i = 1, 2, 3, . . . } with intensity λ u . We further assume saturated traffic conditions, i.e., sensors always have data to transmit, and λ u is considered high enough such that each BS will serve at least one sensor per channel.

A. Channel Model and Power Control
A generic power-law path-loss model is considered where the signal power decays with the propagation distance r at the rate r −α , where α denotes the path-loss exponent. In addition to path-loss attenuation, we assume a Rayleigh fast fading model for a tractable representation of the random channel effects 1 . The channel power gains, denoted by g, are assumed to be independent of each other and of the spatial locations, and identically distributed (i.i.d), following an exponential distribution with unit mean. Let p k denote the transmit power of sensor k. All sensors are considered to have equal maximum transmit power p max and employ full channel-inversion power control to compensate for the path loss. That is, each sensor controls its p k such that the average received signal power at the serving BS is equal to a threshold ρ. The threshold ρ constitutes a network design parameter and takes values greater than the receiver sensitivity ρ min .

B. Outage Probability
An uplink sensor transmission is considered to be successfully decoded at the serving BS if and only if the signalto-interference-plus-noise ratio (SINR) of the useful signal is greater than a certain threshold γ th . Based on the stationarity property of the PPP [14], the SINR experienced at the origin BS for a target sensor (k = 0) can be expressed as where the numerator corresponds to the useful signal power and σ 2 is the noise power. The second term in the denominator of Eq. (1) denotes the aggregate intra-cell interference conditioned on the number of neighboring sensors N s , while the third term pertains to the inter-cell interference from other uplink sensor transmissions on the same timefrequency resource. For analytical tractability, the locations of the interfering sensors are considered to follow a PPP with intensityλ u and their transmit powers p k are independent. Based on the SINR analysis conducted in [15] and for α = 4, the SINR outage probability, P o = P(SINR < γ th ), for a sensor transmission can be calculated as where µ = cλ b cλ b +λu and c = 3.575 is a constant related to the Voronoi tessellation in the R 2 . The impact of γ th and λ b on the SINR outage probability is illustrated in Fig. 1. It can be observed that a high SINR threshold may severely degrade the achieved performance leading to a high number of missing sensor measurements. In addition, a lower BS intensity further exacerbates this detrimental effect. Due to the imperfect communication, the received sensor measurements at the fusion center, co-located with the serving BS, can be represented by a partially observable time sequence Z = [z 1 , z 2 , . . . , z T ], where each vector z t contains the received measurements at time-step t from the deployed sensors 2 . By incorporating the outage probability P o (given by Eq. (2)), the entries z t,k of z t can be expressed as where x t,k denotes the transmitted measurement from sensor k at time-step t and υ t,k is a zero-mean white Gaussian noise with covariance matrix Ω. It is noted that Eq. (3) takes into account the impact of the wireless channel lossy nature on sensor transmissions which may result in a received vector z t with intermittent measurements. In order to impute the unavailable sensor observations, the conditional expectation of the missing measurements with respect to the observed ones over the time-frame from t = 1 to T needs to be estimated.

III. MODEL-BASED HANDLING OF MEASUREMENT DATA STREAMS A. Dynamical Systems
The utilization of dynamical systems as an interpretable scheme for gaining insight into the behavior of measurement streams stems from their ability to extract knowledge from the spatio-temporal synergy among the respective trajectories. A linear dynamical system (LDS) [16] constitutes the foundational block upon which more sophisticated models, such as the switching linear dynamical system (SLDS) [17] and the recurrent switching linear dynamical system (rSLDS) [18], are built; each of these systems can be graphically described by a Dynamic Bayesian Network (DBN), as shown in Fig. 2.
In the case of LDS, the governing dynamics are captured by latent variables h t , which exhibit smaller dimensionality than z t , through linear mappings. According to the mathematical formulation outlined in [5], the equations which describe an LDS are shown in Eqs. (4) and (5), as follows where the former expresses temporal correlations among latent variables of measurement streams and the latter captures spatial interactions among different measurements in the same time-step. Gaussian transition and observation noises are assumed, in order to account for the variation in observed and latent variables caused by sundry factors which cannot be predicted from the deterministic transitions and emissions. Based on the factorized form of Fig. 2, the joint distribution of latent and observation variables is expressed as The parameters which need to be estimated consist of the entries in linear mappings F and G, as well as noise covariances Λ and Σ. The latter are assumed to constitute diagonal matrices. Term p ( h t+1 | h t ) in Eq. (6) can be expressed as The remaining terms can be straightforwardly deduced. Learning parameters θ = {F, Λ, G, Σ} can be achieved by means of Expectation-Maximization algorithm [16] or in a Bayesian framework using blocked Gibbs sampling by setting conjugate prior distributions over all parameters [19].
Since it is plausible that measurement trajectories may not be sufficiently summarized by a single LDS due to temporal structural changes, the use of SLDS which switches among a predetermined number of LDS can be made [17]. Discrete latent variable s t serves as the switching enabler for each respective LDS, which is now characterized by its unique parameters F st , G st , Λ st , and Σ st . It is common that parameter sharing pertaining to emission matrix G st and covariance matrix Σ st among the different LDS is imposed. Finally, SLDS is extended to rSLDS through the explicit incorporation of a dependence of discrete hidden variable s t+1 on continuous latent variable h t , which introduces non-Gaussian potentials ψ t,t+1 (h t , s t+1 ) = p (s t+1 |h t ) with further implications in the sampling of hidden variables and Bayesian parameter learning [18]. It should be stated that the increased expressive power of SLDS and rSLDS comes with the cost of elevated computational complexity which may pose a challenge for real-time applications.

B. Iterative Scheme for Missing Measurement Imputation
We adopt the general framework of [5] for iterative missing values imputation. The main differentiation is that, instead of employing an LDS with parameter learning by means of Expectation-Maximization algorithm, we substitute it with an LDS based on Bayesian parameter learning, as well as an SLDS and an rSLDS.

A. Settings
Data provided by 6 PMUs as part of the EPFL Smart Grid project [20] are utilized as input to this work. From each PMU, we retain 3 measurement streams which correspond to three-phase voltage magnitudes. For the simulation setup of the network topology, we consider a PPP for the BSs and their co-located phasor data concentrators (PDCs) with intensity λ b , while the PMUs are randomly spread over the area until all PDCs are associated with at least one PMU, i.e., to achieve traffic saturation conditions. Regarding resource allocation, a dedicated set of uplink time-frequency slots has been considered for the PMU transmissions. Missing data stem from either the lossy PMU transmissions which may result in data decoding failures at the PDC, as indicated by Eq. (3), or the creation of synthetic occlusions due to a denial-of-service attack or PMU hardware failure. For all experiments, we have utilized the respective codes which are provided in [21] and [22] for LDS, SLDS, and rSLDS. Each experiment was repeated 20 times and we averaged the achieved performance. Reported results in this section stem from the best average performance achieved by the union of the aforementioned dynamical systems in the examined case scenarios.
To quantify the quality of model fitting and data imputation, the root mean squared error (RMSE) is employed as where z t,k andẑ t,k pose a dual role; in the case of examining model fitting, they correspond to the decoded measurements and predicted values by the dynamical model, respectively, at each time-step t where observed values exist, while for quantifying the quality of data imputation they pertain to the ground truth of occluded values and the imputed values, respectively, at each time-step t where missing values exist. For a given number of measurement streams M and timeseries segment duration T , the dimensionality of continuous latent variables h t and the highest integer value of discrete latent variable s t comprise hyper-parameters which need to be judiciously tuned. The performance of dynamical models is contingent upon the selection of suitable values for the aforementioned quantities. Considering computational resources, we carry out a limited hyper-parameter search where the dimensionality of h t and the maximum value of s t may take values in {8, 10, 12, 14, 16} and in {2, 3, 4}, respectively, for 10 randomly selected time-series segments.

B. Model Fitting Performance with Lossy PMU Transmissions
In Figs. 3-6, we illustrate the model fitting performance in terms of RMSE for different network configurations. Our goal is to shed light on the impact of various network parameters, as defined in Section II, with respect to the ability of dynamical models to summarize the decoded measurement stream. It is worth noting that, besides the detrimental effect of missing observations due to the PMU connectivity outage, the imperfect decoding accuracy of the observed PMU data also affects the model fitting performance. Fig. 3 illustrates the RMSE performance with respect to the SINR threshold γ th . It can be observed that the RMSE levels increase with increasing γ th , as the aggregate interference leads to significant signal degradation and results in a high number of PMU links in SINR outage (as also shown in Fig. 1). Subsequently, the increased outage probability in the high γ th regime, impacts the achieved RMSE and leads to degraded performance in the model fitting process. Fig. 4 depicts the evolution of the RMSE performance for varying BS intensity λ b . Note that each PDC is considered to be co-located with its respective BS, therefore λ b also represents the PDC intensity. We observe that model fitting performance improves (i.e., the RMSE decreases) with increasing λ b . This can be intuitively explained as follows. When the BSs/PDCs are dense enough, the distance between a generic PMU and its corresponding serving BS/PDC decreases, resulting in higher quality of the received measurements and lower outage probability levels.   5 depicts the RMSE performance with varying power threshold ρ. It can be observed that RMSE levels decrease with increasing ρ, as a higher received-power threshold at the BS leads to lower SINR outage (for relatively low interference levels, as in our case) and lower probability for lost PMU observations. Finally, Fig. 6 shows the impact of the noise power σ 2 on the model fitting performance. As expected, when σ 2 values turn from negligible to significant, the received SINR levels of PMU measurements deteriorate and the RMSE slightly increases along with the SINR outage levels.

C. Reconstruction Performance with Denial-of-Service or PMU Failure
In contrast to the previous subsection, we now assume an ideal network configuration with high data-decoding accuracy for the PMU transmissions. Synthetic dropouts can be made by uniformly selecting space-time points for occlusion, which can be attributed to the effect of denial-of-service or PMU random hardware failure. A PMU hardware failure is often associated with the ability to capture or transmit a portion of its associated measurement streams. It is to be noted that the RMSE recorded for the mismatch between decoded measurements and their re-   spective predictions by the dynamical systems exhibited values between 0.14 and 0.16 for all experiments conducted in this subsection. This is in stark contrast to the level of mismatch registered in subsection IV-B, which can be explained by the fact that the dynamical systems are capable of achieving a fairly good fit to the observed entries of measurement streams under ideally decoded transmitted data.
In Table I, the performance of imputing missing values among random measurement streams and time-steps is shown. In this case, dynamical systems are capable of drawing insight from the perfectly decoded measurement values to make valid inference for the missing data. Imputation performance expectedly registers a decline with rising percentage of missing entries in the fusion center, albeit not at prohibitive levels. Fig. 7 has been devised by conducting two types of experi- ments, investigating primarily the effect of ensuing occlusions: (a) we randomly opted for 1, 2, and 3 measurement streams to exhibit consecutive missing values of varying length starting at a uniformly distributed common point in time, and (b) we opted for all streams to simultaneously report missing values. The visual assessment of Fig. 7 suggests that reconstruction of missing values in the case of experiment (a) does not suffer from the limitations imposed by potentially unfavorable network configuration parameters. The imputation mechanism takes advantage of spatio-temporal correlations based on the perfectly decoded measurement streams which are consistent with the underlying physical process. Nevertheless, even under such ideal network conditions, the recovery of missing data in experiment (b) suffers from the fact that in the absence of any measurement stream for an extended period of time there is non-existent spatial information in each time-step for the dynamical model to exploit. Moreover, the temporal information is limited to the relatively distant time-steps with recorded measurement values. Hence, the predicted values are mainly influenced by the outcome of linear interpolation from the initial stage of data imputation process, showing significant mismatch with respect to the ground truth values.

V. CONCLUSION
In this paper, we explored the ability of dynamical systems to mine received measurement streams under imperfect and perfect data decoding conditions. It is deduced that when network parameters are such that significant streamwise distortions are present among the decoded values, the utilized dynamical systems are not capable of capturing the underlying spatio-temporal interactions and, thus, the validity of estimating missing values is curtailed.
In the path forward, we will direct our efforts towards the design of appropriate connectivity enablers for high-quality received measurement streams whose potentially missing values are primed for reconstruction. It is to be noted that since the process of imputation poses elevated merit as a realtime application, the design of network solutions will consider dynamical models which can handle measurement streams of short duration and high dimensionality with guarantees against overfitting.