Probabilistic Low-Voltage State Estimation Using Analog-Search Techniques

Power systems are becoming more complex and the need for increased awareness at the lower voltage levels of the distribution grid requires new tools that provide a reliable and accurate estimation of the system state. This paper describes an innovative state estimation method for low voltage (LV) grids that analyses similarities between a real-time snapshot comprising only a subset of smart meters with real-time communications and fully observed system states present in historical data. Real-time estimates of voltage magnitudes are obtained by smoothing the most similar past snapshots with a data-driven methodology that does not relies on full knowledge of the grid topology and electrical characteristics. Moreover, the output of the LV state estimator is a conditional probability distribution obtained with kernel density estimation. The results show highly accurate estimation of voltage magnitude, even in a scenario characterized by a strong integration of photovoltaic (PV) microgeneration.


I. INTRODUCTION
Presently, the nature of LV grids is changing and hence demanding for a paradigm shift in terms of monitoring and control. Distribution System Operators (DSO) are installing advanced metering infrastructures that bring more data and monitoring capacity, which, in general, is only processed in batch mode once per day. Due to technical and economic constraints, real-time monitoring of the voltage magnitude in LV grid nodes is not readily available. On the other hand, the growing penetration of renewable energy resources and storage units, including electric vehicles, demand-side-management strategies and microgrids with independent controllability and islanding capabilities, require new techniques to increase the situational awareness of human operators and enable real-time decision making [1], [2].
Considering all these challenges, the operation of distribution systems can become more efficient only if new operational methods and tools are developed. A tool of unquestionable value for this purpose is a state estimation (SE) algorithm suitable for LV grids. Such tool will aid the DSO to monitor and operate the grid in quasi-real time, similarly to what already happens in transmission grids.
By taking advantage of the large volumes of data provided by the advanced metering infrastructure, it is possible to identify patterns and capture correlations in such a way that, even with a small subset of real-time data sources, a quite accurate estimation of the system's state can be obtained. LV grids are many times found bereft of topology and equipment characterization. In this context, having a data-driven state estimator based exclusively on historical data and a small subset of smart meters with real-time communication of active power and voltage measurements is very advantageous. For instance, the results from a demonstration campaign of real-time monitoring of a LV grid conducted in the FP7 European project IDE4L showed the following benefits: detection of current imbalances; identification of local voltage deviations; support the implementation of decentralized control strategies to improve power quality [3].
Several works have tried to offer a solution to deal with the changing paradigm in the operation of distribution grids. However, traditional approaches do not properly accommodate the different acquisition rates of analogic and digital devices. One very important source are the phase measurement units (PMUs) that have been tentatively included within the traditional state estimation algorithm [4], ignoring analogicbased measure-ments [5], or in a linear post processing step [6], although with unavoidable information losses. The new dynamics found at the distribution grid level weaken the typical quasi-static scenarios and some works explore the time-varying nature of the varia-bles with Kalman filter-based estimators [5], [7], [8], although with limitations due to the process noise covariance matrix.
Additionally, a single outlier can severely contaminate all the data requiring different criteria than the traditional weigthed least squares, for example the correntropy in [9] and the LWS (least winsorized square) in [10]. Notwithstanding, the application of Newton's method to solve state estimation formulations is proved to be numerically unstable, leading to local optimums or not even achieving convergence. This kind of approach gets even less appealing when one takes into account the lack of characterization of the LV grid, the inexistence of good levels of redundancy within the dataset of measurements and the limited observability to perform realtime estimation. Data-driven methods can overcome these limitations and extract value from smart meter data.
In [11], a data-driven SE is performed over distribution grids using auto-associative neural networks, the autoencoders (AE), which only require an historical database and a few realtime measurements to perform an effective state estimation. Missing quantities are reconstructed using Evolutionary Particle Swarm Optimization (EPSO). The method is very flexible regarding the type of measurements that are used as input, allowing a full exploitation of the available metering infrastructure, and avoid two major drawbacks of conventional approaches: (i) modeling the complex three-phase equations, which may lead to heavy iterative algebraic calculations and numerical/convergence problems and (ii) characterizing all the grid parameters. A similar concept was explored in [12], but using extreme learning machines combined with EPSO. In both, only deterministic estimations of voltages were produced.
In [13], instead of using the results produced by the last state estimation to initialize the state variables in the Newton's method, it is used a Bayesian approach based on historical data search via kernel ridge regression. After identifying a group of measurement sets with the smallest distances to the current measurement set, the group of sets is used to compute an initial guess for the iterative algorithm to calculate the current state. Another data-driven SE is proposed in [14], where a load/generation forecasting method is used to feed a standard SE algorithm and produce a probabilistic estimation of the node's voltage. In this work, the authors assume knowledge of active power in the MV/LV substation and the proportions to allocate that load per LV client are calculated from the previous day smart meter data. The method shows the following limitations: (i) balanced loads are assumed across the three phases, and are represented as constant active/reactive power demands with no voltage dependency; (ii) probabilistic estimations of voltage are calculated by considering the unconditional historical error distribution of load forecasting.
The work of the present paper was inspired by the approach developed in [15] to generate aggregated wind power forecasts, based on the search for similarities between current wind speed forecasts and historical wind speed forecasts. The method was extended to the low voltage state estimation and this paper produces the following original contributions:  Real-time state estimation in LV grids is performed using a very limited number of real-time metering points, although assuming that smart meter voltage measurements are sent to the historical database periodically (e.g. every 24 hours or every week).
 In contrast to [14], the estimation results express the conditional uncertainty involved, in the form of a set of quantiles. This feature gains particular relevance to increase the awareness of the human operator by defining probabilistic alarms for the occurrence of under/overvoltages.
 The grid observability, in this case defined by the number of nodes for which there is historical data or real-time measurements, is not required. Since the SE is applied for each node individually, the existence of a portion of the grid without real-time telemetry or smart meters does not prevent the execution of the estimator.
 Neither topological information, nor electrical characteristics of the elements of the grid are necessary. Nonetheless, the knowledge of the phase where each measured value corresponds to, improves the estimation. This also contrasts with [14] that applied a standard SE and assumed balanced phases.
 Weather measurements/forecasts (exogenous variables) can be included to improve the estimation, benefiting grids with strong presence of renewable energy resources (even if under self-consumption schemes). Compared to the state-of-the-art, this work is the first to include information about weather in SE.
In summary, this work describes a novel data-driven probabilistic LV state estimator (LVSE), which takes advantage of information collected by the smart grid infrastructure, and includes exogenous information like weather forecasts and calendar information (hour of the day, day of the week, etc.).
The remaining of this paper is organized as follows: section II discusses the motivation for a data-driven state estimation in LV grids; section III describes the methodology to derive point and probabilistic voltage estimations; section IV presents the results for a LV grid test case; finally, section V presents the conclusions and future work.

II. MOTIVATION FOR A DATA-DRIVEN APPROACH
As previously stated, LV grids present different challenges that undermine the use of classical approaches to the SE problem. For instance, real-time metering is not possible due to technical-economic reasons and the electrical characteristics of grid elements is many times inexistent (or with gross errors). On the other hand, analyzing in detail some of the particularities of these grids it is clear that patterns and dependencies between electrical quantities in different nodes are strong, and this can justify a data-driven approach. Figure 1 depicts the voltage magnitude temporal variation in a node's phase of a typical LV grid where are visible the daily patterns. This periodic behavior is a good indicator that there is valuable information in past voltage measurements (serial dependency).
Moreover, the common radial structure of these grids create identifiable dependencies between the voltage magnitudes across the grid. Figure 2 presents the voltage magnitude dependency between two different nodes of the same LV grid. In the left picture is clear the high and linear dependency between the voltages when these are connected to the same phase. Still, even in the case of being connected to different phases (right picture), strong dependencies can be detected.
This work assumes that the LV grids have smart meters capable of gathering voltage magnitude measurements every 30 minutes (at least) and send it periodicaly to the DSO control center or data concentrator located in the MV/LV substation. In this context, it is possible to construct a historical database to feed the data-driven method. Additionaly, the existence of some real-time telemetry is assumed for a subset of meters.
Exogeneous variables, like calendar variables (e.g., hour of the day, day of the week) and weather measurements or forecasts can also contain relevant information to infer an online (or real-time) snapshot of voltage profiles.

III. LOW VOLTAGE STATE ESTIMATOR (LVSE)
A. General Framework The basic principle behind the proposed LVSE is to search for analog voltage events in the historical dataset, using a set of explanatory variables, and extrapolate the current operating state from past information. This naturally relies on information collected online (or in real time) from a subset of smart meters, but also explores other types of information, mainly related to the autocorrelation of the voltage time series and influence of weather and calendar variables in the load patterns that results in voltage variations along the day.
The analog-search procedure is described in section III.B and Figure 3 illustrates a set of potential explanatory variables. In brief, the method explores recent and current measures collected by the subset of smart meters with real-time communication and MV/LV substation meter, together with voltage observations from the previous day and from all the meters installed in the LV grid and MV/LV substation. Information about the most recent numerical weather predictions (NWP), like global horizontal irradiance or ambient temperature, can be also integrated in the model. The same is valid for measurements collected by a weather station. Information about demand response actions or dynamic price signals are other potential explanatory variables if available.
The outcome is a deterministic estimation (i.e., expected value) of the voltage magnitude in the smart meters without real-time communication, which combined with the others meters provide a real-time snapshot of the system state. The lack of full observability leads to uncertainty in the estimated variables, which is also conditional to the grid current operating conditions (e.g., level of PV generation, observability). Section III.C describes a methodology based on kernel density estimation (KDE) to derive a conditional and non-parametric uncertainty estimation. This statistical method requires a set of hyper-parameters that need to be estimated offline (to avoid "flat start") and online (to adjust to changes in the grid structure and measurements). The model's tuning process is described in section III.D.

B. Deteministic State Estimation Formulation
This estimation methodology relies on the idea that information regarding the current state of the system can be used to quantify how analogous a given known past state is [15]. When running, the state estimator searches for similarities, for each node's phase n, computing a weighted average as in Eq. 1, where the estimated voltage for the current instant t, , , is obtained as a weighted average of past states (instants ).
The smoothing coefficients, ,ℎ, , are in this method calculated considering: i) The distance ( , , ) between the explanatory variables at instant and at each instant of the past, ℎ. In this work, we considered the absolute distance (Eq. 2) for the explanatory variables contained in vector .
ii) The bandwidth that defines the selection window of data according to the distances. Here, it is computed as a percentage (tuning parameter) of the range of distances, and (Eq. 3).
Other alternatives to the range of distances could be chosen like the median or the mean distance [15].
iii) A function that weights the past instants according to distances and within the bandwidth (Eq. 4), where is a tuning parameter, μ is the center of the distribution of distances, is the age in hours of the selected historical MV/LV substation element and , with 0 < < 1, is the forgetting factor. The value of regulates how local the weighting is, so the larger its value, more localized the model is. The objective is to give largest weights to the nearest observations. The age-weighting coefficient provides to the algorithm a higher capacity to adapt to possible topology changes by influencing its preference for more recent observations over the older ones.
The estimation methodology is rather straightforward though it requires tuning some hyper-parameters.

C. Probabilistic Reconstruction of the System State
In order to reach a higher level of awareness and provide insights regarding the confidence on the results of the SE, the KDE method is applied to derive conditional probability density functions (pdf) for the estimated variables (with no assumptions on the shape of the conditional distributions).
The KDE is a memory-based learning method that estimates an unknown density function by smoothing out the observations. The flexibility arising from its non-parametric nature makes KDE a very popular approach for data drawn from a complicated distribution [16].

Let
, … , ∈ ℝ be an independent, identically distributed random sample from an unknown distribution with density function . The KDE can be written as where ∶ ℝ → ℝ is a smooth function called the kernel function and ℎ > 0 is the smoothing bandwidth that controls the amount of smoothing.
Different kernels are available, like the normal, spherical, Gaussian, Epanechnikov, beta, among others. The bandwidth requires special care since if it is too small, there will be many wiggles in the density estimate. If it is too large, important features can be smoothed out. In fact, the choice of the kernel function generally does not play an important role as the selection of the bandwidth. In this work, the Gaussian kernel was considered.
Taking into consideration the conditional construction of the voltage at the current instant from weighting analogous occurrences in its past, as described by Eq. 1, the density function is computed as follows: The resulting density estimation is used for producing To obtain the cdf from the estimated pdf, two additional steps are needed. Firstly, the final pdf is determined by properly normalizing the estimated pdf, so that the integral is equal to one. Secondly, the cdf is obtained using numerical integration through the "normalized" pdf [17]. Once the cdf is estimated, extracting the quantiles is straightforward and computationally cheap. Altogether, the procedure avoids quantile crossing so the following monotonicity property is satisfied: D. Hyper-parameters Tuning As previously mentioned, the LVSE methodology depends on a set of hyper-parameters that should be tuned, namely: , , and the bandwidth ℎ. The first two were found with little impact as long as are kept above an empirical value ( > 40 and > 15). The third one is hard to optimize unless multiple reconfigurations of the grid were tested, hence an educated guess was applied. The fourth deserves special attention since it affects the quality of the probabilistic estimation.
According to Figure 4, an EPSO [18] is run initially over a part of the historical data to tune the bandwidth per feeder. Nevertheless, due to the changing nature of the distribution grid, this parameter is continuously optimized, individually for each phase's node, as the process advances using the Nelder-Mead dynamic simplex method (DynSimplex) [19].  It should be noted that the calculation of the performance metrics (see section IV.C) requires the real values of the voltages magnitude. This means that in every iteration of the DynSimplex, the evaluation of the simplex is applied to a past instant already present in the historical database in such a way that the last evaluation before an update will be applied to the last voltages in the database.
The online approach on top of the forgetting factor , makes the LVSE fully online and capable of handling concept drift (e.g., change in grid topology, new consumer) in the data generation mechanism.

A. Test Case Description
The LVSE tests were conducted using a 33-node LV grid available in [20] and represented in Figure 6 with different levels of renewable generation penetration:  Scenario no_PV -no presence of PV.  Scenario med_PV -medium presence of PV (historical data with a ratio between generated energy and consumption of 18%).  Scenario high_PV -heavy presence of PV (ratio of 36%). Figure 6 -33-node LV grid used for testing the methodology. Next to each load and generator is represented the phase where they are connected.
The load time series was from a smart metering trial conducted by the commission for energy regulation (CER) in Ireland [21] and the PV generation was collected from Évora, Portugal, in the framework of the SuSTAINABLE project [22]. From this data, an unbalanced power flow algorithm was applied to calculate historical data of voltage magnitude.
The following explanatory variables were considered:  Hour and week day.  Irradiance (homogeneous across all PV panels).  Real-time voltage measurements in all phases of nodes 1, 2, 3 and 4.  Voltage magnitude measurement in t-24 (previous day).

B. Benchmark Model: Autoencoders
The deterministic version of the LVSE will be compared against a state-of-the-art technique based on Autoencoders (AE) coupled with a metaheuristic to reconstruct missing quantities [11]. Similarly, this technique only require an historical database and few quasi-real-time measurements to perform an effective state estimation. The AE method uses the same explanatory variables of the proposed LVSE.

C. Evaluation Metrics
To compare the deterministic results of the LVSE with the AE it was used the mean absolute error (MAE) and the maximum absolute deviation (MAD). In the case of assessing the probabilistic estimations, the evaluation metrics for probabilistic forecasting described in [23] were applied to the SE problem.
Let us consider the indicator variable . Given a quantile estimation for voltage V and Vt the current voltage, the indicator is given by: The first metric is the calibration, which measures the mismatch between the empirical probabilities (or long-run quantile proportions) and nominal (or subjective) probabilities. The difference between empirical and nominal probabilities can be called bias of the uncertainty estimation and is calculated for each quantile nominal proportion (Eq. 11).
The sharpness measures the "degree of uncertainty" of the probabilistic state estimation, which numerically corresponds to compute the average interval size between two symmetric quantiles, e.g., 10% and 90% centered in the 50% quantile (median).
The two previous metrics can be quantified together in a single scoring rule. Eq. 10 presents the quantile score (QS) metric for the set of quantiles M, which is positively oriented and admits a maximum value of 0 for perfect probabilistic estimations. This metric is a generalization of the "pinball" function used in quantile regression.

D. Results of the Deterministic SE
Both the LVSE and the AE performed state estimations for a period of 2500 instants with 30-minute resolution. The AE was subjected to an initial training of 1300 instants that preceded the test period. In Figure 7 is shown a small portion of the deterministic state estimation comparing both methods.  Table I presents the results of running both methods for the three scenarios showing that in every scenario, the LVSE outperforms the AE. Both methods produce better estimations in the presence of more PV integration, which may indicate that the use of the irradiance as explanatory variable has a positive effect in estimations. Moreover, the existence of renewable energy close to consumption points has the effect of decreasing the voltage oscillations, thus facilitating the estimation. In terms of computation time, the AE needed 3630 seconds to train and test while the LVSE runs in 86 seconds.

E. Results of the Probabilistic SE
In contrast to its deterministic version, the LVSE runs in offline (or batch) mode an EPSO to get a reasonable valuation of the kernel's bandwidth. For this process, the QS (Eq. 10) was used as objective function. The real-time probabilistic estimation is then performed for 2500 observations with 30minute temporal resolution and all the results presented next refer to this test period. Table II shows the benefits of using the online optimization with the DynSimplex versus keeping constant the bandwidth estimated by the EPSO algorithm. Likewise, the QS rule was used in the DynSimplex method. The results show a considerable improvement of the average (over all smart meters) probabilistic bias (Eq. 9) if this online hyper-parameter tuning is employed.  Figure 8 illustrates the probabilistic voltage estimation for one node in the scenario with high PV integration. The plot shows that the conditional uncertainty intervals are sharp, which is confirmed by Figure 9 that shows a maximum width of the intervals of around 0.02 p.u.. This result is expected due to the high dependency between the voltages (as depicted in Figure 2), which leads to low uncertainty, even only when 4 meters out of 33 (12% of the meters) have real-time communication. Moreover, it is possible to see that periods with high voltage are characterized by wider uncertainty intervals.  A trade-off between sharpness and calibration is known in the literature. Thus, sharp estimations have a tendency to present a poor performance in calibration. However, as shown in Figure 10 the probabilistic bias on average is below -5%, which is a good result considering the presence of narrow uncertainty intervals. Nevertheless, some LV nodes show bias close to -12% for some quantiles. This plot also shows a tendency to underestimate the quantile proportions, which for the lower voltage limit might lead to underestimation of the voltage violation risk, and for the upper voltage limit can lead to overestimation of the risk. In summary, the obtained probabilistic estimations can be used to create probabilistic alarms with small bias in the associated probabilities and also with a small number of false alarms since the intervals are sharp.

V. CONCLUSIONS
This paper proposes an innovative data-driven probabilistic state estimator for LV grid that takes advantage of information collected by the smart grid infrastructure, both historical data and real-time measurements from a subset of meters. It can also include exogenous information like weather forecasts and dynamic price signals. The methodology does not depend on the knowledge of the grid's topology and electrical characteristics, and is not affected by the typical observability issues that affect the traditional state estimators when applied to the distribution grid. Moreover, it is a modular approach where an estimator is applied to each LV node and the set of input features can be easily modified in case of communication failure of some smart meters.
The method was in a first stage compared in its capability of providing deterministic state estimations against a state-ofthe-art technique based on Autoencoders. The results proved that not only was possible to improve the estimation results but also that it could be accomplished in a fraction of the computational time. The results also showed that it is possible to create sharp and reliable estimations of uncertainty from limited observability in real time. This is particularly relevant to enhance situational awareness and create probabilistic alarms for human operators. An online method was also proposed for hyper-parameters tuning, which resulted in an improved performance.
Future work consists in applying metric learning techniques to improve the robustness of the state estimator and define empirical rules to define the location of real-time metering devices.