power system state estimation under model uncertainty

14
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018 593 Power System State Estimation Under Model Uncertainty Saurabh Sihag , Student Member, IEEE, and Ali Tajer , Senior Member, IEEE Abstract—This paper considers the general state estimation in power systems when the system model is not fully known. Model uncertainty might be caused by lack of full information about the network model, or by unpredicted disruptions or changes to the grid topology, model, or parameters. This paper focuses on a setting for state estimation in which besides the nominal model, the system might follow a group of alternative models. Including alternative possibilities for the system model, introduces a new di- mension to state estimation. Specifically, the state estimator needs to detect whether the system model has deviated from its nominal model, and if it is deemed to have deviated, then also isolate the actual model. These estimation, detection, and isolation decisions are inherently coupled due to the fact that isolating the true model is never perfect (due to noisy measurements), the effect of which transcends the isolation process, and affects the estimation routine as well. This paper establishes the fundamental interplay between the detection, isolation, and estimation routines, designs the opti- mal attendant rules, and provides an algorithm for implementing these rules in a unified framework. The optimal framework is ap- plied to the IEEE 14-bus system model and IEEE 118-bus model, and the performance is compared against the existing relevant approaches. Index Terms—Model isolation, model uncertainty, non-linear systems, state estimation. I. INTRODUCTION A. Overview C ONSIDER the problem of state estimation in a power system in which the data collected from the measurement units Z R m ×1 is leveraged to recover the state of the system, denoted by X R n ×1 . When the topology and the model of the system is known perfectly, Z is related to X according to Z = g 0 (X)+ N , (1) where g 0 is a general non-linear model that captures the system model and N R m ×1 accounts for the measurement noise. In practice, however, the full extent of the model embedded in g 0 might not be known perfectly (i.e., g 0 is not known perfectly). Manuscript received October 16, 2017; revised February 19, 2018; accepted March 19, 2018. Date of publication April 16, 2018; date of current version July 27, 2018. This work was supported in part by the U.S. National Science Foun- dation under Grant DMS-1737976 and the CAREER Award ECCS-1554482. The guest editor coordinating the review of this paper and approving it for publication was Dr. Javier Contreras. (Corresponding author: Ali Tajer.) The authors are with the Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180 USA (e-mail:, [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTSP.2018.2827322 Furthermore, when the system undergoes unknown disruptions, its model deviates from the known nominal model g 0 . Motivated by such circumstances, the state estimation routine should be designed to also accommodate the possibility of the actual sys- tem model being different from the nominal one. In this paper, we analyze the general state estimation problem in which be- sides the nominal model g 0 , the true model might be one of the p other possible models {g i : i ∈{1,...,p}}. Under model i ∈{1,...p}, the relationship in (1) changes to Z = g i (X)+ N . (2) The selection of the nominal and alternative models depends on the context of disruption under investigation. For instance, when state estimation under possible line outage is of interest, the nominal model is the model of the network in which there is no line outage, and different possible combinations of line outages constitute the set of alternative models. For instance, the set of models in which one line is in outage gives rise to L models (L is the number of lines), or two-line outages constitute L(L 1) models. It is noteworthy that depending on the nature of disruption, the number of possible models, and subsequently, the computational complexity of estimating the state from (2) grows. For instance, as mentioned before, the number of models p grows linearly with L under the possibility of single-line outages, and quadratically under the possibility of double-line outages. As discussed in Section V, such complexity, however, can be controlled when the structure of the models {g 0 ,...,g p } is leveraged judiciously. Under the assumption that none of the models in (2) ren- der the system unobservable, the uncertainty in the true model introduces a new dimension to the state estimation problem. Specifically, since the estimator structure depends on the cur- rent system model, forming an estimate necessitates forming a decision about the true model as well. Driven by the premise that the detection routine is never perfect due to the presence of noise, we will show that the detection rule for isolating the true model and the state estimation routine are strongly coupled. This observation motivates the theory presented in this paper for state estimation in a system prone to changes in its nominal model. Two key aspects of this theory are: 1) The quality of the state estimate when facing model un- certainty is inferior to that of a setting with a perfectly known model. 2) There exists an inherent interplay between the quality of the model detection decision and that of the estimate. 1932-4553 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Upload: others

Post on 18-Dec-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018 593

Power System State Estimation UnderModel Uncertainty

Saurabh Sihag , Student Member, IEEE, and Ali Tajer , Senior Member, IEEE

Abstract—This paper considers the general state estimation inpower systems when the system model is not fully known. Modeluncertainty might be caused by lack of full information aboutthe network model, or by unpredicted disruptions or changes tothe grid topology, model, or parameters. This paper focuses on asetting for state estimation in which besides the nominal model,the system might follow a group of alternative models. Includingalternative possibilities for the system model, introduces a new di-mension to state estimation. Specifically, the state estimator needsto detect whether the system model has deviated from its nominalmodel, and if it is deemed to have deviated, then also isolate theactual model. These estimation, detection, and isolation decisionsare inherently coupled due to the fact that isolating the true modelis never perfect (due to noisy measurements), the effect of whichtranscends the isolation process, and affects the estimation routineas well. This paper establishes the fundamental interplay betweenthe detection, isolation, and estimation routines, designs the opti-mal attendant rules, and provides an algorithm for implementingthese rules in a unified framework. The optimal framework is ap-plied to the IEEE 14-bus system model and IEEE 118-bus model,and the performance is compared against the existing relevantapproaches.

Index Terms—Model isolation, model uncertainty, non-linearsystems, state estimation.

I. INTRODUCTION

A. Overview

CONSIDER the problem of state estimation in a powersystem in which the data collected from the measurement

units Z ∈ Rm×1 is leveraged to recover the state of the system,denoted by X ∈ Rn×1 . When the topology and the model ofthe system is known perfectly, Z is related to X according to

Z = g0(X) + N , (1)

where g0 is a general non-linear model that captures the systemmodel and N ∈ Rm×1 accounts for the measurement noise. Inpractice, however, the full extent of the model embedded in g0might not be known perfectly (i.e., g0 is not known perfectly).

Manuscript received October 16, 2017; revised February 19, 2018; acceptedMarch 19, 2018. Date of publication April 16, 2018; date of current version July27, 2018. This work was supported in part by the U.S. National Science Foun-dation under Grant DMS-1737976 and the CAREER Award ECCS-1554482.The guest editor coordinating the review of this paper and approving it forpublication was Dr. Javier Contreras. (Corresponding author: Ali Tajer.)

The authors are with the Department of Electrical, Computer, and SystemsEngineering, Rensselaer Polytechnic Institute, Troy, NY 12180 USA (e-mail:,[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSTSP.2018.2827322

Furthermore, when the system undergoes unknown disruptions,its model deviates from the known nominal model g0 . Motivatedby such circumstances, the state estimation routine should bedesigned to also accommodate the possibility of the actual sys-tem model being different from the nominal one. In this paper,we analyze the general state estimation problem in which be-sides the nominal model g0 , the true model might be one ofthe p other possible models {gi : i ∈ {1, . . . , p}}. Under modeli ∈ {1, . . . p}, the relationship in (1) changes to

Z = gi(X) + N . (2)

The selection of the nominal and alternative models dependson the context of disruption under investigation. For instance,when state estimation under possible line outage is of interest,the nominal model is the model of the network in which thereis no line outage, and different possible combinations of lineoutages constitute the set of alternative models. For instance,the set of models in which one line is in outage gives rise to Lmodels (L is the number of lines), or two-line outages constituteL(L− 1) models. It is noteworthy that depending on the natureof disruption, the number of possible models, and subsequently,the computational complexity of estimating the state from (2)grows. For instance, as mentioned before, the number of modelsp grows linearly with L under the possibility of single-lineoutages, and quadratically under the possibility of double-lineoutages. As discussed in Section V, such complexity, however,can be controlled when the structure of the models {g0 , . . . , gp}is leveraged judiciously.

Under the assumption that none of the models in (2) ren-der the system unobservable, the uncertainty in the true modelintroduces a new dimension to the state estimation problem.Specifically, since the estimator structure depends on the cur-rent system model, forming an estimate necessitates forming adecision about the true model as well. Driven by the premisethat the detection routine is never perfect due to the presence ofnoise, we will show that the detection rule for isolating the truemodel and the state estimation routine are strongly coupled.This observation motivates the theory presented in this paperfor state estimation in a system prone to changes in its nominalmodel. Two key aspects of this theory are:

1) The quality of the state estimate when facing model un-certainty is inferior to that of a setting with a perfectlyknown model.

2) There exists an inherent interplay between the quality ofthe model detection decision and that of the estimate.

1932-4553 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

594 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

In order to capture the interplay and also quantify the levelof degradation in estimation quality, we identify three figuresof merit: q ∈ [1,+∞ ) and α, β ∈ (0, 1], where q captures theestimation cost when facing model uncertainty normalized bythat of the setting with a known model, and α and β are the con-straints on the likelihoods of misclassfying the true model whenit is the nominal model and when it is not the nominal model,respectively. In this paper, we establish the intertwined char-acteristic of the state estimation and model detection qualities,and determine the decision rules that minimize q(α, β) under theconstraints on the detection error probabilities. We also providea case study based on IEEE 14-bus system, in which we illus-trate the improvement in the estimation performance yieldedby the optimal decision rules developed in this paper over thatyielded by the commonly used methods that decouple the twosub-routines, in which often the model is detected and the de-tection decision rule is then deemed perfect and leveraged forstate estimation.

B. Related Studies

Inference in a system facing model uncertainties has beenstudied extensively in several domains, including power sys-tems [1]–[14] and control theory [15] and [16]. State estimationalgorithms for general dynamic systems and power grids areoften studied under the assumption of complete knowledge ofthe system and its parameters (c.f. [17]–[19]). However, whenthe true model of the system deviates from the nominal model(for instance, due to line outages in a power grid), the estimatesformed using the state estimation algorithms under the priorassumption on system model are not accurate. To cope withsuch model uncertainties, robust estimation routines that canreliably generate state estimates in dynamic systems under thegiven bounds on the uncertainties in the model parameters areanalyzed in [14]–[16]. In these studies, the uncertainties in thesystem model are captured by additive perturbation terms thatare bounded, and guarantees on the worst possible estimationperformance are provided.

Inference under model uncertainty in power grids, where thegrid topology can change due to line outages, is studied in[1]–[13]. Specifically, outage detection and identification in apower system operating in a quasi-steady state is investigated in[1]–[10]. In [1], an optimal outage detector is developed for apower system that is approximated by a Gaussian linear model.In [2] and [3], the outage detection algorithms to detect singleline and double line outages are developed based on the mea-surements received from phasor measurement units (PMUs) inthe network, which are deemed to be accurately known. Thisassumption, however, may not be valid in practice. In [5], aquickest change detection approach is used for identifying lineoutages. In [6]–[9], the statuses of branches are treated as statevariables and generalized state estimation based techniques aredeveloped to identify topology errors and open circuit breakers.In [10], a computationally efficient scheme is developed to iden-tify the status of circuit breakers, which is followed by the stateestimation. Line outage detection in dynamic models is studiedin [11] and [12]. In [11], power systems with cascading outages

are considered and the expressions for the joint posterior of cas-cades and system states are developed. Specifically, in [12], themeasurements of the power network are modeled by a hiddenMarkov model and the line outage detection problem is treatedas an inference problem. An outage detection framework basedon the maximum likelihood detection of outage hypotheses us-ing real-time power flow measurements in a power transmissionnetwork with a tree structure is developed in [13].

In this paper, we consider a static power system model, inwhich the true model can be one of the p ∈ N different modelsdistinct from the nominal model. Similar formulations have beenstudied for power grids in [1] and [11], where the system modelcan change due to line outages, based on which the outage detec-tion and state estimation problem is posed as a multi-hypothesistesting problem. Specifically, in [1], the variations in the topol-ogy due to line outages are modeled as a set of hypotheses anda closed-form expression for the joint posterior distribution ofthe line statuses and system states is developed under the as-sumptions of linearity and Gaussian distributions for the systemstates and the measurement noise, which is used for designingan optimal detector. The theory in [1] is extended to a lineardynamic system in [11]. The studies in [20]–[22] investigate theproblems that combine state estimation and model uncertaintyusing real-time measurements. In [20] and [21], the measure-ment residuals from the state estimation routine are utilized todetect topology errors in power grids, without any guaranteeson the optimality. In [22], the uncertainty in the circuit breakerstatuses is incorporated into the generalized state estimation costto design a heuristic algorithm to jointly verify the statuses ofthe unobserved circuit breakers and form the state estimate.

The aforementioned studies, irrespective of their discrepan-cies in models, conform to the fact that, they decouple the modeldetection and state estimation routines. In principle, however,these routines are fundamentally interconnected. Decouplingthe detection and state estimation subroutines does not guaran-tee optimality in the estimation performance, as has been notedin [23] and [24] in the signal processing domain. For example,using a Neyman Pearson or Maximum a-posteriori (MAP) baseddetection rule to identify the correct hypothesis correspondingto the outage event, followed by Bayesian state estimation doesnot incorporate the uncertainty of the detection step into theestimation subroutine. Hence, an optimal state estimator canbe characterized only by the inter-dependence of the estimatordesign and the routine for isolating the true system model. Weformulate state estimator under model uncertainty as a multiplecomposite hypotheses testing problem, in which the detectionand isolation rules are designed by incorporating the estima-tion costs under the constraints on the detection and isolationpower. Optimal combined detection and estimation frameworksare developed in [25] and [26]. Specifically, in [26], a binary hy-pothesis testing problem is considered, with composite modelsconsisting of unknown parameters under both the hypotheses.The joint detection and estimation problem in linear Gaussianmodels is also considered in [27], where a closed-form expres-sion for the joint posterior distribution of the unknown param-eters and the hypotheses is developed. The theory developed in[27] is applied to outage detection routine developed in [1].

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 595

In this paper, we compare the estimation performance fromour decision rules with detection-driven approaches on IEEE 14-bus system and IEEE 118-bus system. We consider line outagesand variations in the statistical properties of the measurementsto be the events that can modify the system model. Note that thenumber of outages in a power system increases exponentiallywith the number of branches in the system. Also, Bayesian stateestimation becomes computationally intractable for a large num-ber of measurements. There exist studies that aim to circumventthe complexity in detecting outage events [28]–[34] and state es-timation [35]–[40]. Compressive sensing based approaches aredeveloped in [28] and [29] for outage detection. Graphical mod-els are developed in [30]–[32] for identifying the grid topologyand anomalous lines, if any. A variational inference approachis developed in [33], where efficient methods are developed toperform marginal inference on the status of each line in thepower network. A feature sharing neural network architectureis proposed in [34] to identify the correct topology. Multi-areastate estimation is often considered to lower the complexity ofstate estimation in large systems. A review of multi-area stateestimation methods is provided in [35]. A hierarchical approachto multi-area state estimation is considered in [36] and [37],where the local state estimators operate independently and ex-change information with a central coordinator. A decentralizedstate estimation approach is adopted in [39] for bad data identi-fication. A weighted least squares distributed algorithm for stateestimation is proposed in [40].

II. DATA MODEL

As discussed in Section I, and specified in (1)–(2), we con-sider a nominal model for the grid, denoted by g0 , and p ∈ Npossible alternative models denoted by {gi : i ∈ {1, . . . p}}.Under model gi , we have

Z = gi(X) + N , for i ∈ {1, . . . p} . (3)

Each alternative model can, in general, represent uncertaintyor disruption to the normal operations of the grid. We assumethat the noise term has a known distribution with the probabilitydensity function (pdf) fN . Futhermore, we assume that the stateX has pdf π, which can be determined based on the historicaldata on the state X . It is noteworthy that the cases, in which noprior information about the distribution of X is known, can beaccommodated by setting π as a uniform distribution. Finally,in order to signify that some model can occur more frequentlythan others, we define εi as the likelihood of gi being the truemodel. Clearly,

p∑

i=0

εi = 1. (4)

A. Estimation Model

We are interested in leveraging the data Z and forming anestimate X(Z) for X . Since under different models, the opti-mal estimator takes different structures, we further denote theestimate of X under model gi by X i(Z), for i ∈ {0, . . . p}. Inorder to quantify the fidelity of an estimate X , we adopt the

estimation cost function C(X, X), which measures the close-ness of X and X . In this paper, we specifically set the costfunction based on the mean squared error (MSE) criterion givenby

C(X,U) � ‖X −U‖2 , (5)

for any generic estimator U of X . Under model gi , given thatZ is distributed according to fi , we define the average posteriorcost function as

Cp,i(U | Z) � Ei [C(X,U) | Z] , for i ∈ {0, . . . p} , (6)

and define the optimal estimation cost as

Cp,i(Z) � infU

Cp,i(U | Z), for i ∈ {0, . . . p} . (7)

Hence, the optimal estimate of X under model gi is given by

X i(Z) � arg infU

Cp,i(U | Z), for i ∈ {0, . . . p} . (8)

III. PROBLEM FORMULATION

When the system model deviates from the nominal model,forming a reliable estimate for X strongly depends on the cor-rect isolation of the true model. In order to emphasize this cou-pling, we pose the joint decision as a composite hypothesistesting problem, which involves forming a detection decisionon whether the system is operating under the nominal model g0 .When it is deemed otherwise, we also form an isolation deci-sion to identify the true non-nominal model gi for i ∈ {1, . . . p}.Concurrently with the detection and isolation decisions, we alsoneed to form an estimate X . Hence, these decisions can beformed concurrently by solving the composite hypothesis test

H0 : Z ∼ f0(Z |X), with X ∼ π(X)

H1 : Z �∼ f0(Z |X), with X ∼ π(X). (9)

When the decision is that the true model is not the nominalmodel, we further isolate a model according to the followingn-dimensional composite hypothesis test for i ∈ {1, . . . , p}:

Hi : Z ∼ fi(Z |X), with X ∼ π(X) . (10)

Forming the detection and estimation decisions independentlydoes not ensure achieving the optimum performance. A de-coupling approach for isolating the network model (e.g., viaNeyman-Pearson criterion) in the first step, followed by evenan optimal estimation in the second step does not incorporatethe uncertainty of the isolation decision in the first step into theestimation structure. Therefore, we formulate the decision rulesfor the problem in (9) and (10) such that the quality of estimatorsand the detection/isolation rules are integrated properly to formjointly optimal decisions.

A. Performance Measures

We model the detection and isolation decision rules for thecomposite hypothesis testing problem in (9) and (10) usinga randomized test δ � [δ0(Z), . . . δp(Z)], where δi(Z) is the

596 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

probability of deciding in favor of Hi , for i ∈ {0 . . . p}. Clearly,p∑

i=0

δi(Z) = 1 , (11)

and a randomized test subsumes the deterministic ones, byplacing the entire mass on one hypothesis, i.e., δi(Z) = 1 forsome i ∈ {0, . . . , p}. By defining D ∈ {H0 . . . Hp} as the de-cision formed and T ∈ {H0 . . . Hp} as the true hypothesis, thedetection and isolation likelihood of erroneously deciding inthe favor of hypothesis Hj when the true model is Hi , fori �= j ∈ {0, . . . , p}, is given by

P (D=Hj |T=Hi) =∫

Z

δj (Z)fi(Z) dZ . (12)

We define Pmd as the probability of missing or misclassifyingthe true model when it is not a nominal model

Pmd � 1P (T �= H0)

p∑

i=1

P (D �= Hi | T = Hi)P (T = Hi).

(13)By leveraging (12) and the definition of εi , Pmd can also berestated as

Pmd =p∑

j=1

p∑

i=0,i �=j

εj

1− ε0·∫

Z

δi(Z)fj (Z) dZ . (14)

We also define Pfa as the probability of declaring that the systemmodel has deviated from the nominal model while the true modelis in fact the nominal model and it is given by

Pfa �p∑

i=1

P (D=Hi |T=H0). (15)

Based on (12), Pfa can be rewritten as

Pfa =p∑

i=1

Z

δi(Z)f0(Z) dZ. (16)

For any generic estimator U i of X under Hi for i ∈ {0, . . . , p},the corresponding estimation cost is given by C(X,U i). Sincethe stochastic model of the measurements Z depends on thechoice of hypothesis, the estimation cost C(X,U i) is relevantonly under the event of deciding in favor of Hi . We defineJi(δi,U i) as the average estimation cost given that the decisionis Hi , i.e.,

Ji(δi,U i) � E [C(X,U i) |D=Hi ] , (17)

where the expectation is with respect to X and Z. We alsodefine J(δ,U) as the maximum of the average estimation costsJi(δi,U i), i.e.,

J(δ,U) � maxi∈{0,...p}

Ji(δi,U i) , (18)

where we have defined U � {U 0 , . . . ,U p}. Based on the figureof merit defined for detection, isolation and estimation qualitiesin (14), (16) and (18), in the next subsection we design the deci-sion rules δ and U that minimize J(δ,U) under pre-specifiedconstraints on Pfa and Pmd .

B. State Estimation Framework

We now formulate the estimation problem under uncertaintyin the system model due to certain events, which also char-acterizes the interplay between the estimation quality and thedetection performance. Note that perfect (error-free) detectionand isolation is not possible due to the presence of noise inthe measurements, and at the same time, the estimation qualityhinges on successful isolation of the true system model. Hence,based on the definitions in (14) and (16), we aim to controlPmd and Pfa and obtain the optimal decision rules δ and U thatminimize J(δ,U). This can be formalized as

P(α, β) �

⎧⎪⎨

⎪⎩

min(δ,U) J(δ,U)

s.t. Pmd ≤ β

Pfa ≤ α

. (19)

Definition 1: For given α, β ∈ (0, 1) as the constraints con-trolling the qualities of detection and isolation, we define

q(α, β) � P(α, β)J0(U)

, (20)

which quantifies the ratio of estimation cost in the setting withand without uncertainty. The average estimation cost corre-sponding to estimator U when the model is known is givenby

J0(U) � E0 [C(X,U)] . (21)

Note that for the given distribution f0 , the average cost func-tion under the nominal system model, i.e., minU J0(U) is aconstant . Therefore, characterizing q(α, β) defined in (20) isequivalent to solving P(α, β).

Remark 1 (Feasibility). By forming the detection rules forthe composite hypothesis testing problem in (9) under the con-straint on Pfa using Neyman-Pearson theory [41], it can be read-ily verified that corresponding to any given α, such that Pfa ≤ α,there exists a value β∗(α), which specifies the smallest feasiblevalue for Pmd . Therefore, the constraints on the probabilitiesPmd and Pfa cannot be made arbitrarily small simultaneously.

We remark that the cost function J(δ,U) defined in (18),which serves as the basis for optimizing the decision rules,captures the worst-case estimation cost among all the differentmodels {gi : i ∈ {0, . . . , p}}. While optimizing this measureprovides a worst-case guarantee, often different values of theother cost functions {Ji(δi,U i) : i ∈ {0, . . . , p}} are smallerthan the worst-case cost. In order to shed light on the rangeof {Ji(δi,U i) : i ∈ {0, . . . , p}}, in Section V we compare theq(α, β), which represents the optimal value of δ and U , with thefollowing average and minimum costs. Specifically, for givendetection and isolation rules δ and estimators U that solveP(α, β), we define

qavg � Javg(δ,U)J0(U)

, (22)

where Javg(δ,U) is the average estimation cost across all mod-els, i.e.,

Javg(δ,U) �p∑

i=0

εiJi(δi,U i) , (23)

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 597

and define

qmin(α, β) � Jmin(δ,U)J0(U)

, (24)

where

Jmin(δ,U) � mini∈{0,...p}

Ji(δi,U i). (25)

Note that qmin(α, β), qavg (α, β) and q(α, β) are the estimationmetrics normalized with respect to the estimation quality un-der the nominal model and follow qmin(α, β) ≤ qavg(α, β) ≤q(α, β).

IV. OPTIMAL STATE ESTIMATOR UNDER MODEL UNCERTAINTY

A. Decision Rules

In this section, we provide the design of the estimators{U i : i ∈ {0, . . . , p}} and the detectors {δi : i ∈ {0, . . . , p}}characterized by the optimal solutions to the problem P(α, β).The dependence of Pmd and Pfa on δ is established in the equa-tions (14) and (16), respectively. By using the expansions in (14)and (16), the problem of interest in (19) becomes

P(α, β)

=

⎧⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎩

min(δ,U) J(δ,U)

s.t.∑p

j=1∑p

i=0,

i �=j

εj

1−ε0

Z

δi(Z)fj (Z)dZ≤β

∑pi=1

Z

δi(Z)f0(Z) dZ ≤ α

.

(26)

Note that the estimators {U i : i ∈ {0, . . . , p}} appear only inthe utility function J(δ,U) in (26), which allows for break-ing down of the optimization problem P(α, β) into two sub-problems. This observation is formalized in Theorem 1.

Theorem 1: The problemP(α, β) can be equivalently statedin the following form

P(α, β)

=

⎧⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎩

minδ J(δ, X)

s.t.∑p

j=1∑p

i=0,

i �=j

εj

1−ε0

Z

δi(Z)fj (Z)dZ ≤ β

∑pi=1

Z

δi(Z)f0(Z) dZ ≤ α

,

(27)

where

X � arg minU

J(δ,U) , (28)

and

J(δ, X) � minU

J(δ,U). (29)

By using Theorem 1, we first provide the solution to (28),which provides the design of the optimal estimators.

Theorem 2 (State Estimator). The optimal estimator underhypothesis Hi that minimizes the estimation cost J(δi,U i) is

given by

X i(Z) = arg infU i

Cp,i(U i | Z). (30)

Consequently, the cost function J(δ, X) is given by

J(δ, X) = maxi

⎧⎪⎪⎨

⎪⎪⎩

Z

δi(Z)Cp,i(Z)fi(Z) dZ∫

Z

δi(Z)fi(Z) dZ

⎫⎪⎪⎬

⎪⎪⎭. (31)

Proof. See Appendix A. �Given the estimator design obtained by forming the optimal

solution to J(δ, X), the corresponding optimal detection rulesare established by solving the optimization problem in (27).

Theorem 3 (Detection and Isolation). The optimal detectionand isolation rules are given by:

δi(Z) =

{1, for i = i∗

0, for i �= i∗, (32)

where

i∗ = argmini∈{0,...,p}

Ai. (33)

Constants {Ai : i ∈ {0, . . . , p}} are defined as

A0 � �0f0(Z)(Cp,0(Z)− u) + �p+1

p∑

i=1

εi

1− ε0fi(Z), (34)

and for i ∈ {1, . . . , p},Ai � �ifi(Z)(Cp,i(Z)− u)

+ �p+1

p∑

j=1,j �=i

εj

1− ε0fj (Z) + �p+2f0(Z), (35)

where the non-negative constants {�i : i ∈ {0, . . . p + 2}} arethe Lagrangian multipliers selected such that

p+2∑

i=0

�i = 1, (36)

and the constraints in the following convex optimization prob-lem (which is equivalent to the problem in (27)) are satisfied.

P(α, β) =⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎩

minδ u

s.t.∫

Z

δi(Z)fi(Z)(Cp,i(Z)− u)dZ ≤ 0 ,∀i∑p

j=1∑p

j �=iεj

1−ε0

Z

δi(Z)fj (Z) dZ ≤ β

∑pi=1

Z

δi(Z)f0(Z) dZ ≤ α

(37)

Also, the optimum maximum average estimation cost is givenby P(α, β).

Proof. See Appendix B. �

598 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

Algorithm 1 Finding the optimal solution of P(α, β)1: Compute β∗(α) based on Remark 12: Set the corresponding estimation cost to u13: if β < β∗(α) then4: P(α, β) is not feasible5: break6: else7: Initialize u0 = 08: Evaluate optimal posterior estimation costs in (7)9: repeat

10: u← (u0 + u1)/211: for every � � 0 such that ‖�‖1 = 1 do12: Compute δ from (32)13: Compute M(�) � J (α, β, u)14: end for15: if min� M(�) ≤ 0 then16: u1 ← u17: �← �18: else19: u0 ← u20: end if21: until u1 − u0 ≤ ε, for ε sufficiently small22: P(α, β)← u123: return Decision rules in (32)24: end if

Therefore, in summary, Theorem 2 provides the design ofthe optimal estimators under all possible models, which areshown to be the estimators that minimize the respective aver-age posterior cost functions. Theorem 3 provides the design ofthe decision rules to decide upon the true system model. Notethat the estimation quality is incorporated in the decision rulesdesigned in the form of average posterior cost functions.

B. Implementation

The complete procedure for numerically solving the problemP(α, β) is described in Algorithm 1. Given the constraints α andβ on the detection performance, the optimal decision rules aredetermined by a procedure similar to the bisection search. Wedevelop an equivalent auxiliary convex optimization problem,J (α, β, u), using P(α, β) in (37), where u lies in the intervalcontaining P(α, β). Starting with interval [u0 , u1 ] containingP(α, β), we determine the minimum value of u ∈ [u0 , u1 ] forwhichJ (α, β, u) is feasible. This is done in the outer loop start-ing at line 9, where we iteratively search for the optimal u inthe interval [u0 , u1 ]. Inside this loop, we solve J (α, β, u) bythe method of Lagrangians, and � represents the vector of La-grangian coefficients, where the optimal Lagrangian coefficientssatisfy ‖�‖1 = 1 and � � 0. In the lines 11–14 in Algorithm 1,we compute J (α, β, u) for every possible � that satisfies theconditions in line 11 and store the values in the vector M . Theoptimal value of J (α, β, u) is given by min� M(�) and weperform a feasibility check in line 15. Using lines 15–20 weupdate the values u0 and u1 . The outer loop starting at line 9is repeated until u0 and u1 are sufficiently close, after which

the algorithm returns the value u and the decision rules corre-sponding to min� M(�) in the last iteration before the stoppingcondition in line 21 is satisfied.

We developed an estimation routine that minimizes the mea-sure q of estimation performance under the constraints on thedetection performance. This also provides us with a tool tocharacterize the degradation in the estimation performance asthe constraints on the detection performance are relaxed.

V. CASE STUDY

In this section, we evaluate the performance of the optimalframework on IEEE 14-bus system, a 4-bus model, and IEEE118-bus system. We consider line outages to be one of theevents that can lead to multiple possible true models. Note thatthe number of possible models grows exponentially with thenumber of possible outages and the size of the system.

By judiciously leveraging the structure of the observationsZ and their relationship with the state parameter X , the com-plexity can be reduced significantly. Specifically, only a limitednumber of state parameters contribute to each measurement.Hence, the entire state estimation problem can be decomposedinto a number of problems each with a considerably smallerdimension. Physically this means that the grid can be dividedinto multiple subnetworks, in which each subnetwork has onlylimited shared state parameters with other subnetworks. Thisallows for providing the subnetworks with a level of autonomyin forming their own state estimates, based on which each sub-network has its local decision about its local state parameters.Once these decisions are formed, the neighboring subnetworksthat share common parameters can exchange the relevant infor-mation in order to reach a joint decision about the estimates fortheir shared parameters.

Therefore, we adopt a multi-area approach in this case study.In the experiments on the IEEE 14-bus system, we divide thenetwork into four areas, as depicted in Fig. 1, and evaluate theestimation performances on two of them. In the detection-drivenapproach used in the experiments on IEEE 14-bus system, weuse a multilayer perceptron to design the outage detector andcompare the resulting estimation performance with that yieldedby our decision rules.

A. Comparison with Detection-driven Approaches

We divide the IEEE 14-bus system into four areas{S1 , S2 , S3 , S4} (as done in [37]), with their internal measure-ments given by

S1 : {P1−2 , P1−5 , P2−5 , P1} , (38)

S2 : {P3−4 , P4−7 , P7−8} , (39)

S3 : {P6−11 , P6−12 , P6−13 , P12−13 , P12} , (40)

S4 : {P9−10 , P9−14 , P9} , (41)

where Pi−j is the power flow measurement from bus i to bus jand Pi is the power injection measurement at bus i. Under thenominal model in Fig. 1, the DC power flow equation for an

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 599

Fig. 1. IEEE 14-bus system with specified areas.

Area Si, for i ∈ {1, 2, 3, 4}, is given by

Zi = H0i X i + N i , (42)

where Zi is the measurement vector, X i is the state vector tobe estimated and N i is the additive noise component and H0

i

is the Jacobian matrix for area Si .First, we focus on area S1 . Under a DC model, the state

parameters of this area are the phase angles X1 ,X2 ,X5 , whereXi is the phase angle at bus i. We assume bus 1 to be the localreference bus and form the estimates at bus 2 and bus 5 withrespect to bus 1. Under the nominal model, the Jacobian matrixH0

1 is given by

H01 =

⎢⎢⎢⎣

1 −1 01 0 −10 1 −12 −1 −1

⎥⎥⎥⎦,

and the relationship between the measurements Z1 and the stateparameters X1 is captured by

Z1 = H01X1 + N 1 , (43)

where Z1 = [P1−2 , P1−5 , P2−5 , P1 ]T , X1 = [X1 ,X2 ,X5 ]

T

and N 1 is the noise component with a probability distribu-tion N (0, σ2I4×4), where σ2 = 0.1. X2 and X5 are assumedto be independent and identically distributed (i.i.d) with pdfUnif[0, 0.5π]. We consider the following two outage events inArea S1 .

1) Outage in the branch 1–5, with probability ε1 = 0.3. Themodified adjacency matrix is

H11 =

⎢⎢⎢⎣

1 −1 00 0 00 1 −11 −1 0

⎥⎥⎥⎦,

where the row 2 being zero indicates that there is no powerflow between the node 1 and node 5 and the measurementP1−5 consists of only the noise component. Row 1 androw 4 being identical indicates that the measurements P1and P1−2 measure the power injection at bus 1 under thistopology.

2) Outage in the branch 2− 5, with probability ε2 = 0.2.The modified adjacency matrix is

H21 =

⎢⎢⎢⎣

1 −1 01 0 −10 0 02 −1 −1

⎥⎥⎥⎦,

with row 3 being zero indicating no power flow betweenbuses 2 and 5 and the measurement P2−5 consists of onlynoise.

We assume that the outage events described above cannotoccur simultaneously. Therefore, the probability that no outageevent occurs is given by ε0 = 0.5. We compare the average es-timation cost for a detection driven approach, where the correctnetwork model is identified using a trained multilayer percep-tron followed by Bayesian estimation, with that obtained usingthe decision rules developed in this paper. In order to train themultilayer perceptron, we generate 6×105 data samples. Eachsample belonging to class i ∈ {0, 1, 2} is generated by randomlygenerating the elements of X1 according to Unif[0, 0.5π] andusing the Jacobian matrix H i

1 to generate the measurementsZ1 . An equal number of samples are generated for each class.Two-thirds of the generated data is used for training a multi-layer perceptron with one hidden layer consisting of 60 unitsand one-third of the data is used for testing the accuracy. Theaccuracy of the multilayer perceptron on the testing data is usedto estimate the error rates α and β. The average estimation costis determined by using the multilayer perceptron to identify thecorrect topology over the space of measurements Z 1 , followedby using the estimator corresponding to the identified topology.The degradation in the average estimation cost as compared tothe average estimation cost under outage prone setting, for thisdetection-driven decoupled approach and the decision rules inthis paper is illustrated in Fig. 2. To generate the Bayesian es-timates and posterior cost functions, we discretize the space ofmeasurements and evaluate the discretized pdfs. Given the dis-cretized posterior pdf of the state vector X1 , the average CPUtime taken to evaluate its Bayesian estimate using a 3.2 GHzIntel Core i5 based processor is 0.01243 seconds.

Note that by increasing the size of training data, the detec-tion performance of multilayer perceptron can be further im-proved. However, in order to compare the performance of our

600 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

Fig. 2. Normalized estimation performance versus β for α = 0.354.

decision rules with that of a detection-driven approach underthe same setting, we use the multilayer perceptron to designa detector whose detection performance is comparable to thedetection constraints in our setting. As observed in Fig. 2, thedecision rules determined in this paper outperform a decoupleddetection-driven approach in terms of estimation performance.This figure also compares the average and minimum cost termscaptured by qavg and qmin , where it is observed that both consis-tently, and often substantially, outperform the worst-case cost,corresponding to which the decision rules are optimized.

For the next experiment, we focus on area S3 . The stateparameters to be estimated in Area S3 are the phase anglesX3 = [X6 ,X11 ,X12 ,X13 ]

T , where Xi is the phase angle atnode i. We assume bus 6 to be the local reference bus and formestimates at busses 11, 12 and 13. Under the baseline topology,the Jacobian matrix H0

3 is given by

H03 =

⎢⎢⎢⎢⎢⎢⎣

1 −1 0 01 0 −1 01 0 0 −10 0 1 −1−1 0 2 −1

⎥⎥⎥⎥⎥⎥⎦,

and the relationship between the measurements in Area S3 , Z3and the state parameters X3 is given by

Z3 = H03X3 + N 3 , (44)

where Z3 = [P6−11 , P6−12 , P6−13 , P12−13 , P12 ]T , and X3 =

[X6 ,X11 ,X12 ,X13 ]T . N 3 is the additive noise component with

pdf N (0, σ2I5×5) with σ2 = 0.05, and X11 ,X12 and X13 arei.i.d with pdf Unif[0, 0.5π]. We assume that the outage in thebranch 6–13 is the only possible outage event in Area S3 , whichoccurs with probability ε1 = 0.4. The modified Jacobian matrixunder the outage event is

H13 =

⎢⎢⎢⎢⎢⎢⎣

1 −1 0 01 0 −1 00 0 0 00 0 1 −1−1 0 2 −1

⎥⎥⎥⎥⎥⎥⎦,

Fig. 3. Normalized estimation performance versus β for α = 0.158.

where the row 3 being zero indicates that there is no power flowbetween bus 6 and bus 13 and the measurement P6−13 consistsof only noise. To test the performance of the detection-drivendecoupled approach, we train a multilayer perceptron with onehidden layer consisting of 20 hidden units to identify the cor-rect topology, followed by Bayesian estimation. We generate3×104 samples of the measurements Z3 in a similar fashion asdescribed in the case for Area S1 . An equal number of samplesare generated for both classes. Two-third of the generated data isused for training the multi perceptron network and one-third ofthe generated data is used for testing the detection performance.Given the discretized posterior pdf of the state vector X3 , theaverage CPU time taken to evaluate its Bayesian estimate usinga 3.2 GHz Intel Core i5 based processor is 0.01496 seconds.The degradation in the average estimation cost for the decou-pled approach and the decision rules developed here is depictedin Fig. 3. As observed in Fig. 3, the decision rules developed inthis paper outperform the decoupled detection-driven approachin terms of estimation performance. Similarly to the observa-tions in Fig. 2, in this figure, it is also observed that qavg andqmin substantially outperform the worst-case cost.

B. Comparison with Residue-based Tests for Topology

Next, we compare the estimation performance of our decisionrules with that using residual analysis. Residual analysis can beused to detect topology errors in the grid, as studied in [17], [20],and [21]. Specifically, the error in topology can be modeled as

H t = Hs + E , (45)

where H t is the true Jacobian matrix, Hs is the assumed in-correct Jacobian matrix and E is the Jacobian error matrix. Theresidual vector for this setting is defined as

r � Z −HsX , (46)

where X = (HTs R−1Hs)−1HT

s R−1Z and R is the covariancematrix of the measurement noise. It can be readily shown thatthe residual vector is related to the vector of branch flow errors,F due to the error in topology by

r = (I −Ke)MF, (47)

where Ke � Hs(HTs R−1Hs)−1HT

s R−1 , and M isthe measurement-to-branch incidence matrix. Let T =

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 601

Fig. 4. A 4-bus model.

TABLE ICOMPARISON OF DOT PRODUCT OF COLUMNS OF T WITH THE RESIDUAL VECTOR

(I −Ke)M , then if branch j has a topology error or outage,the jth column of T will be co-linear with the residual vectorr. This claim is borrowed from Section 8.6.1 in [17]. Theco-linearity can be measured by the dot product

cos(θj ) =TT

j · r‖Tj‖‖r‖ , (48)

where Tj is the jth column of T . The use of cos(θj ) as a metric toidentify the outage in branch j in the system is illustrated usingthe power system in Fig. 4. Denote ηij as the cross product givenin (48) corresponding to branch i− j between bus i and bus jin the system. The set of measurements for this bus system isgiven by Z = [P1−2 , P1−3 , P1−4 , P4−3 , P2−3 , P3 ] and the statesare X = [X1 ,X2 ,X3 ,X4 ]. Note that only the outage in line1-3 is detectable using the residual based test used here. Weassume that bus 1 is the reference bus for this system, andline 1− 3 is prone to outage. 20000 samples of the elements ofthe state X = [X1 ,X2 ,X3 ,X4 ] are generated independently ofeach other with uniform distribution Unif[0, 0.5π]. AssumingHs to be the Jacobian matrix corresponding to the system shownin Fig. 4 and H be the changed Jacobian matrix due to anoutage in line 1–3, these samples are used to generate two setsof measurements: Z according to Z = HX and Zs accordingto Zs = HsX , with no measurement noise included in anysample. The average values of ηij for all branches i− j in thesystem are summarized in Table I.

It can be readily concluded from Table I that the metric η13can be used as a metric to identify line outage in branch 1− 3.Denote γ as the threshold on η13 , such that an outage in theline 1− 3 is declared if η13 > γ. Based on different valuesof γ, we can estimate Pfa and Pmd to develop an equivalentsetting to our decision model. After forming the decision on theoutage based on the residue test, we update the state estimateaccording to the correct topology if the outage is deemed toexist. For a fair comparison with the estimation quality fromour decision rules, we formulate the decision rules by setting the

constraints α and β to be Pfa and Pmd obtained from the residuetest, respectively. Since only one outage is considered in thenominal model, the total number of hypotheses is (p + 1) = 2.The problem P(α, β) for this setting is given by

P(α, β) =

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

minδ J(δ, X)

s.t. ε11−ε0

Z

δ0(Z)f1(Z)dZ ≤ β∫

Z

δ1(Z)f0(Z) dZ ≤ α

. (49)

The optimal decision rules {δ0 , δ1} are determined by finding asolution toP(α, β) numerically using Algorithm 1. The specificsteps of Algorithm 1 for this setting are listed below:

1) Check the feasibility condition in Step 3. If the feasibilitycondition is satisfied, proceed.

2) Evaluate the posterior estimation costs Cp,0(Z0)and Cp,0(Z1) by discretizing the interval containingSupp(Z0) and Supp(Z1), respectively.

3) Initialize u0 and u1 .4) Set u to be the mean of u0 and u1 . Discretize a

(p + 3) = 4− dimensional simplex consisting of vectors� such that � � 0 and ‖�‖1 = 1. Using all such vectors �,determine the vector M using the steps 11–14. Update u0and u1 according to the steps 15–20.

5) Repeat the previous step until u0 and u1 converge suffi-ciently.

6) The cost P(α, β) is given by u1 after the algorithm ter-minates based on the convergence test in step 22.

We use an approximation of the average root mean squarederror (ARMSE) as the metric for comparing the two differentestimation performances. This is similar to the metric used in[42] for comparing the performances of linear Bayesian esti-mator and a weighted least squares estimator. ARMSE for anestimator X of X = [X1 , . . . , Xn ] is defined as

R(X) �(

1n

n∑

k=1

E[|Xk − Xk |2 ])1/2

. (50)

ARMSE can be approximated by computing the following met-ric

R(X) �(

1nT

T∑

t=1

||X t − Xt ||2)1/2

, (51)

where T is the number of Monte Carlo samples, X t is the

state in the t-th sample and Xt

is the corresponding estimate.Denote Rres and Ropt as the estimation metric correspondingto the residue test and our decision rules, respectively. Table IIcompares the estimation performance for both settings.

From Table II, it can be concluded that the decision rulesdeveloped in this paper outperform a detection-driven approachbased on error residuals.

C. Application in the AC Model

Next, we apply our decision rules on IEEE 14-bus systemunder an AC model. The internal measurements for the areas

602 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

TABLE IICOMPARISON OF ESTIMATION PERFORMANCE FROM OPTIMAL

DECISION RULES AND HEURISTIC APPROACH

Fig. 5. Normalized estimation performance versus β for α = 0.18.

Fig. 6. Normalized estimation performance versus β for α = 0.2.

are given by

S1 : {P1−2 , Q1−2 , P1−5 , Q1−5 , P2−5 , Q2−5 , P1 , Q1} ,

S2 : {P3−4 , Q3−4 , P4−7 , Q4−7 , P7−8 , Q7−8} ,

S3 : {P6−11 , Q6−11 , P6−12 , Q6−12 , P6−13 , Q6−13 ,

P12−13 , Q12−13 , P12 , Q!2} ,

S4 : {P9−10 , Q9−10 , P9−14 , Q9−14 , P9 , Q9} ,

where Pi−j is the real power flow measurement from bus i tobus j, Pi is the real power injection measurement at bus i, Qi−j

is the reactive power flow measurement from bus i to bus j, Qi

is the reactive power injection measurement at bus i. For eacharea, the state parameters consist of the voltage magnitudes andphase angles at the respective buses, and their relationship withthe measurements are established according to the AC powerflow model given in [17]. For area S1 , the state variables areX = [V1 , V2 , V3 , V4 ,X1 ,X2 ,X3 ,X4 ], where Vi is the voltagemagnitude at bus i and Xi is the phase angle at bus i. Similarly,the state vector for area S3 can be defined. Assuming that allthe measurements are distributed according to Unif[0, 0.3π] and

Fig. 7. Normalized estimation performance versus β for α = 0.3.

Fig. 8. Normalized estimation performance versus β for α = 0.15.

are affected by the measurement noise distributed according toN (0, 0.06), we evaluate the performance of our decision rulesfor Area S1 under the possibility of outages in branch 1–5 and2–5 and for Area S3 under the possibility of outage in line 6–13.

The degradation in average estimation cost for areas S1 andS3 have been depicted in Fig. 5 and Fig. 6, respectively.

D. Application in the IEEE 118-Bus System

We illustrate the application of our decision rules on IEEE118-bus system. We divide the bus system into 9 areas, as donein [36], and evaluate the performance of our decision rules onareas 5 and 8. Note that areas 5 and 8 are the two largest sub-areas in the power system with respect to the number of internalbuses. We evaluate the performance for the DC model, however,the application can be readily extended to an AC model asillustrated previously. In area 5, we assume that at most oneline outage can occur, leading to 15 possible alternative models.In area 8, we assume that at most one of the lines 92− 94,94− 96 and 80− 96 can experience an outage, leading to 3possible alternative models. Given the discretized posterior pdfof the state vector in Area 5, the average CPU time taken toevaluate its Bayesian state estimate using a 3.2 GHz Intel Core i5based processor is 0.03704 seconds. The corresponding time forforming a Bayesian state estimate for Area 8 is 0.0349 seconds.The degradation in average estimation cost for areas 5 and 8have been depicted in Fig. 7 and Fig. 8, respectively.

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 603

VI. CONCLUSION

In this paper, we have analyzed state estimation in power sys-tems when there exists a chance that the system model undergoesa change and deviates from the nominal model. Such deviationscan occur when the instantaneous network models are not fullyknown, or the topology or model undergoes a change due toa disruption or failure in the network. Motivated by the possi-bility of such circumstances, we have provided a framework inwhich the state estimator is co-designed with an optimal rule fordetecting a change in the model, and isolating the true modelfrom a group of alternative models. Closed-form expressionsfor the decision rules and the state estimator are delineated,where it is shown that these rules are inherently coupled. Wehave evaluated the performance of this optimal framework withthose of the existing approaches that involve decoupling modelisolation and state estimation in the IEEE 14-bus system andIEEE 118-bus system.

APPENDIX APROOF OF THEOREM 2

From (17) we have

Ji(δi,U i) = E [C(X,U i) |D=Hi ] (52)

=

Z

X

δi(Z)C(X,U i)fi(Z |X)π(X)dXdZ∫

Z

δi(Z)fi(Z)dZ

.

(53)

Using the definition of Cp,i(U i | Z) from (6), a lower boundon Ji(δi,U i) is given by

Ji(δi,U i) =

Z

δi(Z)Cp,i(U i | Z)fi(Z)dZ∫

Z

δi(Z)fi(Z)dZ

(54)

Z

δi(Z) infU

Cp,i(U i | Z)fi(Z)dZ∫

Z

δi(Z)fi(Z)dZ

, (55)

which implies that

Ji(δi,U i) ≥

Z

δi(Z)Cp,i(Z)fi(Z)dZ∫

Z

δi(Z)fi(Z)dZ

. (56)

Based on the definition of Xi(Z) provided in (8), this lowerbound is clearly achieved when the estimator U i is chosen to be

Xi(Z) = arg infU i

Cp,i(U i | Z), (57)

which proves that the estimator characterized in (30) is anoptimal estimator that minimizes the cost Ji(δi,U i). The

corresponding minimum average estimation cost is

Ji(δi, X i) =

Z

δi(Z)Cp,i(Z)fi(Z)dZ∫

Z

δi(Z)fi(Z)dZ

. (58)

Next, we prove that

maxi

minU{Ji(δi,U i)} ≡ min

UUUmax

i{Ji(δi,U i)} . (59)

Recall from (18), the overall estimation cost J(δ,U) is

J(δ,U) = maxi{Ji(δi,U i)} .

Define C(Ω, δ, U) as a convex function of {Ji(δi,U i) : i ∈{0, . . . , p}}, given by

C(Ω, δ, U) �p∑

i=0

ΩiJi(δi,U i), (60)

where Ω = [Ω0 , . . . ,Ωp ], and Ωi satisfy

p∑

i=0

Ωi = 1 , and Ωi ∈ [0, 1]. (61)

We can redefine J(δ,U) as a function of C(Ω, δ, U) in thefollowing form

J(δ,U) = maxΩC(Ω, δ, U). (62)

Let Ω∗ ={Ω∗j : j = 0, . . . , p

}be defined as

Ω∗ = arg maxΩC(Ω, δ, U), (63)

where Ω∗j = 1 if

j = arg maxi{Ji(δi,U i)} . (64)

From (57) and (58), we observe that

maxΩ

minUC(Ω, δ, U) = max

ΩC(Ω, δ, X) (65)

≥ minU

maxΩC(Ω, δ, U). (66)

Also, at the same time, we have

maxΩC(Ω, δ, U) ≥ max

Ωmin

UC(Ω, δ, U), (67)

which implies that

minU

maxΩC(Ω, δ, U) ≥ max

Ωmin

UC(Ω, δ, U). (68)

From (65) and (68), it is easily concluded that

maxΩ

minUC(Ω, δ, U) = min

Umax

ΩC(Ω, δ, U), (69)

604 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

which completes the proof for (59). Using the results in (59)and (58), the cost function J(δ, X) is given by

J(δ, X) = minU

maxi{Ji(δi,U i)} (70)

= maxi

minU{Ji(δi,U i)} (71)

= maxi

{Ji(δi, X i)

}(72)

= maxi

⎧⎪⎪⎨

⎪⎪⎩

Z

δi(Z)Cp,i(Z)fi(Z)dZ∫

Z

δi(Z)fi(Z)dZ

⎫⎪⎪⎬

⎪⎪⎭. (73)

APPENDIX BPROOF OF THEOREM 3

The function Ji(δi,U i) is a quasi-convex function in δi ∈[0, 1]. To show this, let δ1

i and δ2i be two possible values of δi

such that δi = λδ1i + (1− λ)δ2

i for some λ ∈ [0, 1]. We haveJi(δi,U i) =

Z

X(λδ1

i (Z)+(1−λ)δ2i (Z))C(X , U i)fi(Z |X)π(X)dXdZ

Z(λδ1

i (Z)+(1 − λ)δ2i (Z))fi(Z)dZ

,

(74)

Z

Xδ1i (Z)C(X , U i)fi(Z |X)π(X)dXdZ

λ

Zδ1i (Z)fi(Z)dZ + (1 − λ)

Zδ2i (Z)fi(Z)dZ

+(1 − λ)

Z

Xδ2i (Z)C(X , U i)fi(Z |X)π(X)dXdZ

λ

Zδ1i (Z)fi(Z)dZ + (1 − λ)

Zδ2i (Z)fi(Z)dZ

(75)

Note that, for any a, b, c, d > 0,

a + b

c + d≤ max

{a

c,b

d

}. (76)

Therefore,

Ji(δi,U i) ≤ max{Ji(δ1i ,U i), Ji(δ2

i ,U i)} , (77)

which implies that Ji(δi,U i) is quasiconvex in δi . Since theweighted maximum function preserves the quasi-convexity, itcan be concluded that Ji(δi, X) is a quasi-convex function fromits definition in (31). Therefore, we can find the solution to theoptimization problem in (27) by solving a sequence of feasibilityproblems given below [43]. For u ∈ R + , it is easily observedthat J(δ, X) ≤ u is a necessary and sufficient condition that forall i ∈ {0, . . . , p} we have

Z

δi(Z)fi(Z)(Cp,i(Z)− u)dZ ≤ 0 . (78)

Hence, the feasibility problem that is equivalent to (27) is givenby

P(α, β) =⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

minδ u

s.t∫

Z

δi(Z)fi(Z)(Cp,i(Z)− u) dZ ≤ 0,∀i∑p

j=1∑p

i=0,i �=j

εj

1−ε0

Z

δi(Z)fj (Z) dZ ≤ β

∑pi=1

Z

δi(Z)f0(Z) dZ ≤ α

.

(79)

Therefore, if the above problem is feasible for a given u, the so-lution to (27) satisfies P(α, β) ≤ u. On the other hand, the non-feasibility of P(α, β) implies that P(α, β) > u. Given an in-terval [u0 , u1 ] containing P(α, β), the optimum detection rulesδ and optimum estimation cost P(α, β) can be determined bya bi-section search between u0 and u1 iteratively, where thefeasibility problem is solved in each iteration. The completeprocedure for the numerical search is described in Algorithm 1.To solve the feasibility problem, we define an auxiliary convexoptimization problem

J (α, β, u) �⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

minδ η

s.t∫

Z

δi(Z)fi(Z)(Cp,i(Z)− u) dZ ≤ η ∀i∑p

j=1∑p

i=0,i �=j

εj

1−ε0

Z

δi(Z)fj (Z) dZ ≤ β + η

∑pi=1

Z

δi(Z)f0(Z) dZ ≤ α + η

.

(80)

Note that J (α, β, u) ≤ 0 if and only if P(α, β) is feasible.Algorithm 1 summarizes the steps for determining P(α, β). Tosolve the problem in (80), a Lagrangian function is constructed

Q(δ, η, �) �(

1−p+2∑

i=0

�i

+p∑

i=0

�i

Z

δi(Z)fi(Z)(Cp,i(Z)− u) dZ

+ �p+1

[p∑

j=1

p∑

i=0,i �=j

εj

1− ε0

Z

δi(Z)fj (Z) dZ−β

]

+ �p+2

p∑

i=1

Z

δi(Z)f0(Z) dZ − α�p+2 ,

(81)

SIHAG AND TAJER: POWER SYSTEM STATE ESTIMATION UNDER MODEL UNCERTAINTY 605

where � � [�0 , . . . , �p+2] are the non-negative Lagrangian mul-tipliers selected to satisfy the constraints in (27), such that

p+2∑

i=0

�i = 1 . (82)

The Lagrangian dual function is

L(�) = minδ,ηQ(δ, η, �) (83)

= minδ

p∑

i=0

Z

δi(Z)Ai dZ − β�p+1 − α�p+2 , (84)

where

A0 = �0f0(Z)(Cp,0(Z)− u) + �p+1

p∑

i=1

εi

1− ε0fi(Z) ,

(85)

and for i ∈ {1, . . . , p},Ai = �ifi(Z)(Cp,i(Z)− u)

+ �p+1

p∑

j=1j �=i

εj

1− ε0fj (Z) + �p+2f0(Z) . (86)

Therefore, the optimum detection rules that minimize d(�) aregiven by

δi(Z) ={

1, for i = i∗

0, for i �= i∗ , (87)

where

i∗ = argmini∈{0,...,p}

Ai . (88)

Hence, the proof is concluded.

REFERENCES

[1] Y. Zhao, J. Chen, A. Goldsmith, and H. V. Poor, “Identification of outagesin power systems with uncertain states and optimal sensor locations.” IEEEJ. Sel. Topics Signal Process., vol. 8, no. 6, pp. 1140–1153, Nov. 2014.

[2] J. E. Tate and T. J. Overbye, “Line outage detection using phasor anglemeasurements.” IEEE Trans. Power Syst., vol. 23, no. 4, pp. 1644–1652,Nov. 2008.

[3] J. E. Tate and T. J. Overbye, “Double line outage detection using phasorangle measurements,” in Proc. Power Energy Soc. General Meet., 2009,pp. 1–5.

[4] Y. Zhao, A. Goldsmith, and H. V. Poor, “On PMU location selection forline outage detection in wide-area transmission networks,” in Proc. PowerEnergy Soc. General Meet., 2012, pp. 1–8.

[5] Y. C. Chen, T. Banerjee, A. D. Domınguez-Garcıa, and V. V. Veeravalli,“Quickest line outage detection and identification,” IEEE Trans. PowerSyst., vol. 31, no. 1, pp. 749–758, Feb. 2016.

[6] G. N. Korres and P. J. Katsikas, “Identification of circuit breaker statusesin WLS state estimator,” IEEE Trans. Power Syst., vol. 17, no. 3, pp. 818–825, Aug. 2002.

[7] E. M. Lourenco, A. S. Costa, and K. A. Clements, “Bayesian-basedhypothesis testing for topology error identification in generalized stateestimation,” IEEE Trans. Power Syst., vol. 19, no. 2, pp. 1206–1215,May 2004.

[8] G. N. Korres, P. J. Katsikas, and G. E. Chatzarakis, “Substation topologyidentification in generalized state estimation,” Int. J. Electrical PowerEnergy Syst., vol. 28, no. 3, pp. 195–206, Mar. 2006.

[9] E. M. Lourenco, A. J. A. S. Costa, K. A. Clements, and R. A. Cernev,“A Topology Error Identification Method Directly Based on CollinearityTests,” IEEE Trans. Power Syst., vol. 21, no. 4, pp. 1920–1929, Nov. 2006.

[10] E. Caro, A. J. Conejo, and A. Abur, “Breaker status identifica-tion,” IEEE Trans. Power Syst., vol. 25, no. 2, pp. 694–702, May2010.

[11] Y. Zhao, J. Chen, A. Goldsmith, and H. V. Poor, “Dynamic joint outageidentification and state estimation in power systems,” in Proc. AsilomarConf. Signals, Syst. Comput., 2014, pp. 1138–1142.

[12] Q. Huang, L. Shao, and N. Li, “Dynamic detection of transmission lineoutages using hidden Markov models,” IEEE Trans. Power Syst., vol. 31,no. 3, pp. 2026–2033, Aug. 2016.

[13] R. A. Sevlian, Y. Zhao, R. Rajagopal, A. Goldsmith, and H. V. Poor,“Outage detection using load and line flow measurements in power distri-bution systems,” IEEE Trans. Power Syst., vol. 33, no. 2, pp. 2053–2069,Jul. 2017.

[14] P. Ren, H. Lev-Ari, and A. Abur, “Robust continuous-discrete extendedKalman filter for estimating machine states with model uncertainties,” inProc. Power Syst. Comput. Conf., 2016, pp. 1–7.

[15] A. H. Sayed, “A framework for state-space estimation with uncertainmodels,” IEEE Trans. Autom. Control, vol. 46, no. 7, pp. 998–1013,Jul. 2001.

[16] B. T. Polyak, S. A. Nazin, C. Durieu, and E. Walter, “Ellipsoidal parameteror state estimation under model uncertainty,” Automatica, vol. 40, no. 7,pp. 1171–1179, Jul. 2004.

[17] A. Abur and A. G. Exposito, Power System State Estimation: Theory andImplementation. Boca Raton, FL, USA: CRC Press, 2004.

[18] A. Monticelli, State Estimation in Electric Power Systems: A GeneralizedApproach. Berlin, Germany: Springer Science and Business Media, 1999,vol. 507.

[19] M. S. Grewal and A. P. Andrews, Kalman Filtering. Hoboken, NJ, USA:Wiley, 2008.

[20] F. F. Wu and W. H. E. Liu, “Detection of topology errors by state es-timation,” IEEE Trans. Power Syst., vol. 4, no. 1, pp. 176–183, Feb.1989.

[21] K. A. Clements and P. W. Davis, “Detection and identification of topologyerrors in electric power systems,” IEEE Trans. Power Syst., vol. 3, no. 4,pp. 1748–1753, Nov. 1988.

[22] V. Kekatos and G. B. Giannakis, “Joint power system state estimation andbreaker status identification,” in Proc. North Amer. Power Symp., 2012,pp. 1–6.

[23] D. Middleton and R. Esposito, “Simultaneous optimum detection andestimation of signals in noise,” IEEE Trans. Inf. Theory, vol. 14, no. 3,pp. 434–444, May 1968.

[24] O. Zeitouni, J. Ziv, and N. Merhav, “When is the generalized likelihoodratio test optimal?” IEEE Trans. Inf. Theory, vol. 38, no. 5, pp. 1597–1602,Sep. 1992.

[25] G. V. Moustakides, G. H. Jajamovich, A. Tajer, and X. Wang, “Jointdetection and estimation: Optimum tests and applications,” IEEE Trans.Inf. Theory., vol. 58, no. 7, pp. 4215–4229, Jul. 2012.

[26] G. H. Jajamovich, A. Tajer, and X. Wang, “Minimax-optimal hypothesistesting with estimation-dependent costs,” IEEE Trans. Signal Process.,vol. 60, no. 12, pp. 6151–6165, Dec. 2012.

[27] J. Chen, Y. Zhao, A. Goldsmith, and H. V. Poor, “Optimal joint detectionand estimation in linear models,” in Proc. IEEE Conf. Decision Control,2013, pp. 4416–4421.

[28] W. Pan, Y. Yuan, H. Sandberg, J. Goncalves, and G. B. Stan, “Real-timefault diagnosis for large-scale nonlinear power networks,” in Proc. IEEE52nd Annu. Conf. Decision Control, 2013, pp. 2340–2345.

[29] H. Zhu and G. B. Giannakis, “Sparse overcomplete representations forefficient identification of power line outages,” IEEE Trans. Power Syst.,vol. 27, no. 4, pp. 2215–2224, May 2012.

[30] J. Chen, Y. Zhao, A. Goldsmith, and H. V. Poor, “Line outage detec-tion in power transmission networks via message passing algorithms,” inProc. Asilomar Conf. Signals, Syst. Comput., Pacific Grove, CA, USA,Nov. 2014, pp. 350–354.

[31] J. Heydari and A. Tajer, “Quickest localization of anomalies in powergrids: A stochastic graphical approach,” IEEE Trans. Smart Grid, to bepublished.

[32] J. Heydari, Z. Sun, and A. Tajer, “Quickest line outage localization un-der unknown model,” in Proc. IEEE Global Conf. Signal Inf. Process.,Montreal, QB, Canada, 2017, pp. 1065–1069.

[33] Y. Zhao, J. Chen, and H. V. Poor, “Learning to infer: A new variationalinference approach for power grid topology identification,” in Proc. IEEEStat. Signal Process. Workshop, 2016, pp. 1–5.

606 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 12, NO. 4, AUGUST 2018

[34] Y. Zhao, J. Chen, and H. V. Poor, “Efficient neural network architecture fortopology identification in smart grid,” in Proc. IEEE Global Conf. SignalInf. Process., 2016, pp. 811–815.

[35] A. Gomez-Exposito, A. de la Villa Jaen, C. Gomez-Quiles, P. Rousseaux,and T. Van Cutsem, “A taxonomy of multi-area state estimation methods,”Elect. Power Syst. Res., vol. 81, no. 4, pp. 1060–1069, 2011.

[36] L. Zhao and A. Abur, “Multi area state estimation using synchronizedphasor measurements.” IEEE Trans. Power Syst., vol. 20, no. 2, pp. 611–617, May 2005.

[37] G. N. Korres, “A distributed multiarea state estimation,” IEEE Trans.Power Syst., vol. 26, no. 1, pp. 73–84, Apr. 2011.

[38] V. Kekatos and G. B. Giannakis, “Distributed robust power system stateestimation,” IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1617–1626,Oct. 2013.

[39] E. Caro, A. J. Conejo, and R. Minguez, “Decentralized state estima-tion and bad measurement identification: An efficient Lagrangian relax-ation approach,” IEEE Trans. Power Syst., vol. 26, no. 4, pp. 2500–2508,Jun. 2011.

[40] G. N. Korres, A. Tzavellas, and E. Galinas, “A distributed implementationof multi-area power system state estimation on a cluster of computers,”Elect. Power Syst. Res., vol. 102, pp. 20–32, Sep. 2013.

[41] H. V. Poor, An Introduction to Signal Detection and Estimation. 2nd ed.,New York, NY, USA: Springer-Verlag, 1998.

[42] L. Schenato, G. Barchi, D. Macii, R. Arghandeh, K. Poolla, and A. V.Meier, “Bayesian linear state estimation using smart meters and PMUsmeasurements in distribution grids,” in Proc. Int. Conf. Smart Grid Com-mun., Venice, Italy, 2014, pp. 572–577.

[43] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:Cambridge Univ. Press, 2004.

Saurabh Sihag (S’17) received the B.Tech andM.Tech degrees in electrical engineering from theIndian Institute of Technology Kharagpur, India, in2016. Since Fall 2016, he has been working towardthe Ph.D. degree at the Department of Electrical,Computers, and Systems Engineering, RensselaerPolytechnic Institute, Troy, NY, USA. His researchinterests include statistical signal processing, infor-mation theory, and high-dimensional statistics.

Ali Tajer (S’05–M’10–SM’15) received the M.A de-gree in statistics and the Ph.D. degree in electricalengineering from Columbia University, New York,NY, USA. He is currently an Assistant Professor withthe Department of Electrical, Computer, and SystemsEngineering, Rensselaer Polytechnic Institute, Troy,NY, USA. During 2010–2012, he was with PrincetonUniversity as a Postdoctoral Research Associate. Hisresearch interests include mathematical statistics andnetwork information theory, with applications in dataanalytics and power grids. He Serves as an Editor for

the IEEE TRANSACTIONS ON COMMUNICATIONS and IEEE TRANSACTIONS ON

SMART GRID. In the past, he has also served as the Guest Editor-in-Chief for theIEEE TRANSACTIONS ON SMART GRID and as a Guest Editor for IEEE SignalProcessing Magazine. He was a recipient of the United States NSF CAREERAward in 2016.