introduction 2018... · introduction background the nerc 2019 state of the reliability report (sor)...

19
Relay Work Group (RWG) 2018 Misoperations Report Highlights May 155 North 400 West | Suite 200 | Salt Lake City, Utah 84103 www.wecc.org Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each region to evaluate the performance of protection systems. The misoperation rate is the ratio of protection system misoperations to total protection system operations. NERC’s Event Analysis Process has identified that protection system misoperations significantly increase the severity of events. Even though misoperations in the Western Interconnection are below the NERC overall average, misoperations are still a major area of concern because many reported misoperations involve human factors that can be corrected. That is why WECC worked with entities in the Western Interconnection to develop reduction strategies for protection system misoperations. Strategy components are referenced and linked throughout the document. Figure 1:Five-year protection system misoperation rate by region, Q4 2013 through Q3 2018 7.96% 10.19% 7.58% 13.29% 7.78% 7.02% 5.69% 8.56% 0% 2% 4% 6% 8% 10% 12% 14% FRCC MRO NPCC RF SERC Texas RE WECC NERC

Upload: others

Post on 29-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

Relay Work Group (RWG)

2018 Misoperations Report Highlights

May

155 North 400 West | Suite 200 | Salt Lake City, Utah 84103

www.wecc.org

Introduction

Background

The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each

region to evaluate the performance of protection systems. The misoperation rate is the ratio of

protection system misoperations to total protection system operations. NERC’s Event Analysis Process

has identified that protection system misoperations significantly increase the severity of events. Even

though misoperations in the Western Interconnection are below the NERC overall average,

misoperations are still a major area of concern because many reported misoperations involve human

factors that can be corrected. That is why WECC worked with entities in the Western Interconnection to

develop reduction strategies for protection system misoperations. Strategy components are referenced

and linked throughout the document.

Figure 1:Five-year protection system misoperation rate by region, Q4 2013 through Q3 2018

7.96%

10.19%

7.58%

13.29%

7.78%

7.02%

5.69%

8.56%

0%

2%

4%

6%

8%

10%

12%

14%

FRCC MRO NPCC RF SERC Texas RE WECC

NERC

Page 2: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

2

Purpose

The Relay Work Group (RWG) performs a quarterly review of the misoperations that are reported

through the NERC Misoperation Information Data Analysis System (MIDAS) tool for the Western

Interconnection. Part of this review includes performing a detailed analysis, which is used to—

• Provide trend analysis of protection system misoperation data and possible root cause

identification

• Form conclusions and recommendations from the analysis to reduce the likelihood of future

misoperations

• Develop guidance and best practices for industry through technical documents and webinars

pertaining to protection system misoperation trends, conclusions, and recommendations

• Along with the WECC’s Event Analysis Team, publish results to WECC’s Event and

Performance Analysis Subcommittee (EPAS) and WECC members

The RWG’s focus is on misoperations by cause to potentially identify ways to reduce future

occurrences resulting from similar causes. Each of the eight categories of misoperation causes was

analyzed in an individual and a group setting.

The impact of a misoperation on the BES was not considered in this analysis. The impact of a

misoperation on the BES is captured through the ERO Event Analysis Process if the misoperation is

involved in a reportable event.

Data

Misoperation data from January 1 through December 31, 2018, was used for a one-year analysis. Data

from January 1, 2016, to December 31, 2018, was used for trending.

• The data was obtained by WECC from the NERC 1600 reporting template with defined

categories and causes.

• WECC entities reported 267 misoperations during 2018.

• The 2018 data was compared to data collected since 2016 for trending and analysis. The NERC

1600 data reporting has only been in effect since 2016.

• The reported corrective actions, event description and cause of the misoperation were used to

assist in root cause identification.

• The 2018 misoperation data was reviewed quarterly by the RWG. During this review, the RWG

identifies submissions that need clarification or include errors. WECC staff works with the

entities to address these issues and resubmit these records to MIDAS.

2018 Misoperation Analysis

This section presents an analysis of the 2018 data with comparisons to the 2016 through 2018 data sets

for trending analysis. The sub-group analysis, conclusions, and recommendations for misoperations by

the eight causes follow.

Page 3: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

3

Misoperation by Cause Category

A high-level summary of misoperations can be created from those attributed to human error or to a

protection system component.

1. 47% of all misoperations can be attributed to these cause categories that involve human error:

• Incorrect settings/logic/design errors

• As-left personnel error

2. 35% of all misoperations can be attributed to these protection system component type cause

categories:

• AC system

• Communication failures

• DC system

• Relay failures/malfunctions *“Unknown/unexplainable category” and “Other/Explainable” categories not included in this breakdown

The reduction strategies have identified specific factors that lead to misoperations caused by human

error. These include human performance during commissioning, and the relay setting validation

process in place within companies.

From this general breakdown, the RWG investigated the distribution of misoperations by cause as

shown in Figure 2. “Incorrect setting/logic/design errors” was the largest cause, constituting 36% of all

misoperations, followed by the “Relay failures/malfunctions” representing 17%. Thus, more than half

of all misoperations fall into these two causes.

Page 4: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

4

Figure 2: Misoperation by cause 2018 data

The bar chart of Figure 3 shows the trending of misoperations by cause from 2016 through 2018. From

the trend, the top two causes for misoperations within the Western Interconnection are “Incorrect

setting/logic/design errors” and “Relay failures/malfunctions.” Averaging “As-left personnel error”

over the three-year period becomes the third-most-frequent cause. Two of the three most common

causes involve human error. The leading causes for human error may be attributed to experience and

less-than-adequate processes. This coincides directly with the “knowledge transfer” aspect of the

reduction strategies. Many entities lack an intentional way of retaining and transferring knowledge

within their departments. Using templates, developing proven processes, implementing a mentor

program, and keeping a current succession plan are a few ways entities can transfer knowledge within

their protection departments.

10%

11%

7%

1%

36%

10%

17%

9%

2018 Misoperation By Cause

AC System As-left personnel error

Communication failure DC System

Incorrect setting/logic/design errors Other/Explainable

Relay failures/malfunctions Unknown/unexplainable

Page 5: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

5

Figure 3: Misoperation by cause 2016 to 2018 trending

Misoperation by Cause Category Conclusions and Recommendations:

RWG recommends utilities develop a strategy to reduce misoperations for “Incorrect

setting/logic/design error” and “As-left personnel error" by internal controls and processes. This will

also help entities prepare for the implementation of PRC-027-1 on April 1, 2021, which will require

entities to develop a process for new and revised protection system settings and review implemented

settings within a certain time or based on fault current levels.

Misoperation by Category

“Unnecessary Trips” dominate the number of misoperations when compared to “Failure to Trip” or

“Slow Trip.” The number reflects both widespread redundancy in design and the design of failsafe

measures in protection systems to assure faults are quickly removed from the system, biasing on the

side of dependability. The pie chart in Figure 4 below shows the distribution of misoperations reported

by category. The “Incorrect setting/logic/design errors” have a large impact on the “Unnecessary Trips

during fault” category.

1324 23

7

128

19

543325 25 17 10

128

19

43

1727 2919

3

95

26

45

23

020406080

100120140

Misoperation By Cause Category

2016 2017 2018

Page 6: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

6

Figure 4: Misoperation by category 2018 data

The bar chart of Figure 5 shows the comparison of the misoperation by category for the period 2016

through 2018.

Figure 5: Misoperation by category 2016 to 2018 trending

Misoperation Category Conclusions and Recommendations:

“Unnecessary Trip during fault” misoperations reduce the reliability of the BES due to unexpected loss

of multiple BES facilities. The “Incorrect setting/logic/design errors” category dominates the

unnecessary trips. Reducing “Incorrect setting/logic/design errors” will improve reliability of the BES

by decreasing the number of events involving the loss of multiple Facilities.

4% 2%

51%

43%

2018 Misoperation By Category

Failure to Trip SlowTrip

Unnecessary Trip other than fault Unnecessary Trip during fault

156

141 139

9 4

131140

11 5

137

114

0

20

40

60

80

100

120

140

160

Failure to Trip Slow Trip Unnecessary Trip other

than fault

Unnecesasry Trip during

fault

Misoperation By Category

2016 2017 2018

Page 7: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

7

Misoperation by Voltage Class

There are no observable trends based on misoperations by voltage class from 2016 through 2018 except

for a slight, unexplainable downward trend of misoperations on systems greater than 400 kV.

Voltage Class Misoperation Conclusions and Recommendations:

WECC, under the new NERC 1600 reporting template, receives the number of operations to

misoperations per voltage class to provide an indicator of reliability.

An indicator of reliability is attained by knowing the number of Elements in each voltage class, taken

from the transmission availability data system (TADS). The number of misoperations per number of

Elements provides a better trend of what is happening within each voltage class, shown in Table 1

below.

Table 1: 2018 WECC TADS Elements

Voltage Class AC Circuit Converter DC Circuit Transformer Misoperation

Ratio %

0-99 kV 170 0 0 42 5.66

100-199 kV 2552 0 0 151 4.62

200-299 kV 1523 4 3 518 4.76

300-399 kV 166 2 0 146

400-599 kV 278 0 5 214 2.86

The lower misoperations ratio for the >400 kV voltage class could be due to entities having more

experienced engineers create the settings, and a more rigorous setting validation process (such as RTDS

testing) performed on this voltage class.

Misoperation by Relay Technology

Relay technology refers to one of three broad types of relays—electromechanical, representing the

earliest generation of relay technology using basic electrical circuits in conjunction with moving parts;

solid-state, representing a second generation of relay technology using transistorized components; and

microprocessor, representing the third and latest generation of relay technology using integrated

circuit components. The pie chart of Figure 6 shows the distribution of misoperations as reported by

relay technology. As you can see, 75% of misoperations involved microprocessor technology, far out-

pacing the other two types of relay technology.

Page 8: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

8

Figure 6: Misoperation by relay technology 2018 data

A further analysis of misoperations by relay technology showed the two major causes are “Incorrect

settings/logic/design errors” and “Relay failures/malfunctions.” The inventory of relays by technology

is not known, so it is hard to come to a certain conclusion about these misoperation rates. What is

known is entities are continuing to replace electromechanical and solid-state relays with

microprocessor relays.

Misoperation by Relay Technology Conclusions and Recommendations:

Most microprocessor relays have a published life cycle much less than what has been found with the

electromechanical relays. Entities will need to follow industry-specific microprocessor failures and

address the aging installed base of relays with a replacement strategy.

Sub-Group Analysis and Observations for Misoperations by Cause

AC and DC Systems Cause Analysis

There were 29 misoperations attributed to AC systems and DC systems for 2018. For analysis purposes,

the misoperations due to AC and DC systems were combined into one cause.

13%

75%

4%

7%

2018 Misoperation by Relay Technology

Electromechanical Microprocessor Solid State Other

Page 9: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

9

Figure 7: AC and DC systems misoperation totals 2016-2018

Based on descriptions of events reported by entities and their corrective action plans, the RWG

separated AC and DC systems into various triggers. Referring to Figure 8, the largest sources of

misoperations in 2018 were equipment failures at 38%, followed by wiring problems/damage at 34%. In

2017, wiring problems/damage was the leading cause at 47%. The reason for the increase in

misoperations associated with wiring problems/damage from 2017 to 2018 is not known.

17

3029

0

5

10

15

20

25

30

35

2016 2017 2018

AC and DC—Total Misoperations by Year

# of Misops

Page 10: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

10

Figure 8: AC and DC systems misoperation triggers 2018 data

In 2018, two-thirds of the AC and DC systems misoperation category are triggers that result in an

“Unnecessary Trip other than fault,” with similar numbers over the past three years. The data indicates

that failures in the AC and DC protection system components typically lead to elements unnecessarily

being removed from service.

AC and DC System Misoperation Subcategory Conclusions and Recommendations:

The increase in the number of misoperations attributed to this cause from 2016 correlates with a

decrease in the number reported to the “Unknown/unexplainable” category over this same time. It

appears the review process RWG performs is ensuring the correct cause is being reported.

The RWG maintains that routine maintenance and inspection of wiring and equipment may find most

of the problems before a misoperation. The RWG continues to recommend regular maintenance

practices extend beyond visual inspection. For example, inclusion of fuse replacement, wire checking,

or additional asset/component replacement.

38%

34%

14%

7%

7%

2018 - AC & DC System Misoperation Triggers

Equipment Failure

Wiring problem/damage

CT Saturation

PT Transient Response

DC Noise

Page 11: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

11

Incorrect Setting/Logic/Design Errors Cause Analysis

In 2018, there were 95 misoperations attributed to “Incorrect setting/logic/design errors,” making it the

largest of all cause categories. This is consistent with previous years. Although still the largest cause

category, the total number of misoperations for this cause is lower in 2018 than the previous two years.

While it is premature to consider this improvement a trend, it may be reflective of the concerted effort

that WECC and the RWG have put in to thoroughly reviewing reported misoperations, engaging

member utilities on potential areas of improvement, and developing strategies for improvement.

Microprocessor relay technology was involved in 94% of all 2018 misoperations caused by “Incorrect

setting/logic/design errors.” The large percentage of misoperations suggests that, while providing

enhanced system reliability and event analysis, the complexity of these devices may also be a

contributor to misoperations. However, without knowledge of the total populations of the relay

technologies, this cannot be confirmed.

The “Incorrect setting/logic/design errors” cause can be subdivided into Setting, Logic, and Design

errors. In 2018, Setting errors make up 82% of the misoperations, while Logic and Design errors

account for 18%. This balance is consistent with 2017.This high number of Setting errors, in conjunction

with statistics on relay type, may be indicative of the complexity of microprocessor relays and their

applications.

When the Setting and Design errors were further investigated, as shown in Figure 9, they showed that

two significant causes of misoperations are due to incorrect ground overcurrent settings and

miscoordinated transfer trip scheme settings. This data has only been tracked since 2016, but all years

show similar results. Furthermore, in 2018, the number of “Incorrect setting/logic/design”

misoperations for which the cause remained Undetermined dropped to only 1%. This improvement is a

positive sign. Potential contributing factors include a higher installed percentage of microprocessor

relays which create event record data and support misoperations analysis.

Page 12: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

12

Figure 9: 2018 misoperations by Incorrect setting/logic/design errors subdivided into root cause

Misoperation Incorrect Setting/Logic/Design Errors Cause Category Conclusions and

Recommendations:

Entities should:

• Perform peer review consisting of verifying the fault system model is correct, the coordination

study is complete, the contingencies within the study are correct, proper setting values of the

elements applied, and the elements for the application are enabled.

• Develop standards/guidelines pertaining to fault studies and a process for review of new and

existing settings to ensure changes to the system do not result in misoperations. The new

standard PRC-027 will address the periodic review of protection systems.

• Establish a training program for protection schemes and applications.

• Develop a method for applications-based testing and apply it as a quality assurance measure to

new and modified relay applications.

• Review the IEEE Power System Relaying Subcommittee report, “Processes, Issues, Trends and

Quality Control of Relay Settings,” (Working Group C3 of Power System Relaying Committee

25%

4%

9%

1%

61%

2018 Settings/Design Breakdown

Ground Overcurrent

Ground Distance

Transfer Trip

Undetermined

Other

Page 13: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

13

of IEEE Power Engineering Society, March 2007) to provide more technical guidance for quality

control of protective relay settings.

Communication Failure Cause Analysis

There were 18 misoperations attributed to Communication Failure during 2018.

The Communication Failure cause code grouping can be subdivided into broad areas. In 2018,

misoperations were reported in five subcategories. Ten of 18 events involved “Unnecessary Trip other

than fault.”

Figure 10: 2018 Communication related protection misoperations by sub-cause

Figure 10 illustrates the reported misoperations in communication-related cause codes within WECC

from 2016 through 2018. The various cause codes assigned to reported system events represented

multiple causes and showed a noticeable increase in the Bad Wires category for 2018. For the previous

four years, this had not been an issue. On review, there were several entities reporting misoperations

caused by bad pilot/copper wires on Hybrid Circuit Breaker (HCB) differential schemes. In addition,

the Bad Wires misoperation count was skewed due to multiple misoperations on the same circuit. For

example, there may have been three misoperations on the same circuit caused by coax cable failure.

The reporting of multiple misoperations on the same circuit is a reasonable explanation for the uptick

in the Bad Wires category.

Page 14: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

14

Figure 11: Communication related misoperations

Misoperation Communication Failure Conclusions and Recommendations:

The number of Communication Failure misoperations within WECC has been small throughout the

2016–2018 timeframe. The category represented 20 of 301 misoperations (6.6%) in 2016 and 18 of 268

misoperations (6.7%) in 2018.

In the future, Communication failures may increase as entities are moving to new technologies and

packet-based communication systems for use in protection systems. Migration to new technology,

specifically packet-based communication, has not widely occurred yet. The impact of implementing

new communication system technology on misoperations is not well understood; it may be that current

installations are not causing misoperations, or that there is not enough detail in the misoperation

submission data fields to identify the type of technology used. When employing new technology,

thorough testing and verification should be performed to assure protection system reliability

Relay Failures/Malfunctions Cause Analysis

There were 37 misoperations attributed to “Relay failures/malfunctions” in 2018. The data was divided

by relay technology as shown in Figure 12. The failure rates for electromechanical and solid-state

technologies remained consistent from previous years, but failures of microprocessor relays were down

about 30%.

Page 15: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

15

The failures were analyzed versus the misoperation category. Most relay failures caused an

“Unnecessary Trip other than fault.” The data indicated that failures are prone to cause a trip. Figure 12

shows misoperations by relay failures for each category of relay technology.

Figure 12: Misoperation by Relay failures/malfunctions for 2018 subdivided by relay technology

Microprocessor relays consistently experience more failures than the other technologies. Inventories of

each technology are not known individually, so a meaningful comparison is not readily available. One

possible explanation may be due to aging of the first and second generations of microprocessor relays.

Failures in electromechanical and solid-state relays do appear to be trending downward.

Relay Failures/Malfunctions Cause Category Conclusions and Recommendations:

A thorough investigation of the misoperation is important to understand the root cause and determine

proper corrective actions to mitigate similar issues throughout the entity’s protection systems. Many

failed relays are simply replaced as a corrective action with no further investigation. While this may

resolve the issue on that failed unit, it does not provide details on the reason of the failure. Many

entities will work with the manufacturer to understand the cause of failure. As the root cause is found,

the fix can be applied throughout the entity’s fleet.

As-left/Personnel Error Cause Analysis

The three-year average from 2016–2018 moves the “As-left/Personnel Error” to the third-highest cause

for misoperations within WECC. There were 26 reported “As-left/Personnel Error” events for 2018,

with 12 of these misoperations resulting in an “Unnecessary Trip other than fault” event. Four “As-

left/Personnel Error” misoperations led to “Failure to Trip” events. Failure to Trip is considered a high-

impact event to the reliability of the system, as the fault remains on the system longer and will require

additional elements to be removed from service.

From the review of the event description and the corrective action plan, the “As-left/Personnel Error”

was divided into three major contributors: wiring errors, testing errors, and switching errors. The data

suggests that wiring and testing errors led to the most misoperations in 2018.

Misoperation As-Left/Personnel Errors cause category conclusions and

recommendations

Due to the relatively small number of events in this category, few recommendations are proposed. The

largest groups of events were due to wiring and testing errors. These two types of errors can be

Relay Technology Electromechanical Solid State Microprocessor

# of Failures in 2018 10 5 23

Relay Failures by Technology in 2018

Page 16: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

16

reduced through commissioning and maintenance processes. Some of the best practices in the industry

to avoid “As-left/personnel error” are as follows:

• To avoid leaving incorrect settings on a relay, the technicians performing the work should

compare the “As-Left” settings on the relay to the desired settings given by the setting engineer.

• To avoid wiring errors, the relay scheme should be fully functionally tested to make sure all

inputs and outputs are functioning as desired with the proper response.

• To avoid leaving wiring open, loose, or missing; which can lead to a failure to trip or false trip,

each company should develop and implement a process to be used by the persons performing

the work to ensure all wiring and switches are left in a desired state.

Unknown/Unexplained Cause Analysis

In 2018 there were 23 events reported with the “Unknown/unexplainable” cause category.

Unknown/unexplained is used when no clear cause can be determined. After extensive investigation,

the submitting entity may select this cause when no other option is suitable, or the operation is still

under investigation.

In 2018, “Unknown/unexplainable” misoperations represented 8.6% of all reported misoperations. In

comparison, the 2016 and 2017 data represented 11% and 6% respectively of all reported misoperations.

When the reason for a misoperation is unknown, corrective actions cannot be taken to prevent another

misoperation from occurring at that terminal, nor can knowledge be gained that would allow the

prevention of a similar misoperation from occurring at another terminal. Therefore, it is desirable to

reduce the number of misoperations that cannot be explained and are categorized as

“Unknown/unexplainable.”

The bar graph of Figure 13 shows the total number of misoperations reported as

“Unknown/unexplainable” for each year 2016 thru 2018. A trend line is provided to demonstrate

misoperations reported as “Unknown/Unexplainable” are overall trending downward.

Page 17: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

17

Figure 13: Percentage of misoperations for cause Unknown/unexplainable—Trending 2016-2017

Of note is that many of the unknown causes have a corrective action plan to test the system, monitor or

work with the manufacturer. This category is often used when entities perform their quarterly

reporting while still attempting to find the root cause. The RWG has observed some misoperations with

an unknown cause have a corrective action plan and known cause in MIDAS. As an

“Unknown/unexplainable” is resolved and a cause is determined, the entity should resubmit with the

correct cause category to avoid skewing the numbers.

Misoperation Unknown/Unexplainable cause code Conclusions and

Recommendations:

The number of misoperations reported as “Unknown/unexplainable” cause has increased slightly from

2017 but is still well below what has been reported in previous years. Some entities have found success

by strategically placing Digital Fault Recorder (DFR) on sections of their systems where there are more

electromechanical relays, or have a history of unknown caused operations. These DFRs can provide

helpful information about the event that electromechanical relays do not.

Conclusions

The trending of 2016 through 2018 indicates the total number of misoperations has remained constant

from year to year. Misoperations due to “Incorrect setting/logic/design errors” is the leading cause. The

majority of misoperations due to setting errors are preventable. Best practices and techniques used to

prevent the application of incorrect settings for new protection systems include peer reviews, increased

training, more extensive fault studies, and standard templates for setting standard schemes using

complex relays. In addition, processes should be created for installed fleet to include periodic review of

0

5

10

15

20

25

30

35

2016 2017 2018

Misoperation Unkown/Unexplainable Cause

Page 18: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

18

existing settings and when there is a change in system topography. PRC-027 may affect the number of

misoperations in the future as it becomes enforceable.

In 2018, WECC saw the “Unknown/unexplainable” cause reduce from initial collection of data.

Previous years the number of reported misoperations in “Unknown/unexplainable” were either

categorized incorrectly or never updated when the investigation was concluded by an entity. The

change to the data can be attributed to the review of the submitted quarterly reports by the RWG.

The NERC 1600 reporting template added sub-cause categories in 2018 to the “Incorrect

setting/logic/design errors” and “Relay failures/malfunctions.” Although the sub-cause entry is not

required, entities who choose to use it will help to give better detail of the cause leading to root cause.

The NERC MIDAS Work Group Data Reporting Instruction being drafted in 2019 will improve the

quality of data reporting by providing better documentation and examples. The RWG found event

descriptions continue to improve but are still lacking in establishing the root cause. A root cause is

necessary to determine the proper corrective action to apply either to the protection system, entity

processes or across all similar installations in the entity’s system.

Recommendations

1. A review of quarterly misoperation by the RWG is beneficial and should continue.

2. “Unnecessary Trip during fault” reduces the reliability of the BES due to unnecessary loss of

multiple BES facilities. WECC entities should target “Incorrect setting/logic/design errors”

which contribute to “Unnecessary Trip during fault” to reduce multi-facility loss for a fault or

perform periodic review of settings.

3. A second indicator of reliability is attained using the number of misoperations ratio per number

of Elements in a voltage class. Trending is required to verify improvement to the reliability of

the BES within the Western Interconnect.

4. The RWG should use the same voltage class ranges entities report operations to NERC MIDAS

for analysis.

5. Entities should:

a. Perform peer review—consisting of verifying the fault system model is correct, the

coordination study is complete, the contingencies within the study are correct, proper

setting values of the elements applied and the elements for the application are enabled.

b. Develop standards/guides pertaining to fault studies and a process for review of new and

existing settings to ensure changes to the system do not result in misoperations. The new

standard PRC-027 will address the periodic review of protection systems.

c. Establish a training program for protection schemes and applications.

d. Develop an applications-based testing methodology and apply as a quality assurance

measure to new and modified relay applications.

Page 19: Introduction 2018... · Introduction Background The NERC 2019 State of the Reliability Report (SOR) published the annual misoperation rates for each ... Five-year protection system

RWG 2018 Misoperations Report Highlights

19

6. The RWG maintains that routine maintenance and inspection of wiring and equipment may

find most of the problems before a misoperation. The RWG continues to recommend regular

maintenance practice extend beyond visual inspection, for example, inclusion of fuse

replacement, wire checking, or additional asset/component replacement.

7. The actual cause of a “Relay failure/malfunction” is important in understanding a root cause to

determine the proper corrective action to apply either to the protection system or across all

similar installations. Many entities either replaced the relay or are working with the

manufacturer to understand the cause. As the root cause is found, the fix should be applied

throughout the entity’s fleet.

8. Best practices in the industry to avoid “As-left/personnel error” are as follows:

a. To avoid leaving incorrect settings on a relay, the technicians performing the work should

compare the “As-Left” settings on the relay to the desired settings given by the setting

engineer.

b. To avoid wiring errors, the relay scheme should be fully functionally tested to make sure all

inputs and outputs are functioning as desired with the proper response.

c. To avoid leaving wiring open, loose, or missing, which can lead to a failure to trip or false

trip, each company should develop and implement a process to be used by the persons

performing the work to ensure all wiring and switches are left in a desired state.