local methods for localizing faults in electronic circuits.pdf

7/24/2019 Local Methods for Localizing Faults in Electronic Circuits.pdf

1/30

Massachusetts Institute of Technology

Artificial Intelligence LaboratoryMemo No, ~94 November 1976

LOCAL METHODS FOR LOCALIZING FAULTS

IN ELECTRONIC CIRCUITS

byJohan de Kleer

Abstract:The work described in this paper is part of an investigation of the issues

Involved in making expert problem solving programs for engineering design and formaintenance of engineered systems. In particular, the paper focuses on thetroubleshooting of electronic circuits. Only the individual properties of thecomponents are used, and not the collective properties of groups of components. Theconcept of propagation is introduced which uses the voltage-current properties of components to determine additional information from given measurements. Twopropagated values can be discovered for the same point. This is called a coincidence.In a faulted circuit, the assumptions made about components in the coincidingpropagations can then be used to determine information about the faultiness of thesecomponents. In order for the program to deal with actual circuits, I t handles errorsin measurement readings and tolerances in component parameters. This is done bypropagating ranges of numbers instead of single numbers. Unfortunately, the

comparing of ranges Introduces many complexities Into the theory of coincidences.In conclusion, we show how such local deductions can be used as the basis forqualitative reasoning and troubleshooting.

Work reported herein was conducted in part at the Artificial Intelligence Laboratoryat the Massachusetts Institute of Technology and the Intelligent Instructional SystemsGroup at Bolt Beranek and Newman. The Artificial Intelligence Laboratory issupported in part by the Advanced Research Projects Agency of the Department of Defense and monitored by the Office of Naval Research under Contract NumberNOOOI4-75-C-O64~. The Intelligent Instructional Systems Group is supported in partunder contract number MDA 9 O~-76-C-OlO8 jointly sponsored by Advanced ResearchProjects Agency, Air Force Human Resources Laboratory, Army Research Institute,and Naval Personnel Research & Development Center.


2/30


3/30


4/30


5/30

5

currents on other terminals. For example, the expert fo r a transistor m ight, when it sees a base-

emitter voltage of less than .5 5 volts, infer a zero current through the collector.

This propagation scheme is very similar to that used in EL

. Although similar in that they are both based on propagation of

constraints, the different goals of analysis and troubleshooting lead to many differences in the

details of the two propagation schem es. Therefore, w e include a very terse description of our

propagation scheme, and the reader is referred to the tw o EL papers for a deeper explanation of

propagation of constraints.

Since EL is prIm arily interested in analysis, I t must discover every value in the circuit. When

conventional numeric propagation fails I t resorts to propagating variables and solving algebraic

equations. Since we are mainly Interested In explaining and not analysis the propagation of

variables and solving of equations is not done.

In order to give explanations for deductions, a record Is kept as to which expert made the

particular deduction. Most propagations make assum ptions about the components Involved in

making It , and these are stored on a list along with the propagated value. Propagations are

represented as :

( ( ) )

i s VO LT AG E o r C U R R E N T.

is a pair of nodes for a voltage and a term inal fo r a current.

Note that every such propagation has a value associated with it . For those examples where the

exact numerical value is im portant, exact numbers will be included.

The sim plest kinds of propagations require no assu m ptions at all. These are the Kirchoff

voltage and current laws.


6/30

6

Ni

The circuit consists of components such as resistors and capacitors etc., termtnais of these

components are connected to nodes at which two or m ore terminals are Joined. I n t h e above

diagram T/l, TI 2 and T/~ are terminals and N I, N2 and NS are nodes. Currents are normally

associated with term inals, and voltages with nodes.

Kirchoffs current law states that if all but one of the term inal currents of a component or

node i s known, t h e last te rm inal current can be deduced.

(CURRENT 1 / i )

(CURRENT 1 / 2 )

( C U RR E N T 1 /3 ( K C L N i ) N I L )

Since faults in circuit topology are not considered, KC L makes no new assumptions about the

circuit.

Kirchoffs voltage law states that if tw o voltages are known relative to a common point, the

voltage between the two other nodes can be computed:

(VOLTAGE (Ni N2))

(VOLTAGE (N2 N3))

(VOLTAGE ( N i N3) (K Y L N i N2 N3 ) N I L )

As with KCL, KVL makes no new assumptions about the circuit.

One of the most basic types of the circuit elements is the resistor. Assuming the resistance of

TI 3


7/30

7

the resistor to be correct, the voltage and current can be deduced from each other using Ohms law:

(CURRENT R i )

(VOLTAGE (Ni N 2) ( R E S I S TO R ! R i ) (R i) )

(VOLTAGE ( N i N2))

(CURRENT R i ( RE S I STORV Ri ) ( R i ) )

(In all the example propagations presented so far It was assumed that the prerequisite values had no

assumptions, otherwise t h e y w o u l d have been included in the final a ssum ption list.)

These three kinds of propagations suggest a simple propagation theory. First, Kirchoff s

voltage law can be applied to every new voltage discovered in the circuit. Then for every node an d

component in the circuit, Kirchoffs current law can be applied. Finally, fo r every component which

ha s a newly discovered current into it or voltage across it, its VIC is studied to determine further

propagations. If this produces any new voltages or currents, the procedure is repeated.

The current through a capacitor is always zero, so the current contribution of a capacitor

t e r m i n a l t o a node c a n a l w a y s be d e t e r m i n e d .

(CURRENT C (CAPACITOR C ) ( C ) )


8/30


9/30


10/30


11/30


12/30


13/30

1 3

c o r r o b o r a t i o n s , t h e s i m p l e t r o u b l e s h o o t i n g s ch eme u s e d t h e p r i n c i p l e t h a t a c o i n c i d e n c e in d ica t ed

that all of the components in the assumption list were cleared from suspicion. This principle must

be studied with much greater scrutiny, as there are a number of cases for which it doesnt hold.

In order to do this we must examine the precise nature of the propagations, and, more

importantly, examine the relation between a single v a l u e u s e d i n a propagation with the f i n a l

pr o pa g a t e d v a l u e . C o n s i d e r a p ro p ag a ted v a l u e derived from studying the component D. Let the

resulting current or voltage value be jtD). The propagator is entirely linear; so the propagated

value at any point can be written as a linear expression of sums of products involving measured

an d p r o p a g a t e d values. For every com ponent, current and voltage vary directly with each other and

not inversely. H e n c e , in the expression for the final propagated value,J(D) c an never appear in the

denominator. So the final value can be written as:

value f(D) ~ + b

Where a and b are arbitrary expressions not involving D. The relation between jtD) and the final

propagated value is characterized by a. By studying the nature of component experts, the structure

of a can be determined. Every expert derives JTD ) either by m ultiplying the incoming value v(D)

by a parameter, or by applying a simple comparison test to the v(D). As m any such comparison tests

can b e i n v o lv ed in a single propagation, each propagation ca n have a predicate associated with i t

i n d i ca t i n g what conditions must be true for the propagation to hold. With bo th k i n d s o f

propagations there is a problem if a is zero. In that case,J(D) has no influence on the final value

and so a coincidence says nothing about the validity of fiD).

A corroboration with a propagation involving a predicate only indicates that the incoming

v a lu e v(D) of the predicate l i e s within the tested range, thus saying little about the assumptions

which were used to derive v(D). N o t e , howeve r, t h a t i n a contradiction the predicate may be testing

an erroneous value, and thus v(D) mig h t b e i n c o r r e c t . W e shall call these assumptions, which

c o r r o b o r a t i o n s d o n o t r e m o v e f rom s u s p i c i o n , t h e secondary a ssum ptions of the propagation, an d the

remaining, the primary assumptions.

The situation for which a is zero can be partially characterized. Using the same assumption

more than once in a propagation is relatively rare. In such a single-assumption propagation a must


14/30


15/30


16/30


17/30


18/30


19/30


20/30

20

must wait till It is proven before using this Information. However, in a single fault theory a very

i n t e r e s t i n g d ed u c t io n c a n s t i l l b e m a d e . I t i s e a s i e r t o s e e in formal terms: A splitting B really says

valid(A)~valid(B), while A corroborating B s a y s valid(A)-valid(B). Co n s id e r valid(A)Dvalid(B). If

t he a s s u m p t i o n s of A and B a r e n o t d i s j o i n t , c o n s t r u c t a B* that does not mention the common

assu mp t io n s . Now va1id(A)~vaUd(B*)a l s o i m p l i e s invalid(B~)Dinva1id(A). B u t t h e a s s u m p t i o n s o f

B* an d A a r e d i s j o i n t and t h e c i r c u i t c a n h a v e only one fault. Hence B~~cm u s t b e p e r f e c t l y c o r r e c t .

In summary, the split of B by A in a single fault theory I m p l i e s a l l t h e a s sum pt i ons involved with B

are correct (i.e. a corroboration of B with truth) and nothing about the assumptions of A. This

corresponds with our intuition; a split is a kind of corroboration in which one of the propagations

Is much stronger than the other, and as such the corroboration only com m en ts on the weaker of the

two p r opaga t i ons .

Although the range mechanism was introduced to handle errors In measurements and

component parameters, It can also be used to deal with new kinds of propagations that would have

been impossible in the simple scheme. Noticing that the collector current of a transistor is large

leads to the deduction that its bas e-em itter voltage must be between .5 and I volt, With the range

mechanism this kind of propagation ca n now be included: propagate the range [.5 , I ). There are

many possible uses for this idea. Every diode could propagate a non-negative current through

itself. Every transistor could propagate a base-emitter voltage of less than I volt. The voltage at

every node could be asserted to be less than the su m of the voltage sources in the circuit. More

interestingly, it could handle the problem of having a range propagated over a discontinuous

device: a f-I , +1) current range propagated into a diode should have its lower limit modified to 0

(i.e. (0 , + 1]).

When a significant propagation occurs which overlaps a test point of a discontinuous

component, the best strategy is to Interpret that measurement to have t o o wide an error associatedwith It an d stop the propagation there. In general, when error tolerances in propagated values

become absurd (a significant fraction or multiple of the central value) the propagation should be

artificially stopped.


21/30


22/30


23/30

measurements about which It already knows something (so to produce a coincidence).

Since only measurements at points about which something is explicitly known are considered,

the Information provided by coincidences between solely propagated values (the result of

incompleteness in the propagator) cannot enter into consideration. Thus the basic approach of the

troubleshooter Is to make no hypothetical measurements and look only at those propagations with

unverified assumptions as predictions to try to coincide w ith. Unexpected information, such as that

provided by coincidences between propagated values, cannot be considered in that paradigm

(although making hypothetical measurements would handle this problem).

If we are only prepared to look ahead one measurement, our original search scheme remains

reasonable. The binary search for the best measurement m u s t , o f c o u r s e , be reorganized. Since a

corroboration may eliminate different numbers of components from s u s p i c i o n t h a n a c o n t r a d i c t i o n ,

the search Is not purely binary. A workable solution is to just take the average of the number of

components which would be verified in each case as the measurements score. Then that

measurement whose score was nearest to half the number of faulted components could be chosen as

the next measurement.

There remains the issue of generating an explanation for this choice. Although the above

argument for deriving a future choice of measurement could be made understandable to humans it

does not always admit a very good explanation. A large part of the explanation for a future choiceof measurement involves indicating why a certain component cannot be faulted. Once a component

is eliminated from suspicion for any reason it is never considered again. However, a later

measurement might give a considerably better explanation for its non-faultiness. The problem of

generating good e x p l a n a t i o n s , o f c o u r s e , a l s o must take Into account a m ode l of the student an d

what he knows a b o u t t h e e l e c t r o n i c s a n d t h e p a r t i c u l a r circuit in question.

The above scheme for selecting measurements does not take into account how close~the

measurement is to the actual components in qu estion. For example, a voltage measurement across

two unverified resistors is just as good as a measurement many nodes aw ay which also has only

those two resistors as unverified assumptions. Fortunately these can be easily detected: just r e m o v e

from the list of possible measurements al l those w hich are propagated from o t h e r elements on the


24/30


25/30

25

(~ ( C UR R E NT R u (KCL N3) (01 R9 ) (02)) [.88815 , .88811])

(~ (VOLTAGE ( N i N 3 ) ( R ES I ST O RI A l l ) ( 0 1 R 9 All) ( 0 2 ) ) (.26 , . 18 ] )

(~ (CURRENT C/QI (TRANOFF 0 1) C R11 0 1 A S ) ( 0 2 ) ) (1.E6 , 4.OES])

A c o n t r a d i c t i o n o c c u r s . The n e w p r o p a g a t i o n i s ~better~ t h a n t h e o l d o n e . The o l d p r o p a g a t i o n

cannot not be removed in favor of t h e n ew p r o p a g a t i o n because it is an antecedent of the new

p ro p ag a t io n . We conclude that one of RU , Qj, R9 or Q2 must be faulted.

Consider the problem of R9 being open:

(~ (CURRENT

(~ (CURRENT

(~ (CURRENT

( (VOLTAGE

(~ (CURRENT

(~ (CURRENT

(. (CURRENT

( r n (CURRENT

(~ (C U R R E N T

(~ (VOLTAGE

( .8836

( - (CURRENT

C / 0 2 ( M E A S 1 1 8 0 8 1 ) NIL N I L ) [ .08833 , .88836])

B / 0 2 ( B E TA 02 C / 0 2 ) ( 0 2 ) N I L) 1 2 .2 E 6 , 7 . 2 E 6 ] )

E/ 0 2 ( B E TA 02 C / 0 2 ) ( 0 2 ) NIL) ( - .88837 , - .08033])

( N 2 GROUND) ( M E A S 1 1 0 8 0 2 ) NIL NIL) ( 44 , 49])

R 9 (RESISTORV AS) ( A S ) N I L ) ( . 0 1 2 , . 81 63 )

C/Ui ( K C L N 2) ( R 9 ) (02)) [ . 8 1 2 , . 0 16 ])

B/Ui ( B E TA 0 1 C / O i l ( 0 1 R 9 ) (02)) [8E - 5 , .08833])

E/Q1 ( B E TA 0 1 C / Q 1 ) ( 0 1 AS) (02)) (.817 , .812])

Ru ( K C L N 3 ) ( 0 1 AS) (02)) (2 .6E-6 , .8883])

( N i N 3 ) ( R ES I ST O R! A ll ) ( R u 0 1 A S ) ( 0 2 ) )

.475])

C/al (TRANOFF 0 1) C R1 1 0 1 AS) ( 0 2 ) ) (1.E6 , 4.ES])

This con tradiction Indicates that one of R h, Qj, R 9 or Qj is faulted.

I n t h i s example t h e c i r c u i t ha s no faults.


26/30


27/30

27

method of attack will be from two directions. First, problems inherent in the earlier propagation

scheme can be alleviated with other knowledge about the circuit. Second, many of the kinds of

troubleshooting strategies we see in humans cannot be captured even by a generalization of the

proposed scheme. One of the basic issues Is that of teleology. The more teleological information

one has about the circuit, the more different the troubleshooting process becomes. Currently, most

of the ideas presented in this paper so far have been implemented in a program so that much of

the discussions derive their observations from actual interactions w ith the program .

The most arresting observation is that the propagator cannot propagate values very far, and

at other times it propagates values beyond the point of absurdity. Examining those p r o p a g a t i o n s

which go t o o far the most dominant characteristic is that either the v a l u e i t s e l f ha s t o o h igh o f an

error associated with It, or that the propagation i t s e l f i s n o t r e l e v a n t t o t h e I s s u e s i n q u e s t i o n . The

former problem can be more easily answered by more stringent controls on the errors in

propagations. The latter requires an idea of localization of interaction. This idea of a theater of

interactions would limit senseless propagation; however, it requires a more hierarchical description

of the circuit.

The idea that every measurement must have a purpose points out the basic problem: our

troubleshooter cannot make intelligent measurements until it has, by accident, lim ited the number of

possible faults to a small subset of al l the components in the circuit. After this discovery h a s been

made , wh ich t h e t r o u b l e s h o o t e r i s not given a n d must make by i t s e l f , fairly intelligent suggestions,

can be made. However, as such a discovery is u s u a l l y ma d e when the set of possible faults is

r e du c e d t o a b o u t f i v e components, it can only i n t e l l i g e n t l y troubleshoot in the last few (two or three)

measurements t h a t a r e made in the circuit.

C l e a r l y , many measurements are made before this discovery and the troubleshooter cannot do

anything Intelligent during this period. Still, the propagation scheme and the ideas of

corroborations and contradictions can be effectively used even during this period.

The o n l y w ay i n t e l l i g e n t m e a su r e m e n t s c a n b e ma d e d u r i n g this period is by knowing

something about how the circuit should be behaving. This requires teleological information about

the circuit. For example, just to know that the circuit is faulted and requires troubleshooting


28/30

2 8

requires teleology. In the situations where the propagator did not propagate very far, the problem

usually was that some simple teleological assumption could have been made. The voltages an d

currents at many points in the circuit remain relatively constant for al l instantlations of the circuit,

and furthermore many of them can be easily deduced (e.g. knowing certain voltage and current

sources such as the power supply, knowing contributions by certain components to be small, etc.).

Propagation can then proceed much further. Of course, the handling of coincidences requires

modifications, an d a new kind o f strategy to deal with teleological coincidences needs t o b e

developed.

Coincidences provided information only about the assumptions of the propagations involved.

Since the only kind of assumptions we were considering were those about the faultedness of

components, the consequences of violating assumptions were obvious. The consequences of violating a t e l e o l o g i c a l assu m ption is not at all obvious and requires m ore knowledge about the

circuit. The point Is that the ability the propagate t e l e o l o g i c a l assu m ptions is just a small step

towards dealing with teleology.

In his thesis Brown deals primarily with how to represent and use teleological

knowledge In troubleshooting. Although propagation plays only a small role in his theory, many of

his Ideas address the p roblem s that we have been discussing in this section.

FUTURE RESEARCH

The previous sections have sketched out the necessity fo r more teleological and non-local

knowledge. Since Brown addressed this problem, on e obvious direction for research Is to try to

incorporate his Ideas. This direction suffers from tw o difficulties. First, Brown never implemented

his ideas an d t hus t h e y r e q u i r e a m a j o r effort to become actually utilizable. (The troubleshooter

based on the ideas of this paper (INTER) is working and requires a practical theory of teleology.)

Second, Browns troubleshooting theory would not be usable in a tutoring context where the expert

must be able to understand the students troubleshooting strategy.

Fortunately, there appears to be a rather simple strategy based on the existing propagator

which can be used to deal with non-local knowledge. The idea is based on observations that


29/30

29

s t uden t s o f t e n r e a s o n s o m e t h i n g l i k e : I f the voltage limiter is o f f a n d i t s ho u ld b e o f f , t h e n t he

cons t an t v o l t a g e s o u r c e cannot be contributing to the observed s y m p t o m . Note t h a t t h i s a rgumen t

i s n o t i n t e r m s o f n u m e r i c a l quantities, b u t is in terms of states of the components and sections. The

c o m p o n e n t e x p e r t s c a n be modified to d e t e r m i n e what state the components are in. These

o b s e r v a t i o n s cou ld t h e n be asserted i n a data-base.

This collection of assertions forms a qualitative description of the state of the circuit. Of

course, the assertions, like propagations, have their assumptions stored with them. Circuit specific

theorems can then be encoded referring to assertions In the description space. The rule of the

previous paragraph might be encoded as:

(STATE voltage-limiter off) A (CORRECT-STATE voltage-limiter off)

(OK constant-voltage-source)

It appears that only a small num ber of such theorems are necessary to determine what is known

a b o u t a c i r c u i t f r o m a s e t of measurements. The theorems are, of course, very circuit specific. Since

o n l y a few o f t h e m a r e b e required for any specific circuit t h e p r i n c i p l e i s s t i l l u s a b l e .

The local reasoning strategy isolates the qualitative reasoner from worrying about many of

the idiosyncrasies of propagating num erical values by describing the circuit in qualitative term s.

This is giving us the opportunity to try m an y different kinds of qualitative reasoning strategies.

The f a i l i n g s o f t h e local troubleshooting strategy is also showing exactly where this qualitative

r e a s o n i n g i s r e q u i r e d .


30/30

s o

REFERENCES:

Brown, A.L., Qualitative Knowledge, Causal R easoning, and the Localization of Failures aProposal for Research, Artificial Intelligence Laboratory, WP-61, Cambridge: M.I.T., I974~

Brown, A.L., Qualitative Knowledge, Causal Reasoning, and the Localization of Failures,Artificial Intelligence Laboratory, forthcoming TR , Cam bridge: M.I.T., 1 976.

Brown, A.L., and G.J. S ussm an, Localization of Failures In Radio C ircuits a Study In Causal andTeleological Reasoning, Artificial Intelligence Laboratory, AIM-Zig, Cambridge: M.I.T., 1974.

Brown, John Seely, R ichard R . Burton and Alan 0. Bell, SOPHIE: A Sophisticated Instructional Environment for Teaching E lectronic Troubleshooting (An example of Al in CA!), Final Report, B.B.N.Report 279, A.!. Report 1 2 , March,1974.

Stallman, R .S ., and G.j. S ussm an, Forward R easoning and Dependency-Directed Backtracking In aSystem for Computer-Aided Circuit Analysis, Artificial Intelligence Laboratory, AIM-~8O,Cambridge: M.I.T.,1976.

Sussman, G.J., and R.M. Sta llm an, Heuristic Techniques in Com puter Aided Circuit Analysis,Artificial Intelligence Laboratory, AIM-328, C am bridge M.I.T., 1975.

local methods for localizing faults in electronic circuits.pdf

Documents