068-a bist tpg for low power dissipation and high fault coverage

Upload: msatreddy1

Post on 30-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    1/13

    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007 777

    A BIST TPG for Low Power Dissipationand High Fault Coverage

    Seongmoon Wang

    AbstractThis paper presents a low hardware overhead testpattern generator (TPG) for scan-based built-in self-test (BIST)that can reduce switching activity in circuits under test (CUTs)during BIST and also achieve very high fault coverage withreasonable lengths of test sequences. The proposed BIST TPGdecreases transitions that occur at scan inputs during scan shiftoperations and hence reduces switching activity in the CUT. Theproposed BIST is comprised of two TPGs: LT-RTPG and 3-weightWRBIST. Test patterns generated by the LT-RTPG detecteasy-to-detect faults and test patterns generated by the 3-weightWRBIST detect faults that remain undetected after LT-RTPGpatterns are applied. The proposed BIST TPG does not requiremodification of mission logics, which can lead to performance

    degradation. Experimental results for ISCAS89 benchmarkcircuits demonstrate that the proposed BIST can significantlyreduce switching activity during BIST while achieving 100% faultcoverage for all ISCAS89 benchmark circuits. Larger reductionin switching activity is achieved in large circuits. Experimentalresults also show that the proposed BIST can be implemented withlow area overhead.

    Index TermsBuilt-in self-test (BIST), heat dissipation duringtest application, low power testing, power dissipation during testapplication, random pattern testing.

    I. INTRODUCTION

    SINCE in built-in self-test (BIST), test patterns are gener-

    ated and applied to the circuit-under-test (CUT) by on-chiphardware, minimizing hardware overhead is a major concern

    of BIST implementation. Unlike stored pattern BIST, which

    requires high hardware overhead due to memory devices re-

    quired to store precomputed test patterns, pseudorandom BIST,

    where test patterns are generated by pseudorandom pattern

    generators such as linear feedback shift registers (LFSRs) and

    cellular automata (CA), requires very little hardware overhead.

    However, achieving high fault coverage for CUTs that con-

    tain many random pattern resistant faults (RPRFs) only with

    (pseudo) random patterns generated by an LFSR or CA often

    requires unacceptably long test sequences thereby resulting in

    prohibitively long test time. The random pattern test lengthrequired to achieve high fault coverage is often determined by

    only a few RPRFs [1].

    Several techniques have been proposed to address this

    problem. Reseedable and/or reconfigurable LFSRs are pro-

    posed in [2][4]. In [5] and [6], random patterns that do not

    detect any new faults are mapped into deterministic tests for

    RPRFs. In test point insertion (TPI) techniques [7], [8], control

    and observation points are inserted at selected gates to improve

    detection probabilities of RPRFs. In weighted random pattern

    Manuscript received May 24, 2006; revised December 22, 2006.The author is with NEC Laboratories, America, Princeton, NJ 08540 USA.

    Digital Object Identifier 10.1109/TVLSI.2007.899234

    testing [1], [9][11], the outputs of test pattern generator (TPG)

    are biased to generate test sequences that have nonuniform

    signal probabilities to increase detection probabilities of RPRFs

    that escape pseudorandom test sequences, which have a uni-

    form signal probability of 0.5. Random pattern generators

    proposed in [12] and [13] use Markov sources to exploit spatial

    correlation between state inputs that are consecutively located

    in the scan chain. A 3-weight weighted random BIST (3-weight

    WRBIST) can be classified as an extreme case of conventional

    weighted random pattern testing BIST. However, in contrast

    to conventional weighted random pattern testing BIST where

    various weights, e.g., 0, 0.25, 0.5, 0.75, 1.0, can be assignedto outputs of TPGs, in 3-weight WRBIST, only three weights,

    0, 0.5, and 1, are assigned. Since only three weights are used,

    circuitry to generate weights is simple; weight 1 (0) is obtained

    by fixing a signal to a 1 (0) and weight 0.5 by driving a signal

    by an output of a pseudorandom pattern generator, such as an

    LFSR. Weight sets are calculated from test cubes for RPRFs.

    Though the attainment of high fault coverage with practical

    lengths of test sequences is still one major concern of BIST

    techniques, reducing switching activity has become another im-

    portant objective. It has been observed that switching activity

    during test application is often significantly higher than that

    during normal operation [14]. The correlation between consec-

    utive random patterns generated by an LFSR is lowthis is a

    well-known property of LFSR generated patterns [1], [15]. On

    the other hand, significant correlation exists between consecu-

    tive patterns during the normal operation of a circuit. Hence,

    switching activity in a circuit can be significantly higher during

    BIST than that during its normal operation. Finite-state ma-

    chines are often implemented in such a manner that vectors

    representing successive states are highly correlated to reduce

    power dissipation [16]. However, use of design-for-testability

    (DFT) techniques such as scan significantly decreases the cor-

    relation between consecutive state vectors. Use of scan allows to

    apply patterns that cannot appear during normal operation to the

    state inputs of the CUT during test application. Furthermore, thevalues applied at the state inputs of the CUT during scan shift

    operations represent shifted values of test vectors and circuit re-

    sponses and have no particular temporal correlation. Excessive

    switching activity due to low correlation between consecutive

    test patterns can cause several problems [14], [17][19].

    Since heat dissipation in a CMOS circuit is proportional to

    switching activity, a CUT can be permanently damaged due

    to excessive heat dissipation if switching activity in the circuit

    during test application is much higher than that during its normal

    operation. Heat dissipated during test application is already in-

    fluencing the design of test methodologies for practical circuits

    [14], [19], [20].

    1063-8210/$25.00 2007 IEEE

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    2/13

    778 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007

    Metal migration (electromigration) causes erosion of conduc-

    torsandsubsequentfailureofcircuits[21].Sincetemperatureand

    current density are majorfactors that determine electromigration

    rate, elevated temperature, and current density caused by exces-

    sive switching activity during test application will severely de-

    crease reliability of CUTs. This is even more severe in circuits

    equipped with BIST since such circuits are tested frequently.To test a bare die, power must be supplied during the period

    of test through probes, which typically have higher inductances

    thanpowerandgroundpinsofthecircuitpackage.Hence,thebare

    dieundertestwillexperiencehigherpower/groundnoisewhichis

    givenby , where isthe inductance ofpower and ground

    line and is the rate of change of current flowing in power

    and groundlines. Excessive power/ground noisecan erroneously

    change the logic state of circuit lines causing some good dies to

    fail the test, leading to unnecessary loss of yield.

    In this paper, we propose a low hardware overhead scan-

    based BIST technique that can achieve very high fault coverage

    without the risk of damaging CUTs due to excessive switching

    activity during BIST. Recently, techniques to reduce switchingactivity during BIST have been proposed in [17] and [22][25].

    A straightforward solution is to reduce the speed of the test

    clock during scan shift operations. However, since most test ap-

    plication time of scan-based BIST is spent for scan shift op-

    erations (typically , where is the number of scan

    flip-flops in the longest scan chain), this will increase test appli-

    cation time by about a factor of if scan flip-flops are clocked at

    speed during scan shift operations. Furthermore, reducing

    the clock speed does not solve high power/ground noise that

    is caused by a large number of simultaneous transitions in the

    circuit. A technique that uses enhanced flip-flops to isolate mis-

    sion logics of the CUT from scan chains is proposed [17]. Majordisadvantages of this technique are performance degradation

    and area overhead entailed by adding extra logics to isolate

    mission logics from scan chains. Techniques to schedule tests

    under power constraints are proposed in [14], [19], and [20].

    Test scheduling techniques can reduce overall chip power dis-

    sipation. However, these techniques cannot solve the hot spot

    problem that is caused by temperature being excessively ele-

    vated at a small area of the chip.

    Though techniques to reduce switching activity during scan

    BIST have been extensively studied, very few papers simul-

    taneously address both excessive switching activity and fault

    coverage. A BIST TPG that can achieve high fault coverage

    and also reduce switching activity during BIST is proposed for

    single scan chain designs in [26], which augments the LT-RTPG

    [23] with the serial fixing 3-weight WRBIST [27]. It is shown

    that the 3-weight WRBIST can achieve very high fault coverage

    with low hardware overhead [27]. The LT-RTPG proposed in

    [23] generates correlated test patterns that can reduce transitions

    at state inputs during scan shift operations. The serial fixing

    3-weight WRBIST can also generate test patterns that cause less

    switching activity during BIST. This paper is a significant ex-

    tension of [26], especially, a technique to optimize TPGs for

    multiple scan chain designs is proposed.

    The rest of this paper is organized as follows. Section II

    briefly introduces the serial fixing 3-weight WRBIST [27].The techniques that are used in this paper to reduce switching

    activity during BIST are illustrated in Section III. The architec-

    ture of the proposed TPG and the outline of algorithm used to

    design the proposed BIST TPG are described in Section IV. A

    technique to minimize hardware overhead for implementing the

    proposed BIST TPG for circuits with multiple scan chains is

    presented in Section V. Section VI reports experimental results.

    Finally, Section VII presents the conclusions.

    II. 3-WEIGHT WRBIST

    A. Generator

    In this paper, we assume that the sequential CUT has pri-

    mary and state inputs, and employs full-scan. Even though the

    proposed BIST TPG is applicable to scan designs with multiple

    scan chains, we assume that all primary and state inputs are

    driven by a single scan chain unless stated otherwise (applica-

    tion to multiple scan chains is discussed separately in Section V)

    only for clarity and convenience of illustration. A test cube is

    a test pattern that has unspecified inputs. The detection proba-

    bility of a fault is defined as the probability that a randomly gen-erated test pattern detects the fault [1]. In the 3-weight WRBIST

    scheme, fault coverage for a random pattern resistant circuit is

    enhanced by improving detection probabilities of RPRFs; the

    detection probability of an RPRF is improved by fixing some

    inputs of the CUT to the values specified in a deterministic test

    cube for the RPRF. A generatoror weight setis a vector that rep-

    resents weights that are assigned to inputs of the circuit during

    3-weight WRBIST. Inputs that are assigned weight 1 (0) are

    fixed to 1 (0) and inputs that are assigned weight 0.5 are driven

    by outputs of the pseudorandom pattern generator, such as an

    LFSR and a CA. A generator is calculated from a set of deter-

    ministic test cubes for RPRFs.

    Consider a sequential circuit that has primary and state

    inputs. denotes a set of test cubes for

    RPRFs in the CUT, where is an -bit

    test cube, where , where is a dont care.

    Fig. 1 shows test cube set , which consists of four test cubes,

    and . Generator for

    the circuit with inputs is denoted as an -bit tuple, where

    and .

    If input is assigned only a 1 (0) or an in every test cube

    and assigned a 1 (0) at least in one test cube in , then input

    is assigned a 1 (0) in the generator, i.e., (0). If input

    is assigned a 1 in a test cube , i.e., , and assigned a 0

    in another test cube , i.e., , then input is assigneda in the generator, i.e., . Otherwise ( is always

    assigned an in every generator), input is assigned an

    in the generator, i.e., . In summary, is defined as

    follows:

    if or in and at least one

    if or in and at least one

    if and where

    otherwise

    (1)

    Inputs that are assigned s in a generator are called conflictinginputs of the generator.

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    3/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 779

    Fig. 1. Example test cubesets. (a)TestcubesetC

    . (b) TestcubesubsetsC ; C

    .

    In the test cube shown in Fig. 1, input isalways assigned

    only a 0 or an in every test cube in (0 in and and

    in and ). Therefore, is assigned 0 in generator .

    On the other hand, is assigned a 1 in generator since

    input is always assigned only a 1 or an in every test cube

    (1 in and and in and ). In contrast, since input ,

    which is assigned both 0 and 1 (0 in and 1 in and ),

    in .

    Assume that test cubes, , and , are only test cubes

    that detect faults and , respectively. Under this as-

    sumption, the detection probability of fault is simply given by

    . If , then input

    can be fixed to a 1 (0) without making any fault in untestable

    since input isnever assigneda 0 (1) in any testcube.Note that

    fixing input to a 1 (0) increases the number of binary values

    by 1 in all test cubes. Hence, fixing to a 1 (0) improves de-

    tection probabilities of faults that require a 1 (0) at input for

    their detection by a factor of 2. When , e.g., and

    of Fig. 1(a), there are test cubes in that conflict at input ,so that fixing input to a binary value , where ,

    may make faults for which test cubes are assigned at un-

    detectable.

    If a circuit contains a large number of RPRFs, then test cubes

    for RPRFs may be assigned conflicting values in many inputs

    resulting in a generator where large number of inputs are as-

    signed s. Hence, only a few inputs can be fixed in such gen-

    erators without making any fault untestable. If a circuit has a

    large number of RPRFs, then multiple generators, each of which

    is calculated from test cubes for a subset of RPRFs in the circuit,

    may be required to achieve high fault coverage with a reason-

    able length of random pattern sequence.In this paper, the generator for subset is denoted by

    , where is the number of

    inputs of the CUT. If we partition the four test cubes

    and shown in Fig. 1(a) into two groups

    and , then inputs , and can,

    respectively, be fixed to 0, 0, 1, 0, and 1 in and inputs

    , and to 1, 0, 0, and 0 in without any con-

    flict. As a consequence, and

    . Note that numbers of conflicting

    inputs, i.e., s, of both and are reduced to

    1 and detection probabilities of all four faults , and

    increase to .

    Since test cubes are grouped into two subsets and , theentire test session is also divided into two subsessions. In the

    Fig. 2. Example generators (weight sets).

    Fig. 3. Exemplary 3-weight WRBIST.

    first subsession, inputs and are fixed to 0, 0,

    1, 0, and 1, respectively, to generate test patterns by using

    . In the second subsession, inputs and are

    fixed to 1, 0, 0, and 0, respectively, to generate test patterns

    by using . In this paper, the same patterns are gener-

    ated by using every generator , where to

    simplify hardware for the BIST controller. However, different

    numbers of test patterns can also be generated to reduce test ap-

    plication time at the expense of higher hardware overhead. Ad-

    equate numbers of test patterns should be generated by using

    each generator to detect all faults targeted by the generator. A

    probability-based procedure to compute a suitable time intervalfor generator is described in [28].

    B. Architecture of 3-Weight WRBIST

    Fig. 2 shows a set of generators and Fig. 3 shows an imple-

    mentation of the 3-weight WRBIST for the generators shown

    in Fig. 2. The shift counter is an -modulo counter,

    where is the number of scan elements in the scan chain (since

    the generators are 9 bits wide, the shift counter has

    4 stages). When the content of the shift counter is ,

    where , a value for input is scanned into the

    input of scan chain. The generator counter selects appropriategenerators; when the content of the generator counter is , test

    patterns are generated by using , where .

    Pseudo-random pattern sequences generated by an LFSR and

    a CA are modified (fixed) by controlling the AND and OR gates

    with overriding signals and ; fixing a random value to a 0

    is achieved by setting to a 1 and to a 0 and fixing a random

    valueto a 1 is achieved by setting to a 1 (sincea random value

    can be fixed to a 1 by setting to a 1 independent of the state

    of , the state of is a dont care). Overriding signals and

    are driven by the outputs of flip-flops, and . The

    inputs of and are in turn driven by the outputs of the

    decoding logic and , respectively, which are generated

    by the outputs of the shift counter and the generator counter asinputs.

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    4/13

    780 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007

    The shift counter is required by all scan-based BIST tech-

    niques and not particular to the proposed 3-weight WRBIST

    scheme. All BIST controllers need a pattern counter that counts

    the number of test patterns applied. The generator counter

    can be implemented from MSB (most significant bit)

    stages of the existing pattern counter, where is the number

    of generators, and no additional hardware is required for thegenerator counter. Hence, hardware overhead for implementing

    a 3-weight WRBIST is incurred only by the decoding logic

    and the fixing logic, which includes two toggle flip-flops (

    flip-flops), an AND and an OR gate. Since the fixing logic can

    be implemented with very little hardware, overall hardware

    overhead for implementing the serial fixing 3-weight WRBIST

    is determined by hardware overhead for the decoding logic.

    Consider generating test patterns by the TPG shown in Fig. 3.

    Assume that if ( , and ),

    both and are set to 0s and hence the flip-flops hold

    their previous states in cycles when a scan value for input is

    scanned in. Also assume that flip-flop is initialized to a

    1 and flip-flop is initialized to a 0 before each scan shiftoperation starts. As shown in Fig. 3, scan flip-flops are placed in

    the scan chain in a descending order of their subscript numbers.

    Hence, the value for input is scanned in first and the value

    for input is scanned in last.

    Test patterns are generated by using generator

    first. Since

    (see Fig. 2) when a value for scan input is scanned in,

    (the output of ) should be set to a 1 to set the output of

    the AND gate to a 0 (to fix the value for scan input to a 0)

    and (the output of ) should be set to a 0 to propagate

    the 0 at the output of the AND gate to the output of the OR

    gate. Since the initial state of is a 1 and the initial stateof is a 0, both and are set to 0s and and

    hold their initial states. At the next scan shift cycle

    , since , both and are

    assigned a 0 to make and hold their previous states

    and . Since , both and should

    be set to a 0 to scan in a random value, which is generated by

    the LFSR, for . Hence, is set to a 1 to toggle the state

    of to a 0 when . The next three

    inputs , and are assigned in generator ,

    i.e., . Hence, and hold their

    previous states, i.e., , for the next three cycles

    , and 5. Since , both and

    hold their previous states when .

    Next, since is set to a 1 to fix the value for scan

    input to a 1. This requires to be set to a 1 to toggle the

    state of to a 1. Finally, when , since

    , both and are assigned a 0 and both and

    hold their previous states. This sequence is repeatedly

    generated at and until all test patterns are applied

    to the scan inputs. and values for generators

    and can be determined in similar manners.

    Random patterns generated by the LFSR can be fixed by con-

    trolling the AND/OR gates directly by the decoding logic without

    the two flip-flops. However, this scheme will incur larger

    hardware overhead for the decoding logic and also cause moretransitions in the CUT during BIST (see Section III) than the

    Fig. 4. D and D values: (a) with toggle flip-flops T F and T F and (b)without toggle flip-flops.

    scheme with the flip-flops. Fig. 4(a) shows ,

    and values for the scheme with flip-flops that is imple-

    mented for the three generators and

    shownin Fig. 2. The column under the label showsthe initial

    states of and . is assigned 1s at five locations (

    in , and in , and and in )

    and is assigned 1s at four locations ( in in

    , and and in ). Hence, the on-set of the

    function for has five minterms and the on-set of the function

    for has four minterms. Hardware overhead required to im-

    plement a decoding logic is in general determined by the numberof minterms in the on-set (or off-set) of the function of the de-

    coding logic.

    Now consider a version of 3-weight WRBIST that is imple-

    mented without the flip-flops between the outputs of the de-

    coding logic and the inputs of the AND and OR gate. and

    values for all three generators , and

    are shown in Fig. 4(b) for comparisons with the scheme with

    flip-flops. For every input that is assigned a 0 in a generator

    , i.e., , its corresponding value should be a 1

    and value b e a 0. I f is a ssigned a 1 i n a generator ,

    i.e., , then the corresponding value should be a 1

    (the value is a dont care). The on-set (off-set) of the func-

    tion for has eight (five) minterms and the on-set (off-set) ofthe function for has six (13) minterms. Hence, the function

    for the decoding logic of this scheme requires more minterms

    in its on-set and off-set than that for the decoding logic of the

    scheme with flip-flops. Significant reduction in the number

    of minterms can be achieved by inserting flip-flops for large

    designs.

    C. ATPG to Generate Suitable Test Cubes

    Each generator is computed from a set of deterministic test

    cubes for RPRFs. A test cube set is dynamically constructed

    by continuously adding test cubes into the test cube set. The

    test cube set , which is currently under construction, i.e., intowhich test cubes are currently added, are called the current test

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    5/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 781

    cube set. Test cubes are added into the current test cube set until

    placing any more test cube into makes the number of con-

    flicting inputs, i.e., s, in the generator greater than a prede-

    fined threshold. Whenever a test cube is placed into , gener-

    ator is updated according to (1). Upon the completion

    of generating a test cube set , a new current test cube set

    is created and the test cubes generated later are placed into thenew test cube set .

    Since each generator requires a different sequence of control

    bits at the output of the decoding logic, hardware overhead for

    the decoding logic is determined also by the number of genera-

    tors. A special automatic test pattern generation (ATPG) is used

    to generate deterministic test cubes for RPRFs that are suitable

    to minimize the number of generators. In order to minimize the

    number of generators (placing more test cubes into each test

    cube set will result in smaller number of generators), the pro-

    posed ATPG generates each test cube taking all test cubes ex-

    isting in the current test cube set into consideration. At the heart

    of the ATPG technique are three cost functions: controllability,

    observability, and test generation cost function, which are ob-tained by modifying the traditional SCOAP-like [29] cost func-

    tions. The test generation process of the ATPG, which is based

    on PODEM [30], is guided by the cost functions to generate

    suitable test cubes.

    The controllability cost of input , is defined by

    considering the generator of the current test cube set

    as follows:

    if

    if

    if

    if

    where (2)

    where is a binary value 0 or 1.

    The purpose of the controllability cost is to estimate the

    number of conflicting inputs that would be caused by adding

    into the current test cube set a test cube where input is

    assigned a binary value . If , the current test cube set

    already contains test cubes that conflict at input (at least

    one test cube in the current test cube is assigned a 1 at and

    another test cube is assigned a 0 at ). Hence, assigning any

    binary value to does not cause any more adverse effect.

    Hence, . When , adding a test cube

    whose input is assigned a 1 (0) causes no conflict with any

    test cube in the current test cube set. Hence, . If

    (0), all test cubes existing in the current test cube setare assigned only 1 (0) or at input . Hence, adding a test

    cube whose input is assigned the opposite value 0 (1) causes

    a conflict at input with other test cubes in the current test

    cube set. Hence, high cost is given. Finally, if

    (input is not specified in any test cube in the current

    test cube set), adding a test cube that is assigned at input

    may increase one minterm in the on-set for the function of .

    Hence, a small cost 1 is given.

    The controllability costs for internal circuit line in the cir-

    cuit, which are computed in a similar manner to the testability

    measures used in [29], are given by

    ifotherwise

    (3)

    where and are, respectively, the inputs and the output of

    a gate with controlling value and inversion . The controlling

    value of a gate is the binary value that, when applied to any input

    of a gate, determines the output value of that gate independent

    of the values applied at the other inputs of the gate. The control-

    lability cost functions guide the ATPG to select the backtrace

    paths that require the minimum cost (number of conflicting in-puts), whenever there is a choice of several paths to backtrace

    from the target line to the inputs.

    The observability cost functions are recursively computed

    from primary outputs to primary inputs. The observability cost

    of line is given by

    if is a stem with branches

    otherwise

    (4)

    where in the latter case is the output of gate with input and

    are all inputs of other than . The observability cost functions

    are used to guide the objective selection.

    The proposed ATPG will now be described for the stuck-at fault model. In order to generate a test cube to detect a

    stuck-at- at line , first the fault should be activated

    by setting line to . The cost to activate l - - is . Then,

    the activated fault effect should be propagated to one or more

    outputs. The cost to propagate the activated fault effect at line

    is . Hence, the test generation cost to generate a test cube

    for l - - is defined as the sum of two cost functions

    (5)

    The test generation cost is used to select a best target fault from

    the fault list.

    Since test cubes generated by the proposed ATPG are oftenover-specified, a few bits that are assigned binary values by the

    proposed ATPG can be relaxed to dont cares while ensuring

    the detection of targeted faults. Test cubes with fewer specified

    inputs have fewer conflicting inputs with test cubes already in

    the current test cube set so that more test cubes can be placed

    in the current test cube set. Whenever a test cube is generated,

    inputs that are assigned binary values are ordered according to

    the cost incurred by assigning each input to its binary value. The

    binary value assigned to each of these inputs is flipped in this

    order. If all the targeted faults can still be detected even after an

    input is flipped to its opposite value, the value assigned to the

    input is relaxed to a dont care.If a circuit has any reconvergent fanout, an input assignment

    required to satisfy some objectives may conflict with that re-

    quired to satisfy other objectives, causing the proposed ATPG

    to select an objective or backtrace path with a high cost. Hence,

    in circuits with reconvergent fanouts, the actual cost of the test

    cube generated for a fault may be much higher than the cost

    of the fault given by the estimate test generation cost function

    shown in (5). To prevent adding such test cubes to the current

    test cube set, if the actual cost of a generated test cube is higher

    by a certain number (say, 100) than the estimate test generation

    cost of the fault, the generated test cube is discarded. Test gen-

    eration is then carried out for alternative target faults until a test

    cube is found for a fault whose actual cost is close to the es-timate test generation cost of the fault or all faults in the fault

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    6/13

    782 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007

    list are tried. Even in the worst case where all faults in the fault

    list need to be tried, generating test cubes is required only for a

    few faults. The estimate test generation cost of a fault is always

    an optimistic approximate of the actual cost for the fault in the

    sense that the actual cost of any test cube for the fault cannot be

    less than the estimate test generation cost. Hence, if the estimate

    test generation cost of a fault is greater than the actual cost ofthe test cube that has the minimum actual cost among the test

    cubes that have been generated but discarded due to high actual

    cost, then the actual cost of any test cube for the fault cannot

    be smaller than the current minimum actual cost. Hence, we do

    not need to generate a test cube for the fault. If test cubes for

    all faults in the fault list have very high actual cost, then the test

    cube that has the minimum actual cost is chosen to be a new

    member of the current test cube set.

    III. MINIMIZING SWITCHING ACTIVITY DURING BIST

    The BIST TPG proposed in this paper reduces switching ac-

    tivity in the CUT during BIST by reducing the number of tran-

    sitions at scan inputs during scan shift cycles. In this paper,

    we assume that the sequential CUT is implemented in CMOS,

    and employs full-scan. If scan input is assigned , where

    , at time and assigned the opposite value at

    time , then a transition occurs at at time . The transition that

    occurs at scan input can propagate into internal circuit lines

    causing more transitions. During scan shift cycles, the response

    to the previous scan test pattern is also scanned out of the scan

    chain. Hence, transitions at scan inputs can be caused by both

    test patterns and responses. Since it is very difficult to generate

    test patterns by a random pattern generator that cause minimal

    number of transitions while they are scanned into the scan chain

    and whose responses also cause minimal number of transitionswhile they are scanned out of the scan chain, we focus on min-

    imizing the number of transitions caused only by test patterns

    that are scanned in. Even though we focus on minimizing the

    number of transitions caused only by test patterns, our exten-

    sive experiments show that the proposed TPG can still reduce

    switching activity significantly during BIST (see Section VI).

    Since circuit responses typically have higher correlation among

    neighborhood scan outputs than test patterns, responses cause

    fewer transitions than test patterns while being scanned out.

    A transition at the input of the scan chain at scan shift cycle

    , which is caused by scanning in a value that is opposite to

    the value that was scanned in at the previous scan shift cycle, continuously causes transitions at scan inputs while the

    value travels through the scan chain for the following scan shift

    cycles. Fig. 5 describes scanning a scan test pattern 01100 into a

    scan chain that has five scan flip-flops. Since a 0 is scanned into

    the scan chain at time , the 1 that is scanned into the scan

    chain at time causes a transition at the input of the scan

    chain and continuously causes transitions at the scan flip-flops

    it passes through until it arrives at its final destination at time

    . In contrast, the 1 that is scanned into the scan chain at

    the next cycle causes no transition at the input of the scan

    chain and arrives at its final destination at time without

    causing any transition at the scan flip-flops it passes through.

    This shows that transitions that occur in the entire scan chaincan be reduced by reducing transitions at the input of the scan

    Fig. 5. Transitions at scan chain input.

    Fig. 6. LT-RTPG.

    chain. Since transitions at scan inputs propagate into internal

    circuit lines causing more transitions, reducing transitions at the

    input of scan chain can eventually reduce switching activity in

    the entire circuit.

    A. LT-RTPG

    The LT-RTPG proposed in [23] reduces switching activity

    during BIST by reducing transitions at scan inputs during scan

    shift operations. An example LT-RTPG is shown in Fig. 6. The

    LT-RTPG is comprised of an -stage LFSR, a -input ANDgate, and a toggle flip-flop (T flip-flop). Hence, it can be imple-

    mented with very little hardware. Each of inputs of the AND

    gate is connected to either a normal or an inverting output of

    the LFSR stages. If large is used, large sets of neighboring

    state inputs will be assigned identical values in most test pat-

    terns, resulting in the decrease fault coverage or the increase in

    test sequence length. Hence, like [23], in this paper, LT-RTPGs

    with only or 3 are used. Since a flip-flop holds pre-

    vious values until the input of the flip-flop is assigned a 1,

    the same value , where , is repeatedly scanned into

    the scan chain until the value at the output of the AND gate be-

    comes 1. Hence, adjacent scan flip-flops are assigned identicalvalues in most test patterns and scan inputs have fewer transi-

    tions during scan shift operations. Since most switching activity

    during scan BIST occurs during scan shift operations (a capture

    cycle occurs at every cycles), the LT-RTPG can reduce

    heat dissipation during overall scan testing. Various properties

    of the LT-RTPG are studied and a detailed methodology for its

    design is presented in [23].

    It has been observed that many faults that escape random pat-

    terns are highly correlated with each other and can be detected

    by continuously complementing values of a few inputs from a

    parent test vector. This observation is exploited in [26][27], [31],

    and [32] to improve fault coverage for circuits that have large

    numbers of RPRFs. We have also observed that tests for faultsthat escape LT-RTPG test sequences share many common input

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    7/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 783

    assignments. This implies that RPRFs that escape LT-RTPG test

    sequences can be effectively detected by fixing selected inputs

    to binary values specified in deterministic test cubes for these

    RPRFs and applying random patterns to the rest of inputs. This

    technique is used in the 3-weight WRBIST to achieve high fault

    coverage for random pattern resistant circuits. In this paper we

    demonstrate that augmenting the LT-RTPG with the serialfixing3-weight WRBIST proposed in [27] can attain high fault cov-

    erage without excessive switching activity or large area over-

    head even for circuits that have large numbers of RPRFs.

    B. Property of 3-Weight WRBIST to Reduce Switching Activity

    If a large set of scan inputs that are consecutively located in

    the scan chain are assigned identical values ( is identical to

    any binary value 0 or 1) in a generator, then the flip-flops

    and of 3-Weight WRBIST (see Fig. 3) stay at the

    same state for many scan shift cycles. While holds a 1,

    the output of the OR gate in the fixing logic is set to a 1 and 1s

    are continuously scanned into the scan chain and no transitions

    occur at the input of the scan chain. Likewise, while holds

    a 1, random pattern values generated by the LFSR are blocked at

    the AND gate and no transition occurs at the input of scan chain

    provided that the other flip-flop does not toggle. Hence,

    in order to significantly reduce the number of transitions at the

    input of scan chain, either or should be assigned a 1

    and stays at the 1 for long periods of scan shift cycles.

    Typically, the majority of scan inputs are assigned s (dont

    cares) in every generator. Since all the faults that are targeted

    by a generator can be detected independent of the scan values

    applied to the scan inputs that are assigned s in the gener-

    ator, the values of and for those scan inputs are assigned

    such that the number of minterms in the on-sets of the functionsfor and is minimized (to minimize hardware overhead

    for the decoding logic) and either or stays at 1 for

    long periods of scan shift cycles (to minimize the number of

    transitions at the input of the scan chain). In order to minimize

    the number of minterms in the on-sets of functions for and

    and for the scan inputs that are assigned s in a

    generator should be assigned 0. Note that does not

    toggle (holds its previous state) when the state of is 0.

    Assume that test patterns are currently generated by gener-

    ator . Also, assume that scan input is assigned an

    while its predecessor is assigned a care bit (0, 1, or

    ) in . Let be the first scan input in the scan chainthat is assigned a care bit in after , i.e., , In

    other words, consecutive scan inputs , which

    are located between and , are assigned s in .

    and , respectively, denote the states of and

    in a scan shift cycle when a value for scan input is

    scanned into the scan chain. In this paper, the decoding logic

    is designed such that the states of and for the scan

    inputs are always the same as those of

    and for input , i.e., and values for the inputs

    are always 0 to minimize the number of

    minterms in the on-sets of the function for the decoding logic.

    and values for input are determined by consid-

    ering values a ssigned at and in generator . If eitheror is 1, i.e., is assigned a binary

    value 0 or 1 in , then both and for are as-

    signed 0 to minimize the number of minterms in the on-sets of

    the decoding logic function. Note that this minimizes also the

    number of transitions since when either or is set to

    1 for scan shift cycles for inputs , no transi-

    tions occur during the scan shift cycles for these scan inputs.

    On the other hand, if both and are0, i.e., , then we check , i.e., the value assigned

    at in , to determine the values of and for

    scan input (recall that both and values for scan in-

    puts are always assigned 0). If ,

    then we assign for input to 1 to make

    and ( and ). This adds

    one minterm in the on-set of for all the scan inputs,

    . In this case, a transition can occur at the

    input of the scan chain only in the scan shift cycles when a value

    for scan input is scanned in among all scan shift cyclesfor in-

    puts . As the last case, if both and

    , i.e., both inputs that flank the consecutive scan inputs

    are assigned in , then we determinethe values of and for scan input based on the number

    of scan i nputs between and , i.e., . If , where

    is a predefined natural number, then we arbitrarily select ei-

    ther or and assign it to a 1 for input to suppress tran-

    sitions at the input of scan chains. Otherwise, we assign both

    and for scan input to 0 to minimize the number of

    minterms in the on-sets of functions for and . If is

    large, then transitions can occur at the input of the scan chain

    in many scan shift cycles. Hence adding one more minterm in

    the on-set of functions for the decoding logic to suppress large

    number of transitions is worthy.

    Forexample, consider the set of generators shown in Fig. 4(a).Consecutive scan inputs and are assigned s and

    input , which precedes in the scan chain, is assigned a

    in . Since is assigned a and hence both

    , we check input , which is the first scan input

    that is assigned a care bit (0, 1, or ) in after the con-

    secutive scan inputs , and , which are assigned s in

    . Since is assigned a 0 in for is as-

    signed a 1 to toggle the state of in the scan shift cycles

    for input . On the other hand, in generator , consec-

    utive inputs , and are assigned s and inputs and

    , which flank the inputs , and , are assigned both .

    Hence, the and values for the scan input are deter-

    mined by considering the predefined number and the number

    of consecutive scan inputs that are assigned s between and

    , i.e., 3. If is used, then for input is assigned a 1

    to toggle the state of to 1 to suppress transitions. If

    is used, then both and for input are assigned 0s to

    minimize the number of minterms in the on-sets of functions

    for the decoding logic.

    IV. PROPOSED TEST PATTERN GENERATOR

    The proposed BIST is comprised of two TPGs: an LT-RTPG

    [23] and a 3-weight WRBIST [27] (see Fig. 7). The multiplexer,

    which drives the input of scan chain, selects a test pattern source

    between the LT-RTPG and the 3-weight WRBIST. In the firsttest session, test patterns generated by the LT-RTPG are selected

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    8/13

    784 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007

    Fig. 7. Proposed BIST TPG.

    and scanned into the scan chain to detect easy-to-detect faults.

    In the second session, test patterns that are generated by the

    3-weight WRBIST are selected to detect the faults that remain

    undetected after the first session. Considering the fact that an

    LT-RTPG can be implemented with very little hardware over-head (only one flip-flop and one AND gate in addition to an

    LFSR), overall hardware overhead to implement the proposed

    TPG is determined by hardware overhead for the decoding logic

    of the 3-weight WRBIST.

    An outline of the overall procedure to design an optimized

    BIST TPG by the proposed method is described in the fol-

    lowing.

    1) Apply a sequence of test patterns generated by the

    LT-RTPG to the circuit and drop all detected faults.

    2) .

    3) Initialize the current test cube set, , and generator

    . and unmark all faultsin the fault list.

    4) If there are no more faults in the fault list, then exit. Select

    an unmarked fault that has the minimum test generation

    cost and generate a test cube for the fault by the proposed

    ATPG.

    5) Add the test cube to the current test cube set,

    . .

    6) Update generator according to the definition de-

    scribed in Section II-A. If the number of conflicting inputs

    is smaller than or equal to is a positive integer,

    then mark all faults detected by test cube and go to Step

    4).

    7) . Update generator and generate

    3-weight WRBIST patterns by using . Run

    fault simulation to drop the faults that are detected by the

    generated 3-weight WRBIST patterns. and go to

    Step 3).

    V. EXTENSION TO CIRCUIT WITH MULTIPLE SCAN CHAINS

    The proposed TPG can be extended for circuits with multiple

    scan chains by constructing STUMPS-like scan-path archi-

    tecture [1]. Fig. 8 illustrates a straightforward implementation

    of the proposed TPG for a scan design with scan chains,

    , where the fixing logic of each scan chain

    is controlled by a separate decoding logic output pairand . Hence the decoding logic of 3-weight WRBIST has

    Fig. 8. BIST TPG for scan design with multiple scan chains.

    Fig. 9. Splitting a scan chain. (a) Single scan chain. (b) Two scan chains.

    pairs of outputs, .

    Output , where , drives the input of

    toggle flip-flop of the corresponding fixing logic.

    Since scan chain length, which is defined as the number

    of the scan flip-flops in the longest scan chain, is one of the

    factors that determine test application time and in turn test cost,

    long scan chains are often split into several shorter scan chains

    to reduce test application time. Splitting a long scan chain

    into several shorter scan chains does not change the number of

    minterms in the on-sets of functions for the decoding logic if the

    order of scan flip-flops in the original scan chain is preservedin every split scan chain. Fig. 9(a) shows a generator and its

    corresponding , and values for a circuit with

    13 scan flip-flops that are connected into a single scan chain.

    If we assume that and are initialized to 1 and 0,

    respectively, (the initial values of and are given in the

    columns in Fig. 9) before each scan shift operation starts,

    then the function of the decoding logic for the generator has

    four minterms in its on-set ( is assigned a 1 only for one scan

    input and is assigned 1s for three scan inputs ,

    and ). Now assume that the scan chain is split into two scan

    chains and between scan inputs and so that chain

    is composed of and chain is composed

    of as shown in Fig. 9(b). denotes th scaninput of scan chain , where 1 and 2. Note that we simply

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    9/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 785

    split the longer scan chain into two shorter scan chains keeping

    the order of scan flip-flops in the original scan chain. The values

    required at , and , where 1 and 2, for the two scan

    chain version of the circuit are shown in Fig. 9(b) (assume that

    the two pairs of flip-flops / and

    are initialized both to 1 and 0, respectively, before a scan shift

    operation starts). The function of the decoding logic for thetwo scan chain version also has four minterms on its on-set.

    (However, the two scan chain version will require one more

    fixing logic, which can however be implemented with very little

    hardware.) As this example shows, the number of minterms in

    the on-sets of functions for the decoding logic does not change

    substantially unless there are drastic changes in the order of

    scan flip-flops, even if long scan chains are split into shorter

    scan chains.

    In the previous paragraph, we assume that the decoding logic

    is designed such that a separate pair of outputs and are

    assigned for each scan chain , where . How-

    ever, it is not necessary to assign a separate pair of outputs to

    each scan chain. In the following, we present a method, whichis based on compatibility analysis, to reduce the number of out-

    puts of the decoding logic for circuits with multiple scan chains

    to reduce hardware overhead.

    If the value assigned at every scan input of scan chain

    , where , is identical to the value assigned at the

    corresponding scan input of another scan chain in every

    generator , where , where is the

    number of generators, then scan chain is said to be compat-

    ible with scan chain (dont care is identical to any value

    , and ). Otherwise, scan chains and are not com-

    patible. For example, in Fig. 10(a), which shows a set of gen-

    erators computed for a circuit with four scan chains ,and , the value assigned at every scan input in scan chain

    , where , is identical to the value assigned at

    in every generator and . Scan

    inputs and , where , are also assigned

    identical values in every generator. Hence, scan chains and

    and also and are compatible pairwise. In Fig. 10(b),

    the nodes represent the scan chains and the arcs depict compat-

    ibility relationships between scan chains (if scan chain and

    scan chain are compatible with each other, then the node for

    scan chain and the node for scan chain are connected by

    an arc). denotes the value assigned at th scan input of scan

    chain in generator . denotes that is

    identical to while denotes that is not iden-

    tical to . Scan chains and are not compatible because

    while input of scan chain is assigned a 0 in ,

    i.e., of is assigned a 1 in the same generator,

    i.e., . and are assigned nonidentical values at

    other inputs too;

    , and .

    If scan chain is compatible with scan chain , then the

    fixing logics for scan chains and can share a common de-

    coding logic output pair. Note that in Fig. 10(d), the inputs of

    flip-flops for compatible scan chains and ( and ) are

    driven by the same output pair and ( and

    ) of the decoding logic. If a circuit has a large numberof scan chains, then typically there are also a large number of

    Fig. 10. Merging compatible scan chains. (a) Generators before merging. (b)Graph representation of compatible scan chains. (c) Generators after merging.

    (d) Decoding logic for merged scan chains.

    compatible scan chains. Reducing the number of outputs of the

    decoding logic by merging compatible scan chains can reduce

    hardware overhead for the decoding logic. In this paper, mini-

    mizing the number of outputs of the decoding logic is achieved

    by finding maximal numbers of compatible scan chains, which

    can be formulated as the clique problem [33]. If a set of com-

    patible scan chains are merged to be driven by the same pair

    of decoding logic outputs, then generators for the merged scan

    chains are also updated accordingly as follows. If input

    is assigned a care value , where , in generator

    , then values for all th scan inputs of the scan

    chains that are merged together are updated to in the new gen-

    erator. Fig. 10(c) shows the new generators after generators ofcompatible scan chains are updated. Fig. 10(d) shows an im-

    plementation of 3-weight WRBIST for the merged generators

    shown in Fig. 10(c). Since the circuit has four scan chains, the

    3-weight WRBIST has four fixing logics. However, since the

    two pairs of compatible scan chains and , and and

    are merged, the decoding logic has only two pairs of outputs,

    , and .

    VI. EXPERIMENTAL RESULTS

    Table I compares results obtained by applying test sequences

    generated by regular LFSRs ( LFSR sequences, for short)

    and results obtained by applying test sequences generatedby the proposed TPGs (the proposed TPG sequences, for

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    10/13

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    11/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 787

    TABLE II

    EXPERIMENTAL RESULTS FOR PROPOSED TPG WITH MULTIPLE SCAN CHAINS

    TABLE III

    COMPARISONS WITH PRIOR WORK

    together (see Section V). Note that numbers of output pairs

    of decoding logics (columns # Dec output) for multiple scan

    chain versions of large benchmark circuits such as s38417 ands38584 are much smaller than numbers of scan chains in these

    circuits (versions) since many scan chains are compatible and

    thus merged together. Even if there are no merged scan chains

    (the number of scan chains is the same as that of decoding

    logic output pairs), gate equivalents of the decoding logic

    for versions that have shorter scan chains are slightly smaller

    than those of decoding logics for versions with longer scan

    chains. For example, even though no scan chains are merged in

    either 128 or 64 scan chain length version of s5378 and s9234,

    gate equivalents of decoding logics for single scan chains are

    slightly larger than those of decoding logics for 128 scan chain

    length versions and gate equivalents of decoding logics for 128scan chain length versions are in turn slightly larger than those

    of decoding logics for 64 scan chain length versions. Average

    numbers of transitions are shown in the columns under the

    headings #Trans.; columns LT-(3W-) give average numbers of

    transitions per test clock cycle caused by LT-RTPG test se-

    quences (3-weight WRBIST sequences). In general, LT-RTPG

    sequences cause smaller numbers of transitions for versions

    that have shorter scan chains.

    Table III compares the proposed method with recent prior

    work [12], [36][38]. Like the proposed TPG, the TPGs pro-

    posed in [12] and [36] also achieve 100% fault efficiency for

    every ISCAS89 benchmark circuit. Unlike the proposed TPG,

    since either [12] or [36] does not consider power dissipation

    during BIST. Hence direct comparisons of the proposed TPG

    with [12] and [36] may not be fair. For s9234, s15850, and

    s38417, the gate equivalents of the proposed TPG are signifi-

    cantly smaller than those of [12] while the gate equivalent of[12] is much smaller than that of the proposed TPG for s13207

    and s38584. Note that the gate equivalent of the proposed TPG

    for s38417, the largest circuit, is smaller than half the gate equiv-

    alent of [12] while the number of patterns to achieve 100% is

    even smaller than that of [12]. The gate equivalents of [36] are

    a little smaller than those of the proposed TPG for most circuits

    except s9234 and s15850. The gate equivalent of [36] for s9234

    is significantly smaller than that of the proposed TPG while the

    gate equivalent of [36] for s15850 is even larger than that of the

    proposed TPG.

    The TPGs proposed in [37] and [38] can reduce switching

    activity during BIST. The columns % AP reduction show re-duction in the average number of transitions against regular

    LFSRs. Fault efficiency and coverage achieved are shown in the

    columns FE% and FC%. Even though more test patterns were

    applied than the proposed TPG, fault efficiencies (coverages)

    achieved by the TPGs proposed in [37] and [38] are much lower

    than 100%. The proposed TPG (with LT-RTPG with )

    achieves even larger reduction in the average number of transi-

    tions than [36]. The TPG [37] achieves larger reduction in the

    average number of transitions than the proposed TPG.

    Table IV shows experimental results for three industrial de-

    signs. The column grid cnt gives the size of each design in the

    number of grids (the grid count does not include area occu-

    pied by embedded memories) and the column # FFs gives the

    number of scan flip-flops in the design. The column # Init pat

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    12/13

    788 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007

    TABLE IV

    EXPERIMENTAL RESULTS ON INDUSTRIAL DESIGNS (3-WEIGHT BIST)

    gives the number of test patterns that were generated to detect

    easy-to-detect faults and the column Init FC% gives fault cov-

    erage achieved by the initial test sequence. The number of test

    patterns generated by the 3-weight BIST to detect the faults re-

    maining undetected by the initial test sequence is reported in the

    column # 3W patand final fault coverage achieved is reported in

    the column Final FC%. The column is the maximum number

    of conflicting bits allowed in each test pattern. The column 3-W

    Area reports area overhead for the 3-weight BIST in grid count

    and also in percentage to the grind count of benchmark circuit

    (shown in the parenthesis in the same column). Area overhead

    for 3-weight WRBIST for large designs such as and is

    only about 2% or even less. Note that the grid count reported

    in the column grid cnt does not include area occupied by em-

    bedded memories. Hence real area overhead for 3-weight WR-

    BIST will be even less. The experimental results clearly show

    that the proposed TPG can be implemented for large industrial

    designs with low hardware overhead. Since a simulation tool

    that can calculate the number of transitions during scan testing

    for large industrial designs is currently not available to us, the

    number of transitions is not reported.

    VII. CONCLUSION

    This paper presents a low hardware overhead TPG for scan-based BIST that can reduce switching activity in CUTs during

    BIST and also achieve very high fault coverage with a reason-

    able length of test sequence. Unacceptably long test sequences

    are often required to attain high fault coverage with pseudo-

    random test patterns for circuits that have many random pattern

    resistant faults. The main objective of most recent BIST tech-

    niques has been the design of TPGs that achieve high fault cov-

    erage at acceptable test lengths for such circuits. While this ob-

    jective still remains important, reducing heat dissipation during

    test application is also becoming an important objective. Since

    the correlation between consecutive patterns applied to a circuit

    during BIST is significantly lower, switching activity in the cir-cuit can be significantly higher during BIST than that during its

    normal operation. Excessive switching activity during test ap-

    plication can cause several problems.

    The proposed TPG reduces the number of transitions that

    occur at scan inputs during scan shifting by scanning in the test

    patterns where neighboring bits are highly correlated.

    The proposed BIST is comprised of two TPGs: LT-RTPG

    [23] and 3-weight WRBIST [27]. Test sequences generated by

    the LT-RTPG detect easy-to-detect faults. Faults that escape

    LT-RTPG test sequences are detected by test patterns generated

    by the 3-weight WRBIST. The number of weight sets (gener-

    ators) is minimized by guiding the proposed ATPG with cost

    functions that reflect the number of conflicting inputs to be in-curred by setting an input to a binary value. An algorithm to

    design the 3-weight WRBIST that requires minimal hardware

    overhead and whose patterns cause minimal number of transi-

    tions during scan shift cycles is presented. Hardware overhead

    for the proposed TPG is further reduced by identifying compat-

    ible scan chains in multiple scan chain designs. Experimental re-

    sults for ISCAS89 benchmark circuits demonstrate that the pro-

    posed BIST can significantly reduce switching activity duringBIST while achieving 100% fault coverage for all benchmark

    circuits. Larger reduction in switching activity is achieved for

    large circuits, which have long scan chains. The proposed BIST

    structure does not require modification of mission logic which

    can cause performance degradation. Experimental results for

    large industrial circuits demonstrate that the proposed TPG can

    significantly improve fault coverage of LFSR generated test se-

    quences with low hardware overhead.

    REFERENCES

    [1] P. H. Bardell, W. H. McAnney, and J. Savir, Built-In Test for VLSI:Pseudorandom Techniques. New York: Wiley, 1987.

    [2] S. Hellebrand, J. Rajski, S. Tarnick, S. Venkataraman, and B. Courtois,

    Built-In test for circuits with scan based on reseeding of multiple-polynomial linear feedback shift registers, IEEE Trans. Comput., vol.44, no. 2, pp. 223233, Feb. 1995.

    [3] N. Zacharia, J. Rajski, and J. Tyszer, Decompression of test data usingvariable-length seed LFSRs, in Proc. IEEE 13th VLSI Test Symp.,1995, pp. 426433.

    [4] S. Hellebrand, S. Tarnick, and J. Rajski, Generation of vector patternsthrough reseeding of multiple-polynomial linear feedback shift regis-

    ters, in Proc. IEEE Int. Test Conf., 1992, pp. 120129.[5] N. A. Touba and E. J. McCluskey, Altering a pseudo-random bit se-

    quence for scan-based BIST, in Proc. IEEE Int. Test Conf., 1996, pp.167175.

    [6] M. Chatterjee and D. K. Pradhan, A new pattern biasing technique forBIST, in Proc. VLSITS, 1995, pp. 417425.

    [7] N. Tamarapalli and J. Rajski, Constructive multi-phase test point in-sertion for scan-based BIST, in Proc. IEEE Int. Test Conf., 1996, pp.

    649658.[8] Y. Savaria, B. Lague, and B. Kaminska, A pragmatic approach to thedesign of self-testing circuits, in Proc. IEEE Int. Test Conf., 1989, pp.745754.

    [9] J. Hartmann and G. Kemnitz, How to do weighted random testingfor BIST, in Proc. IEEE Int. Conf. Comput.-Aided Design, 1993, pp.568571.

    [10] J. Waicukauski, E. Lindbloom, E. Eichelberger, and O. Forlenza, Amethod for generating weighted random test patterns, IEEE Trans.Comput., vol. 33, no. 2, pp. 149161, Mar. 1989.

    [11] H.-C. Tsai, K.-T. Cheng, C.-J. Lin, and S. Bhawmik, Efficient test-point selection for scan-based BIST, IEEE Trans. Very Large Scale

    Integr. (VLSI) Syst., vol. 6, no. 4, pp. 667676, Dec. 1998.[12] W. Li, C. Yu, S. M. Reddy, and I. Pomeranz, A scan BIST generation

    method using a markov source and partial BIST bit- fixing, in Proc.IEEE-ACM Design Autom. Conf., 2003, pp. 554559.

    [13] N. Z. Basturkmen, S. M. Reddy, and I. Pomeranz, Pseudo random

    patterns using markov sources for scan BIST, in Proc. IEEE Int. TestConf., 2002, pp. 10131021.

    [14] Y. Zorian, A distributed BIST control scheme for complex VLSI de-vices, in Proc. VLSI Testing Symp., 1993, pp. 49.

    [15] S. W. Golomb, Shift Register Sequences. Laguna Hills, CA: Aegean

    Park, 1982.[16] C.-Y. Tsui, M. Pedram, C.-A. Chen, and A. M. Despain, Low power

    state assignment targeting two-and multi-level logic implementation,in Proc. IEEE Int. Conf. Comput.-Aided Des., 1994, pp. 8287.

    [17] P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, A testvector inhibiting technique for low energy BIST design, in Proc. VLSITest. Symp., 1999, pp. 407412.

    [18] V. Dabholkar, S. Chakravarty, I. Pomeranz, and S. Reddy, Techniquesfor minimizing power dissipation in scan and combinational circuitsduring test application, IEEE Trans. Comput.-Aided Des. Integr. Cir-cuits Syst., vol. 17, no. 12, pp. 13251333, Dec. 1998.

    [19] R. M. Chou, K. K. Saluja, and V. D. Agrawal, Scheduling tests forVLSI systems under power constraints, IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst., vol. 5, no. 2, pp. 175185, Jun. 1997.

  • 8/14/2019 068-A BIST TPG for Low Power Dissipation and High Fault Coverage

    13/13

    WANG: BIST TPG FOR LOW POWER DISSIPATION AND HIGH FAULT COVERAGE 789

    [20] T. Schuele and A. P. Stroele, Test scheduling for minimal energy con-sumption under power constrainits, in Proc. VLSI Test. Symp., 2001,pp. 312318.

    [21] N. H. E. Weste andK. Eshraghian, Principles of CMOSVLSI Design:ASystems Perspective, 2nd ed. Reading, MA: Addison-Wesley, 1992.

    [22] S. Gerstendorfer and H.-J. Wunderlich, Minimized power consump-tion for scan-based BIST, in Proc. IEEE Int. Test Conf., 1999, pp.7784.

    [23] S. Wang and S. K. Gupta, LT-RTPG: A new test-per-scan BIST TPGfor low heat dissipation, IEEE Trans. Comput.-Aided Des. I ntegr. Cir-cuits Syst., vol. 25, no. 8, pp. 15651574, Aug. 2006.

    [24] S. Wang and S. K. Gupta, DS-LFSR: A BIST TPG for low heat dis-sipation, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol.21, no. 7, pp. 842851, Jul. 2002.

    [25] F. Corno, M. Rebaudengo, M. S. Reorda,G. Squillero, andM. Violante,Low power BIST via non-linear hybrid celluar automata, in Proc.VLSI Test. Symp., 2000, pp. 2934.

    [26] S. Wang, Generation of low power dissipation and high fault coveragepatterns for scan-basedBIST, in Proc. IEEE Int. Test Conf., 2002, pp.834843.

    [27] S. Wang, Low hardware overhead scan based 3-weight weightedrandom BIST, in Proc. IEEE Int. Test Conf., 2001, pp. 868877.

    [28] S. Wang, Minimizing Heat Dissipation During Test Application,Ph.D. dissertation, EE-Systems Dept., Univ. Southern California, LosAngeles, 1998.

    [29] L. H. Goldsteinand E. L. Thigpen, SCOAP: Sandiacontrollability/ob-servability analysis program, in Proc. IEEE-ACM Des. Autom. Conf.,1980, pp. 190196.

    [30] P. Goel, An implicit enumeration algorithm to generate tests for com-binational logic circuits, IEEE Trans. Comput., vol. C-30, no. 3, pp.215222, Mar. 1981.

    [31] K.-H. Tsai, J. Rajski, and M. Marek-Sadowska, Star test: The theoryand its applications, IEEE Trans. Comput.-Aided Des. Integr. CircuitsSyst., vol. 19, no. 9, pp. 10521064, Sep. 2000.

    [32] I. Pomeranz and S. Reddy, 3-weight pseudo-random test generationbased on a deterministic test set for combinational and sequential cir-

    cuits, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 12,pp. 10501058, Jul. 1993.

    [33] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algo-rithm. Cambirdge, MA: MIT Press, 1990.

    [34] J. Savir, Skewed-load transition test: Part I, calculus, in Proc. IEEEInt. Test Conf., 1992, pp. 705713.

    [35] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A.Saklanha, H. Savoj, P. R. Stephan, R. K. Brayton, and A. Sangiovanni-Vincentelli, SIS: A system for sequential circuit synthesis, Electron.Res. Lab. Memorandum, Univ. California, Los Angeles, UCB/ERL

    M92/41, 1992.[36] L. Li and K. Chakrabarty, Test set embedding for deterministic BISTusing a reconfigurable interconnect network, IEEE Trans. Comput.-

    Aided Des. Integr. Circuits Syst., vol. 23, no. 9, pp. 12891305, Sep.2004.

    [37] N.-C. Lai, S.-J. Wang, and Y.-H. Fu, Low-power BIST with asmoother and scan-chain reorder under optimal cluster size, IEEETrans. Comput.-Aided Des. Integr. Circuits Syst., vol. 25, no. 11, pp.

    25862594, Nov. 2006.[38] N. Z. Basturkmen, S. M. Reddy, and I. Pomeranz, A low power

    pseudo-random BIST technique, J. Electron. Test.: Theory Appl., vol.19, no. 6, pp. 637644, Dec. 2003.

    Seongmoon Wang received the B.S. degree in elec-

    trical engineering from Chungbuk National Univer-sity, Chungbuk, Korea, in 1988, the M.S. degree inelectrical engineering from Korea Advanced Instituteof Science and Technology, Daejeon, Korea, in 1991,and the Ph.D. degree in electrical engineering fromUniversity of Southern California, Los Angeles, in1998.

    He is currently a Senior Research staff member atNEC Laboratories America, Princeton, NJ. He waspreviously a Design Engineer at GoldStar Electron,

    Korea, and a DFT Engineer at Syntest Technologies, Sunnyvale, CA, and 3Dfx

    Interactive San Jose, CA. His main research interests include design for testa-bility and computer-aided design.