low-power scan testing: a scan chain partitioning and scan hold based technique

13
Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique Efi Arvaniti & Yiorgos Tsiatouhas Received: 20 December 2013 /Accepted: 12 May 2014 # Springer Science+Business Media New York 2014 Abstract Power consumption during scan testing operations can be significantly higher than that expected in the normal functional mode of operation in the field. This may affect the reliability of the circuit under test (CUT) and/or invalidate the testing process increasing yield loss. In this paper, a scan chain partitioning technique and a scan hold mechanism are com- bined for low power scan operation. Substantial power reduc- tions can be achieved, without any impact on the test applica- tion time or the fault coverage and without the need to use scan cell reordering or clock and data gating techniques. Furthermore, the proposed design solution for scan power alleviation, permits the efficient exploitation of X-filling tech- niques for capture power reduction or the use of extreme (power independent) compression techniques for test data volume reduction. Keywords Scan testing . Design for test (DfT) . Low power scan 1 Introduction Among testing techniques, scan testing is a valuable solution for both built-in self test (BIST) and non-BIST (external) testing schemes. In a scan design the memory elements of a circuit are dynamically configured as a shift register aiming to increase the controllability and observability of internal circuit nodes. There are two distinct phases in the scan operation, the shift phase where test data are shifted in/out the chain and the capture phase where the responses of the com- binational logic are captured. Today, power consumption during integrated circuit testing procedures is a great concern since it can be several times higher than this during the normal mode of operation. This situation can affect the reliability or even cause the structural damage of the CUT due to overheat and electromigration phenomena [19]. In addition, the elevated temperature can degrade the speed performance of the CUT and result to erroneous test responses that will invalidate the testing process and lead to yield loss. Scan testing power consumption is an open issue. The excessive switching activity of the CUT during scan operations may violate the specification limita- tions on power supply IR and Ldi/dt drop, which in turn increases the probability of noise induced test failures. Various techniques have been proposed in the literature for the reduction of dynamic power dissipation during test appli- cation. Initially, a method to reduce power in sequential circuit testing is to decrease the test frequency [20, 28]; but the test time increases. Moreover, the power supply can be lowered during testing to further reduce power consumption [20]. To eliminate the dynamic power dissipation of the combinational logic during the shift operations in scan testing, data gating techniques at the outputs of the scan cells can be exploited [14]. The drawback in that case is the delay penalty during the normal mode of operation. Several scan cell local clock gating techniques have been proposed [1, 25, 32] for low power, although clock skew problems in the normal mode turn to be a major disadvantage. In addition, a method that uses two non- overlapping clocks working at half of the initial frequency and feeding separate partitions of the scan chain is discussed in [5]. A dedicated power supply gating technique has been present- ed in [4] in order to avoid power consumption in the Responsible Editor: P. Girard This research has been co-funded by the European Union (European Social Fund) and Greek national resources under the framework of the Thalesproject of the Education & Lifelong LearningOperational Program. E. Arvaniti : Y. Tsiatouhas (*) Department of Computer Science and Engineering, University of Ioannina, Ioannina, Greece e-mail: [email protected] E. Arvaniti e-mail: [email protected] J Electron Test DOI 10.1007/s10836-014-5453-9

Upload: yiorgos

Post on 24-Jan-2017

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

Low-Power Scan Testing: A Scan Chain Partitioning and ScanHold Based Technique

Efi Arvaniti & Yiorgos Tsiatouhas

Received: 20 December 2013 /Accepted: 12 May 2014# Springer Science+Business Media New York 2014

Abstract Power consumption during scan testing operationscan be significantly higher than that expected in the normalfunctional mode of operation in the field. This may affect thereliability of the circuit under test (CUT) and/or invalidate thetesting process increasing yield loss. In this paper, a scan chainpartitioning technique and a scan hold mechanism are com-bined for low power scan operation. Substantial power reduc-tions can be achieved, without any impact on the test applica-tion time or the fault coverage and without the need to usescan cell reordering or clock and data gating techniques.Furthermore, the proposed design solution for scan poweralleviation, permits the efficient exploitation of X-filling tech-niques for capture power reduction or the use of extreme(power independent) compression techniques for test datavolume reduction.

Keywords Scan testing . Design for test (DfT) . Low powerscan

1 Introduction

Among testing techniques, scan testing is a valuable solutionfor both built-in self test (BIST) and non-BIST (external)testing schemes. In a scan design the memory elements of a

circuit are dynamically configured as a shift register aiming toincrease the controllability and observability of internal circuitnodes. There are two distinct phases in the scan operation, theshift phase where test data are shifted in/out the chainand the capture phase where the responses of the com-binational logic are captured.

Today, power consumption during integrated circuit testingprocedures is a great concern since it can be several timeshigher than this during the normal mode of operation. Thissituation can affect the reliability or even cause the structuraldamage of the CUT due to overheat and electromigrationphenomena [19]. In addition, the elevated temperature candegrade the speed performance of the CUT and result toerroneous test responses that will invalidate the testing processand lead to yield loss. Scan testing power consumption is anopen issue. The excessive switching activity of the CUTduring scan operations may violate the specification limita-tions on power supply IR and Ldi/dt drop, which in turnincreases the probability of noise induced test failures.

Various techniques have been proposed in the literature forthe reduction of dynamic power dissipation during test appli-cation. Initially, a method to reduce power in sequential circuittesting is to decrease the test frequency [20, 28]; but the testtime increases. Moreover, the power supply can be loweredduring testing to further reduce power consumption [20]. Toeliminate the dynamic power dissipation of the combinationallogic during the shift operations in scan testing, data gatingtechniques at the outputs of the scan cells can be exploited[14]. The drawback in that case is the delay penalty during thenormal mode of operation. Several scan cell local clock gatingtechniques have been proposed [1, 25, 32] for low power,although clock skew problems in the normal mode turn to be amajor disadvantage. In addition, a method that uses two non-overlapping clocks working at half of the initial frequency andfeeding separate partitions of the scan chain is discussed in [5].A dedicated power supply gating technique has been present-ed in [4] in order to avoid power consumption in the

Responsible Editor: P. Girard

This research has been co-funded by the European Union (EuropeanSocial Fund) and Greek national resources under the framework of the“Thales” project of the “Education & Lifelong Learning” OperationalProgram.

E. Arvaniti :Y. Tsiatouhas (*)Department of Computer Science and Engineering,University of Ioannina, Ioannina, Greecee-mail: [email protected]

E. Arvanitie-mail: [email protected]

J Electron TestDOI 10.1007/s10836-014-5453-9

Page 2: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

combinational logic. However, aiming to apply this schemetransistor level redesign of a large number of standard cells ina library is required. A simple approach for low power testingis the re-ordering of the used test vectors [8, 13, 15, 27] or thescan cells [11] to minimize the switching activity. Althoughtest vectors re-ordering does not induce overhead in testapplication time, it is characterized by high computation timesdue to the problem complexity [20]. Moreover scan cell re-ordering may increase design and silicon area cost. A tech-nique to deactivate part of the parallel scan chains that are notinvolved in a scan session during the capture phase and to feedthem with constant values during the shift phase has beenproposed in [12]. This approach may affect the coverage ofun-modeled faults, while for the reduction of the scan-outswitching activity local clock gating is required. Low poweroriented test pattern generation techniques have been alsoproposed [29]. Equivalently, a practical solution is the re-assignment of don’t care bits in the test cubes (X-filling) suchthat the switching activity is reduced [3, 6, 19, 21, 24, 31].However, test cube bit re-assignment techniques usually can-not achieve the same amount of power reduction as hardware-based techniques do [19].

In [26] a scan chain modification is introduced for scanpower reduction by inserting logic gates in-between the scancells. This approach requires large computational effort for thedetermination of the insertion points in large cores and de-pending on the CUT its effectiveness may be limited. A scanchain partitioning technique which uses multiplexers and scancell re-ordering is presented in [18] in order to avoid scanpower dissipation and reduce test application time. A partitionis not involved in the scan-in/out operations in case that allcells in it have don’t care values in both the test vector and thecorresponding response vector. The main drawback of thistechnique is the need for scan cell re-ordering in order toachieve acceptable results. In addition, the coverage of un-modeled faults may be significantly reduced. A similar topol-ogy for scan testing acceleration has been presented in [23].The jump scan architecture has been proposed in [10]. Eachflip-flop in the chain is modified so that its master and slavelatches are working as two independent latches during thescan mode of operation, with the use of an additional multi-plexer in-between them. The scan-in/out power is reducedsince half of the slave latches are bypassed for each bit that

is shifted in the scan chain. A major disadvantage of thisapproach is that the above modification reduces the speedperformance of the functional circuit during the normal modeof operation. Another low power scan chain partitioning tech-nique has been proposed in [9]. In this work, the parallel scanchains (segments) of an Illinois topology, except one that isused as reference, are divided to an equal number of partitionswith the use of multiplexers. Initially, input test data arescanned-in only to the reference segment while the rest seg-ments are loaded with zeros. In parallel all segments uploadthe response data they store. Finally, during the shift opera-tions at the last partition of the reference segment, all partitionsin the rest segments are fed with the same data that feed thecorresponding partition in the reference segment. The mainlimitation of this approach is that it is capable to reduce onlythe scan-in power but not the scan-out power in a chain.Moreover, it is applicable only to Illinois-based scan chains.In [22] an efficient technique has been proposed that is capa-ble to reduce the scan-out power. The concept of the referencesegment from [9] is exploited. Again, the scan chains arepartitioned and extra compactors are inserted in-between thepartitions. The compacted response test data are scanned-outthrough the reference segment for power reduction. By com-bining the two techniques in [9] and [22], as proposed in [22],both scan-in and scan-out power reductions can be achieved.However, the applicability of the combined version is stilllimited to Illinois-based designs. Recently, in [16, 17] a scansegmentation technique for low power testing has been pre-sented, which is supported by scan freeze flip-flops and aproper status register. Before test vector insertion the statusregister is loaded. Test data are shifted through a single seg-ment (or group of segments) under the control of the statusregister, while the rest segments remain “frozen”. This schemeincreases the test time since it requires extra clock cycles pertest vector to scan-in configuration data in the status register,especially when the target is not to decrease the fault coverageof un-modeled faults. Furthermore, this technique is not BISTcompliant. Finally, in [33] and [7] two-stage scan architecturesare presented, where flip-flops are included as “leaf cells” in ascan chain for test application time reduction; limited scanpower reductions are reported as a side effect.

A scan chain partitioning technique for low power scantesting is presented in this work. It is suitable for BIST and

Scan-OutScan-InOriginal Scan Chain

Scan Chain Length L(a)

0

1

0

1

0

1. . .Scan-In Scan-Out

MODE1 MODE2 MODEp

(p)

Length L/p “0” “1” “0”(b)

Fig. 1 a Original scan chain andb) Scan chain partitioning andshift operation

J Electron Test

Page 3: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

non-BIST test environments to effectively reduce/adjust scanpower dissipation down to the required levels. By exploiting theproposed scheme, low power oriented scan cell reordering,clock, data or power supply gating techniques are avoided. Inaddition, the new technique can be efficiently combined, a) withexisting X-filling techniques to reduce capture power and b) testdata compression techniques for test data volume reduction. Apreliminary, poster, version of the work has been presented in[2]. The paper is organized as follows. In Section 2, the proposedscan chain architecture is introduced and its operation is ana-lyzed. In Section 3, comparisons with existing low power scansolutions are discussed. Next, in Section 4, experimental resultsfrom the application of the new technique on benchmark circuitsare presented in order to validate its low power efficiency.Finally, in Section 5 the conclusions are drawn.

2 Low Power Scan Architecture

During scan testing operations, the dynamic power consump-tion depends on the switching activity (transitions) of the scanchain, the combinational logic and the clock distribution tree.

Thus, circuit switching activity is commonly used forpower dissipation estimation. The target in low poweroriented scan testing techniques is the switching activityof the combinational logic. The weighted transition count(WTC) metric [8, 24] is a well known and widely accept-able power consumption estimation tool in scan chainbased designs. According to this metric, the power con-sumption for a given vector depends on the number ofsubsequent bit transitions in it and the relative position ofthese transitions. The WTC at the pseudo-primary inputsof the combinational logic during the scan-in/out processis strongly correlated to the pertinent switching activity onthe internal nodes of the CUT [8, 24]. Consequently, thehigher the WTC of the scanned-in test vector or thescanned-out response vector, the higher the power con-sumption in the CUT.

Next, let us consider a scan chain of length L. Moreover,assume a test vector tj=(tj,1, tj,2,… tj,L), where tj,s+1 is scanned-in before tj,s and so on, and the corresponding response vectorrj-1=(rj-1,1, rj-1,2,… rj-1,L), in the scan chain, from the applica-tion of the previous test vector tj-1, where rj-1,s+1 is scanned-outbefore rj-1,s and so on. The total scan chain WTC for the

Normal ModeScan_EN=”low”

Scan (Shift)Mode

Scan_EN=”high”

Scan_Ini

Scan FF

D

QCLK

0

1

Scan_EN

CLK

To Logic

Scan_Outi

From Logic

MODEj

Scan-Hold FF

Flip-Flop(FF)

Fig. 2 Scan cell with hold modeof operation

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

. . . (s)

Tes

t Sou

rce

Tes

t Sin

k

PartitionCluster

1

PartitionCluster

2

PartitionCluster

p

Fig. 3 Partitioning of multipleparallel scan chains (segments)

J Electron Test

Page 4: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

combination of these two vectors, during the pertinent scan-in/out session, is given by the following expression:

WTC ¼ L t j;l⊕r j−1;1� �þ

XL−1

i¼1

i t j;i⊕t j;iþ1

� �

þXL−1

i¼1

L−ið Þ r j−1;i⊕r j−1;iþ1

� � ð1Þ

The idea behind the proposed low power scan operation isthe partitioning of the scan chain and the application of thescan-in/out operations in each partition separately while therest partitions remain stable in a hold mode of operation. Thisway, during the scan shift operations, the number of signaltransitions at the pseudo-primary inputs of the combinationallogic (inputs driven by scan-cells) is drastically reduced andconsequently the same stands for the power consumption.

2.1 Scan Chain Structure and Operation

The original and the proposed scan chains are presented inFig. 1. The scan chain is partitioned and multiplexers areplaced in-between each partition. Setting to “high” the selectinput (signalMODEj) of a single multiplexer at the output of apartition, while the select inputs of the rest multiplexers arekept to “low”, we permit scan-in/out operations only to thisparticular partition (active partition) while the rest partitionsare bypassed (Fig. 1b). However, by performing scan-in/outoperations on a single partition the data in the rest partitionsare corrupted, unless the corresponding flip-flops are set in ahold mode of operation to retain their data. A simple way toachieve this hold mode of operation is to block the clocksignal in the pertinent partitions using local clock gatingtechniques like in [32]. In each partition a single AND gateis inserted in the clock distribution network. This gate is fed bythe clock signal and the MODEj signal of the correspondingmultiplexer in order to block the clock. In the normal mode ofoperation allMODEj signals are “high”. Although this is a lowcost approach, it is not a preferable design style due to clockskew related problems that local clock gating techniques insertin the clock distribution network. We adopt an alternativesolution for the hold mode of operation where the stored datain a scan flip-flop re-feed its input.

In Fig. 2 a scan flip-flop with a re-feeding mechanism toimplement the hold mode of operation is presented. The usedthree-state buffers are controlled by the select signal MODEjof the multiplexer at the output of the corresponding partitionwhere the flip-flop belongs. Since these buffers are not oncritical paths for the circuit normal operation, they can bereplaced by simple pass-gates to reduce the pertinent cost.

The operation of the proposed scan chain is as follows. Letus consider that the scan chain is divided into p partitions. Inthe scan mode, the partitions are successively activated toperform the required scan-in/out operations. Initially the right-most partition is activated (though in general the order ofpartition activation can be random) that is MODEp=“1” andMODEj=“0” ∀ j<p. The partition is fed through the scan-inport Scan-In with the corresponding data at the rightmost partof the next test vector, while the rest partitions are in the holdmode of operation and bypassed. In parallel, the response dataof the previous test vector that are stored in this partition arescanned-out through the scan-out port Scan-Out. Next thesubsequent partition in the left is activated for the pertinentscan-in/out operations, while the rest partitions are in the holdmode of operation and bypassed, and so on until the sequentialactivation of all p partitions in the design (see Fig. 1b). At theend, the test pattern has been applied to exactly the same bitpositions as in the original (standard) scan chain while the

DQ

CLK

CLR

DQ

CLK

CLR

DQ

CLK

CLR

DQ

CLK

SET

. . .(p)

Scan_EN

MR-CLK

MODEpMODE(p-1)MODE2MODE1

ModeRegister

MR-CD

“0”“0”“0” “1”

SET SET SET

Fig. 4 TheMode-Register (at theinitial state)

(a) (b)

C-Counter

P-Counter

Mode-Register. . .

MODEj signals

CLK

MR-CLK

Start

Reset P-CounterInitialize Mode-Register

Full count cycle of C-Counter- Shift Phase -

Reset C-Counter

Increment P-Counter

P-Counter p

Capture Phase

Next test vector?

End

yes

no

yes

no

Fig. 5 a Mode-Register clock generation and b) operation flowchart

J Electron Test

Page 5: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

response vector has been extracted from the scan chain. Thus,the fault coverage is not affected. Moreover, the number ofclock cycles required for the completion of the scan-in/outoperations in the whole scan chain, according to the proposeddesign technique, is exactly the same to the correspondingnumber of clock cycles required in the original scan chain.Thus the test application time is not increased.

Obviously, the proposed technique can be extended tomultiple parallel scan chains (segments), as it is shown inFig. 3 where s segments are considered. In all segments,partitions with identical partition numbers (a partition cluster)are activated in parallel. Again, the number of clock cyclesrequired for the completion of the scan-in/out procedures in allsegments is exactly the same to the corresponding number inthe standard design. The capture phase is not affected.

In order to realize the new scan chain architecture in Fig. 3,a dedicated control signal MODEj is required for each parti-tion cluster. For the generation of these signals during the scanmode of operation, a p-bit auxiliary shift register is exploited.We will call this register Mode-Register (see Fig. 4). TheMode-Register is initialized (by sequentially setting theScan_EN (scan enable) and MR-CD (clear) signals at “low”)to the all zero state, except the rightmost cell that is set to“high”. During the scan mode of operation (where

Scan_EN=“high”), a single pulse is shifted in the Mode-Register (from right to left) to successively activate the parti-tion clusters. In addition, two counters can be used to auto-mate the shifting of the pulse in the Mode-Register (seeFig. 5a). The first counter (P-Counter) is log2(p) bits wideand counts the number of partitions. The second counter (C-Counter) is log2(L/p) bits wide (where L is the length of eachsegment) and counts the number of cells in each partition.Each time the P-Counter is incremented, the Mode-Register isshifted one position for the activation of a new cluster. Then,the C-Counter performs a complete count cycle (L/p counts)and provides the necessary cycles for the completion of thescan-in/out operation in the activated cluster. Next, the C-Counter is nullified and the P-Counter is triggered to activatethe next cluster and so on until the completion of the scan-in/out operations in all partition clusters. So far, the new testvector has been applied, the previous response vector has beenextracted and the P-Counter has accomplished a completecount cycle (p counts). Afterwards, the Scan_EN signal isturned to “low” and sets a) the circuit into the normal modeof operation in order to capture the response data of theapplied test vector and b) the Mode-Register to the all “high”state. Then, both counters are nullified and the Mode-Registeris initialized.

CLK

Scan_EN

Shift Phase 1 Shift Phase 2 Shift Phase p

Cap

ture... ...... ...

MR-CLK

Shift Phase 1

Test Vector Insertion & Response Extraction

FollowingShift Phases

...

L/p cycles

Shift Phases

L cycles

MR-CD

Fig. 6 Signal waveforms

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

. . . (s)

BIST Controller Mode-Register

OR

A

TP

G

C&P Counters

Fig. 7 The proposed scan chainpartitioning scheme in a BISTtopology

J Electron Test

Page 6: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

The above procedure is repeated until the application of alldesired test vectors, as it is presented in the flowchart ofFig. 5b. Note that in the normal mode of operation theMode-Register remains stable to the all “high” state(Scan_EN=“low”). In Fig. 6 the signal waveforms for theoperation of the proposed scan chain architecture areillustrated.

The above design for testability circuitry is embedded inthe CUTand can be controlled either by an external automatictest equipment (ATE) or a BISTcontroller. In case that a BISTsolution is utilized, the overall scheme is shown in Fig. 7,where the two counters and the Mode-Register are included inthe BIST controller along with the Test Pattern Generator(TPG) and the Output Response Analyzer (ORA).

2.2 Partial Implementation

Either in case that the silicon area requirements of the pro-posed scan chain architecture are considered quite high or thepower reduction is higher than this required by the specifica-tions for test efficiency, a partial implementation can be alter-natively adopted. In Fig. 8, the partial implementation of thenew scheme is illustrated. Among the parallel scan chains of

the design only a portion (selected using cost and powercriteria) is modified according to the proposed technique.The rest of the chains remain standard scan chains utilizingthe standard scan flip-flop. The operation of this schemefollows exactly the procedures discussed in Figs. 5b and 6.The operation of the standard scan chains is not affected bythese procedures, since the scan cycles are exactly the same asin the original scan chain, and the scan-in/out process isperformed in the traditional way.

2.3 At Speed Scan Testing

In modern nanometer technology designs, at speed scan test-ing techniques are mandatory [30]. The proposed architecturesupports at speed testing either in its full or partial implemen-tation without any extra cost. In Fig. 9, signal waveforms forthe application of the well known launch-on-capture at speedtesting technique on the proposed topology are presented.After L cycles for the scan-in of the new vector (initializationvector) and the scan-out of the previous test response the scanenable signal (Scan_EN) is deactivated and a pair of fastpulses is applied to launch the test vector and capture the testresponse respectively. Note that the deactivation of the scan

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

. . .

Tes

t Sou

rce

Tes

t Sin

k

. . .

. . .

Modified Scan Chain

Standard Scan Chain

. . .

. . .

Fig. 8 Partial implementation ofthe proposed scan chainpartitioning scheme

CLK

Scan_EN

Shift Phase 1 Shift Phase 2 Shift Phase p

... ......

MR-CLK

Shift Phase 1

Test Vector Insertion & Response Extraction

FollowingShift Phases

...

L/p cycles

Shift Phases

...

L cycles

MR-CD

Fig. 9 Launch-on-capture atspeed testing signal waveforms

J Electron Test

Page 7: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

enable signal sets the Mode-Register to the all high state (seeFig. 4). Thus, the scan chains are set to the normal mode ofoperation for the application of the pair of launch and capturepulses. Then, a clear operation is performed on the Mode-Register (MR-CD=”low”) and another session of L cyclesfollows for the next pattern insertion and so on until thecompletion of the test set.

2.4 Illinois Scan Implementation

The Illinois scan chain is a well known scan architecture fortest application time reduction [30]. The Illinois scan archi-tecture can be easily applied on the proposed scan testingscheme according to Fig. 10. As in the standard scan chain,simple multiplexers are inserted at the input of each parallelscan chain (except the first one). These multiplexers are con-trolled by the serial/broadcast selection signal S/B. In thebroadcast mode all scan chains are fed in parallel with testdata, while in the serial mode all scan chains form a singlescan chain to serially scan-in test data. Either in the serial orthe broadcast mode of operation, only the scan partitions ofthe active cluster are in the shift mode of operation, while therest partitions are in the hold mode. Thus, the same as earlierWTC reduction can be achieved.

3 Comparisons

As mentioned earlier the proposed low power scan architecturecan be applied either in BIST or non-BIST scan-based testingenvironments. In the BIST case and especially when pseudo-random test vectors are used, software based techniques, like X-filling [3, 6, 19, 21, 24, 31], for low power scan operation, arenot applicable. Other effective hardware-based approaches, like[16, 17], are not compatible with BIST techniques. The designfor testability scheme introduced in this work is an efficientsolution for low power scan-based BIST.

In a non-BIST environment and particularly in determinis-tic testing, the proposed scan architecture can be easily com-bined with X-filling techniques [3, 6, 19, 21, 24, 31] to reducefurther both the shift and the capture power consumptionduring scan testing. However, since in our case the shift poweris significantly alleviated, X-filling effort can be exclusivelydevoted to reduce the capture power (e.g. with the use of thepreferred fill technique [8]).

Furthermore, test data compression techniques can bemoreeffective without the need to consider low power issues duringthe scan operations (as it is stated in [8]), given that theproposed technique is exploited.

Last but not least, we have to mention that the new scanarchitecture: a) does not require the re-ordering of the scancells in the chain (as it is the case in [11]), b) does not affect thefault coverage of either modeled or un-modeled faults (as it isthe case in [12, 16–18, 23]), c) does not increase the testapplication time (as it is the case in [16, 17, 28]), d) does notrequire the use of clock gating techniques (as it is the case in[1, 12, 25, 32]), e) has negligible influence on the speedperformance and the power consumption of the functionalcircuit during the normal mode of operation (which is notthe case in [4, 10, 14]), f) it is compliant with the launch-on-

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

0

1

0

1

0

1. . .

MODE1 MODE2 MODEp

(p)

. . . (s)

Tes

t Sin

k

PartitionCluster

1

PartitionCluster

2

PartitionCluster

p

S/B

S/B

1

0

1

0

0

1

Tes

t Sou

rce

Fig. 10 Illinois version of theproposed scan chain designtechnique

Table 1 Benchmark circuits’ characteristics

Benchmark Inputs Outputs # Gates # Flip-Flops

s38417 28 106 22,179 1,636

s38584 12 278 19,523 1,452

usb_funct 112 98 20,980 1,746

tv80 13 32 12,031 359

systemcaes 258 129 16,340 670

pci_bridge32 158 164 36,665 3,359

aes_core 258 129 26,620 530

ac97_ctrl 54 47 23,445 2,199

ethernet 93 105 94,428 10,544

wb_conmax 1,128 1,416 54,151 770

J Electron Test

Page 8: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

Table 2 Experimental results on WTC and implementation cost

Circuit Segments (s) Partitions (p) Weighted transition count (WTC) WTC reduction (%) Size (Unit transistors) Size increment (%)

s38584 1 1 82,994,860 – 296,411 –

2 41,689,564 49.77 306,223 3.31

3 27,477,052 66.89 306,334 3.35

4 20,839,276 74.89 306,545 3.42

6 13,854,044 83.31 306,667 3.46

8 10,397,662 87.47 306,889 3.53

10 8,603,392 89.63 307,111 3.61

s38417 1 1 51,059,343 – 281,208 –

2 25,016,079 51.01 292,137 3.89

3 16,728,231 67.24 292,248 3.93

4 12,338,895 75.83 292,359 3.97

6 8,597,127 83.16 292,581 4.04

8 6,283,599 87.69 292,803 4.12

10 5,468,999 89.29 293,025 4.20

tv80 1 1 59,249,424 – 132,216 –

2 29,617,385 50.01 135,321 2.35

3 19,771,752 66.63 135,432 2.43

4 1,478,858 75.04 135,543 2.52

6 9,864,342 83.35 135,765 2.68

8 7,361,674 87.58 135,987 2.85

10 5,876,482 90.08 136,209 3.02

aes_core 1 1 86,974,932 – 251,676 –

2 43,483,662 50.00 255,849 1.66

3 28,877,124 66.80 255,960 1.70

4 21,725,064 75.02 256,071 1.75

6 14,477,760 83.35 256,293 1.83

8 10,868,644 87.50 256,515 1.92

10 8,688,738 90.01 256,737 2.01

Systemcaes 1 1 67,120,663 – 198,285 –

2 33,562,373 50.00 203,323 2.54

3 22,382,145 66.65 203,434 2.60

4 16,804,567 74.96 203,545 2.65

6 11,201,739 83.31 203,767 2.76

8 8,411,325 87.47 203,989 2.88

10 6,717,617 89.99 204,211 2.99

2 1 33,562,477 – 198,285 –

2 16,793,451 49.96 203,275 2.52

3 11,200,701 66.63 203,399 2.58

4 8,404,071 74.96 203,523 2.64

6 5,555,883 83.45 203,771 2.77

8 4,210,854 87.45 204,019 2.89

10 3,363,859 89.98 204,267 3.02

wb_conmax 1 1 246,732,620 – 543,961 –

2 123,214,610 50.06 549,614 1.04

3 82,424,944 66.59 549,725 1.06

4 61,749,906 74.97 549,836 1.08

6 41,165,288 83.32 550,058 1.12

8 30,895,414 87.48 550,280 1.16

10 24,698,654 89.99 550,502 1.20

J Electron Test

Page 9: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

Table 2 (continued)

Circuit Segments (s) Partitions (p) Weighted transition count (WTC) WTC reduction (%) Size (Unit transistors) Size increment (%)

2 1 123,164,610 – 543,961 –

2 61,771,306 49.85 549,566 1.03

3 41,173,260 66.57 549,690 1.05

4 30,827,041 74.97 549,814 1.08

6 20,578,381 83.29 550,062 1.12

8 15,447,765 87.46 550,310 1.17

10 12,341,390 89.98 550,558 1.21

usb_funct 1 1 1,758,690,097 – 317,254 –

2 879,882,901 49.97 328,850 3.66

3 586,870,395 66.63 328,961 3.69

4 438,607,537 75.06 329,072 3.73

6 293,389,861 83.32 329,294 3.80

8 219,537,221 87.52 329,516 3.87

10 176,967,943 89.94 329,738 3.94

2 1 889,882,901 – 317,254 –

2 438,672,720 50.70 328,802 3.64

3 293,389,861 67.03 328,926 3.68

4 219,618,692 75.32 329,050 3.72

6 146,223,379 83.57 329,298 3.80

8 110,028,278 87.64 329,546 3.87

10 88,224,106 90.09 329,794 3.95

ac97_ctrlpci_bridge32

1 1 773,239,939 – 371,362 –

2 386,869,007 49.97 385,701 3.86

3 257,813,397 66.66 385,812 3.89

4 193,555,369 74.97 385,923 3.92

6 129,036,379 83.31 386,145 3.98

8 96,601,816 87.51 386,367 4.04

10 77,290,810 90.00 386,589 4.10

3 1 257,575,439 – 371,362 –

2 129,017,280 49.91 385,635 3.84

3 85,922,985 66.64 385,772 3.88

4 64,501,106 74.96 385,909 3.92

6 42,992,124 83.31 386,183 3.99

8 32,235,114 87.49 386,457 4.06

10 25,775,230 89.99 386,731 4.14

6 1 128,669,163 – 371,362 –

2 64,323,563 50.01 385,639 3.70

3 42,878,171 66.68 385,815 3.75

4 32,141,187 75.02 385,991 3.79

6 21,466,981 83.32 386,343 3.88

8 16,070,959 87.51 386,695 3.97

10 12,847,187 90.02 387,047 4.05

1 1 5,685,778,630 – 581,273 –

2 2,842,307,479 50.01 602,617 3.67

3 1,894,316,972 66.68 602,728 3.69

4 1,420,891,234 75.01 602,839 3.71

6 947,447,111 83.34 603,061 3.75

8 710,330,302 87.51 603,283 3.79

10 568,366,902 90.00 603,505 3.82

J Electron Test

Page 10: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

capture technique for at-speed testing and g) can be easilyapplied to Illinois-based scan topologies for power reductionin both the serial and the broadcast modes of operation.

4 Experimental Results

The IWLS05, as well as two large ISCAS89 (s38417 ands38584), benchmark circuits were used for the evaluation of

the proposed technique. Table 1 provides the characteristics ofthese circuits. The general architecture in Fig. 3 was consid-ered. Depending on the number of flip-flops, the scan chain isdivided into segments (up to eight) and each segment topartitions (up to sixteen). All scan cells support the hold modeof operation like in Fig. 2. The test vectors used in theexperiments have been extracted using the ATALANTA tool,where the random fill option was activated for X-bitassignment.

Table 2 (continued)

Circuit Segments (s) Partitions (p) Weighted transition count (WTC) WTC reduction (%) Size (Unit transistors) Size increment (%)

3 1 1,895,546,478 – 581,273 –

2 948,116,798 49.98 602,552 3.66

3 631,943,348 66.66 602,689 3.68

4 484,039,118 74.46 602,826 3.71

6 315,964,926 83.33 603,100 3.76

8 237,986,078 87.44 603,374 3.80

10 189,521,262 90.00 603,648 3.85

6 1 129,036,209 – 581,273 –

2 64,695,882 49.86 602,556 3.53

3 43,250,832 66.48 602,732 3.56

4 32,514,183 74.80 602,908 3.59

6 21,840,637 83.07 603,260 3.64

8 16,445,271 87.26 603,612 3.70

10 13,222,154 89.75 603,964 3.76

Ethernet 1 1 226,883,623,870 – 1,896,115 –

2 113,439,283,342 50.00 1,960,691 3.41

3 75,631,946,582 66.66 1,960,802 3.41

4 56,727,199,414 75.00 1,960,913 3.42

6 37,816,570,898 83.33 1,961,135 3.43

8 28,362,485,009 87.50 1,961,357 3.44

10 22,690,298,350 90.00 1,961,579 3.45

16 14,181,438,786 93.75 1,962,245 3.49

4 1 56,727,197,274 – 1,896,115 –

2 28,362,483,510 50.00 1,960,621 3.40

3 18,908,283,458 66.67 1,960,771 3.41

4 14,181,438,786 75.00 1,960,921 3.42

6 9,458,756,230 83.33 1,961,221 3.43

8 7,090,674,626 87.50 1,961,521 3.45

10 5,667,707,024 90.01 1,961,821 3.47

16 3,546,178,075 93.75 1,962,721 3.51

8 1 28,412,493,510 – 1,896,115 –

2 14,181,438,786 50.09 1,960,651 3.40

3 9,454,195,678 66.73 1,960,853 3.41

4 7,092,509,100 75.04 1,961,055 3.42

6 4,733,023,768 83.34 1,961,459 3.45

8 3,545,582,826 87.52 1,961,863 3.47

10 2,836,548,241 90.02 1,962,267 3.49

16 1,883,684,671 93.37 1,963,479 3.55

J Electron Test

Page 11: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

The experimental results are provided in Table 2. The firstcolumn presents the circuit under consideration. The secondand the third columns provide the number of segments andpartitions respectively used for the implementation of the scanarchitecture. The rows where the number of partitions is equalto one (p=1) refer to the original scan design. In the fourthcolumn the scan-in/out WTC for the application of the wholetest set to the CUT is given, while the fifth column shows thepercentage reduction in theWTC for each configuration (s andp combination, where p>1) with respect to the original scandesign (p=1). The circuit size (expressed by the requiredequivalent number of minimum/unit size transistors) in eachconfiguration, is provided in the sixth column of the table.Finally, in the seventh column the percentage increase in thecircuit size is presented.

As it is expected, the WTC reduction achieved is propor-tional to the number of the partitions in the design. The samestands for the average WTC. In Fig. 11, an indicative 3-Dgraph of the WTC reduction with respect to the number ofsegments and partitions in the Ethernet circuit is presented. Ingeneral, the experiments showed that the WTC reduction isalmost independent of the CUT, the test vectors’ applicationorder and the ordering of the scan cells in the chain.

An interesting parameter in low power scan testing is thepeak WTC per test vector. According to the experimentalresults, the peakWTC reduction during the shifting operationsis also proportional to the number of the scan chain partitions.In Fig. 12, 3-D graphs are provided to illustrate the reductionof the peak WTC with respect to the number of partitions andthe number of the segments in the design, for the threeIWLS05circuits with the highest flip-flop count. The sametrend stands for all the rest circuits in Table 2.

ethernet circuit

Fig. 11 WTC reduction with respect to s and p parameters for theEthernet

Fig. 12 Peak WTC reduction with respect to s and p parameters

J Electron Test

Page 12: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

Among the various scan power reduction techniquesdiscussed in Section 1, these related to data [14] and power [4]gating schemes are the most effective since they are capable toeliminate the signal transitions in the combinational logic. How-ever, the above techniques seriously affect the circuit perfor-mance in the normal mode of operation. Next, scan chainpartitioning techniques are the most effective. In more details,we observe the following. According to the experimental resultsin Table 2, the WTC reduction of the proposed scan testingtechnique is proportional to the product of the number of seg-ments (s) and the number of partitions (p) in each segment (s.ptimes reduction with respect to the case of a single segmentwithout partitions). In clock gating techniques, like in [32], theWTC reduction is proportional to the square of the number ofsegments in the design (s2 times reduction). The scan enablegating technique [12] (possibly combined with clock gating)provides a theoretical maximum WTC reduction which is alsoproportional to the square of the number of segments in thedesign (s2 times reduction). However, this reduction dependson the applied test data and the corresponding test response foreach circuit under test and it is certainly less than this upperbound. The combined use of the techniques presented in [9] and[22], as it is proposed in [22] for both scan-in and scan-out powerminimization, provides WTC reductions that are proportional tos.p/(s+p-1) [22] (as the experimental results also prove). Finally,note that for the X-filling technique in [19], which is a test datadependant technique, the maximum reported reduction for theISCAS89 and ITC99 benchmark circuits is 1.72 s times (theaverage reduction for the circuits used is 1.37 s times).

Table 3 presents concentrated comparison results for thefive techniques above, with respect to: theWTC reduction, thecell area overhead, the performance degradation in the normalmode, the test application time increase, the degradation of thecoverage of un-modeled faults, the increase of clock skewrelated problems and the increase of the test data volume.

Finally, note that the implementation cost due to the addi-tional circuitry used in the proposed scheme is small and onlyslightly increases with the number of partitions (p). However,in the above cost, the routing cost of theMODEj signals mustbe also included, which forms the main drawback of theproposed technique.

5 Conclusion

Low power scan operation is very important for circuit reli-ability during and after manufacturing testing. In this paper ascan partitioning technique is proposed, along with a scanhold mechanism, to significantly reduce the power dissipationduring the shift phase in a scan chain. The proposed approachis applicable in both BIST and non-BIST based test schemesand can be easily combined with X-filling methodologies toreduce also the power consumption in the capture phase. It istest data compression friendly and can alleviate the effort ofdefect oriented X-filling techniques [3]. The test applicationtime and the fault coverage are not affected by the applicationof the proposed technique. Furthermore, no scan cell

Table 3 Comparison results

Scan chain architecture Proposed [5] [15] [26]+[27] [1]

Segments (s) Partitions (p)

WTC reduction 1 1 1 1 1 1 <2

1 4 4 1 1 1 <2

1 8 8 1 1 1 <2

1 16 16 1 1 1 <2

4 4 16 16 ≤16 2.29 <8

4 8 32 16 ≤16 2.91 <8

4 16 64 16 ≤16 3.37 <8

8 8 64 64 ≤64 4.27 <16

8 16 128 64 ≤64 5.57 <16

16 16 256 256 ≤256 8.26 <256

Area overhead <4 % NG NG NA NO

Performance degradation NG NO NO NG NO

Test application time increase NO NO NO NO NO

Un-modeled faults coverage degradation NO NO YES YES YES

Clock skew problems NO YES YES NO NO

Test data volume increase NO NO YES NO NO

NG negligible, NA not available

J Electron Test

Page 13: Low-Power Scan Testing: A Scan Chain Partitioning and Scan Hold Based Technique

reordering is required, neither the use of clock, data or powersupply gating techniques.

References

1. Almukhaizim S, Alsubaihi S, Sinanoglou O (2010) On the applica-tion of dynamic scan chain partitioning for reducing peak shift power.Springer J Electron Test Theory Appl 26:465–481

2. Arvanity E, Tsiatouhas Y (2012) Low power scan by partitioning andscan hold, IEEE Symp. on Design and Diagnostics of ElectronicCircuits and Systems, pp 262–265

3. Balatsouka S, Tenentes V, Kavousianos X, Chakrabarty K (2010)Defect aware x-filling for low-power scan testing, IEEE/ACMDesign Automation and Test in Europe Conference, pp 873–878

4. Bhunia S, Mahmoodi H, Ghosh D, Mukhopadhyay S, Roy K (2005)Low-power scan design using first level supply gating. IEEE TranVLSI Syst 13(3):384–395

5. Bonhomme Y, Girard P, Guiller L, Landrault C, Pravossoudovitch S(2001) A gated clock scheme for low power scan testing of logic ICsor embedded cores, IEEE Asian Test Symposium, pp 253–258

6. Butler K, Saxena J, Fryars T, Hetherington G, Jain A, Lewis J (2004)Minimizing power consumption in scan testing: pattern generationand DFT techniques, IEEE Int. Test Conference, pp 355–364

7. Chalkia M, Tsiatouhas Y (2012) The leafs scan-chain for test appli-cation time and scan power reduction, IEEE Int. Conference onElectronics, Circuits and Systems, pp 749–752

8. Chandra A, Chakrabarty K (2002) Low-power scan testing and testdata compression for System-on-Chip. IEEE Tran CAD Integr CircSyst 21(5):597–604

9. Chandra A, Ng F, Kapur R (2008) Low power Illinois scan architec-ture for simultaneous power and test data volume reduction,IEEE/ACM Design Automation and Test in Europe Conference, pp462–467

10. Chiu M-H, Li J C-M (2005) Jump scan: a DFT technique for lowpower testing, IEEE VLSI Test Symposium, pp 277–282

11. Chosh S, Basu S, Touba N (2003) Joint minimization of power andarea in scan testing by scan cell reordering, IEEE Comp Soc AnnuSymp VLSI, pp 246–249

12. Czysz D, Kassab M, Lin X, Mrugalski G, Rajski J, Tyszer J (2008)Low power scan shift and capture in the EDT environment, IEEEInternational Test Conference, p 13.2

13. Dabholkar V, Chakravarty S, Pomeranz I, Reddy SM (1998)Techniques for minimizing power dissipation in scan and combina-tional circuits during test application. IEEE Tran CAD Integr CircSyst 17(12):1325–1333

14. Gerstendorfer S, Wunderlich H (2000) Minimized power consump-tion for scan based BIST. J Electron Test Theory Appl 16(3):203–212

15. Girard P, Guiller L, Landrault C, Pravossoudovitch S (1999) A testvector ordering technique for switching activity reduction during testapplication, IEEE Great Lakes Symp. on VLSI, p 24–27

16. Kim H-S, Kang S, Hsiao M (2008) A new scan architecture for bothlow-power testing and test volume compression under SoC testenvironment. Springer J Electron Test Theory Appl 24:365–378

17. Kim H-S, Kim C-G, Kang S (2008) A new scan partition scheme forlow-power embedded systems. ETRI J 30(3):412–420

18. Lee I-S, Hur Y-M, Ambler T (2004) The efficient multiple scan chainarchitecture reducing power dissipation ant test time, IEEE AsianTest Symposium, pp 94–97

19. Li J, Xu Q, Hu Y, Li X (2010) X-filling for simultaneous shift- andcapture-power reduction in at-speed scan-based testing. IEEE TransVLSI Syst 18(7):1081–1092

20. Nicolici N, Al-Hashimi B (2003) Power-constrained testing of VLSIcircuits, Kluwer Academic Publishers

21. Remersaro S, Lin X, Zhang Z, Reddy S, Pomeranz I, Rajski J (2006)Prefered fill: a scalable method to reduce capture power for scanbased designs, IEEE International Test Conference, p 32.2

22. Saeed SM, Sinanoglou O (2011) Expedited response compaction forscan power reduction, IEEE VLSI Test Symposium, pp 40–45

23. Samaranayake S, Sitchinava N, Kapur R, Amin MB, Williams TW(2002) Dynamic scan: driving down the cost of test. IEEE Comput35(2):63–68

24. Sankaralingam R, Oruganti RR, Touba NA (2000) Static compactiontechniques to control scan vector power dissipation, IEEE VLSI TestSymposium, pp 35–40

25. Sankaralingam R, Pouya B, Touba N (2001) Reducing power dissi-pation during test using scan chain disable, IEEE VLSI TestSymposium, pp 319–324

26. Sinanoglou O, Bayractaroglou I, Orailoglou A (2002) Test powerreduction throughminimization of scan chain transitions, IEEEVLSITest Symposium, pp 166–171

27. Tudu J, Larsson E, Singh V, Agrawal V (2009) On minimization ofpeak power for scan circuit during test, IEEE European TestSymposium, pp 25–30

28. Vranken H, Waayers T, Fleury H, Lelouvier D (2001) Enhancedreduced-pin-count test for full-scan design, IEEE Int. TestConference, pp 738–747

29. Wang S, Gupta SK (1998) ATPG for heat dissipation minimizationduring test application. IEEE Trans Comput 47(2):256–262

30. Wang L-T, Stroud C, Touba N (2008) System on chip test architec-tures, Morgan and Kaufmann Pub

31. Wen X,Miyase K, Suzuki T, YamatoY, Kajihara S,Wang L-T, SalujaKK (2006) A highly-guided x-filling method for effective low-capture-power scan test generation,” IEEE International Conferenceon Computer Design, pp 251–258

32. Whetsel L (2000) Adapting scan architectures for low power opera-tion, IEEE Int. Test Conference, pp 863–872

33. Xiang D, Li K, Fujiwara H, Thulasiraman K, Sun J (2007)Constraining transition propagation for low-power scan testing usinga two stage scan architecture. IEEE Trans Circ Syst − II Exp Briefs54(5):450–454

Efi Arvaniti received the B.S. degree in electronic and computer engi-neering in 2008 from the Technical University of Crete, Greece and theM.S. degree in computer science in 2011 from the University of Ioannina,Greece. His research interests include VLSI circuit design and design fortestability.

Yiorgos Tsiatouhas received the B.S. degree in physics in 1990, theM.S.degree in electronic automation in 1993 and the Ph.D. degree in computerscience in 1998, all from the University of Athens, Greece. From 1992 to1996 he was with the National Center of Scientific Research“Demokritos”Athens, Greece. From 1998 to 2002 hewas with IntegratedSystems Development (ISD) S.A. as cooperative projects director andtechnical manager of the Advanced Silicon Solutions Div. In 2002 hejoined the Department of Computer Science and Engineering, Universityof Ioannina, Greece, where currently he is an Associate Professor. Hisresearch interests are in the area of integrated circuit design and design fortestability. Dr. Tsiatouhas is a member of the EDAA, the IEEE and theIEEE Test Technology Technical Council. He was member of the pro-gram committees of many conferences in the area of VLSI design andtesting. He received the best paper awards of the 2002 I.E. InternationalSymposium on Quality Electronic Design and the 2009 I.E. InternationalSymposium on Design and Diagnostics of Electronic Circuits andSystems.

J Electron Test