hardware-software approaches to in-circuit emulation for ......such as pipelining and superscalar...

16
Hardware-Software Approaches to In-Circuit Emulation for Embedded Processors Chung-Fu Kao National Sun Yat-Sen University Hsin-Ming Chen Andes Technology Ing-Jer Huang National Sun Yat-Sen University &AN IN-CIRCUIT EMULATOR (ICE) is part of the development environment for a microprocessor- or microcontroller-based system—called a target system. (We use the terms microprocessor and microcontroller interchangeably unless we need to differentiate them.) While retaining the same functionality and physical features as the original microprocessor, the ICE provides extra debug and test mechanisms to support designers in the test, development, debug, and maintenance of target systems’ hardware and soft- ware. These mechanisms include single stepping, breakpoint setting and detection, tracing, internal resource monitoring, and modification. Traditionally, designers have used an ICE mainly when debugging a microprocessor-based system design at the PCB (printed circuit board) level, as Figure 1 shows. (Although test, development, debug, and maintenance are different activities, they involve similar operations. We use the term debug to include all these activities unless there is a need to differentiate them.) To debug the system, designers pull the target microprocessor chip out of its slot on the board and insert the ICE into the slot to act as the target microprocessor. 1 The host computer’s software controls the ICE’s operation via a communication channel. After debug is complete, the designer dis- connects the ICE and places the original target microprocessor chip back in its slot. In this scenario, the ICE functions only during debug and doesn’t exist in the final product. The ICE’s cost and performance affect the develop- ment system but not the final product. In the SoC era, however, the ICE no longer plays a negligible role. Responding to the needs of higher performance, more functionality, and higher integra- tion levels, manufacturers are permanently embed- ding an ICE with the microprocessor core in the final product. For example, in microprocessors developed by ARM 2 and IBM, 3 there is no way to remove the ICE from the chip. ICE performance, cost, power con- sumption, test and debug support, and hardware- software interfacing have become important consid- erations in microprocessor-based platform design. Thus, it has become necessary to comprehensively investigate the effects of embedding an ICE in a microprocessor core. Unfortunately, the design of ICEs has mainly followed an ad hoc approach. Architecture platforms, hardware-software interfaces, and operating 462 In-circuit emulators have become part of the permanent structure of microprocessor cores to support on-chip test and debug activities in highly integrated environments such as SoCs. However, ICE design styles and operation principles are quite diverse. This article presents a taxonomy based on the notions of foreground and background operations and hardware- software implementation alternatives to organize existing in-circuit emulation approaches. In-Circuit Emulation 0740-7475/08/$25.00 G 2008 IEEE Copublished by the IEEE CS and the IEEE CASS IEEE Design & Test of Computers Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Upload: others

Post on 10-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

Hardware-SoftwareApproaches to In-CircuitEmulation forEmbedded ProcessorsChung-Fu Kao

National Sun Yat-Sen University

Hsin-Ming Chen

Andes Technology

Ing-Jer Huang

National Sun Yat-Sen University

&AN IN-CIRCUIT EMULATOR (ICE) is part of the

development environment for a microprocessor- or

microcontroller-based system—called a target system.

(We use the terms microprocessor and microcontroller

interchangeably unless we need to differentiate them.)

While retaining the same functionality and physical

features as the original microprocessor, the ICE

provides extra debug and test mechanisms to support

designers in the test, development, debug, and

maintenance of target systems’ hardware and soft-

ware. These mechanisms include single stepping,

breakpoint setting and detection, tracing, internal

resource monitoring, and modification.

Traditionally, designers have used an ICE mainly

when debugging a microprocessor-based system

design at the PCB (printed circuit board) level, as

Figure 1 shows. (Although test, development, debug,

and maintenance are different activities, they involve

similar operations. We use the term debug to include

all these activities unless there is a need to differentiate

them.) To debug the system, designers

pull the target microprocessor chip out

of its slot on the board and insert the

ICE into the slot to act as the target

microprocessor.1 The host computer’s

software controls the ICE’s operation

via a communication channel. After

debug is complete, the designer dis-

connects the ICE and places the

original target microprocessor chip

back in its slot. In this scenario, the ICE functions only

during debug and doesn’t exist in the final product.

The ICE’s cost and performance affect the develop-

ment system but not the final product.

In the SoC era, however, the ICE no longer plays a

negligible role. Responding to the needs of higher

performance, more functionality, and higher integra-

tion levels, manufacturers are permanently embed-

ding an ICE with the microprocessor core in the final

product. For example, in microprocessors developed

by ARM2 and IBM,3 there is no way to remove the ICE

from the chip. ICE performance, cost, power con-

sumption, test and debug support, and hardware-

software interfacing have become important consid-

erations in microprocessor-based platform design.

Thus, it has become necessary to comprehensively

investigate the effects of embedding an ICE in a

microprocessor core. Unfortunately, the design of ICEs

has mainly followed an ad hoc approach. Architecture

platforms, hardware-software interfaces, and operating

462

In-circuit emulators have become part of the permanent structure of

microprocessor cores to support on-chip test and debug activities in highly

integrated environments such as SoCs. However, ICE design styles and

operation principles are quite diverse. This article presents a taxonomy based

on the notions of foreground and background operations and hardware-

software implementation alternatives to organize existing in-circuit emulation

approaches.

In-Circuit Emulation

0740-7475/08/$25.00 G 2008 IEEE Copublished by the IEEE CS and the IEEE CASS IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 2: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

methodologies vary widely

among ICEs for different

microprocessors. Further-

more, many ICEs are pro-

prietary commercial prod-

ucts for which in-depth

design information is un-

available. Most available

information is in the form

of user manuals or appli-

cation notes, which pro-

vide very limited design

information. Therefore, it

is difficult to perform a fair

comparison of on-chip

debug approaches and

select appropriate approaches

for future designs under

various application require-

ments.

In this article, our goal is

to demystify ICE designs

and their impact on the

SoC environment. We classify existing ICE approaches,

identify a basic design for each major category, and

show how to instantiate it with an ARM7-based

microprocessor. Finally, we conduct experiments to

quantitatively analyze the hardware, software, and

operational features of these on-chip debug approach-

es and draw conclusions about their applications in

embedded-system design.

Classification of in-circuitemulation approaches

We divide in-circuit emulation operations into two

modes: background debug mode (BDM) and fore-

ground debug mode (FDM). In BDM, the user program

executes normally, except that the ICE is active at the

same time to monitor system status for trigger

conditions such as timer timeout, breakpoint and

watchpoint matching, single stepping, and trace buffer

full. (Although breakpoints, watchpoints, single step-

ping, and traces are different activities, they can be

implemented with similar basic operations. To simply

our discussion, we focus on the breakpoint activity.)

Once the trigger condition exists, the operation mode

switches into FDM, in which the ICE, rather than the

user program, takes control of the system.

In FDM, while the user program is halted, the ICE

can observe or configure the microprocessor’s internal

system status, including memory, registers, and other

control or I/O signals. Alternatively, the ICE can

communicate with the host computer to receive

debug commands from the host and execute them

or send back the internal system status to the host

though a communication channel. Finally, the ICE can

switch the operation mode back to BDM to resume

user program execution.

We can refine these two modes into more

sophisticated debug modes. For example, in one

variation of FDM, the user program can continue

execution within a limited and safe context instead of

halting completely, while the ICE communicates with

the host. (Because of space limitations, we don’t go

into such details here.)

Both modes can be implemented with either

software or hardware. Therefore, we can place all

possible in-circuit emulation approaches into the four

classes listed in Table 1. The software emulation class

uses the all-software approach for both FDM and BDM,

and hardware emulation uses the all-hardware ap-

proach. Hybrid emulation 1 uses software for FDM and

hardware for BDM; and hybrid emulation 2 uses

hardware for FDM and software for BDM. The table

uses the notations F and B for foreground and

background, and S and H for software and hardware.

Thus, FSBH indicates a software foreground and a

463

Figure 1. Microprocessor-based development system. (PCB: printed circuit board.)

September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 3: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

hardware background implementation and represents

hybrid emulation 1.

BDM operationsBDM includes two major tasks: detecting trigger

conditions while the user program is executing, and

suspending the user program and switching to FDM

when the conditions are met.

Software BDM approaches

The two basic approaches to BDM software

implementation are instrumenting and single stepping.

Instrumenting creates a specialized version of the user

program. Programmers construct this version by

patching special instructions into the target locations

of the original user program. Executing these instruc-

tions raises a software interrupt that transfers control to

an exception handler (also called an interrupt service

routine), causing the processor to enter FDM immedi-

ately. Alternatively, the instrumented program can

perform simple condition checking at the cost of

performance overhead. If the conditions are met, the

processor enters FDM; otherwise, it resumes user

program execution. This pure software approach can

be implemented in almost all microprocessors. Its

disadvantage is that the instrumented program is

different from the original user program; a bug-free

instrumented user program doesn’t guarantee a bug-

free original user program.

The single-stepping approach can be used in

microprocessors, such as Intel’s x86 microprocessors,

that support the single-stepping exception.7 This single-

stepping mechanism, once enabled, causes an excep-

tion to be raised after the execution of each instruction

in the user program, and the corresponding exception

handler takes control. The exception handler then

activates the software monitor to check trigger

conditions and decide whether to execute the next

instruction in the user program or to switch to FDM.

This approach’s advantage is that it achieves debug

without instrumenting the user program. On the other

hand, debugging with this approach is very slow, so it

464

Table 1. Classification of in-circuit emulation approaches.*

Approach

Foreground

debug mode

(FDM)

Background

debug mode

(BDM) Advantages Disadvantages Examples

Software emulation

(FSBS)**

Software Software Flexible, easily

modified

Large amount of

system memory,

longer time to detect

breakpoint and

return to user

program

Motorola HC08 Monitor Mode,4

Motorola MPC565,5 Infineon

Tricore,6 Intel x86 debug

instructions,7 ARM Angel

debug monitor,8 Intel

IA-32/647,9,10

Hardware emulation

(FHBH)**

Hardware Hardware Real-time breakpoint

detection, support

for sophisticated

breakpoint conditions

Gate count overhead,

modification

inflexibility

ARM embedded ICE,2

hardware breakpoint in ARM

RealView Debugger,11 Nexus

500112,13

Hybrid emulation 1

(FSBH)**

Software Hardware Similar to FHBH,

flexible FDM

implementation

Longer time for FDM

operations

Intel x86 debug register,7,14

Intel IA-32/647,9,10

Hybrid emulation 2

(FHBS)**

Hardware Software Similar to FSBS,

smaller supporting

software for

FDM operations

Higher hardware cost

than FSBS

Motorola M68300/M68HC16,15

software breakpoint in ARM

RealView Debugger,11 any

simple microprocessor core

with JTAG port and boundary

scan cells and an appropriate

software interrupt instruction

* ICE: in-circuit emulator.

** F: foreground; B: background; H: hardware; S: software.

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 4: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

is infeasible for even a medium-size user program. To

avoid this problem, the user program must be

instrumented to turn on the single-stepping mecha-

nism only within a small range in the program.

In summary, the advantages of the software BDM

approaches are that they are applicable to most

microprocessors, flexible in design, easy to modify,

and require little hardware support. They support a

flexible number of breakpoints. In addition, they allow

adjustment of the priority among the breakpoint

exception and other hardware and software excep-

tions to protect critical tasks from being disrupted by

the debug activity.

On the other hand, software BDM takes up

exception (interrupt) vector space and precious

memory space, which might be limited in the SoC

environment. Second, identifying a possible trigger

condition takes a significant number of instruction

cycles, making software BDM inappropriate for real-

time debug. Third, software overhead makes it

infeasible to detect trigger conditions with complica-

tions such as masking, data dependency, and range

checking. Finally, software BDM can detect only

software logic bugs; it has difficulty detecting hard-

ware and timing-related bugs.

Hardware BDM approaches

The basic hardware support approach provides a

mechanism to control the target microprocessor’s

program execution flow.16 Implementing BDM in

hardware usually involves a hardware comparator to

monitor address and data buses, control signals,

internal states, and I/O signals. The comparator

contains a set of registers that can be programmed

for several trigger conditions. The trigger conditions

can be more sophisticated than those of the software

approach, including masking and data dependency

(equal, not equal, greater than, less than, range, and so

forth), because these are easy to implement in

hardware. Once the trigger conditions are met, the

comparator stops the core clock or raises an exception

to halt the user program and enter FDM.

Implementing BDM in hardware is a simple concept

but requires careful design. An important issue is

proper timing in halting the microprocessor after the

trigger conditions are met to keep it in a stable, precise

state. Another issue is handling instruction parallelism

such as pipelining and superscalar execution so as to

retain the logical sequence and eliminate false

conditions caused by parallel execution.

The advantages of the hardware approach are that

it allows trigger condition checking in real time, and

the trigger conditions can be sophisticated because

these extra functions take only a few extra gates. In

addition, system status that is not directly accessible by

software can be handled by hardware. Therefore, the

hardware approach can detect hardware, software,

and timing bugs. The disadvantages are hardware

overhead, longer design and verification time, and

inflexibility in modifying the ICE (such as increasing

the number of supported breakpoints) after integration

in the SoC.

FDM operationsFDM consists of three major tasks: accessing and

modifying internal system status and configuring BDM,

interacting with the host computer, and switching

back to BDM and resuming the user program.

Software FDM approaches

The software FDM implementation has the form of

a system service routine or software monitor that

usually resides in the system memory area.16 The

software monitor consists of a command loop that

interacts with the host computer to receive commands

from the user and feed information back to the user

through a communication channel. Upon receiving

the user’s command, the software monitor decodes

the command and calls the corresponding service

subroutine, such as setting breakpoints, accessing

memory, accessing registers, resuming the user

program, single stepping, or tracing.

Although a procedure call can invoke the software

monitor, it is more efficient to invoke it through an

interrupt (or exception) such as a software interrupt.

The exception handler backs up the user program’s

system status, checks the exception’s source, performs

system mode switching (if necessary), and finally calls

the software monitor. Once the service of the software

monitor is complete, the system leaves FDM and

returns to BDM by simply resuming the execution of

the user program.

The main advantages and disadvantages of soft-

ware FDM are similar to those of software BDM. An

additional advantage is the smooth transition software

provides between BDM and FDM: Entering (or

leaving) the software monitor automatically suspends

(or restarts) the user program. There is no need to

release (or hold) the system clock to activate (or

465September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 5: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

deactivate) the user program, as in the case of

hardware FDM.

Hardware FDM approaches

Implementing FDM in hardware usually requires an

I/O port specifically dedicated to debug, a set of

registers for storing related information, and a debug

controller for handling communication with the

external world and executing FDM operations. Hard-

ware FDM is independent of the microprocessor core

and is thus driven by the test clock, which is different

from the core clock. While the system is in FDM, the

core clock halts and the hardware FDM is under the

test clock’s control. To switch back to BDM, the test

clock halts and the core clock resumes.

Although there are many possible hardware FDM

implementations, designs based on standard test

mechanisms make core integration and software

development easier. Such mechanisms include the

IEEE 1149.1 JTAG architecture,17 which provides serial

test access to the chip, and the newer IEEE 1500

architecture, which provides both serial and parallel

access to the chip.18

Hardware FDM has two main advantages. First, the

debug circuit is independent of the microprocessor

core and thus takes no programming resources from

the user program. No exception or service routine is

necessary. Second, the test clock can run faster than

the core clock to speed up debug operations, because

the debug circuit is far

simpler than the micropro-

cessor core, which is not

active during FDM. The

main disadvantages are

similar to those of hard-

ware BDM.

FDM communication

channels

An important distinc-

tion between software

and hardware FDM is their

communication channels.

Figure 2 shows a generic

block diagram of an SoC

with an ICE. The SoC has

two communication chan-

nels: the external I/O bus

connected to the micro-

processor’s system (mem-

ory) bus, and the external test bus connected to the

test access mechanism.

In software, the debug channel can be regarded as

a regular I/O port, accessible through memory-

mapped I/O addresses or distinct I/O ports, as defined

by the instruction set architecture. Therefore, software

FDM communicates with the external world through

the external I/O bus. The advantage of this approach is

its simplicity. The disadvantage is that other SoC

components might be blocked from accessing the

system bus or might have to share bus use with

software FDM. Thus, the approach can slow down

both system and debug performance.

Hardware FDM communicates with the external

world through the external test bus. This bus is visible

only to the test access mechanism, not the micropro-

cessor software. The advantage is that debug access

doesn’t interfere with other activities on the system

bus. The disadvantage is that additional I/O pins are

necessary.

Classification examplesThe debugger program for the Motorola 68HC11

evaluation board is an example of software emula-

tion.19 For FDM, it uses a software monitor called

Buffalo, which resides at the top of the memory. The

user can input a set of commands from the

keyboard. The program performs a BDM breakpoint

through a software interrupt (SWI) instruction

466

Figure 2. ICE communication channels in an SoC. (DMA: direct memory access; DSP:

digital-signal processor.)

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 6: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

patched into the target address. It performs single

stepping through a counter (OC5timer) that gener-

ates an interrupt to halt the user program. The value

set in the counter is the exact time required to run

through the monitor and execute the next user

instruction.

Another example of software emulation is the ARM

Angel debug monitor.8 Angel is a program that lets

developers debug applications running on ARM-based

hardware. Angel requires ROM or flash memory to

store the debug monitor code, and RAM to store data.

A typical Angel system’s two main components are a

host debugger and a debug monitor, which commu-

nicate through a physical connection such as a serial

cable. The host debugger, acting as the FDM, runs on

the host computer. The Angel debug monitor, acting

as the BDM, runs on the target system. Angel uses its

Angel Debug Protocol to communicate between the

host and the target systems.

Intel’s x86 microprocessors, such as the IA-32/64,

support both software and hybrid emulation 1.7,10 In

software emulation, the software FDM resides in the

INT1 and INT3 handlers. In BDM, a breakpoint

instruction (OCCh) patched into the target address

causes an INT1 trap and activates the breakpoint.

Turning on the trap flag, which causes an INT3 trap,

achieves single stepping. Alternatively, in hybrid

emulation 1, the hardware comparator handles break-

points. There are four breakpoint registers in hardware.

The breakpoint comparison occurs at the linear

address space—that is, before the physical address

translation.

ARM’s microprocessor ICEs are examples of

hardware emulation. A hardware comparator, called

the ICEBreaker, serves as the hardware BDM. The

supported breakpoint conditions are very sophisticat-

ed, including masking, data dependency, chaining,

and range check. User program execution halts when

a breakpoint is matched. A JTAG port serves as the

hardware FDM. There are two scan chains for the

microprocessor core’s I/O pins, and one scan chain for

configuring the ICEBreaker. The RISCWatch debugger

of IBM’s PowerPC microprocessors uses a similar

hardware technique.3

ARM’s ICE hardware supports only two breakpoints

(which we call BP0 and BP1). To overcome this

limitation, ARM’s RealView Debugger, running on the

host, uses an interesting technique to combine the

hardware and software emulations.11 RealView pro-

vides one hardware breakpoint and an unlimited

number of software breakpoints. The hardware break-

point refers to BP0 in hardware. The so-called software

breakpoints in RealView are actually accomplished by

BP1 in hardware, as opposed to software interrupts in

the previously described software emulation method.

When a programmer places software breakpoints in

the program under debug, RealView replaces the

instructions in the corresponding locations with the

same specific binary pattern (for example, 0xFFFF

FFFF). In addition, RealView configures BP1 in the ICE

hardware as a watchpoint, with the binary pattern as

the target value under watch. When program execu-

tion reaches such locations, the binary pattern is

fetched as an instruction from program memory and

appears on the data bus. The binary pattern appearing

on the data bus triggers the watchpoint and causes the

processor to halt accordingly. The host debugger

software can then read back the program counter

through the JTAG port to determine the halted

location. With this technique, classified as hybrid

emulation 2, a single breakpoint circuit in hardware

can support an unlimited number of software break-

points.

The National Sun Yat-Sen University’s retargetable

embedded ICE module is another example of

hardware emulation based on the JTAG architecture.20

To make the ICE module retargetable to a wide range

of microprocessor architectures, the developers de-

cided that its operations should be controlled only

through test access port (TAP) instructions, not

through instruction set architecture features such as

instructions, system flags, or proprietary configuration

registers. Therefore, they defined a TAP instruction set

extension and additional hardware for the module’s

JTAG architecture.

The Nexus 5001 Forum defined the IEEE-ISTO 5001-

2003 debug interface specification to standardize the

processor debug interface in embedded systems.12 The

standard adopts the hardware emulation approach. It

uses the JTAG port to access the internal debug circuit

and allows optional extra pins, defined by the

designer, for higher debug throughput or more

complex control. At least two hardware breakpoints

are required to meet the standard. Vendors such as

IPextreme provide Nexus 5001-compliant debug mod-

ules for microprocessors such as the ARM7 and ARM9,

and on-chip bus interfaces such as the Advanced High-

Performance Bus (AHB).13

Finally, any microprocessor core with a basic JTAG

port and boundary scan cells and appropriate software

467September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 7: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

interrupt capability is a typical example

of hybrid emulation 2. The JTAG-related

circuits serve as the hardware FDM, and

the software interrupt instruction can be

patched into the user program to serve as

the software BDM.

Representative ICE designsWe have presented a classification

scheme of in-circuit emulation approach-

es from the hardware and software

perspective. However, quantitatively ana-

lyzing and comparing such approaches is

still difficult because existing designs are

implemented on significantly different plat-

forms and for different purposes. Here,

we identify a typical design for each

class of ICE and show how to implement

it on the same ARM7 microprocessor

platform, thus allowing fair analysis and

comparison.

Software emulation (FSBS)

Figure 3a shows a block diagram of

the software emulation scheme for the

ARM7 microprocessor. At the right is the

ARM7 microprocessor core. External

memory is connected to the address

and data buses of the microprocessor

core. The external memory is conceptu-

ally partitioned into three portions:

system memory, user memory, and the

communication device. Software FDM

and software BDM are located in system

memory and user memory, respectively.

Software FDM is implemented with a soft-

ware program segment called SoftFDM,

activated by the SWI exception handler,

which also resides in system memory.

Software BDM is the instrumented user

program under debug. The communica-

tion devices are memory-mapped I/O

devices.

Figure 3b shows the memory layout

in more detail. The upper part (system

memory) contains the table that stores

breakpoint information, the pool that

preserves the register contents of the

user program upon entering FDM, the

I/O buffers that hold information while

468

Figure 3. Memory organization of software emulation for the ARM7

microprocessor: block diagram (a) and memory layout (b). (SWI: software

interrupt.)

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 8: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

SoftFDM communicates with the host

computer, and SoftFDM, which is part of

the SWI handler. The instrumented user

program resides in the user program

memory. A target location in the user

program, where the user intends to set a

breakpoint, is replaced with the special

SWI instruction, which serves as the

FDM trigger.

Figure 4 presents the basic structure

of the SWI exception handler. Before

entering SoftFDM, the SWI exception

handler must back up the user register

file and read the SWI instruction’s data

field to determine the exception service

vector. After entering SoftFDM, the pro-

gram’s first task is to restore the registers

polluted by the SWI exception handler.

These housekeeping activities constitute

software emulation’s major performance

overhead. SoftFDM is a command loop

that receives commands from the host

computer, decodes them, and performs

corresponding operations.

Hardware emulation (FHBH)

Figure 5 shows a block diagram of

the hardware emulation scheme for the

ARM7 microprocessor core. The hard-

ware monitor is the hardware BDM. The

JTAG controller and its related compo-

nents, such as the five I/O pins and the boundary

scan chains, serve as the hardware FDM. The

hardware monitor is connected to the microproces-

sor core’s address and data buses. The hardware

monitor checks the trigger conditions on the buses.

The hardware monitor’s major component is a

comparator. Figure 6 shows the circuit diagram of the

comparator, which supports two breakpoints. The

figure shows the details of one breakpoint. Three kinds

of information are necessary to configure a break-

point: the control, data, and address signals. Each

signal is further controlled by two parameters: the

mask and the target value. It takes a total of six

configuration registers to control a breakpoint setting.

These configuration registers allow breakpoint check-

ing to be data dependent and bitwise maskable. The

hardware monitor is controlled by the debug-enable

I/O pin and the hardware FDM. When a breakpoint is

triggered, output signal breakpt is asserted. This

disables the microprocessor core clock at the proper

cycle to halt user program execution and switch the

system into FDM, in which the system is under test

clock control.

Hardware FDM is implemented with the IEEE

1149.1 JTAG architecture. The serial access imposed

by the JTAG standard could cause a performance

bottleneck during debug. To improve debug perfor-

mance, designers can use newer architectures with

parallel test access, such as IEEE 1500, for the FDM

implementation, at the cost of higher hardware

overhead.

Hybrid emulation 1

Hybrid emulation 1 (FSBH) uses the software FDM

from software emulation and the hardware BDM from

hardware emulation. Figure 7 shows the block dia-

gram and the memory layout for hybrid emulation 1.

These are similar to those of software emulation

469

Figure 4. Basic structure of the SWI exception handler.

September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 9: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

because the FDM is implemented with software.

However, a few modifications are worth noting. First,

an additional hardware module, the hardware mon-

itor, connects to the memory data and address buses.

The hardware monitor implements the hardware

BDM. Second, there is no instrumented code in the

user program, because the hardware

monitor performs breakpoint checking

in the background. Third, the hardware

monitor’s behavior is similar to memory

controllers such as memory manage-

ment units. Thus, instead of holding the

core clock for the microprocessor core

as in hardware emulation, the hardware

monitor halts the microprocessor core

and enters the FDM by generating a data

abort signal (using its breakpt output

signal). Fourth, the software FDM is in

the data abort exception handler, in-

stead of the software interrupt handler as

in software emulation. Fifth, an addition-

al field called the hardware monitor

registers is allocated in the system

memory for configuration of the hard-

ware monitor.

Hybrid emulation 2

Hybrid emulation 2 (FHBS) uses the

hardware FDM from hardware emulation

and the software BDM from software emulation.

Figure 8 shows the block diagram and the memory

layout for hybrid emulation 2. These illustrations are

similar to those of hardware emulation because the

FDM is implemented with hardware. However, again,

we note a few modifications. First, the hardware

470

Figure 5. Hardware emulation for the ARM7 microprocessor. (BDM:

background debug mode; FDM: foreground debug mode; nTRST: test reset;

TCK: test clock; TMS: test mode select; TDI: test data in; TDO: test data out.)

Figure 6. The major BDM hardware: the comparator. (BP: breakpoint.)

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 10: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

module connected to the

memory data and address

buses in the hardware em-

ulation scheme is not nec-

essary here, because the

software BDM checks trig-

ger conditions. Second,

although hardware per-

forms the major FDM task,

this design still needs a

small software interface to

manage the FDM control-

ler. This is the FDM control

routine in the SWI excep-

tion handler in Figure 8b.

Third, because there is no

hardware monitor to hold

the core clock, the FDM

control routine must hold

the core clock by writing

to a memory-mapped I/O

address—0x0000001 C in

Figure 8b. While the clock

is held, system control can

be safely transferred to the

FDM controller. The user

can reactivate the core

clock by properly config-

uring the related I/O circuit

through the FDM controller.

Table 2 summarizes the

implementation features of

the four emulation ap-

proaches for the ARM7

microprocessor core.

Quantitativecomparisons

We constructed an FPGA-

based prototyping system

to verify and demonstrate

various in-circuit emula-

tion approaches. We built

the ICE designs just de-

scribed with an academ-

ic synthesizable micropro-

cessor core that implements the ARM7 instruction set.

We downloaded the ICEs to the prototyping system for

experiments, synthesized them to standard cells, and

analyzed their gate-level features.

Hardware analysis

We synthesized the ICEs with TSMC’s 0.35-micron

standard cell library. The ARM7 core requires 46,167

gates. Table 3 presents our quantitative analysis of the

471

Communicationbuffer field

Host computer

Softwareemulation code

Figure 7. Hybrid emulation 1 (FSBH) for the ARM7 microprocessor: block diagram (a)

and memory layout (b).

September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 11: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

ICE hardware for each of the four

emulation approaches. Compared with

the ARM7 core, the gate count overheads

of the FSBS, FHBH, FSBH, and FHBS

approaches are 0%, 15%, 11%, and 4%.

The major gate count contributor is the

hardware comparator. Therefore, the

designer must be careful in deciding

how many breakpoints and conditions

are supported by hardware. Regarding

the core clock speed, FHBH and FHBS

have the same speed because they have

the same critical path. FSBS and FSBH

also have the same speed. In addition,

FHBH and FHBS have a minor overhead

of 0.4% of the overheads of FSBS and

FSBH. The overhead is due to the scan

cells on the critical path. The experi-

ment shows that most of the ICE

hardware components are not on the

critical path and don’t affect system

performance.

FHBH and FHBS have another clock,

the test clock, which drives the hardware

while the core clock is halted. The

experiment shows that the test clock is

about 20% faster than the core clock,

because most of the complex system

hardware modules are not used during

test. This indicates that the hardware

debug mechanism can operate at a

faster speed than normal system speed.

Software aspects

The related software modules are

written in the ARM7 assembly language,

and assembled and linked with the ARM

STD v2.5 development tool. The ma-

chine code is downloaded into the

embedded memory in the chip. Table 4

presents our quantitative analysis of the

ICE software for the four emulation

approaches. Of all the approaches,

FHBH needs no software code or re-

sources. FSBH needs the largest software

code and consumes one exception

vector resource, but it can also debug

the original user program. On the other

hand, FHBS has the same software

debug mechanism as FSBS but requires

472

Figure 8. Hybrid emulation 2 (FHBS) for the ARM7 microprocessor: block

diagram (a) and memory layout (b).

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 12: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

473

Table 2. Implementation features of the four emulation approaches for the ARM7 microprocessor.

Feature

Software emulation

(FSBS)*

Hardware emulation

(FHBH)*

Hybrid emulation 1

(FSBH)*

Hybrid emulation 2

(FHBS)*

FDM approach SoftFDM program in

software exception

handler

JTAG controller SoftFDM program in

data abort exception

handler

JTAG controller with simple

control routine in software

exception handler

BDM approach Instrumented code

(SWI instruction)

Hardware monitor Same as hardware

emulation

Same as software

emulation

Mode switch: BDM

to FDM

Execute SWI instruction breakpt signal stops

core clock

breakpt signal raises

data abort exception

Same as software

emulation

Mode switch: FDM

to BDM

Exit SWI exception

handler

Input and execute

JTAG restart instruction

to resume core clock

Exit data abort

exception handler

Same as hardware

emulation

Suspend user

program

Jump to SWI exception

handler

Hold core clock with breakpt

signal of hardware monitor

Jump to data abort

exception handler

Hold core clock with SWI

exception handler

Communication

interface

Memory-mapped I/O JTAG port Same as software

emulation

Same as hardware

emulation

Set breakpoint Use SWI instruction to

patch instruction at

breakpoint address

Scan breakpoint values into

hardware monitor registers

Use store instruction

to store target values

in hardware monitor

register buffer of

system memory

Same as software

emulation

Register access Execute memory store

instruction to store

register values in

output buffer of

system memory

Scan in and execute memory

store instruction to put register

values on memory data bus;

then scan out bus through

boundary scan chain

Same as software

emulation

Same as hardware

emulation

Memory access Execute memory load

and store instructions

to copy memory content

to output buffer of

system memory

accessible by host

Scan in and execute memory

load instruction

to read memory content

in register file; then use register

access to output content to host

Same as software

emulation

Same as hardware

emulation

Single-step Patch consecutive

instructions

Set new breakpoint on next

instruction address or use

end-of-instruction signal as

breakpoint trigger

Same as hardware

emulation

Same as software

emulation

* F: foreground; B: background; H: hardware; S: software.

Table 3. Quantitative comparison of ICE hardware in the four emulation approaches for the ARM7 microprocessor.

Features

Software emulation

(FSBS)*

Hardware emulation

(FHBH)*

Hybrid emulation

1 (FSBH)*

Hybrid emulation

2 (FHBS)*

Gate count 0 6,992 4,912 2,046

Core clock, test

clock (ns)

17.03, NA 17.10, 13.78 17.03, NA 17.10, 13.78

* F: foreground; B: background; H: hardware; S: software.

September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 13: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

far less code in the SWI exception handler, thus saving

precious system memory space.

ICE features

Table 5 presents our quantitative analysis of the ICE

features for the four emulation approaches. The first

three ICE features in the table are related to the ICE

capability. FHBH has the most complex and inflexible

design, because everything is in hardware. FSBS is the

simplest and most flexible, because everything is in

software. FSBH and FHBS have medium complexity,

and they are complementary to each other in the

flexibility of FDM and BDM. FHBH and FSBH can

provide sophisticated breakpoint conditions, where-

as FSBS and FHBS detect only instruction accesses.

The analysis shows that when choosing the appro-

priate in-circuit emulation approach, the designer

must consider the flexibility requirements for FDM

and BDM in a specific SoC development environ-

ment.

The last five ICE features in Table 5 are the latencies

for various ICE operations. We show the latency with

the physical time instead of the cycle count, because

the operations in FHBH and FHBS are the collabora-

tion of the core clock and the test clock running at

their own speeds. In addition, because some opera-

tions involve interactions between the SoC and the

external world, the bandwidth of the communication

channel also affects the latencies.

Therefore, Table 5 shows two versions of latencies,

whenever appropriate, designated by S and P to

indicate serial and parallel access. For the hardware

FDM, serial access refers to the IEEE 1149.1 JTAG

architecture, and parallel access refers to the IEEE

1500 architecture. For the software FDM, serial access

and parallel access refer to an external I/O bus with 1-

bit and 32-bit bandwidth, respectively. Furthermore,

some ICE operations can be broken down to three

steps of operations, which are listed in parentheses in

the table:

& set up the debug command,

& execute the command, and

& send feedback to the user.

The analysis of the ICE operation latencies shows

that FHBH has the shortest latencies, especially for

detecting the breakpoints in which the latency incurs

only the time spent waiting for the current instruction

to complete its execution before a break can be taken.

This feature makes FHBH the best candidate for real-

time debug. The next-best candidate is FHBS. The

worst one is FSBH, because once a breakpoint is

detected by the hardware monitor with the data abort

exception, it must await the current instruction for

completion, preserve the system status, and then

transfer control to the software FDM.

Moreover, the latency breakdown analysis indi-

cates that the major contributor of the latency is the

time spent receiving commands from and sending

feedback to the user, rather than the time to execute

the debug command. This observation suggests that

the ICE performance for an SoC can be greatly

improved by employing the following strategies: First,

develop a communication channel with a high

bandwidth and an efficient protocol. Second, store

macros of ICE operations on chip, similar to the

concept of microprogramming, to avoid communicat-

ing a tremendous amount of primitive operations

through the channel.

474

Table 4. Quantitative comparison of ICE software in the four emulation approaches for the ARM7 microprocessor.

Features

Software

emulation

(FSBS)*

Hardware

emulation

(FHBH)*

Hybrid

emulation 1

(FSBH)*

Hybrid

emulation 2

(FHBS)*

Code size (bytes):

equivalent hardware

gate count

888 NA 920 44

Resources used One exception

vector: SWI

NA One exception vector:

data abort exception

SWI

Debug on original user

program

No Yes Yes No

* F: foreground; B: background; H: hardware; S: software.

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 14: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

Application domains

Table 6 gives the suitable SoC application domains

for the four emulation approaches.

THE QUANTITATIVE ANALYSES show that the FHBH

hardware emulation is suitable for SoC designs where

the extensive hardware cost is affordable and real-time

hardware-software debug is a strict requirement. The

FSBS software emulation is suitable for SoC designs

with a rich memory resource and a simple I/O

structure, in which functional software debug, and

not timing behavior, is the primary concern. It can be

also used as a supplement to the FHBH hardware to

provide extra capacity that is not provided by the

hardware, such as more breakpoints. The FSBH

approach is suitable for SoC designs requiring low-

475

Table 5. Quantitative comparison of ICE features in the four emulation approaches for the ARM7 microprocessor.

Features

Software emulation

(FSBS)*

Hardware emulation

(FHBH)*

Hybrid emulation

1 (FSBH)*

Hybrid emulation 2

(FHBS)*

Design complexity Simple, flexible Complex, inflexible Medium, flexible FDM but

inflexible BDM

Medium, inflexible FDM

but flexible BDM

Breakpoint number Unlimited, subject only

to memory size

2, fixed in number 2, fixed in number Unlimited, subject only to

memory size

Breakpoint condition Instruction address only Instruction or data access

with masking, data

dependency checking

Instruction or data access

with masking, data

dependency checking

Instruction address only

Breakpoint setup** S: 20947 (20828, 119, 0)

P: 885 (766, 119, 0)

S: 3790 (152, 3638, 0)

P: 317 (69, 248, 0)

S: 124761 (124251, 510, 0)

P: 4393 (3883, 510, 0)

S: 11268 (303, 10965, 0)

P: 865 (138, 727, 0)

Latency for breakpoint

detection (ns)**

732 (0, 732, 0) 17 to 272 (0, 17 to 272,

0) Wait for one instruction

to complete execution

749 to 1,004 (17 to 272,

732, 0) Wait for one

instruction to complete

execution

633 (0, 633, 0)

Latency to resume

user program (ns)**

S: 12210 (11444, 766, 0)

P: 1124 (358, 766, 0)

S: 6818 (303, 6515, 0)

P: 949 (138, 811, 0)

S: 11751 (11444, 307, 0)

P: 665 (358, 307, 0)

S: 14797 (303, 14494, 0)

P: 1458 (138, 1320, 0)

Latency to access

one memory word

(ns)**

S: 15702 (15072, 85, 545)

P: 920 (818, 85, 17)

S: 6527 (3541, 51, 2935)

P: 657 (441, 51, 165)

S: 15702 (15072, 85, 545)

P: 920 (818, 85, 17)

S: 6527 (3541, 51, 2935)

P: 657 (441, 51, 165)

Latency to access

one register word

(ns)**

S: 15293 (14714, 34, 545)

P: 511 (460, 34, 17)

S: 3272 (2260, 34, 978)

P: 337 (248, 34, 55)

S: 15293 (14714, 34, 545)

P: 511 (460, 34, 17)

S: 3272 (2260, 34, 978)

P: 337 (248, 34, 55)

* F: foreground; B: background; H: hardware; S: software.

** S: serial access; P: parallel access. Numbers in parentheses indicate time to set up the debug command, time to execute this command, and

time to send feedback to the user.

Table 6. Suitable SoC application domains of the four emulation approaches for the ARM7 microprocessor.

Software emulation

(FSBS)*

Hardware emulation

(FHBH)*

Hybrid emulation 1

(FSBH)*

Hybrid emulation 2

(FHBS)*

Functional software debug

with rich memory resource

and simple I/O structure and

timing requirement

Real-time low-level hardware

and software debug with

complex I/O structure and

timing requirement

Low-level hardware and

software debug with flexibility

in FDM but less real-time

response

Functional software debug with

limited memory resource

* F: foreground; B: background; H: hardware; S: software.

September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 15: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

level hardware and software debug support and

flexibility in the FDM implementation. FHBS is suitable

for software emulation but with a tight memory size. In

the future, we’d like to investigate the in-circuit

emulation problems in superscalar processors, VLIW

processors, multiprocessors, and hierarchical-core-

based systems. &

AcknowledgmentsWe thank the editors and reviewers for their

valuable suggestions in improving this work. This

work was partially funded by the National Science

Council (Taiwan) under contract 91-2218-E-110-005.

&References

1. M. Rafiquzzaman, Microprocessors and Microcomputer

Development Systems, Harper & Row, 1984.

2. S. Furber, ARM System-on-Chip Architecture, 2nd ed.,

Addison-Wesley, 2000.

3. ‘‘RISCWatch Debugger,’’ IBM, http://www-306.ibm.com/

chips/products/powerpc/tools/riscwatc.html.

4. K. Kikuchi and J. Suchyta, HCS08 Background Debug

Mode versus HC08 Monitor Mode, Motorola application

note AN2497/D, June 2003; http://e-www.motorola.com/

files/microcontrollers/doc/app_note/AN2497.pdf.

5. MPC565/MPC566 User’s Manual, MPC565UM/D

revision 2, Motorola, 2002.

6. Tricore 1 Architecture Manual, v1.3.3, Infineon

Technologies, 2002.

7. Pentium Processor Family User’s Manual, vol. 3,

Architecture and Programming Manual, Intel, 1994.

8. ‘‘Angel Debug Monitor,’’ ARM, http://www.arm.com/

products/DevTools/AngelDebugMonitor.html.

9. ‘‘Debugging Support, Embedded Intel386DX

Microprocessor,’’ data sheet, Intel, 1995.

10. Intel 64 and IA-32 Architectures Software Developer’s

Manual, vol. 3b, System Programming Guide. 2006, http://

www.intel.com/design/processor/manuals/253669.pdf.

11. ‘‘RealView Debugger,’’ ARM, http://www.arm.com/

products/DevTools/RVD.html.

12. The Nexus 5001 Forum Standard for a Global Embedded

Processor Debug Interface, IEEE Industry Standards and

Technology Organization, 23 Dec. 2003, http://www.

nexus5001.org/standard.html.

13. ‘‘Freescale Nexus 5001 Software Debug Interfaces,’’

IPextreme, http://www.ip-extreme.com/IP/nexus_5001.

html.

14. M. El Shobaki and L. Lindh, ‘‘A Hardware and Software

Monitor for High-Level System-on-Chip Verification,’’

Proc. 2nd Int’l Symp. Quality Electronic Design, IEEE CS

Press, 2001, pp. 56-61.

15. C. Melear, ‘‘Emulation Techniques for Microcontrollers,’’

Proc. WESCON/97 Conf., IEEE Press, 1997, pp.

532-541.

16. A.B.T. Hopkins and K.D. McDonald-Maier, ‘‘Debug

Support for Complex Systems on-Chip: A Review,’’ IEE

Proc. Computers and Digital Techniques, vol. 153, no. 4,

3 July 2006, pp. 197-207.

17. IEEE Std. 1149.1-2001, Test Access Port and Boundary-

Scan Architecture, IEEE, 2001.

18. IEEE Std. 1500, Embedded Core Test (SECT), IEEE

Computer Society, 2005, http://grouper.ieee.org/groups/

1500/index.html.

19. W.C. Wray, J.D. Greenfield, and R. Bannatyne, Using

Microprocessors and Microcomputers: The Motorola

Family, 4th ed., Prentice Hall, 1998.

20. I.-J. Huang et al., ‘‘A Retargetable Embedded In-Circuit

Emulation Module for Microprocessors,’’ IEEE Design &

Test, vol. 19, no. 4, July/Aug. 2002, pp. 28-38.

Chung-Fu Kao completed the work described in

this article while completing his PhD at National Sun

Yat-Sen University, Taiwan. His research interests

include SoC platform design, design for verification,

and hardware-software coverification. He has a BS in

computer science and information engineering from

Tamkang University, Taiwan, and an MS and a PhD in

computer science and engineering from National Sun

Yat-Sen University.

Hsin-Ming Chen is an advanced engineer at

Andes Technology. He completed the work de-

scribed in this article while completing his MS at

National Sun Yat-Sen University. His research inter-

ests include embedded-ICE design and DFT. He has

a BS in computer information science from National

Chin-Yi University of Technology, Taiwan, and an MS

in computer science and engineering from National

Sun Yat-Sen University.

Ing-Jer Huang is a professor in the Department of

Computer Science and Engineering at National Sun

Yat-Sen University. His research interests include

microprocessors, SoC design, design automation,

system software, embedded systems, and hardware-

software codesign. He has a BS in electrical

engineering from National Taiwan University, and an

MS and a PhD in computer engineering from the

476

In-Circuit Emulation

IEEE Design & Test of Computers

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.

Page 16: Hardware-Software Approaches to In-Circuit Emulation for ......such as pipelining and superscalar execution so as to retain the logical sequence and eliminate false conditions caused

University of Southern California. He is a member of

the IEEE and the ACM.

&Direct questions and comments about this article to

Ing-Jer Huang, Embedded Systems Laboratory

(F5014), Dept. of Computer Science and Engineer-

ing, National Sun Yat-Sen University, 70 Lien-Hai Rd,

Kaohsiung City, Taiwan 80424 ROC; ijhuang@cse.

nsysu.edu.tw.

For further information on this or any other computing

topic, please visit our Digital Library at http://www.

computer.org/csdl.

477September/October 2008

Authorized licensed use limited to: National Sun Yat Sen University. Downloaded on September 29, 2009 at 02:19 from IEEE Xplore. Restrictions apply.