flexible divider

A LOW-POWER SINGLE-PHASE CLOCK MULTIBAND

FLEXIBLE DIVIDER

ABSTRACT

In this paper, a wideband 2/3 prescaler is verified in the design of proposed wide band

multimodulus 32/33/47/48 prescaler. A dynamic logic multiband flexible integer-N divider is

designed which uses the wideband 2/3 prescaler , multimodulus 32/33/47/48 prescaler. Since the

multimodulus 32/33/47/48 prescaler has maximum operating frequency of 6.2 GHz, the values of

P and S counters can actually be programmed to divide over the whole range of frequencies.

However, the P and S counters are programmed accordingly. The proposed multiband flexible

divider also uses an improved loadable bit-cell for Swallow - counter and consumes a power of

0.96 and 2.2 mW, respectively, and provides a solution to the low power PLL synthesizers for

Bluetooth, Zigbee, IEEE 802.15.4, and IEEE 802.11a/b/g WLAN applications with variable

channel spacing.

CHAPTER 1

INTRODUCTION

1.1. GENERALWIRELESS LAN (WLAN) in the multigigahertz bands, such as Hiper LAN II and IEEE

802.11a/b/g, are recognized as leading standards for high-rate data transmissions, and

standards like IEEE 802.15.4 are recognized for low-rate data transmissions. The demand

for lower cost, lower power, and multiband RF circuits increased in conjunction with

need of higher level of integration. The frequency synthesizer, usually implemented by a

phase-locked loop (PLL), is one of the power-hungry blocks in the RF front-end and the

first-stage frequency divider consumes a large portion of power in a frequency

synthesizer. The integrated synthesizers for WLAN applications at 5 GHz reported in

and consume up to 25 mW in CMOS realizations, where the first-stage divider is

implemented using an injection-locked divider which consumes large chip area and has a

narrow locking range. The best published frequency synthesizer at 5 GHz consumes 9.7

mWat 1-V supply, where its complete divider consumes power around 6 mW where the

first-stage divider is implemented using the source-coupled logic (SCL) circuit , which

allows higher operating frequencies but uses more power. Dynamic latches are faster and

consume less power compared to static dividers. The frequency synthesizer reported in

uses a prescaler as the first-stage divider, but the divider consumes Power. Most IEEE

802.11a/b/g frequency synthesizers employ SCL dividers as their first stage , while

dynamic latches are not yet adopted for multiband synthesizers. In this paper, a dynamic

logic multiband flexible integer-N divider based on pulse-swallow topology is proposed

which uses a low-power wideband 2/3 prescaler and a wideband multi modulus

32/33/47/48 prescaler. The divider also uses an improved low-power loadable bit-cell for

the Swallow S counter. The frequency synthesizer is one of the basic building blocks in

modern communication systems. The operating frequency of the frequency synthesizer is

limited by the frequency divider and the voltage-controlled oscillator. The function of

channel selection in the frequency synthesizer demands programmable division ratios for

the frequency divider. The integer-N frequency synthesizer is more practical, less costly

and of low spurious sideband performance as compared with the fractional-N frequency

synthesizer . It is usually formed by a prescaler, a program counter (P counter) and a

swallow counter( S counter). Such a topology can provide a programmable division ratio

of N X P + S, where N, P and S are the division ratios of three blocks respectively. The

prescaler provides a dual-modulus of N=N +1. The P counter provides a fixed division

ratio according to the requirement of the overall division ratio, while the continuous

division ratios from 3 to 2n is achieved through the S counter by periodically reloading

the divide-by-2 stages, where n is the number of stages of the S counter. The continuous

division ratio is used to select the desired channels. Much research has been

focused on the prescaler design for its highest operating frequency . However, in the

modern communication system, there is an increasing demand for multi-standards

applications. The requirement for wide band and high resolution operations continue to

be the problems. To satisfy these requirements, different reference frequencies, and

different arrangement for N, P and S counters are selected for different applications. For

example, only the UNII bands are covered. In this paper, a new wide-band high

resolution programmable frequency divider is proposed. The wide band and high

resolution are obtained by using the all-stage programmable topology in both counters ,

The high-speed frequency divider is a key block in frequency synthesis. The prescaler is

the most challenging part in the high-speed frequency-divider design because it operates

at the highest input frequency. A dual-modulus prescaler usually consists of a divide-by-

2/3 (or 4/5) unit followed by several asynchronous divide-by-2 units. The operation of the

divide-by-2/3 unit at the highest input frequency makes it the bottleneck of the prescaler

design. To achieve the two different division ratios, D flip-flops (DFFs) and additional

logic gates, which reduce the operating frequency by introducing an additional

propagation delay, are used in the unit. The power consumption of this divide-by-2/3

unit, which is the greatest portion of the total power consumption in the prescaler,

significantly increases due to the power consumption of the additional components.

1.2. OBJECTIVE

1. A dynamic logic multiband flexible integer-N divider based on pulse-swallow

topology is proposed which uses a low-power wideband 2/3 prescaler and a wideband

multi modulus 32/33/47/48 prescaler.

2. To achieve high-rate data transmissions

3. To achieve low-rate data transmissions.

1.3EXISTING SYSTEM

1. The demand for lower cost, lower power, and multiband RF circuits increased in

conjunction with need of higher level of integration. The frequency synthesizer,

usually implemented by a phase-locked loop (PLL), is one of the power-hungry

blocks in the RF front-end and the first-stage frequency divider consumes a large

portion of power in a frequency synthesizer.

2. The integrated synthesizers for WLAN applications at 5 GHz reported in and

consume up to 25 mW in CMOS realizations, where the first-stage divider is

implemented using an injection-locked divider which consumes large chip area and

has a narrow locking range.

3. Frequency synthesizer at 5 GHz consumes 9.7 mW 1-V supply, where its complete

divider consumes power around 6 mW where the first-stage divider is implemented

using the source-coupled logic (SCL) circuit, which allows higher operating

frequencies but uses more power.

1.3.1 LITERATURE SURVEY

1.3.1.1 A 13.5-mW 5-GHz frequency synthesizer with dynamic-logic frequency divider- S.;

Levantino, S.; Samori, C.; Lacaita, A.L.-Feb. 2004.

DESCRIPTION: The adoption of dynamic dividers in CMOS phase-locked loops for

multigigahertz applications allows to reduce the power consumption substantially without

impairing the phase noise and the power supply sensitivity of the phase-locked loop (PLL). A 5-

GHz frequency synthesizer integrated in a 0.25-μm CMOS technology demonstrates a total

power consumption of 13.5 mW. The frequency divider combines the conventional and the

extended true-single-phase-clock logics. The oscillator employs a rail-to-rail topology in order to

ensure a proper divider function. This PLL intended for wireless LAN applications can

synthesize frequencies between 5.14 and 5.70 GHz in steps of 20 MHz. A low-power 5-GHz

CMOS frequency synthesizer for wireless LAN transceivers has been presented. The PLL

integrated in a 0.25- m CMOS technology consumes only 13.5 mW, thanks to a dynamic TSPC

divider. This class of dividers is demonstrated to be suitable for multigigahertz synthesizers,

since it does not impair the power supply rejection or the phase noise performance. WIRELESS

LAN systems in the 5–6-GHz band, such as HiperLAN II and IEEE 802.11a, are recognized as

the leading standards for high-rate data transmissions. Being intended for mobile operations, the

radio transceiver has a limited power budget. The frequency synthesizer, usually implemented by

a phase-locked loop (PLL), is one of the most critical blocks in terms of average current

dissipation since it operates extensively for both receiving and transmitting. The best published

integrated synthesizers around 5 GHz suitable for wireless LAN receivers consume up to

25mWin both CMOS and bipolar realizations. Other synthesizers embedded in 802.11a-

compliant transceivers can consume up to 200 mW.

DISADVANTAGES OF EXISTING:

This high power consumption is mainly due to the first stages of the frequency divider

that often dissipates half of the total power. Due to the high input frequency, the first

stage of the divider cannot be implemented in conventional static CMOS logic.

Instead, it is commonly realized in source-coupled logic (SCL), which allows higher

operating frequency, but burns more power.

ADVANTAGES OF PROPOSED:

Dynamic latches are known to be faster and more compact than static ones. The true-

single-phase-clock (TSPC) design allows to drive the dynamic latch with a single clock

phase, thus avoiding the skew problem.

The use of dynamic logic is not only possible up to 6 GHz, but also extremely effective

in reducing the synthesizer power dissipation.

1.3.1.2 Design and Optimization of the Extended True Single-Phase Clock-Based Prescaler

-Xiao Peng Yu; Manh Anh Do; Wei Meng Lim; Kiat Seng Yeo; Jian-Guo Ma- :

Nov.2006.

DESCRIPTION: The power consumption and operating frequency of the extended true

single-phase clock (E-TSPC)-based frequency divider is investigated. The short-circuit power

and the switching power in the E-TSPC-based divider are calculated and simulated. A low-power

divide-by-2/3 unit of a prescaler is proposed and implemented using a CMOS technology.

Compared with the existing design, a 25% reduction of power consumption is achieved. A

divide-by-8/9 dual-modulus prescaler implemented with this divide-by-2/3 unit using a 0.18-

mum CMOS process is capable of operating up to 4 GHz with a low-power consumption. The

prescaler is implemented in low-power high-resolution frequency dividers for wireless local area

network applications. The design and optimization of a high-speed E-TSPC-based prescaler has

been carried out by investigation of the operating frequency and power consumption of the E-

TSPC circuit. A new divide-by-2/3 unit with low power consumption has been proposed. It is

suitable for the high-speed CMOS prescaler design. A divide-by-8/9 dual-modulus prescaler

implemented with the proposed unit has been implemented to achieve the ultra-low-power

consumption. The dual-modulus operation above 4 GHz in the TSPC-based prescaler has first

been achieved. The prescaler has been implemented in high-resolution frequency dividers. It is

suitable for the wireless communication system below 4 GHz. The operation of this proposed

prescaler and frequency divider have also been silicon verified.The high-speed frequency divider

is a key block in frequency synthesis. The prescaler is the most challenging part in the high-

speed frequency-divider design because it operates at the highest input frequency. A dual-

modulus prescaler usually consists of a divide-by-2/3 (or 4/5) unit followed by several

asynchronous divide-by-2 units. The operation of the divide-by-2/3 unit at the highest input

frequency makes it the bottleneck of the prescaler design.


The extended true single-phase clock (E-TSPC) logic is proposed to increase the

operating frequency. However, this causes additional power consumption.

In modern wireless communication systems, the power consumption is a key

consideration for the longer battery life. The MOS current mode logic (MCML) circuit,

which is of high power consumption, is commonly used to achieve the high operating

frequency, while a true single-phase clock (TSPC) dynamic circuit, which only

consumes power during switching, has a lower operating frequency.


In this paper, the power consumption and operating frequency in the E-TSPC logic style

is evaluated. The two major sources of power consumption, namely, the short-circuit

power and the switching power, in the E-TSPC divide- by-2 unit is calculated and

simulated.

Based on the analysis,a new divide-by-2/3 unit is proposed to achieve the low power

consumption by reducing the switching activities and the short-circuit current in the DFFs

of the unit, and a dual-modulus prescaler implemented with the unit is proposed.

1.3.1.3 A Dynamic-Logic Frequency Divider for 5-GHz WLAN Frequency Synthesizer-

Yue-Fang Kuo and Ro-Min Weng- National Dong Hwa University Hualien, Taiwan,

Republic of China.

DESCRIPTION: A dynamic-logic frequency divider for fully integrated CMOS frequency

synthesizer is presented in this paper. The divider based on the dual-modulus prescaler and

dynamic logic circuit is designed to reduce the power consumption, transistor-counts, and chip

area. The simulation results show the proposed circuit achieved the operating frequency band

from 5.15GHz to 5.825GHz for wireless local area network applications. A simple architecture

of the dynamic-logic frequency divider has been demonstrated in a standard 0.18μm CMOS

technology. The frequency divider is designed without counters and the simulation results show

the advantages in low power consumption and less chip area. The proposed frequency divider

achieves the operating frequency bands form 5.15GHz and 5.825GHz in steps of 20MHz, which

covers 15 channels in WLAN applications. IEEE 802.11a and HiperLAN are standards of

wireless data networks with frequency band operated from 5 to 6GHz which covers fifteen

channels with a channel spacing of 20MHz. The frequency synthesizers are widely used to

generate local oscillation (LO) signals in modern communication systems. In order to cover the

required carries and operate from input frequency of 5GHz, the division of the divider has to be

programmed from 257 to 294. The operating frequency of a frequency synthesizer is limited by

the frequency divider as well as the voltage controlled oscillator (VCO).

DISADVANTAGES IN EXISTING:

For WLAN standard, most common high-speed frequency are based on a pulse-swallow

architecture. The architectures require two additional counters for generation of a desired

division ratio. It occupies many gate-counts, large chip area, and consumes extra power.

The high power consumption is mainly due to the first stages of the frequency divider

that often consumes half of the total power. The first stage of the divider cannot be

implemented in dynamic TSPC circuit


This paper proposes a new frequency divider keeping the same function as a conventional

one without employing a sallower counter to consume takes extra power and unnecessary

chip area.

The proposed topology is based on the counter-less and dual-modulus counter detector.

1.3.1.4 Design of a low power wideband high resolution programmable frequency divider-

X. P. Yu et al.- Sep. -2005

DESCRIPTION: The design of a high-speed wide-band high resolution programmable

frequency divider is investigated. A new reloadable D flip-flop for the high speed programmable

frequency divider is proposed. It is optimized in terms of propagation delay and power

consumption as compared with the existing designs. Measurement results show that an all-stage

programmable counter implemented with this D flip-flop using the Chartered 0.18 m CMOS

process is capable of operating up to 1.8 GHz for a 1.8 V supply voltage and a 5.8-mW power

consumption. By using this counter, an ultra-wide range high resolution frequency divider is

achieved with low power consumption for 5–6-GHz wireless LAN applications. The design

difficulties of the wide-band high resolution programmable frequency divider for multi-standard

application are investigated. A high speed low power counter is successfully implemented for

multi-standard operations. Measurements results show the first GHz all-stage programmable

divider with low power consumption is achievable with the proposed bitcell.

DRAWBACKS OF EXISTING:

The frequency synthesizer is one of the basic building blocks in modern communication

systems. The operating frequency of the frequency synthesizer is limited by the

frequency divider and the voltage-controlled oscillator. The function of channel selection

in the frequency synthesizer demands programmable division ratios for the frequency

divider.

Much research has been focused on the prescaler design for its highest operating

frequency. However, in the modern communication system, there is an increasing

demand for multi-standards applications.


A new wide-band high resolution programmable frequency divider is proposed. The wide

band and high resolution are obtained by using the all-stage programmable topology in

both counters.

1.3.1.5 4.2 mW frequency synthesizer for 2.4 GHz ZigBee application with fast settling

time performance- S. Shin et al.,- Jun. 2006 .

DESCRIPTION: A new frequency synthesizer with low-power and short settling time is

introduced. With two-point channel controls for an integer-N PLL, we have achieved a near zero

settling time for any frequency change in 2.4GHz ZigBee band. By utilizing a vertical-NPN

parasitic transistor for the VCO biasing, the close-in phase noise has been improved by 5dB from

the case of MOS biasing. A modified-TSPC topology is proposed for low-voltage frequency

divider circuits. Using the 1.2V supply voltage for 0.18mum CMOS, the power consumption is

only 4.2Mw. A new frequency synthesizer architecture with very low power and short frequency

settling time was introduced. A two-point channel control scheme was used for our proposed

frequency synthesizer in which a DAC with tunable gain and a linearized VCO are used to

effectively compensate the gain mismatch between the two control paths. Despite the use of an

integer-N architecture with narrow 20kHz bandwidth, we have achieved near zero frequency

settling time within the accuracy of the measurement equipment for the 75MHz frequency

jumping from 2.4GHz.The battery life for mobile applications is inversely proportional to the

energy consumption of mobile devices.Thus it is important to mninimize the energy

consumption by mninimizing both the active duty-cycle and the active power consumption of a

wireless termninal concurrently.


The frequency settling time of a PLL decreases as the loop-bandwidth increases.

However, it requires a high frequency fractional controller thus increases the hardware

complexity and active power consumption.


In this paper, a new frequency synthesizer with very short frequency settling time and

low active power consumption is introduced.

For short frequency settling time, a two-point channel control scheme composed of a

direct-VCO control (compensation-path) and a divider control (main-path) is used.

1.4 PROPOSED SYSTEM

The proposed wideband multimodulus prescaler which can divide the input frequency by 32, 33,

47, and 48 . It is similar to the 32/33 prescaler, but with an additional inverter and a multiplexer.

The proposed prescaler performs additional divisions (divide- by-47 and divide-by-48) without

any extra flip-flop, thus saving a considerable amount of power and also reducing the complexity

of multiband divider which will be discussed in Section V. The multimodulus prescaler consists

of the wideband 2/3 (N1/(N1+1)) prescale, four aysnchronous TSPC divide-by-2 circuits

(AD=16) and combinational logic circuits to achieve multiple division ratios. Beside the usual

MOD signal for controlling (N/N+1)divisions, the additional control signal sel is used to switch

the prescaler between 32/33 and 47/48 modes.

ADAVANTAGES IN PROPOSED SYSTEM

1. Best performance on power consumption

2. Efficient architecture in silicon verification.

CHAPTER 2

A LOW-POWER SINGLE-PHASE CLOCK MULTIBAND

FLEXIBLE DIVIDER

2.1 GENERAL

WIRELESS LAN (WLAN) in the multigigahertz bands, such as Hiper LAN II and IEEE

802.11a/b/g, are recognized as leading standards for high-rate data transmissions, and standards

like IEEE 802.15.4 are recognized for low-rate data transmissions. The demand for lower cost,

lower power, and multiband RF circuits increased in conjunction with need of higher level of

integration. The frequency synthesizer, usually implemented by a phase-locked loop (PLL), is

one of the power-hungry blocks in the RF front-end and the first-stage frequency divider

consumes a large portion of power in a frequency synthesizer. The battery life for mobile

applications is inversely proportional to the energy consumption of mobile devices. Thus it is

important to mninimize the energy consumption by mninimizing both the active duty-cycle and

the active power consumption of a wireless termninal concurrently . The active duty-cycle of a

ZigBee wireless node strongly depends on the frequency settling time of a PLL, since the

settling time is a domninant portion of the total active period. Generally speaking, the frequency

settling time of a PLL decreases as the loop-bandwidth increases. Since a fractional-N PLL with

higher reference frequency can achieve a wider loop bandwidth, it has been favored for small

active duty-cycle. However, it requires a high frequency fractional controller such as a YIA-

modulator, and thus increases the hardware complexity and active power consumption. In this

paper, a new frequency synthesizer with very short frequency settling time and low active power

consumption is introduced. For short frequency settling time, a two-point channel control scheme

composed of a direct-VCO control (compensation-path) and a divider control (main-path) is

used. A tunable gain DAC and linearized chararactors are used in the compensation-path to

effectively compensate the gain mismatch between the two control paths. For low active power

consumption, 1.2V power supply voltage is used, instead of the typical 1.8V for the 0.18tm

CMOS technology. This is made possible by adopting a modified-TSPC circuit with 2- transistor

stacks for the high frequency divider circuits and other low-voltage circuit techniques. We also

utilized a parasitic Vertical-NPN (VNPN) transistor, which is available in a conventional triple-

well CMOS process, as a VCO current source transistor to improve the close-in phase noise

performance. AS THE feature size of MOSFETs continues to shrink, a proportional

downscaling in the supply voltage is mandatory to maintain gate–oxide reliability . However, in

consideration of the sub threshold leakage and the noise margin required by the digital integrated

circuits, the scaling rate of the threshold voltage is relatively slow compared with that of the

supply voltage. Consequently, the overdrive voltage of the transistors progressively decreases as

the technology advances. It has become an inevitable trend to operate the MOS devices in

moderate or weak inversion for certain mixed-signal and RF integrated circuits, motivating the

development of low-voltage design techniques exclusively for deep-submicrometer CMOS

technologies. In an RF receiver frontend, the low-noise amplifier (LNA) and the down-

conversion mixer are considered the most important building blocks. Typically, these circuits

suffer from significant degradation in the RF properties, especially for gain, noise figure, and

linearity, as the transistors operate in weak inversion. To overcome the limitations on the supply

voltage and the transistor overdrive, a complementary current-reused topology has been

proposed for the RF frontend circuits . Using a standard 0.18- m CMOS process, an ultra-low-

voltage LNA and mixer suitable for operations with microwatt power consumption are realized

at the 5-GHz frequency band. The behavior of the MOSFETs biased at a reduced overdrive

voltage and its impact on the circuit performance of RF frontends are also investigated. Though

the fabricated circuits are not targeted at a specific wireless standard, the developed techniques

and design guidelines can be easily applied for various short-range wireless applications such as

ZigBee, Bluetooth, wideband personal area network (WPAN), and wireless sensor networks.

which were adopted for a remote sensing microsystem’ operating in the 433 MHz

European ISM (Industrial, Scientific, Medical) band. However, the same tradeoffs can be

encountered in the design of most short-range, battery-powered transceivers operating in the

UHF range. Therefore most proposed choices should be applicable to all these systems.

Potential applications are broadly ranged, from computer peripherals and home automation to

surveillance systems. Actually, these high-volume consumer products all share the need for a

small sized, low cost, ultralow power transceiver. The first part of the paper will discuss the

specifications of such a system, and explain shortly the architectural choices, together with the

related tradeoffs. In the second part, the practical realization of some key blocks will be briefly

described. The discussion will be accompanied by simulated and measured results, as functional

versions of most blocks have already been realized and tested in a standard 0.5pm CMOS digital

process. Finally, in the third part, a summary of the expected performances of the whole system

will be presented. Although some specifications may vary from application to application, some

generic objectives can still be stated. The total volume of a network node, including battery and

antenna, has to be in the order of 10cm3 (lin3). This is mostly a limitation for the battery

capacity and a challenge for the antenna. The tradeoff between antenna efficiency and power

consumption leads to the choice of a carrier frequency in the UHF range. ISM bands are usually

preferred because they are not bound to a particular standard, giving more freedom for

implementing power saving strategies. The frequencies of interest are therefore 433MHz

(Europe) and 917MHz (USA). In order to reduce power consumption, avoid DC-DC converters

and be ready for next-generation deep-submicron processes with limited supply voltage, single

battery operation is desirable. Therefore, ultimately, the circuits should be able to operate with a

supply as low as lV, corresponding to the battery end-of-life voltage. As each node should

exhibit some “intelligence” and be able to interact with its environment, cost and size

requirements favor a “system-on-a-chip” approach, leading to choose a standard CMOS process

(on chip microcontroller and interfaces). In order to ease process migration and keep costs down,

a fully digital technology is preferred. From our experience, a 0.5pm process is sufficient for

433MHz operation, whereas a 0.35pm process would be more appropriate in. the 917MHz band.

Cost reduction and reliability also motivates the minimization of the number of external

components. The data rate can be relatively low, in the 1 to 100kbit/s range, half duplex, because

each node should mostly receive data request and transmit the measurement of some slowly

varying physical quantities. For some applications, as computer interfaces, the availability of

several channels (2-4) is desirable. A raw BER (Bit Error Rate) of (before error correction) is

usually acceptable, corresponding to a minimum SNR (Signal-to-Noise Ratio) at the

demodulator of 12dB. The maximum radiated power is enforced in the -2 to 10dBm range by

the ISM standards and by the battery internal resistance. As wave propagation in buildings can

vary widely according to configuration, the maximum distance between two network nodes can

be specified in free range (i.e. without any obstacles) and usually lies between 10 and 100m. For

most sensing applications, nodes spend far more time receiving (monitoring a potential

incoming call) than emitting (answering a request). Therefore, even though the power level

required in transmits mode is an order of magnitude higher, the receiver power consumption is

most critical. In our case, the target was not higher than 1.2mW’(lmA, 1.2V supply) in order to

obtain a sufficient battery lifetime. This high power consumption is mainly due to the first stages

of the frequency divider that often dissipates half of the total power. Due to the high input

frequency, the first stage of the divider cannot be implemented in conventional static CMOS

logic . Instead, it is commonly realized in source-coupled logic (SCL), which allows higher

operating frequency, but burns more power. A more efficient alternative to the first SCL divider

is the injection locking divider employed. However, this resonant /5 divider requires a tank

whose area is larger than the oscillator’s tank, and it suffers from pulling phenomena. Dynamic

latches are known to be faster and more compact than static ones. The true-single-phase-clock

(TSPC) design allows to drive the dynamic latch with a single clock phase, thus avoiding the

skew problem. However, the adoption of this class of latches in frequency dividers has been so

far limited to PLLs up to 900 MH. In this brief, a TSPC divider is employed within a PLL

integrated in a 0.25- m CMOS technology. The use of dynamic logic is not only possible up to 6

GHz, but also extremely effective in reducing the synthesizer power dissipation. This divider

implementation causes no appreciable degradation of the PLL spur performance, nor additional

phase noise. We report results on a 0.5 V RF front end operating at 900 MHz, for zero-IF or

near-zero-IF receivers. This work is a continuation of earlier work at IF frequencies . A 0.5 V

front end has been presented before using fully-depleted CMOS/SIMOX SOI technology, our

circuits use instead a standard mixed-mode CMOS 0.18 μm technology. LNA design in standard

CMOS, operating at 0.5 V, has been reported recently. The chief limitation in 0.5 V design is the

reduced voltage headroom due to the substantial gate voltage overdrive needed to maintain

device speed. The threshold voltage, VT, is anticipated to stay at about 200 mV to reduce OFF

currents in nanoscale CMOS technologies . We used the low-VT transistors in the 0.18 μm

process with a VT close to that value. The growing number of users and demand for high speed

wireless communications has motivated designers to move from the 1-2GHz range towards

higher frequency bands. Recently, new standards in the 5GHz range for wireless local area

network (WLAN) applications have been defined, such as the IEEE 802.1 la standard for the

FCC unlicensed national information infrastructure (U-NII) band in the US, and the high

performance radio LAN (HIPERLAN) standard in Europe. Traditionally, radio frequency

integrated circuits (RFICs) were implemented in GaAs or SiGe bipolar technologies, because of

their relatively high unity gain cutoff frequencies fT (i.e. >65GHz) and their superior noise

performance. However, as the minimum feature size of CMOS devices decreases, the fT of the

transistors continues to improve to the point where it is becoming comparable to those of GaAs

and SiGe processes. Deep submicron CMOS devices with J$s exceeding l00GHz and minimum

noise figures (NF) less that 0.5dB at 2GHz have been demonstrated. Because of these promising

RF performances, together with the advantages of low cost and ease of integration with baseband

digital circuitry, CMOS is becoming a viable alternative for RF applications, with continuous

efforts towards implementing higher frequency circuitry operating from lower supply

voltages IEEE 802.11a and Hiper LAN are standards of wireless data networks with frequency

band operated from 5 to 6GHz which covers fifteen channels with a channel spacing of 20MHz .

The frequency synthesizers are widely used to generate local oscillation (LO) signals in modern

communication systems. In order to cover the required carries and operate from input frequency

of 5GHz, the division of the divider has to be programmed from 257 to 294. The operating

frequency of a frequency synthesizer is limited by the frequency divider as well as the voltage

controlled oscillator (VCO). For WLAN standard, most common high-speed frequency are based

on a pulse swallow architecture. It usually comprises of a dual modulus prescaler (DMP), a 6-

bits pulse (P) counter, and a 5-bits sallower (S) counter. Two counters generate the given

division ration of divider and divide-by-value of DMP. To keep up with the high input frequency

and reduce the power consumption, the DMP is a trade-off between the speed and divide-by-

value. On the other hand, the architectures require two additional counters for generation of a

desired division ratio. It occupies many gate-counts, large chip area, and consumes extra power.

This paper proposes a new frequency divider keeping the same function as a conventional one

without employing a sallower counter to consume takes extra power and unnecessary chip area.

A phase-locked loop (PLL) is an electronic feedback system that generates a signal, the phase of

which is locked to the phase of an input reference signal. PLLs are widely used in radio,

telecommunication, computer and other electronic systems. Frequency synthesizer are one of the

most critical parts in a wireless transceiver. Most frequency synthesizers are of the phase-locked

loop (PLL) type. . In the frequency synthesizer, only the VCO and prescaler operate at the

highest frequency. It is relatively easy to design a multi-gigahertz voltage controlled oscillator

(VCO) in the current advanced CMOS process, so the dual modulus prescaler remains the most

critical components with the trend of applying CMOS into higher frequency systems. Operating

at the highest frequency, the dual modulus prescaler divides the output frequency of the VCO by

one of the two fixed division ratios to a lower frequency. This division ratio is controlled by a

modulus control signal generated by the programmable counter. Besides the high operating

speed, the dual modulus prescaler also has to be efficient in terms of current consumption, and be

able to work at low supply voltage. In this paper, we present a 16/17 dual-modulus prescaler

operating up to 3.1GHz realized in a 0.35μm CMOS technology with 1.8V supply voltage. By

merging NAND logic with D-flip-flop, we eliminated their propagation delay. Also, by merging

AND logic that counts asynchronous output into flip-flop with NAND logic, it allowed are

accurate pulse swallow by eliminating the delay that was found in the AND logic.

2.2 METHODOLOGIES

The key parameters of high-speed digital circuits are the propagation delay and power

consumption. The maximum operating frequency of a digital circuit is calculated.

Fmax = 1/(tp LH + tp HL ) ………………. 1

where tp LH and tp HL are the propagation delays of the low-to-high and high-to-low

transitions of the gates, respectively. The total power consumption of the CMOS digital circuits

is determined by the switching and short circuit power. The switching power is linearly

proportional to the operating frequency and is given by the sum of switching power at each

output node as in

Pswitching=∑i=1

n

fclkCLiVdd 2 ……………… 2

Where n is the number of switching nodes, Fclk is the clock frequency, CLi is the load

capacitance at the output node of the ith stage, and Vdd is the supply voltage. Normally, the

short-circuit power occurs in dynamic circuits when there exists direct paths from the supply to

ground which is given by

P sc = I sc * V dd ……………… 3

Where Isc is the short-circuit current. The analysis shows that the short-circuit power is much

higher in E-TSPC logic circuits than n TSPC logic circuits. However, TSPC logic circuits exhibit

higher switching power compared to that of E-TSPC logic circuits due to high load capacitance.

For the E-TSPC logic circuit, the short-circuit power is the major problem. The E-TSPC cicuit

has the merit of higher operating frequency than that of the TSPC circuit due to the reduction in

load capacitance, but it consumes significantly more power than the TSPC circuit does for a

given transistor size. The following analysis s based on the latest design using the popular and

low-cost 0.18-microm CMOS process.

MAIN MODULE’S:

WIDEBAND 2/3 PRESCALER

MULTIMODULUS 32/33/47/48 PRESCALER

SWALLOW P-COUNTER

SWALLOW S-COUNTER

WIDEBAND 4/5 PRESCALER

MULTIMODULUS 32/33/47/48 PRESCALER

2.4 MODULE DESCRIPTION:

2.4.1 WIDEBAND 2/3 PRESCALER

The E-TSPC 2/3 prescaler reported in consumes large shortcircuit power and has a higher

frequency of operation than that of 2/3 prescaler. The wideband single-phase clock 2/3 prescaler

used in this design was reported which consists of two D-flip-flops and two NOR gates

embedded in the flip-flops as in Fig. 2. The first NOR gate is embedded in the last stage of

DFF1, and the second NOR gate is embedded in the first stage of DFF2 . Here, the transistors

M2,M25, M4, and M8 in DFF1 helps to eliminate the short-circuit power during the divide-by-2

operation. The switching of division ratios between 2 and 3 is controlled by logic signal MC.

WIDEBAND SINGLE-PHASE CLOCK 2/3 PRESCALER

When MC switches from “0” to “1,” transistors M2, M4 and M8 in DFF1 turns off and nodes S1,

S2 and S3 switch to logic “0.” Since node S3 is “0” and the other input to the NOR gate

embedded in DFF2 is Qb, the wideband prescaler operates at the divide-by-2 mode. During this

mode, nodes S1, S2 and S3 switch to logic “0” and remain at “0” for the entire divide-by-2

operation, thus removing the switching power contribution of DFF1. Since one of the transistors

is always OFF in each stage of DFF1, the short-circuit power in DFF1 and the first stage of

DFF2 is negligible. The total power consumption of the prescaler in the divide-by-2 mode is

equal to the switching power in DFF2 and the short-circuit power in second and thirrd stages of

DFF2.

Where CLi is the load capacitance at the output node of the i th stage of DFF2, and Psc1 and

Psc2 are the short-circuit power in the second and third stages of DFF2. When logic signal MC

switches from “1” to “0,” the logic value at the input of DFF1 is transfered to the input of DFF2

as one of the input of the NOR gate embedded in DFF1 is “0” and the wideband prescaler

operates at the divide-by-3 mode. During the divide-by-2 operation, only DFF2 actively

participates in the operation and contributes to the total power consumption since all the

switching activities are blocked in DFF1. Thus, the wideband 2/3 prescaler has benifit of saving

more than 50% of power during the divide-by-2 operation.

2.4.2 MULTIMODULUS 32/33/47/48 PRESCALER

The proposed wideband multimodulus prescaler which can divide the input frequency by 32, 33,

47, and 48 is shown in Fig. 4. It is similar to the 32/33 prescaler used in, but with an additional

inverter and a multiplexer. The proposed prescaler performs additional divisions (divide- by-47

and divide-by-48) without any extra flip-flop, thus saving a considerable amount of power and

also reducing the complexity of multiband divider which will be discussed in Section V. The

multimodulus prescaler consists of the wideband 2/3 (N1/(N+1))prescaler [10], four

asynchronous TSPC divide-by-2 circuits ((AD)=16) and combinational logic circuits to achieve

multiple division ratios. Beside the usual MOD signal for controlling N(N+1) divisions, the

additional control signal Scl is used to switch the prescaler between 32/33 and 47/48 modes.

PROPOSED MULTIMODULUS 32/33/47/48 PRESCALER

Case 1: Sel=’0’

When Sel=0 , the output from the NAND2 gate is directly transferred to the input of 2/3

prescaler and the multimodulus prescaler operates as the normal 32/33 prescaler, where the

division ratio is controlled by the logic signal MOD. If MC=1, the 2/3 prescaler operates in the

divide-by-2 mode and when MC , the 2/3 prescaler operates in the divide-by-3 mode.

If MOD =1, the NAND2 gate output switches to logic “1”(MC=1)and the wideband prescaler

operates in the divide- by-2 mode for entire operation. The division ratio N performed by the

multimodulus prescaler is

N = (AD*N1)+(0*(N1+1)) = 32 ……………………… 4

Where N=2 and AD=16 is fixed for the entire design. If MOD=0 , for 30 input clock cycles MC

remains at logic “1”, where wideband prescaler operates in divide-by-2 mode and, for three input

clock cycles, MC remains at logic “0” where the wideband prescaler operates in the divide-by-3

mode. The division ratio N+1 performed by the multimodulus prescaler is

N + 1 = ((AD – 1)*N1)+(1*(N1+1)) =33 ………………………… 5

Case 2: Sel = 1

When Sel = 1, the inverted output of the NAND2 gate is directly transfered to the input of 2/3

prescaler and the multimodulus prescaler operates as a 47/48 prescaler, where the division ratio

is controlled by the logic signal MOD. If MC = 1, the 2/3 prescaler operates in divide-by-3 mode

and when MC=0, the 2/3 prescaler operates in divide-by-2 mode which is quite opposite to the

operation performed when Sel=0.

If MOD = 1, the division ratio N+1 performed by the multi modulus prescaler is same as

except that the wideband prescaler operates in the divide-by-3 mode for the entire operation

given by

N + 1 = ((AD *(N1+1))+(0*N1)) = 48 ……………………… 6

If MOD = 1, the division ratio N performed by the multi modulus prescaler is

N = ((AD - 1) * (N1+1)) + (1*N1) = 47 …………………… 7

2.4.3 PROGRAM COUNTER

The program counter is responsible for counting P pulses of SlowCLK before outputting a pulse

to the phase/frequency detector and resetting itself and the swallow counter. The implementation

used in this project, using a 7-bit ripple counter, a 7-bit comparator, and a zero-detector is shown

in Figure 12. The ripple counter is clocked by SlowCLK, and increments its count by one each

clock cycle. At each stage, the 7-bit comparator compares each count bit to the corresponding bit

in the control signal, and outputs a 0 for each equal bit. When the zero-detector detects

equivalence in all of the 7 bits, indicating that the desired count has been reached, Fout is driven

high. On the next clock cycle, the program counter is reset to zero and the count is restarted. In

addition, the output pulse on Fout is used to reset the count of the swallow counter, indicating

the end of one complete cycle of the frequency divider.

BLOCK DIAGRAM OF A 7-BIT PROGRAM COUNTER

The ripple counter is implemented using 7 cascaded D-type flip-flops, each arranged in a toggle

configuration. The output of each flip-flop is used to clock the next flip-flop. Since the output of

each flip-flop inverts on every clock cycle, each flip-flop essentially divides its clock by two,

causing the next stage of the ripple counter to be clocked at half the rate of the previous flip flop.

Each flip-flop was designed to respond to the falling edge of its clock, when the output of the

previous stage changes from a 1 to a 0. In this way, an incrementing binary count is achieved

with the outputs of each flip-flop forming the bits of the count. Since the program counter

contains 7-bits, any count between 0 and 127 can be set by the control signal. It is important to

realize however that in order to achieve a division ratio as specified in the equation DIV=NP+S,

the control signal must be set to P-1, since the zero-state is included in the count.

2.4.3.1 P-COUNTER IMPLEMENTATION

It is possible to see the three major components of the program counter implemented using

MCML logic gates. At the input of the counter, an array of 7 flip-flops is used as the ripple

counter. The outputs of the ripple counter, taken from the outputs of each of the flip-flops, are

fed into an array of 7 XNOR gates. The XNOR gates compare each bit with the corresponding

bit in the control signal, outputting a logical ‘1’ when the bits are equal. Although this logic is

inverted compared to the description of the comparator in the previous section, the zero-detector

is implemented as a one-detector using a tree of cascaded AND gates. In this way, the overall

logic of the circuit is unchanged, and the output pulse can be generated without any additional

logic.

Another difference seen in Figure 13 is a separate output, SwallowRST, and some simple

circuitry used to generate it. SwallowRST is used internally to reset the flip-flops of the program

counter, and externally to reset the flip-flops of the swallow counter. Since the fan-out of the

reset signal is high (7 flip-flops in the PC, and 6 in the SC), the reset signal is broken into two

paths and driven using separate MCML buffers. In early simulations, these buffers were absent

and the reset signal could not provide enough current to drive the input capacitance associated

with the flip-flops. SwallowRST was generated using an approach that guarantees predictable

timing of the reset signal. Fout is “tapped” and fed to the input of a flip-flop clocked by Fin. On

the clock cycle immediately following Fout going high, the pulse is sampled by the flip-flop,

generating SwallowRST and resetting both the program counter and the swallow counter. To

ensure that the reset signal is removed before the next clock cycle, the reset signal is fed back to

its generating flip-flop through a delay chain comprised of three buffers.

2.4.4 SWALLOW COUNTER

The swallow counter, as indicated in Figure 8, is used to count S pulses of SlowCLK before

asserting the modulus control signal and changing the modulus of the DMP to N. A block

diagram of the swallow counter is provided in Figure 15. By looking at Figure 15, the similarities

between the swallow counter and the program counter are apparent. Once again, the count (6-bits

in this case) is maintained using a ripple counter comprised of cascaded flip-flops clocked with

SlowCLK. In addition, a comparator compares each count bit with its corresponding bit in the

control signal, and a zero-detector asserts modulus control when all bits are equal. However, the

swallow counter does not reset when the count is reached, but masks the input clock using an

AND gate connected to the inverse of modulus control. As a result, the ripple counter stops

counting when the count is reached, and the state of the circuit is maintained until a reset signal

(SwallowRST) is received from the program counter. Since the swallow counter contains 6 bits,

it is capable of any count from 0 to 64. Once again, the control signal must be set to S-1, since

the zero-state is included in the count.

BLOCK DIAGRAM OF A 6-BIT SWALLOW COUNTER

2.4.4.1 S-COUNTER IMPLEMENTATION

The 6-bit ripple counter implemented as an array of flip-flops, and clocked with the gated clock

provided by the AND of SlowCLK and modulus control. In addition, the comparator is

implemented as an array of MCML XNOR gates, while the zero-detector is actually

implemented as a one-detector using a tree of cascaded AND gates. Unlike the program counter

however, no additional circuitry is necessary to generate the reset as the reset is received from

the program counter by means of the SwallowRST signal.

2.4.5 4/5 PRESCALER

The 4/5 prescaler reported in consumes large short circuit power and has a higher

frequency of operation than that of 4/5 prescaler. The wideband single-phase clock 4/5 prescaler

used in this design, which consists of Three D-flip-flops and two Nand gates embedded.

The Multi prescaler 4by5 consist of Four D-flip flop, two nand gates, and two or gates

one not gate with main 4/5 prescaler circuit. The multi modulus prescaler operates as the normal

64/65/78/79.

MULTIPRESCALER_4BY5:

2.4.6 PROPOSED MULTIBAND FLEXIBLE DIVIDER:

Our proposed multiband flexible divider is Combined by 2/3 Prescaler and 4/5 Prescaler in multi

modulus Prescaler. By using mux we can operate either 2/3 Prescaler and 4/5 Prescaler. It will

operate 32/33/47/48 or 64/65/78/79 bandwidth.

CHAPTER 3

HARDWARE REQUIREMENTS

3.1 GENERAL

Integrated circuit (IC) technology is the enabling technology for a whole host of

innovative devices and systems that have changed the way we live. Jack Kilby and Robert Noyce

received the 2000 Nobel Prize in Physics for their invention of the integrated circuit; without the

integrated circuit, neither transistors nor computers would be as important as they are today.

VLSI systems are much smaller and consume less power than the discrete components used to

build electronic systems before the 1960s. Integration allows us to build systems with many

more transistors, allowing much more computing power to be applied to solving a problem.

Integrated circuits are also much easier to design and manufacture and are more reliable than

discrete systems; that makes it possible to develop special-purpose systems that are more

efficient than general-purpose computers for the task at hand.

APPLICATIONS OF VLSI

Electronic systems now perform a wide variety of tasks in daily life. Electronic systems

in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other

means; electronics are usually smaller, more flexible, and easier to service. In other cases

electronic systems have created totally new applications. Electronic systems perform a variety of

tasks, some of them visible, some more hidden:

Personal entertainment systems such as portable MP3 players and DVD players perform

sophisticated algorithms with remarkably little energy.

Electronic systems in cars operate stereo systems and displays; they also control fuel

injection systems, adjust suspensions to varying terrain, and perform the control functions

required for anti-lock braking (ABS) systems.

Digital electronics compress and decompress video, even at high definition data rates, on-

the-fly in consumer electronics.

Low-cost terminals for Web browsing still require sophisticated electronics, despite their

dedicated function.

Personal computers and workstations provide word-processing, financial analysis, and

games. Computers include both central processing units (CPUs) and special-purpose

hardware for disk access, faster screen display, etc.

Medical electronic systems measure bodily functions and perform complex processing

algorithms to warn about unusual conditions. The availability of these complex systems, far from

overwhelming consumers, only creates demand for even more complex systems. The growing

sophistication of applications continually pushes the design and manufacturing of integrated

circuits and electronic systems to new levels of complexity. And perhaps the most amazing

characteristic of this collection of systems is its variety as systems become more complex, we

build not a few general-purpose computers but an ever wider range of special-purpose systems.

Our ability to do so is a testament to our growing mastery of both integrated circuit

manufacturing and design, but the increasing demands of customers continue to test the limits of

design and manufacturing.

ADVANTAGES OF VLSI

While we will concentrate on integrated circuits in this book, the properties of integrated circuits

what we can and cannot efficiently put in an integrated circuit—largely determine the

architecture of the entire system. Integrated circuits improve system characteristics in several

critical ways. ICs have three key advantages over digital circuits built from discrete components:

• Size. Integrated circuits are much smaller—both transistors and wires are shrunk to micrometer

sizes, compared to the millimeter or centimeter scales of discrete components. Small size leads to

advantages in speed and power consumption, since smaller components have smaller parasitic

resistances, capacitances, and inductances.

• Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than

they can between chips. Communication within a chip can occur hundreds of times faster than

communication between chips on a printed circuit board. The high speed of circuits on-chip is

due to their small size—smaller components and wires have smaller parasitic capacitances to

slow down the signal.

• Power consumption. Logic operations within a chip also take much less power. Once again,

lower power consumption is largely due to the small size of circuits on the chip—smaller

parasitic capacitances and resistances require less power to drive them.

VLSI AND SYSTEMS

These advantages of integrated circuits translate into advantages at the system level:

• Smaller physical size. Smallness is often an advantage in itself—consider portable televisions

or handheld cellular telephones.

• Lower power consumption. Replacing a handful of standard parts with a single chip reduces

total power consumption. Reducing power consumption has a ripple effect on the rest of the

system: a smaller, cheaper power supply can be used; since less power consumption means less

heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic

shielding may be feasible, too.

• Reduced cost. Reducing the number of components, the power supply requirements, cabinet

costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that

the cost of a system built from custom ICs can be less, even though the individual ICs cost more

than the standard parts they replace. Understanding why integrated circuit technology has such

profound influence on the design of digital systems requires understanding both the technology

of IC manufacturing and the economics of ICs and digital systems.

INTEGRATED CIRCUIT MANUFACTURING

Integrated circuit technology is based on our ability to manufacture huge numbers of very small

devices—today, more transistors are manufactured in California each year than raindrops fall on

the state. In this section, we briefly survey VLSI manufacturing.

TECHNOLOGY

Most manufacturing processes are fairly tightly coupled to the item they are manufacturing. An

assembly line built to produce Buicks, for example, would have to undergo moderate

reorganization to build Chevys—tools like sheet metal molds would have to be replaced, and

even some machines would have to be modified. And either assembly line would be far removed

from what is required to produce electric drills.

MASK-DRIVEN MANUFACTURING

Integrated circuit manufacturing technology, on the other hand, is remarkably versatile. While

there are several manufacturing processes for different circuit types—CMOS, bipolar, etc.—a

manufacturing line can make any circuit of that type simply by changing a few basic tools called

masks. For example, a single CMOS manufacturing plant can make both microprocessors and

microwave oven controllers by changing the masks that form the patterns of wires and transistors

on the chips. Silicon wafers are the raw material of IC manufacturing. The fabrication process

forms patterns on the wafer that create wires and transistors. a series of identical chips are

patterned onto the wafer (with some space reserved for test circuit structures which allow

manufacturing to measure the results of the manufacturing process). The IC manufacturing

process is efficient because we can produce many identical chips by processing a single wafer.

By changing the masks that determine what patterns are laid down on the chip, we determine the

digital circuit that will be created. The IC fabrication line is a generic manufacturing line—we

can quickly retool the line to make large quantities of a new kind of chip, using the same

processing steps used for the line’s previous product.

CIRCUITS AND LAYOUTS

We could build a breadboard circuit out of standard parts. To build it on an IC fabrication line,

we must go one step further and design the layout, or patterns on the masks. The rectangular

shapes in the layout (shown here as a sketch called a stick diagram) form transistors and wires

which conform to the circuit in the schematic. Creating layouts is very time-consuming and very

important—the size of the layout determines the cost to manufacture the circuit, and the shapes

of elements in the layout determine the speed of the circuit as well. During manufacturing, a

photolithographic (photographic printing) process is used to transfer the layout patterns from the

masks to the wafer. The patterns left by the mask are used to selectively change the wafer:

impurities are added at selected locations in the wafer; insulating and conducting materials are

added on top of the wafer as well. These fabrication steps require high temperatures, small

amounts of highly toxic chemicals, and extremely clean environments. At the end of processing,

The wafer is divided into a number of chips.

MANUFACTURING DEFECTS

Because no manufacturing process is perfect, some of the chips on the wafer may not work.

Since at least one defect is almost sure to occur on each wafer, wafers are cut into smaller,

working chips; the largest chip that can be reasonably manufactured today is 1.5 to 2 cm on a

side, while a wafer is in moving from 30 to 45 cm. Each chip is individually tested; the ones that

pass the test are saved after the wafer is diced into chips. The working chips are placed in the

packages familiar to digital designers. In some packages, tiny wires connect the chip to the

package’s pins while the package body protects the chip from handling and the elements; in

others, solder bumps directly connect the chip to the package. Integrated circuit manufacturing is

a powerful technology for two reasons: all circuits can be made out of a few types of transistors

and wires; and any combination of wires and transistors can be built on a single fabrication line

just by changing the masks that determine the pattern of components on the chip. Integrated

circuits run very fast because the circuits are very small. Just as important, we are not stuck

building a few standard chip types—we can build any function we want. The flexibility given by

IC manufacturing lets we build faster, more complex digital systems in ever greater variety.

ECONOMICS

Because integrated circuit manufacturing has so much leverage—a great number of parts can be

built with a few standard manufacturing procedures—a great deal of effort has gone into

improving IC manufacturing. However, as chips become more complex, the cost of designing a

chip goes up and becomes a major part of the overall cost of the chip.

Moore’s Law

In the 1960s Gordon Moore predicted that the number of transistors that could be manufactured

on a chip would grow exponentially. His prediction, now known as Moore’s Law, was

remarkably prescient. Moore’s ultimate prediction was that transistor count would double every

two years, an estimate that has held up remarkably well. Today, an industry group maintains the

International Technology Roadmap for Semiconductors (ITRS), that maps out strategies to

maintain the pace of Moore’s Law. (The ITRS roadmap can be found at http://www.itrs.net.)

Terminology

The most basic parameter associated with a manufacturing process is the minimum channel

length of a transistor. (In this book, for example, we will use as an example a technology that can

manufacture 180 nm transistors.) A manufacturing technology at a particular channel length is

called a technology node. We often refer to a family of technologies at similar feature sizes:

micron, submicron, deep submicron, and now nanometer technologies. The term nanometer

technology is generally used for technologies below 100 nm.

COST OF MANUFACTURING

IC manufacturing plants are extremely expensive. A single plant costs as much as $4 billion.

Given that a new, state-of-the-art manufacturing process is developed every three years, that is a

sizeable investment. The investment makes sense because a single plant can manufacture so

many chips and can easily be switched to manufacture different types of chips. In the early years

of the integrated circuits business, companies focused on building large quantities of a few

standard parts. These parts are commodities—one 80 ns, 256Mb dynamic RAM is more or less

http://www.itrs.net/

the same as any other, regardless of the manufacturer. Companies concentrated on commodity

parts in part because manufacturing processes were less well understood and manufacturing

variations are easier to keep track of when the same part is being fabricated day after day.

Standard parts also made sense because designing integrated circuits was hard—not only the

circuit, but the layout had to be designed, and there were few computer programs to help

automate the design process.

COST OF DESIGN

One of the less fortunate consequences of Moore’s Law is that the time and money required to

design a chip goes up steadily. The cost of designing a chip comes from several factors:

• Skilled designers are required to specify, architect, and implement the chip. A design team may

range from a half-dozen people for a very small chip to 500 people for a large, high-performance

microprocessor

• These designers cannot work without access to a wide range of computer- aided design (CAD)

tools. These tools synthesize logic, create layouts, simulate, and verify designs. CAD tools are

generally licensed and you must pay a yearly fee to maintain the license. A license for a single

copy of one tool, such as logic synthesis, may cost as much as $50,000 US.

• The CAD tools require a large compute farm on which to run. During the most intensive part of

the design process, the design team will keep dozens of computers running continuously for

weeks or months.

A large ASIC, which contains millions of transistors but is not fabricated on the state-of-the-art

process, can easily cost $20 million US and as much as $100 million. Designing a large

microprocessor costs hundreds of millions of dollars.

DESIGN COSTS AND IP

We can spread these design costs over more chips if we can reuse all or part of the design in

other chips. The high cost of design is the primary motivation for the rise of IP-based design,

which creates modules that can be reused in many different designs

TYPES OF CHIPS

The preponderance of standard parts pushed the problems of building customized systems back

to the board-level designers who used the standard parts. Since a function built from standard

parts usually requires more components than if the function were built with custom designed ICs,

designers tended to build smaller, simpler systems. The industrial trend, however, is to make

available a wider variety of integrated circuits. The greater diversity of chips includes:

More specialized standard parts. In the 1960s, standard parts were logic gates; in the 1970s

they were LSI components. Today, standard parts include fairly specialized components:

communication network interfaces, graphics accelerators, floating point processors. All these

parts are more specialized than microprocessors but are used in enough volume that designing

special-purpose chips is worth the effort. In fact, putting a complex, high-performance function

on a single chip often makes other applications possible—for example, single-chip floating point

processors make high-speed numeric computation available on even inexpensive personal

computers.

• Application-specific integrated circuits (ASICs)

Rather than build a system out of standard parts, designers can now create a single chip for their

particular application. Because the chip is specialized, the functions of several standard parts can

often be squeezed into a single chip, reducing system size, power, heat, and cost. Application-

specific ICs are possible because of computer tools that help humans design chips much more

quickly.

• Systems-on-chips (SoCs).

Fabrication technology has advanced to the point that we can put a complete system on a single

chip. For example, a single-chip computer can include a CPU, bus, I/O devices, and memory.

SoCs allow systems to be made at much lower cost than the equivalent board-level system. SoCs

can also be higher performance and lower power than board-level equivalents because on-chip

connections are more efficient than chip-to chip connections. A wider variety of chips is now

available in part because fabrication methods are better understood and more reliable. More

importantly, as the number of transistors per chip grows, it becomes easier and cheaper to design

special-purpose ICs. When only a few transistors could be put on a chip, careful design was

required to ensure that even modest functions could be put on a single chip. Today’s VLSI

manufacturing processes, which can put millions of carefully-designed transistors on a chip, can

also be used to put tens of thousands of less-carefully designed transistors on a chip. Even

though the chip could be made smaller or faster with more design effort, the advantages of

having a single-chip implementation of a function that can be quickly designed often outweighs

the lost potential performance. The problem and the challenge of the ability to manufacture such

large chips is design—the ability to make effective use of the millions of transistors on a chip to

perform a useful function.

CMOS TECHNOLOGY

CMOS is the dominant integrated circuit technology. In this section we will introduce some

basic concepts of CMOS to understand why it is so widespread and some of the challenges

introduced by the inherent characteristics of CMOS.

POWER CONSUMPTION

POWER CONSUMPTION CONSTRAINTS

The huge chips that can be fabricated today are possible only because of the relatively tiny

consumption of CMOS circuits. Power consumption is critical at the chip level because much of

the power is dissipated as heat, and chips have limited heat dissipation capacity.

Even if the system in which a chip is placed can supply large amounts of power, most chips are

packaged to dissipate fewer than 10 to 15 Watts of power before they suffer permanent damage

(though some chips dissipate well over 50 Watts thanks to special packaging). The power

consumption of a logic circuit can, in the worst case, limit the number transistors we can

effectively put on a single chip. Limiting the number of transistors per chip changes system

design in several ways. Most obviously, it increases the physical size of a system. Using high-

powered circuits also increases power supply and cooling requirements. A more subtle effect is

caused by the fact that the time required to transmit a signal between chips is much larger than

the time required to send the same signal between two transistors on the same chip; as a result,

some of the advantage of using a higher-speed circuit family is lost. Another subtle effect of

decreasing the level of integration is that the electrical design of multi-chip systems is more

complex: microscopic wires on-chip exhibit parasitic resistance and capacitance, while

macroscopic wires between chips have capacitance and inductance, which can cause a number of

ringing effects that are much harder to analyze. The close relationship between power

consumption and heat makes low-power design

techniques important knowledge for every CMOS designer. Of course, low-energy design is

especially important in battery-operated systems like cellular telephones. Energy, in contrast,

must be saved by avoiding unnecessary work. We will see throughout the rest of this book that

minimizing power and energy consumption requires careful attention to detail at every level of

abstraction, from system architecture down to layout. As CMOS features become smaller,

additional power consumption mechanisms come into play. Traditional CMOS consumes power

when signals change but consumes only negligible power when idle. In modern CMOS, leakage

mechanisms start to drain current even when signals are idle. In the smallest geometry processes,

leakage power consumption can be larger than dynamic power consumption. We must introduce

new design techniques to combat leakage power.

DESIGN AND TESTABILITY

DESIGN VERIFICATION

Our ability to build large chips of unlimited variety introduces the problem of checking whether

those chips have been manufactured correctly. Designers accept the need to verify or validate

their designs to make sure that the circuits perform the specified function. (Some people use the

terms verification and validation interchangeably; a finer distinction reserves verification for

formal proofs of correctness, leaving validation to mean any technique which increases

confidence in correctness, such as simulation.) Chip designs are simulated to ensure that the

chip’s circuits compute the proper functions to a sequence of inputs chosen to exercise the chip.

manufacturing test But each chip that comes off the manufacturing line must also undergo

manufacturing test—the chip must be exercised to demonstrate that no manufacturing defects

rendered the chip useless. Because IC manufacturing tends to introduce certain types of defects

and because we want to minimize the time required to test each chip, we can’t just use the input

sequences created for design verification to perform manufacturing test. Each chip must be

designed to be fully and easily testable. Finding out that a chip is bad only after you have

plugged it into a system is annoying at best and dangerous at worst. Customers are unlikely to

keep using manufacturers who regularly supply bad chips. Defects introduced during

manufacturing range from the catastrophic— contamination that destroys every transistor on the

wafer—to the subtle—a single broken wire or a crystalline defect that kills only one transistor.

While some bad chips can be found very easily, each chip must be thoroughly tested to find

even subtle flaws that produce erroneous results only occasionally. Tests designed to exercise

functionality and expose design bugs don’t always uncover manufacturing defects. We use fault

models to identify potential manufacturing problems and determine how they affect the chip’s

operation. The most common fault model is stuck-at-0/1: the defect causes a logic gate’s output

to be always 0 (or 1), independent of the gate’s input values. We can often determine whether a

logic gate’s output is stuck even if we can’t directly observe its outputs or control its inputs. We

can generate a good set of manufacturing tests for the chip by assuming each logic gate’s output

is stuck at 0 (then 1) and finding an input to the chip which causes different outputs when the

fault is present or absent.

TESTABILITY AS A DESIGN PROCESS

Unfortunately, not all chip designs are equally testable. Some faults may require long input

sequences to expose; other faults may not be testable at all, even though they cause chip

malfunctions that aren’t covered by the fault model. Traditionally, chip designers have ignored

testability problems, leaving them to a separate test engineer who must find a set of inputs to

adequately test the chip. If the test engineer can’t change the chip design to fix testability

problems, his or her job becomes both difficult and unpleasant. The result is often poorly tested

chips whose manufacturing problems are found only after the customer has plugged them into a

system. Companies now recognize that the only way to deliver high-quality chips to customers is

to make the chip designer responsible for testing, just as the designer is responsible for making

the chip run at the required speed. Testability problems can often be fixed easily early in the

design process at relatively little cost in area and performance. But modern designers must

understand testability requirements, analysis techniques which identify hard-to-test sections of

the design, and design techniques which improve testability

RELIABILITY

RELIABILITY IS A LIFETIME PROBLEM

Earlier generations of VLSI technology were robust enough that testing chips at manufacturing

time was sufficient to identify working parts—a chip either worked or it didn’t. In today’s

nanometer-scale technologies, the problem of determining whether a chip works is more

complex. A number of mechanisms can cause transient failures that cause occasional problems

but are not repeatable. Some other failure mechanisms, like overheating, cause permanent

failures but only after the chip have operated for some time. And more complex manufacturing

problems cause problems that are harder to diagnose and may affect performance rather than

functionality.

DESIGN-FOR MANUFACTURABILITY

A number of techniques, referred to as design-for-manufacturability or design-for-yield, are in

use today to improve the reliability of chips that come off the manufacturing line. We can make

chips more reliable by designing circuits and architectures that reduce design stresses and check

for problems. For example, heat is one major cause of chip failure. Proper power management

circuitry can reduce the chip’s heat dissipation and reduce the damage caused by overheating.

We also need to change the way we design chips. Some of the convenient levels of abstraction

that served us well in earlier technologies are no longer entirely appropriate in nanometer

technologies. We need to check more thoroughly and be willing to solve reliability problems by

modifying design decisions made earlier.

INTEGRATED CIRCUIT DESIGN TECHNIQUES

To make use of the flood of transistors given to us by Moore’s Law, we must design large,

complex chips quickly. The obstacle to making large chips work correctly is complexity—many

interesting ideas for chips have died in the swamp of details that must be made correct before the

chip actually works. Integrated circuit design is hard because designers must juggle several

different problems:

• Multiple levels of abstraction. IC design requires refining an idea through many levels of

detail. Starting from a specification of what the chip must do, the designer must create an

architecture which performs the required function, expand the architecture into a logic design,

and further expand the logic design into a layout like the one in Figure 1-2. As you will learn by

the end of this book, the specification-to-layout design process is a lot of work.

• Multiple and conflicting costs. In addition to drawing a design through many levels of detail,

the designer must also take into account costs—not dollar costs, but criteria by which the quality

of the design is judged. One critical cost is the speed at which the chip runs. Two architectures

that execute the same function (multiplication, for example) may run at very different speeds.

We will see that chip area is another critical design cost: the cost of manufacturing a chip is

exponentially related to its area, and chips much larger than 1 cm2 cannot be manufactured at all.

Furthermore, if multiple cost criteria—such as area and speed requirements—must be satisfied,

many design decisions will improve one cost metric at the expense of the other. Design is

dominated by the process of balancing conflicting constraints.

• Short design time. In an ideal world, a designer would have time to contemplate the effect of a

design decision. We do not, however, live in an ideal world. Chips which appear too late may

make little or no money because competitors have snatched market share. Therefore, designers

are under pressure to design chips as quickly as possible. Design time is especially tight in

application-specific IC design, where only a few weeks may be available to turn a concept into a

working ASIC.

FIELD-PROGRAMMABLE GATE ARRAYS(FPGA)

A field-programmable gate array (FPGA) is a block of programmable logic that can implement

multi-level logic functions. FPGAs are most commonly used as separate commodity chips that

can be programmed to implement large functions. However, small blocks of FPGA logic can be

useful components on-chip to allow the user of the chip to customize part of the chip’s logical

function. An FPGA block must implement both combinational logic functions and interconnect

to be able to construct multi-level logic functions. There are several different technologies for

programming FPGAs, but most logic processes are unlikely to implement anti-fuses or similar

hard programming technologies, so we will concentrate on SRAM-programmed FPGAs.

LOOKUP TABLES

The basic method used to build a combinational logic block (CLB) also called a logic element—

in an SRAM-based FPGA is the lookup table (LUT). As shown in Figure , the lookup table is an

SRAM that is used to implement a truth table. Each address in the SRAM represents a

combination of inputs to the logic element. The value stored at that address represents the value

of the function for that input combination. An n-input function requires an SRAM with locations.

Fig -5 Lookup Tables

Because a basic SRAM is not clocked, the lookup table logic element operates much as any

other logic gate as its inputs change, its output changes after some delay.

PROGRAMMING A LOOKUP TABLE

Unlike a typical logic gate, the function represented by the logic element can be changed by

changing the values of the bits stored in the SRAM. As a result, the n-input logic element can

represent functions (though some of these functions are permutations of each other).

Fig-6 Programming A Lookup Table

A typical logic element has four inputs. The delay through the lookup table is independent of

the bits stored in the SRAM, so the delay through the logic element is the same for all functions.

This means that, for example, a lookup table-based logic element will exhibit the same delay for

a 4-input XOR and a 4-input NAND. In contrast, a 4-input XOR built with static CMOS logic is

considerably slower than a 4-input NAND. Of course, the static logic gate is generally faster than

the logic element. Logic elements generally contain registers—flip-flops and latches—as well as

combinational logic. A flip-flop or latch is small compared to the combinational logic element

(in sharp contrast to the situation in custom VLSI), so it makes sense to add it to the

combinational logic element. Using a separate cell for the memory element would simply take up

routing resources. The memory element is connected to the output; whether it stores a given

value is controlled by its clock and enable inputs.

COMPLEX LOGIC ELEMENT

Many FPGAs also incorporate specialized adder logic in the logic element. The critical

component of an adder is the carry chain, which can be implemented much more efficiently in

specialized logic than it can using standard lookup table techniques. The wiring channels that

connect to the logic elements’ inputs and outputs also need to be programmable. A wiring

channel has a number of programmable connections such that each input or output generally can

be connected to any one of several different wires in the channel.

PROGRAMMABLE INTERCONNECTION POINTS

Simple version of an interconnection point, often known as a connection box.

Fig-6 Programming A Lookup Table

A programmable connection between two wires is made by a CMOS transistor (a pass

transistor). The pass transistor’s gate is controlled by a static memory program bit (shown here as

a D register). When the pass transistor’s gate is high, the transistor conducts and connects the

two wires; when the gate is low, the transistor is off and the two wires are not connected.

CHAPTER 4

SOFTWARE REQUIREMENTS

Verification Tool

Modelsim 6.4c

Synthesis Tool

Xilinx ISE 9.1

INTRODUCTION TO MODELSIM

ModelSim /VHDL, ModelSim /VLOG, ModelSim /LNL, and ModelSim /PLUS are produced by

Model Technology™ Incorporated. Unauthorized copying, duplication, or other reproduction is

prohibited without the written consent of Model Technology. The information in this manual is

subject to change without notice and does not represent a commitment on the part of Model

Technology. The program described in this manual is furnished under a license agreement and

may not be used or copied except in accordance with the terms of the agreement. The online

documentation provided with this product may be printed by the end-user. The number of copies

that may be printed is limited to the number of licenses purchased. ModelSim is a registered

trademark of Model Technology Incorporated. Model Technology is a trademark of Mentor

Graphics Corporation. PostScript is a registered trademark of Adobe Systems Incorporated.

UNIX is a registered trademark of AT&T in the USA and other countries. FLEXlm is a

trademark of Globetrotter Software, Inc. IBM, AT, and PC are registered trademarks, AIX and

RISC System/6000 are trademarks of International Business Machines Corporation. Windows,

Microsoft, and MS-DOS are registered trademarks of Microsoft Corporation. OSF/Motif is a

trademark of the Open Software Foundation, Inc. in the USA and other countries. SPARC is a

registered trademark and SPARCstation is a trademark of SPARC International, Inc. Sun

Microsystems is a registered trademark, and Sun, SunOS and OpenWindows are trademarks of

Sun Microsystems, Inc. All other trademarks and registered trademarks are the properties of their

respective holders.

ModelSim is a useful tool that allows you to stimulate the inputs of your modules and view both

outputs and internal signals. It allows you to do both behavioural and timing simulation,

however, this document will focus on behavioural simulation. Keep in mind that these

simulations are based on models and thus the results are only as accurate as the constituent

models.

Standards Supported

ModelSim VHDL supports both the IEEE 1076-1987 and 1076-1993 VHDL, the 1164-1993

Standard Multivalue Logic System for VHDL Interoperability, and the 1076.2-1996 Standard

VHDL Mathematical Packages standards. Any design developed with ModelSim will be

compatible with any other VHDL system that is compliant with either IEEE Standard 1076-

1987 or 1076-1993. ModelSim Verilog is based on IEEE Std 1364-1995 and a partial

implementation of 1364-2001, Standard Hardware Description Language Based on the Verilog

Hardware Description Language. The Open Verilog International Verilog LRM version 2.0 is

also applicable to a large extent. Both PLI (Programming Language Interface) and VCD (Value

Change Dump) are supported for ModelSim PE and SE users.

MODELSIM

Basic Steps For Simulation

This section provides further detail related to each step in the process of simulating your design

using ModelSim.

Step 1 - Collecting Files and Mapping Libraries

Files needed to run ModelSim on your design:

design files (VHDL, Verilog, and/or SystemC), including stimulus for the design

libraries, both working and resource

modelsim.ini (automatically created by the library mapping command

Providing stimulus to the design

You can provide stimulus to your design in several ways:

Language based testbench

Tcl-based ModelSim interactive command, force

VCD files / commands

See "Using extended VCD as stimulus" (UM-458) and "Using extended VCD as

stimulus"

3rd party test bench generation tools

What is a library in ModelSim?

A library is a location where data to be used for simulation is stored. Libraries are ModelSim’s

way of managing the creation of data before it is needed for use in simulation. It also serves as a

way to streamline simulation invocation. Instead of compiling all design data each and every

time you simulate, ModelSim uses binary pre-compiled data from these libraries. So, if you

make a changes to a single Verilog module, only that module is recompiled, rather than all

modules in the design.

Working and resource libraries

Design libraries can be used in two ways: 1) as a local working library that contains the compiled

version of your design; 2) as a resource library. The contents of your working library will change

as you update your design and recompile. A resource library is typically unchanging, and serves

as a parts source for your design. Examples of resource libraries might be: shared information

within your group, vendor libraries, packages, or previously compiled elements of your own

working design. You can create your own resource libraries, or they may be supplied by another

design team or a third party (e.g., a silicon vendor). For more information on resource libraries

and working libraries, see "Working library versus resource libraries", "Managing library

contents", "Working with design libraries, and "Specifying the resource librarie".

Creating The Logical Library – vlib

Before you can compile your source files, you must create a library in which to store the

compilation results. You can create the logical library using the GUI, using File > New > Library

(see "Creating a library"), or you can use the vlib command. For example, the command:

vlib work

creates a library named work. By default, compilation results are stored in the work

library.

Mapping The Logical Work To The Physical Work Directory – vmap

VHDL uses logical library names that can be mapped to ModelSim library directories. If

libraries are not mapped properly, and you invoke your simulation, necessary components will

not be loaded and simulation will fail. Similarly, compilation can also depend on proper library

mapping. By default, ModelSim can find libraries in your current directory (assuming they have

the right name), but for it to find libraries located elsewhere, you need to map a logical library

name to the pathname of the library. You can use the GUI ("Library mappings with the GUI", a

command ("Library mappings with the GUI" ), or a project ("Getting started with projects" to

assign a logical name to a design library.

The format for command line entry is:

vmap <logical_name> <directory_pathname>

This command sets the mapping between a logical library name and a directory.

Step 2 - Compiling the design with vlog/vcom/sccom

Designs are compiled with one of the three language compilers.

Compiling Verilog - vlog

ModelSim’s compiler for the Verilog modules in your design is vlog . Verilog files may be

compiled in any order, as they are not order dependent. See "Compiling Verilog files" for

details.

Verilog portions of the design can be optimized for better simulation performance.

"Optimizing Verilog designs".

Compiling VHDL - vcom

ModelSim’s compiler for VHDL design units is vcom . VHDL files must be compiled according

to the design requirements of the design. Projects may assist you in determining the compile

order: for more information, see"Auto-generating compile order" (UM-46). See "Compiling

VHDL files" (UM-73) for details. on VHDL compilation.

Compiling SystemC - sccom

ModelSim’s compiler for SystemC design units is sccom , and is used only if you have SystemC

components in your design. See "Compiling SystemC files" for details.

Step 3 - Loading the design for simulation

vsim <top>

Your design is ready for simulation after it has been compiled and (optionally) optimized with

vopt . For more information on optimization, see Optimizing Verilog designs . You may then

invoke vsim with the names of the top-level modules (many designs contain only one top-level

module) or the name you assigned to the optimized version of the design.

For example, if your top-level modules are "testbench" and "globals", then invoke the simulator

as follows:

vsim testbench globals

After the simulator loads the top-level modules, it iteratively loads the instantiated modules and

UDPs in the design hierarchy, linking the design together by connecting the ports and resolving

hierarchical references.

Using SDF

You can incorporate actual delay values to the simulation by applying SDF back annotation

files to the design. For more information on how SDF is used in the design, see "Specifying SDF

files for simulation" .

Step 4 - Simulating the design

Once the design has been successfully loaded, the simulation time is set to zero, and you

must enter a run command to begin simulation. For more information, see Verilog

simulation , VHDL simulation , and SystemC simulation .

The basic simulator commands are:

add wave

force

bp

run

step

next

Step 5- Debugging The Design

Numerous tools and windows useful in debugging your design are available from the ModelSim

GUI. For more information, seeWaveform analysis (UM-237), PSL Assertions andTracing

signals with the Dataflow window. In addition, several basic simulation commands are available

from the command line to assist you in debugging your design:

describe

drivers

examine

force

log

checkpoint

restore

show

MODELSIM BASICS

On the left side of the interface, under the project tab is the frame listing of the files that pertain

to the opened project. The Library frame lists the entities of the project (that have been ompiled).

To the right is the ModelSim shell frame. It is an extension of MS-DOS, so both ModelSim and

MS-DOS commands can be executed.

A. Creating a New Project

Once ModelSim has been started, create a new project:

File > New > Project…

Name the project FA, set the project location to

F:/VHDL , and click OK.

A new window should appear to add new files to the

project. Choose Create New File.

Enter F:\VHDL\FA.vhd as the file name and click OK.

Then close the “add new files” window.

Additional files can be added later by choosing from the menu: Project > Add File to Project

B. Editing Source Files

Double click the FA.vhd source found under the Workspace window’s project tab. This will

open up an empty text editor configured to highlight VHDL syntax. Copy the source found at the

end of this document. After writing the code for the entity and its architecture, save and close the

source file.

C. Compiling projects

Select the file from the project files list frame and right click on it. Select compile to just

compile. This file or compile all for the files in the current project. If there are errors within the

code or the project, a red failure message will be displayed. Double click these red errors for

more detailed errors. Otherwise, if all is well then no red warning or error messages will be

displayed.

II. Simulating with ModelSim

To simulate, first the entity design has to be loaded into the simulator. Do this by selecting

fromthe menu:

Simulate > Simulate

A new window will appear listing all the entities (not filenames) that are in the work library.

Select FA entity for simulation and click OK.

Often times it will be necessary to create entities with multiple architectures. In this case the

architecture has to be specified for the simulation. Expand the tree for the entity and select the

architecture to be simulated and then click OK.

Creating test files for the simulator After the design is loaded, clear up any previous data and

restart the timer by typing in thePrompt:

View > Signals

A new window will be displayed listing the design entity’s signals and their initial value (shown

below). Items in waveform and listing are ordered in the same order in which they are declared

in the code. To display the waveform, select the signals for the waveform to display (hold CTL

and click to select multiple signals) and from the signal list window menu select:

Add > Wave > Selected signals

XILINX ISE

INTRODUCTION

The Spartan®-3 family of Field-Programmable Gate Arrays is specifically designed to meet the

needs of high volume, cost-sensitive consumer electronic applications. The eight-member family

offers densities ranging from 50,000 to five million system gates. The Spartan-3 family builds on

the success of the earlier Spartan-IIE family by increasing the amount of logic resources, the

capacity of internal RAM, the total number of I/Os, and the overall level of performance as well

as by improving clock management functions. Numerous enhancements derive from the

Virtex®-II platform technology. These Spartan-3 FPGA enhancements, combined with advanced

process technology, deliver more functionality and bandwidth per dollar than was previously

possible, setting new standards in the programmable logic industry. Because of their

exceptionally low cost, Spartan-3 FPGAs are ideally suited to a wide range of consumer

electronics applications, including broadband access, home networking, display/projection and

digital television equipment. The Spartan-3 family is a superior alternative to mask programmed

ASICs. FPGAs avoid the high initial cost, the lengthy development cycles, and the inherent

inflexibility of conventional ASICs. Also, FPGA programmability permits design upgrades in

the field with no hardware replacement necessary, an impossibility with ASICs.

Features

• Low-cost, high-performance logic solution for high-volume, consumer-oriented applications

- Densities up to 74,880 logic cells

• SelectIO™ interface signaling

- Up to 633 I/O pins

- 622+ Mb/s data transfer rate per I/O

- 18 single-ended signal standards

- 8 differential I/O standards including LVDS, RSDS

- Termination by Digitally Controlled Impedance

- Signal swing ranging from 1.14V to 3.465V

- Double Data Rate (DDR) support

- DDR, DDR2 SDRAM support up to 333 Mbps

• Logic resources

- Abundant logic cells with shift register capability

- Wide, fast multiplexers

- Fast look-ahead carry logic

- Dedicated 18 x 18 multipliers

- JTAG logic compatible with IEEE 1149.1/1532

• SelectRAM™ hierarchical memory

- Up to 1,872 Kbits of total block RAM

- Up to 520 Kbits of total distributed RAM

• Digital Clock Manager (up to four DCMs)

- Clock skew elimination

- Frequency synthesis

- High resolution phase shifting

• Eight global clock lines and abundant routing

• Fully supported by Xilinx ISE® and WebPACK™

software development systems

• MicroBlaze™ and PicoBlaze™ processor, PCI®, PCI

Express® PIPE Endpoint, and other IP cores

• Pb-free packaging options

• Automotive Spartan-3 XA Family variant

ARCHITECTURAL OVERVIEW

The Spartan-3 family architecture consists of five fundamental programmable functional

elements:

• Configurable Logic Blocks (CLBs) contain RAM-based Look-Up Tables (LUTs) to implement

logic and storage elements that can be used as flip-flops or latches. CLBs can be programmed to

perform a wide variety of logical functions as well as to store data.

• Input/Output Blocks (IOBs) control the flow of data between the I/O pins and the internal logic

of the device. Each IOB supports bidirectional data flow plus 3-state operation. Twenty-six

different signal standards, including eight high-performance differential standards, are available

as shown in Table 2. Double Data-Rate (DDR) registers are included. The Digitally Controlled

Impedance (DCI) feature provides automatic on-chip terminations, simplifying board designs.

• Block RAM provides data storage in the form of 18-Kbitdual-port blocks.• Multiplier blocks

accept two 18-bit binary numbers as inputs and calculate the product.

Digital Clock Manager (DCM) blocks provide self-calibrating, fully digital solutions for

distributing, delaying, multiplying, dividing, and phase shifting clock signals. These elements are

organized as shown in Figure. A ring of IOBs surrounds a regular array of CLBs. The XC3S50

has a single column of block RAM embedded in the array. Those devices ranging from the

XC3S200 to the XC3S2000 have two columns of block RAM. The XC3S4000 and XC3S5000

devices have four RAM columns. Each column is made up of several 18-Kbit RAM blocks; each

block is associated with a dedicated multiplier. The DCMs are positioned at the ends of the outer

block RAM columns. The Spartan-3 family features a rich network of traces and switches that

interconnect all five functional elements, transmitting signals among them. Each functional

element has an associated switch matrix that permits multiple connections to the routing.

Fig - SPARTAN-3 Family Architecture

CONFIGURATION

Spartan-3 FPGAs are programmed by loading configuration data into robust, reprogrammable,

static CMOS configuration latches (CCLs) that collectively control all functional elements and

routing resources. Before powering on the FPGA, configuration data is stored externally in a

PROM or some other nonvolatile medium either on or off the board. After applying power, the

configuration data is written to the FPGA using any of five different modes: Master Parallel,

Slave Parallel, Master Serial, Slave Serial, and Boundary Scan (JTAG). The Master and Slave

Parallel modes use an 8-bit-wide Select MAP port.

The recommended memory for storing the configuration data is the low-cost Xilinx

Platform Flash PROM family, which includes the XCF00S PROMs for serial configuration and

the higher density XCF00P PROMs for parallel or serial configuration.

I/O CAPABILITIES

The Select IO feature of Spartan-3 devices supports 18 single- ended standards and 8 differential

standards. Many standards support the DCI feature, which uses integrated terminations to

eliminate unwanted signal reflections.

Package Marking

Figure 2 shows the top marking for Spartan-3 FPGAs in the quad-flat packages. Figure 3 shows

the top marking for Spartan-3 FPGAs in BGA packages except the 132-ball chip-scale package

(CP132 and CPG132). The markings for the BGA packages are nearly identical to those for the

quad-flat packages, except that the marking is rotated with respect to the ball A1 indicator.

Figure 4 shows the top marking for Spartan-3 FPGAs in the CP132 and CPG132 packages. The

“5C” and “4I” part combinations may be dual marked as “5C/4I”. Devices with the dual mark

can be used as either -5C or -4I devices. Devices with a single mark are only guaranteed for the

marked speed grade and temperature range. Some specifications vary according to mask

revision. Mask revision E devices are errata-free. All shipments since 2006 have been mask

revision E.

Fig- Spartan-3 QFP Package Marking Example for Part Number XC3S400-4PQ208C

Fig Spartan-3 BGA Package Marking Example for Part Number XC3S1000-4FT256C

Fig Spartan-3 CP132 and CPG132 Package Marking Example for XC3S50-4CP132C

Ordering Information

Spartan-3 FPGAs are available in both standard and Pb-free packaging options for all

device/package combinations. The Pb-free packages include a special ‘G’ character in the

ordering code.

Fig Standard Packaging

For additional information on Pb-free packaging, see XAPP427: "Implementation and Solder

Reflow Guidelines for Pb-Free Packages".

Fig-Pb-Free Packaging

CHAPTER 5

SIMULATION IMPLEMENTATION

5.1 GENERAL

Since the early 1980s, when schematic capture was introduced as an efficient way to

design very large-scale integration (VLSI) circuits, it has been the design method of choice for

designers in the world of VLSI design. However, the use of this method reached its limits in the

early 1990s, as more and more logic functionality and features were integrated onto a single

chip. Today, most application-specific integrated circuit (ASIC) chips consist of no fewer than

one million transistors. Designing circuits this large using the method of schematic capture is

time consuming and is no longer efficient. Therefore, a more efficient manner of design was

required. This new method had to increase the designers’ efficiency and allow ease of design,

even when dealing with large circuits. From this requirement arose the wide acceptance of HDL

(hardware description language). HDL allows a designer to describe the functionality of a

required logic circuit in a language that is easy to understand. The description is then simulated

using test benches. After the HDL description is verified for logic functionality, it is synthesized

to logic gates by using synthesis tools. This method helps a designer to design a circuit in a

shorter timeframe. The savings in design time is achieved because the designer need not be

concerned with the intricate complexities that exist in a particular circuit, but instead is focused

on the functionality that is required. This new method of design has been widely adopted today

in the field of ASIC design. It allows designers to design large numbers of logic gates to

implement logic features and functionality that are required on an ASIC chipAs the size and

complexity of digital systems increase, more computer aided design (CAD) tools are introduced

into the hardware design process. Early simulation and primitive hardware generation tools have

given way to sophisticated design entry, verification, high-level synthesis, formal verification,

and automatic hardware generation and device programming tools. Growth of design automation

tools is largely due to hardware description languages (HDLs) and design methodologies that are

based on these languages. Based on HDLs, new digital system CAD tools have been developed

and are now widely used by hardware designers. At the same time research for finding better and

more abstract hardware languages continues. One of the most widely used HDLs is the Verilog

HDL. Because of its wide acceptance in digital design industry, Verilog has become a must-

know for design engineers and students in computer-hardware-related fields. This chapter

presents tools and environments that are based on Verilog and are available to a hardware

designer for automating his or her design process, and hence improving the final product’s time

to market. We discuss steps involved in taking a hierarchical, high-level design from a Verilog

description of the design to its implementation in hardware. Processes and terminologies are

illustrated here. We discuss available electronic design automation (EDA) tools that are based on

Verilog, and talk about their role in an automated design environment. The last section of this

chapter discusses some of the properties of Verilog that make this language a good choice for

designers and modelers of hardware.

Verilog HDL

The previous section showed steps involved in taking an RT level design from a Verilog

description to hardware implementation. This design process is only possible because Verilog is

a language that can be understood by system designers, RT level designers, test engineers,

simulators, synthesis tools, and machines. Because of this important role in design, Verilog has

become an IEEE standard. The standard is used by users as well as tool developers.

Verilog evolution

Verilog was designed in early 1984 by Gateway Design Automation. Initially the original

language was used as a simulation and verification tool. After the initial acceptance of this

language by electronic industry, a fault simulator, a timing analyzer, and later in 1987, a

synthesis tool was developed based on this language. Gateway Design Automation and its

Verilog-based tools were later acquired by Cadence Design System. Since then, Cadence has

been a strong force behind popularizing the Verilog hardware description language. In 1987

VHDL became an IEEE standard hardware description language. Because of its Department of

Defense (DoD) support, VHDL was adapted by the U.S. government for related projects and

contracts. In an effort for popularizing Verilog, in 1990, OVI (Open Verilog International) was

formed and Verilog was placed in public domain. This created a new line of interest in Verilog

for the users and EDA vendors. In 1993, efforts for standardization of this language started.

Verilog became the IEEE standard, IEEE Std. 1364-1995, in 1995. Already having simulation

tools, synthesizers, fault simulation programs, timing analyzers, and many of their design tools

developed for Verilog, this standardization helped further acceptance of Verilog in electronic

design communities. Anew version of Verilog was approved by IEEE in 2001. This version that

is referred to as Verilog-2001 is the present standard used by most users and tool developers.

New features for external file access for read and write, library management, constructs for

design configuration, higher abstraction level constructs, and constructs for specification of

iterative structures, are some of the features added to this version of Verilog. Work on improving

this standard continues in various IEEE sponsored study groups.

Verilog attributes

Verilog is a hardware description language for describing hardware from transistor level to

behavioral. The language supports timing constructs for switch level timing simulation and at the

same time, it has features for describing hardware at the abstract algorithmic level. A Verilog

description may consist of a mix of modules at various abstraction levels with different degrees

of detail.

Switch level

Features of the language that make it ideal for switch level modeling and simulation includes

primitive unidirectional

Digital System Design Automation with Verilog

and bidirectional switches with parameters for delay and charge storage. Circuit delays may be

modeled as propagation delay, rise and fall delay, and line delays. The charge storage feature at

this level of abstraction in Verilog makes this language capable of describing dynamic

complimentary metal oxide semicondutor (CMOS) and metal oxide semiconductor (MOS)

circuits.

Gate level

Gate level primitives with predefined parameters provide a convenient platform for netlist

representation and gate level simulation. For more detailed and special purpose gate simulations,

gate components may be defined at the behavioral level. Verilog also provides utilities for

defining primitives with special functionalities. A simple 4-value logic system is used in Verilog

for signal values. However, for more accurate logic modeling, Verilog signals also include 16

levels of strength in addition to the four values.

Pin-to-pin delay

Autility for timing specification of components at the input/output level is provided in Verilog.

This utility can be used for back annotation of timing information in original predesigned

descriptions. Moreover, the pin-to-pin language facility enables modelers to finetune timing

behavior of their models based on physical implementations.

Bussing specifications

Bus and register modeling utilities are provided in Verilog. For various bus structures, Verilog

supports predefined wire and bus resolution functions using its 4-value logic value system.

Combination of bus logic and resolution-functions enable modeling of most physical bus types.

For register modeling, high-level clock representation and timing-control constructs can be used

for representation of registers with various clocking and resetting schemes.

Behavioral level

Procedural blocks of Verilog enable algorithmic representations of hardware structures.

Constructs similar to those in software programming languages are provided for describing

hardware at this level.

System utilities

System tasks in Verilog provide designers with tools for testbench generation, file access for

read and write, data handling, data generation, and special hardware modeling. System utilities

for reading memory and programmable logic array (PLA) images provide convenient ways of

modeling these components. Verilog display and I/O tasks can be used to handle all inputs and

outputs for data application and simulation. Verilog allows random access to files for read and

write operations.

Programming Language Interface (PLI)

Programming language interface (PLI) of Verilog provides an environment for accessing Verilog

data structures using a library of C-language functions.

The Verilog language

The Verilog HDL satisfies all requirements for design and synthesis of digital systems. The

language supports hierarchical description of hardware from system to gate or even switch level.

Verilog has strong support at all levels for timing specification and violation detection. Timing

and concurrency required for hardware modeling are specially emphasized. In Verilog a

hardware component is described by the module declaration language construct. Description of a

module specifies a component’s input and output list as well as internal component busses and

registers. Within a module, concurrent assignments, component instantiations, and procedural

blocks can be used to describe a hardware component. Several modules can hierarchically be

instantiated to form other hardware structures. Leaves of a hierarchical design specification may

be modules, primitives, or user defined primitives. For simulating a design, it is expected that all

leaves of the hierarchy are individually compiled. Many Verilog tools and environments exist

that provide simulation, fault simulation, formal verification, and synthesis. Simulation

environments provide graphical front-end programs and waveform editing and display tools.

Synthesis tools are based on a subset of Verilog. For synthesizing a design, target hardware, e.g.,

specific FPGA or ASIC, must be known.

This chapter is about the software language and the tools used in the development of the project.

The platform used here is JAVA. The Primary languages are JAVA,J2EE and J2ME. In this

project J2EE is chosen for implementation.

CHAPTER 6

SNAPSHOTS

6.1 GENERAL

Snapshot is nothing but every moment of the application while running. It gives the

clear elaborated of application. It will be useful for the new user to understand for the future

steps.

6.2 VARIOUS SNAPSHOTS

1. D FLIP FLOP:

2. MUX:

3. Buff P:

4. Zero Dect P:

5. Comparator P:

6. Counter P:

7. P COUNTER:

8. BUFF S:

9. ZERO S:

10. COUNTER S:

11. COMPARATOR S:

12. S COUNTER:

13. PRESCALAR:

14. FREQ DIVIDER 2/3:

15. 4/5 PRESCALER

16. TOP MODULE:

CHAPTER 7

APPLICATION

The proposed wide band multi modulus prescaler has the maximum operating frequency

of 7.2 GHz (simulation) with a power consumption of 1.52, 1.60, 2.10, and 2.13 mW during the

divide-by-32, divide-by-33, divide-by-47 and divide-b48, respectively.The proposed multiband

divider has a variable resolution of _ MHz for lower frequency band (2.4–2.484 GHz) and for the

higher frequency band (55.825 GHz), where K is integer from 1 to 5 for 2.4-GHz.

CHAPTER 8

CONCLUSION & REFERENCES

8.1 CONCLUSION

In this paper, a wideband 2/3 prescaler is verified in the design of proposed wide band

multimodulus 32/33/47/48 prescaler. A dynamic logic multiband flexible integer N divider is

designed which uses the wideband 2/3 prescaler, multimodulus 32/33/47/48 prescaler, and is

silicon verified using the 0.18µm CMOS technology. Since the multimodulus 32/33/47/48

prescaler has maximum operating frequency of 6.2 GHz, the values of P- and S-counters

can actually be programmed to divide over the whole range of frequencies from 1 to 6.2 GHz

with finest resolution of 1 MHz and variable channel spacing. However, since interest lies in

the 2.4- and 5–5.825-GHz bands of operation, the P- and S-counters are programmed

accordingly. The proposed multiband flexible divider also uses an improved loadable bit-cell

for Swallow S-counter and consumes a power of 0.96 and 2.2 mW in 2.4- and 5-GHz bands,

respectively, and provides a solution to the low power PLL synthesizers for Bluetooth,

Zigbee, IEEE 802.15.4, and IEEE 802.11a/b/g WLAN applications with variable channel

spacing.

8.2 REFERENCES

1. H. R. Rategh et al., “A CMOS frequency synthesizer with an injectedlocked frequency

divider for 5-GHz wirless LAN receiver,” IEEE J. Solid-State Circuits, vol. 35, no. 5, pp.

780–787, May 2000.

2. P. Y. Deng et al., “A 5 GHz frequency synthesizer with an injectionlocked frequency

divider and differential switched capacitors,” IEEE Trans. Circuits Syst. I, Reg. Papers,

vol. 56, no. 2, pp. 320–326, Feb. 2009.

3. L. Lai Kan Leung et al., “A 1-V 9.7-mW CMOS frequency synthesizer for IEEE 802.11a

transceivers,” IEEE Trans. Microw. Theory Tech., vol. 56, no. 1, pp. 39–48, Jan. 2008.

4. M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS Current-Mode Logic

Digital Circuits. New York: Springer, 2005.

5. Y. Ji-ren et al., “Atrue single-phase-clock dynamicCMOScircuit technique,” IEEE J.

Solid-State Circuits, vol. 24, no. 2, pp. 62–70, Feb.1989.

6. S. Pellerano et al., “A 13.5-mW 5 GHz frequency synthesizer with dynamic-logic

frequency divider,” IEEE J. Solid-State Circuits, vol. 39, no. 2, pp. 378–383, Feb. 2004.

7. V. K. Manthena et al., “A low power fully programmable 1 MHz resolution 2.4 GHz

CMOS PLL frequency synthesizer,” in Proc. IEEE Biomed. Circuits Syst. Conf., Nov.

2007, pp. 187–190.

8. S. Shin et al., “4.2 mW frequency synthesizer for 2.4 GHz ZigBee application with fast

settling time performance,” in IEEE MTT-S Int. Microw. Symp. Dig. , Jun. 2006, pp.

411–414.

9. S. Vikas et al., “1 V 7-mW dual-band fast-locked frequency synthesizer,” in Proc. 15th

ACM Symp. VLSI, 2005, pp. 431–435.

10. V. K. Manthena et al., “A 1.8-V 6.5-GHz low power wide band singlephase clock CMOS

2/3 prescaler,” in IEEE 53rd Midwest Symp. Circuits Syst., Aug. 2010, pp. 149–152.

11. J. M. Rabaey et al., “Digital integrated circuits, a design perspective,” in Ser. Electron

and VLSI, 2nd ed. Upper Saddle River, NJ: Prentice- Hall, 2003.

12. X. P. Yu et al., “Design and optimization of the extended true singlephase clock-based

prescaler,” IEEE Trans. Microw. Theory Tech., vol. 56, no. 11, pp. 3828–3835, Nov.

2006.

13. X. P. Yu et al., “Design of a low power wideband high resolution programmable

frequency divider,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 9, pp.

1098–1103, Sep. 2005.

flexible divider

Documents