a low-power 8-read 4-write register file design

4
 A Low-power 8-Read 4-Write Register File Design Hao YAN 1,2 , Yan LIU 1 , Dong-hui WANG 1 , Chao-huan HOU 1  Digital System Integration Lab Institute of Acoustics, Chinese Academy of Sciences 1  Graduate University of Chinese Academy of Sciences 2  Beijing, China [email protected]  Abstract This paper gives a study of leakage in 90nm logic CMOS technology, and by analyzing the power constitutions in multi-ported register files and the leakage in different nMOS transistors with different power supply, a low-swing strategy for bit lines is used in saving power. In this paper an 8 read / 4 write ports write-through register file is designed and it dissipates 10.04mW at 500MHz, which save 33.36% power in read operation and 24.8% energy in average. The leakage power on the bit lines also reduces 1.7% by using high threshold voltage transistors in low-swing scheme.  I. I  NTRODUCTION For mobile and handheld microprocessors, lowering power consumption for longer battery life is always a key issue in the design. And the main energy consumed in a microprocessors’ core is in ALU and on-chip memories, which contain the most critical speed paths, limiting the microprocessor frequency. In the modern microprocessors’ architecture, various types of memories such as caches, register files (RF), sometimes content addressable memories (CAM) are used. Among these on-chip memories, register files on the top of memory hierarchy that directly contacted with the execution units require very fast access times. Multi-ported register files are basic block of superscalar microprocessors [1], which enable concurrent execution multiples instructions in a single cycle and consuming one of the largest percentages of power at about 25%. T he architecture of multi-ported register files consists of a decode block and memory array block. In multi-ported register file, the energy cost is proportion to its ports’ number, and the main power contribution to multi-ported register files is in address decoding steps and the charge and discharge in bit lines for reading operations. By using different decoder types, the power consumption also changes. Generally, the NAND structure decoder consumes less power and the total switching capacitances are also reduced via pre-decoding. The other part in power consumption becomes even more serious coupled with the technology’s progressing. With CMOS technology scaling, aggressively low threshold devices result in an exponential increase in bit line active leakage currents and poor  bit line noise immunity [2]. In order to make the bit line robust, the additional circuits called keeper is added to maintain the charge on bit lines. However, despite the die costs, this comes at a cost of bit lines delay increase due to keeper contention. In reference [3], the leakage trend increases 2 or 3 times for each technology generation, and consume 30~50% of the  power in the core. In order to reduce the energy lost generated  by leakage, several modified memory core cells are proposed [4] [5]. In reference [6] [7], substrate/well bias schemes have also been proposed to enhance bit lines leakage tolerance. However, general N-well CMOS process can not provide the special bias for substrate. This restricts the application. Other leakage tolerant techniques like reference [8], a special control signal is added before a read operation occurs. And this signal is always complex pre-conditioning in controlling. In this paper, a low-swing strategy is adopted to reduce not only the leakage power on the bit lines, but also limited the total voltage swing for saving power. Different with reference [9]’s differential low-swing circuits, this paper introduces a low-swing scheme for multi-ported register files. And this scheme is very easy to use and gives about 24.8% power saving in active mode. Section II gives a detail description of this low-swing strategy and section III shows the implementation in this method and the simulation results are given, section IV is the conclusion of this paper. II. LOW-SWING STRATEGY I  N BIT LINE This section covers the details of this low-swing method in saving power. Typically, lowering the power supply is a  powerful method in reducing both the active and leakage  power [10] [11]. As the previous part goes with the square of VDD and the last one is proportion to the VDD, the whole  power could be cut down by decreasing VDD voltage. However, the lowering VDD increases the gate delays, which goes against the high speed demands in register files. In order to keep the gate delays same as before, the low threshold voltage transistors are used for compensating the lowed VDD,  but setting 90nm CMOS logic technology as an example, the VDD is already falling down to 1 volt. This means that the adoption of low V th  transistors brings little space in lowering VDD. And in more advanced processes, lowering VDD  becomes even more difficult. Even though the low threshold transistors can be used in lowering the VDD, but the leakage caused by low threshold  becomes prominent. In the below part, the leakage in 90nm CMOS technology is studied as a guide for total power reduction in register files.  A. The Leakage in 90nm Logic CMOS Process The leakage current component is the product of the off- current (sub-threshold, gate, etc.) multiplied by the supply voltage. Lowering the supply voltage VDD exponentially reduces the leakage power. In this part, various MOS transistor’s leakage is studied by using 90nm logic CMOS technology. There are mainly three

Upload: edmund-leong

Post on 14-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Low-Power 8-Read 4-Write Register File Design

7/27/2019 A Low-Power 8-Read 4-Write Register File Design

http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 1/4

 

A Low-power 8-Read 4-Write Register File Design

Hao YAN1,2, Yan LIU1, Dong-hui WANG1, Chao-huan HOU1 

Digital System Integration Lab

Institute of Acoustics, Chinese Academy of Sciences1 

Graduate University of Chinese Academy of Sciences2 Beijing, China

[email protected]

 Abstract —  This paper gives a study of leakage in 90nm logic

CMOS technology, and by analyzing the power constitutions in

multi-ported register files and the leakage in different nMOS

transistors with different power supply, a low-swing strategy for

bit lines is used in saving power. In this paper an 8 read / 4 write

ports write-through register file is designed and it dissipates

10.04mW at 500MHz, which save 33.36% power in read

operation and 24.8% energy in average. The leakage power on

the bit lines also reduces 1.7% by using high threshold voltage

transistors in low-swing scheme. 

I.  I NTRODUCTION 

For mobile and handheld microprocessors, lowering power consumption for longer battery life is always a key issue in thedesign. And the main energy consumed in a microprocessors’core is in ALU and on-chip memories, which contain the mostcritical speed paths, limiting the microprocessor frequency.

In the modern microprocessors’ architecture, various typesof memories such as caches, register files (RF), sometimescontent addressable memories (CAM) are used. Among theseon-chip memories, register files on the top of memoryhierarchy that directly contacted with the execution unitsrequire very fast access times.

Multi-ported register files are basic block of superscalar microprocessors [1], which enable concurrent executionmultiples instructions in a single cycle and consuming one of the largest percentages of power at about 25%. The architectureof multi-ported register files consists of a decode block andmemory array block. In multi-ported register file, the energycost is proportion to its ports’ number, and the main power contribution to multi-ported register files is in address decodingsteps and the charge and discharge in bit lines for readingoperations. By using different decoder types, the power consumption also changes. Generally, the NAND structuredecoder consumes less power and the total switchingcapacitances are also reduced via pre-decoding. The other part

in power consumption becomes even more serious coupledwith the technology’s progressing. With CMOS technologyscaling, aggressively low threshold devices result in anexponential increase in bit line active leakage currents and poor  bit line noise immunity [2]. In order to make the bit line robust,the additional circuits called keeper is added to maintain thecharge on bit lines. However, despite the die costs, this comesat a cost of bit lines delay increase due to keeper contention.

In reference [3], the leakage trend increases 2 or 3 times for each technology generation, and consume 30~50% of the power in the core. In order to reduce the energy lost generated by leakage, several modified memory core cells are proposed

[4] [5]. In reference [6] [7], substrate/well bias schemes havealso been proposed to enhance bit lines leakage tolerance.However, general N-well CMOS process can not provide thespecial bias for substrate. This restricts the application. Other leakage tolerant techniques like reference [8], a special controlsignal is added before a read operation occurs. And this signalis always complex pre-conditioning in controlling.

In this paper, a low-swing strategy is adopted to reduce notonly the leakage power on the bit lines, but also limited the

total voltage swing for saving power. Different with reference[9]’s differential low-swing circuits, this paper introduces alow-swing scheme for multi-ported register files. And thisscheme is very easy to use and gives about 24.8% power saving in active mode. Section II gives a detail description of this low-swing strategy and section III shows theimplementation in this method and the simulation results aregiven, section IV is the conclusion of this paper.

II.  LOW-SWING STRATEGY I N BIT LINE 

This section covers the details of this low-swing method insaving power. Typically, lowering the power supply is a powerful method in reducing both the active and leakage power [10] [11]. As the previous part goes with the square of 

VDD and the last one is proportion to the VDD, the whole power could be cut down by decreasing VDD voltage.However, the lowering VDD increases the gate delays, whichgoes against the high speed demands in register files. In order to keep the gate delays same as before, the low thresholdvoltage transistors are used for compensating the lowed VDD, but setting 90nm CMOS logic technology as an example, theVDD is already falling down to 1 volt. This means that theadoption of low Vth transistors brings little space in loweringVDD. And in more advanced processes, lowering VDD becomes even more difficult.

Even though the low threshold transistors can be used inlowering the VDD, but the leakage caused by low threshold

 becomes prominent. In the below part, the leakage in 90nmCMOS technology is studied as a guide for total power reduction in register files.

 A.  The Leakage in 90nm Logic CMOS Process

The leakage current component is the product of the off-current (sub-threshold, gate, etc.) multiplied by the supplyvoltage. Lowering the supply voltage VDD exponentiallyreduces the leakage power.

In this part, various MOS transistor’s leakage is studied byusing 90nm logic CMOS technology. There are mainly three

Page 2: A Low-Power 8-Read 4-Write Register File Design

7/27/2019 A Low-Power 8-Read 4-Write Register File Design

http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 2/4

 

types of nMOS transistors in 90nm CMOS technology. Fig .1shows the simulation results of three type nMOS transistors’leakage currents when VDD changes from 0 to 1 volt. Thesetransistors are all in minimum size. Compared with eachothers, the nMOS transistor with high threshold has the lowestchannel leakage current and the low Vth transistor hasextremely high leakage current which is about 52 times larger than high Vth transistor when VDD equals to 1 volt. So for the purpose of reducing the leakage power, the low Vth MOStransistors should be avoided. And from figure 1, a fact is presented that only the VDD reduces to about 0.15 volt couldmake the leakage current equals to normal nMOS transistor leakage current whose VDD is 1 volt. Based on those truthsmentioned above, the usage of low threshold MOS transistorsand lowering VDD is not working well at shrinking down theleakage power consumption.

Figure 1. The simulation results of different threshold voltage nMOStransistors’ channel leakage currents as the supply voltage changes.

Figure 2. The leakage currents through gates in there types nMOS transistorswith minimum geometry in simulation.

Figure 2 depicts the changes in the currents that leak through the gates with different type nMOS transistors whenthe input voltages rising from ground to VDD. And the resultshows that those currents are very close to each other, whichcan be neglect compared with the channel leakage currents infigure 1.

 B.  The Low-swing Strategy in Saving Power 

This part will give a low-swing scheme on the bit lines toreduce the total charging power consumed on the bit lines, andto fight against the energy lost through leakages, the lowchannel leakage transistors are used.

 Nowadays, Multi-ported register files always using single-ended implementation to realize high density in area, and time- borrowable domino logic to enhance performances. Due to the bit lines’ segments, additional power consumption on the globe bit lines is coming out, and the parasitic effects and leakagedeteriorate the whole chip’s performance along with the scalingin technology.

In multi-ported register files, the main power consumption

is in reading operation. During reading operation, the bit linesneed to be charged and discharged frequently. This makes the power consumed on bit lines take a great part in total power distribution. And in traditional precharge step, the pMOStransistor is used. The advantage of pMOS precharge is the fullswing on the bit lines with high noise immunity.

However, the full swing on the bit lines is not a must inregister file. Therefore a low swing method can be used inregister files design if the bit line is robust enough.

To realize the low-swing on the bit lines, there are severalways such as precharge the bit line in carefully calculated timewith special current source, or use feedback circuits to detectthe level on the bit line. In this paper, a very simplicity method

is to use nMOS transistors instead of pMOS transistors tocharge the bit lines. As the poor capability of passing highlogic in nMOS transistors, the voltage can be only charged toVDD - Vth. Different with multi voltage strategy, this methoduse a unit single voltage supply, and the low-swing schememainly focuses on the voltage level of bit lines. And byswitching the types of nMOS transistor, the voltage level on bitlines can be adjusted slightly. In this time, the power consumedin bit lines is the square of VDD- Vth, and is cut down directly.Suppose VDD- Vth is 0.5 volt, and this can give about 75%improvement in power on the bit lines. That is very attractivein reducing the energy.

In the full swing bit lines register files, the sense parts may

 be use inverters to pass the results, which threshold voltage isone half of VDD. And after lowering the swing on the bit lines,the threshold voltage of sense parts must be dropped as well.Figure 3.a is an optional logic in sense parts and figure 3.b isthe voltage transfer characteristic curve. In figure 3.b, thethreshold of this circuit is about 0.25 volt, and in actualimplantations, the threshold voltage can be designed accordingto the exact specification. There are many other ways to detectthe bit lines voltage. For examples, the sense amplifiers can beobtained just by carefully designed the transistors’ size in theinverters to get the proper threshold voltage. And the pre-charged domino logic or differential sense amplifier can also be used to detect the voltage changing on the bit lines. The later one needs to generate a lower voltage reference on-chip.

By using the sense logic in figure 3, the low-swing methodin charging bit lines also does not affect the bit lines’ segment.Therefore the usage of this scheme brings no trade off intiming demands. On the contrary, by setting reasonable sensingthreshold, an improvement in charging speed is also obtained.But too sensitivity would also make the register file’s noiseimmunity worse. Thus aiming at the robustness of bit lines, theless sensitive amplifiers are recommended to compensate the poor noise immunity brought by low-swing on the bit lines.

Page 3: A Low-Power 8-Read 4-Write Register File Design

7/27/2019 A Low-Power 8-Read 4-Write Register File Design

http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 3/4

 

Figure 3. (a) The sense logic circuit in low-swing scheme. (b) The voltagetransfer characteristic curve of this logic.

As the off current in nMOS transistor goes down with thevoltage on bit line, in this low-swing scheme, the leakage power consumption is also cut down along with the bit linesactive power consumptions. And from figure 1, when the power supply is about half VDD, the leakage current of low Vth MOS transistors is 4.76 times of normal nMOS transistors and34.5 times of high Vth MOS transistors. This result implies thatthe energy caused by leakage could be cut down about 34%.And it is very powerful in the situations where the leakage’s

influence is too bad.

Figure 4 shows a leakage situation on the bit lines in asingle-ended application. After the pre-charge step, the voltageon the bit line goes high. (The voltage is VDD in pMOS precharge step, and in nMOS precharge scheme, it is VDD-Vth ). And if the data on the bit line is “0”, the charged bit linedoes not need to be discharge, but the charge stored on the bitline is losing all the time as a result of the leakages of nMOStransistors.

In figure 4 the bit line is also connected to the gates of thesense amplifiers, and there is always a leakage current throughthe gates. But fortunately, the bit lines’ leakages due to thesense amplifiers’ gates are no longer needed to be considered,

 because of the neglectable currents compared with the leakagecurrents in read out logics in 90nm CMOS technology.

Coupled with the advanced technology, the channel leakagecurrents can not be ignored. Additionally, in order to make upthe voltage drops caused by the leakage, the voltage keepersare always introduced for bit lines’ robustness. But this let thevoltage on the bit lines invariably equals to the voltage supply,which maximizes the leakage currents of nMOS transistors.

Therefore by replacing the transistors in read out logic pathwith high Vth nMOS transistors can control the leakagecurrents when these transistors are turned off. And from figure1, the leakage current of high Vth nMOS transistors is below0.3 nA in the smallest geometry. So plus with the low-swing

method, the leakage currents can be reduced greatly. As aresult, the voltage keeper is no longer need. This is truly in90nm CMOS technology by using smallest size transistors asthe bit line segment is less than 32 bits, but below the 65nm,the leakage becomes extremely serious. The keeper may beadded to increase the stabilities in bit lines. In other words, afine segment in bit line is needed for alleviating leakage.

Figure 4. The leakages on the bit line through read out logics.

III.  IMPLEMENTATION AND SIMULATION R ESULTS 

In order to show the performance of the low-swing strategyin saving power, an 8 read / 4 write write-through register fileis designed. This multi-ported register file is organized in32words×32bits. Figure 5 depicts the timing of this register fileand figure 6 gives the layout of this register file. In this register file, the bit lines are not segmented, and the 32 cell’s bit linesare charged or discharged simultaneity.

Figure 5. The timing waveform of this register file.

Read port Read portRead Control Logic

16 bits

Memory

Array

16 bits

Memory

Array

PRE-CHARGE LOGIC

Decoder 

Block 

Write ports Write portsAddress Ports and

Priority Encoder  

Figure 6. The layout of this register file.

In this register file, each port has its own gated clock andwhen the enable signal is invalid, the port is shut down for saving power. And the decoder is implemented with SourceCoupled Logic circuit.

Page 4: A Low-Power 8-Read 4-Write Register File Design

7/27/2019 A Low-Power 8-Read 4-Write Register File Design

http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 4/4

 

The power consumption of this register file is measuredunder the condition that all 12 ports work and the switchingactivity of each port is above 1/2. Table I shows the postsimulation results of this register file at 500MHz in the typicalcorner. The sense logic in this register file is a designedthreshold inverter just for comparing with the inverter sensed innon low-swing register file. And the timing parameters of eachregister files are almost same. The read access time of thisregister file is 1.4ns at worst case.

From table 1, the low-swing scheme plays a great role inreducing the power dissipation and can save 24.8% energycompared the one not using low-swing. And by replacing thetransistors in the read out logic of memory core cell, the highVth nMOS transistors consume lowest energy about 1.7% power improvement in low-swing scheme. These results showthat the leakage on bit lines in 90nm CMOS technology is notvery severity. However, there are a lot of leakages take placesin memory core cells, and these leakages take a great part inleakage power. If the transistors in crossed coupled inverters incore cell are all changed into high threshold voltage transistors,the power is reduced to 10.02mW. This gives a little additional power improvement and shows that the leakages in crossed

coupled inverters are not big deals. Actually, the leakagesthrough the writing and reading logic in core cells are the mainsource of core cells, especially in multi-ported register file.

In the reading cycles, the low-swing scheme can save33.36% power. And by using high Vth nMOS transistors, theenergy consumed in leakage reduces about 2%. This 2% saved power is in the low-swing scheme and is typically above 2%compared with Non low-swing applications.

In this multi-ported register file, the souse coupled logicdecoder block also dissipates a great part in power because of frequently charging and discharging. If the decoder is changedto the NAND structure, the effect in power saving will be evenmore prominent.

TABLE I. THE POWER SUMMURY.

Power

/ mW

Non

low-swing

Low-swing

Bit lines core

N10 Nlvt10 N10 Nhvt10 Nhvt10

Write 3.65 3.63 3.59 3.57 3.55

Read 9.71 6.59 6.57 6.47 6.47

Total 13.36 10.22 10.16 10.04 10.02

IV.  CONCLUSION 

This paper proposed a low-swing strategy in bit linescharging step. And by the study of different threshold voltagesnMOS transistors’ leakage in 90nm logic CMOS process, ananti-leakage in bit line using low-swing scheme multi-portedregister file is implemented. This register file is an 8 read / 4write write-through register file which is organized in32words×32bits without bit lines segment. The post simulationresults prove that the total power consumption reduce to 24.8%in low-swing strategy and 33.36% in reading processes. The

energy saving in leakage is about 1.7% in low-swing scheme inactive state, and in non active mode, the leakage current can bereduce greatly by the high Vth transistors.

R EFERENCES 

[1]  Wei Hwang, Rajiv V Joshi, Walter H Henkels, “A 500-MHz, 32-Word64-Bit, Eight-Port Self-Resetting CMOS Register File”, IEEE J. Solid-State Circuits,1999, 34:56A. N. Netravali and B. G. Haskell,  Digital 

 Pictures, 2nd ed., Plenum Press: New York, 1995, pp. 613-651.[2]  R. Krishnamurthy, et al., “A 130-nm 6-GHz 256x32b leakage-tolerant

register file,” IEEE Journal of Solid-State Circuits, vol. 37, pp. 624-632,May 2002.

[3]  De and S. Borkhar, “Technology and Design Challenges for Low Power and High Performance,” in Proceedings International Symposium Low

 Power Electronics Design, Aug. 1999, pp. 163-168.

[4]  Shengqi Yang; “Low-leakage robust SRAM cell design for sub-100nmtechnologies”,  Proceedings of the ASP-DAC 2005. Asia and South

 Pacific, pp.539 - 544 Vol. 1,2005[5]  Jain, S.K ,Agarwal, P., “A low leakage and SNM free SRAM cell design

in deep sub micron CMOS technology”, VLSI Design, 2006[6]  H. Kawaguchi, et al., “Dynamic leakage cut-off scheme for low-voltage

SRAMs,” in Symposium VLSI Circuits Digest Technical Papers, June1998, pp. 140-141.

[7]  T. Kuroda, et al., “A 0.9V 150MHz 10mW 4mm2 2-D discrete cosinetransform core processor with variable threshold voltage scheme,” IEEE 

 Journal Solid-State Circuits, vol. 31, pp. 1770-1779, Nov. 1996.[8]  K. Agawa, et al., “A bitline leakage compensation scheme for low-

voltage SRAMs,”  IEEE Journal Solid-State Circuits, vol. 36, pp. 726-734, May 2001.

[9]  D. Deleganes, et al., “Low-voltage swing logic circuits for a Pentium® 4 processor integer core,” IEEE Journal Solid-State Circuits, vol. 40, pp.36-43, Jan. 2005.

[10]  D. Liu and C. Svensson, “Trading speed for low power by choice of supply and threshold voltage,”  IEEE Journal Solid-State Circuits, vol.28, pp. 10-17, January 1993.

[11]  R. Gonzalez, et al., “Supply and threshold voltage scaling for low power CMOS,”  IEEE Journal Solid-State Circuits, vol. 32, pp. 1210-1216,Aug. 1997.