a low-power 8-read 4-write register file design
TRANSCRIPT
![Page 1: A Low-Power 8-Read 4-Write Register File Design](https://reader031.vdocuments.net/reader031/viewer/2022020812/577cd7121a28ab9e789df9fd/html5/thumbnails/1.jpg)
7/27/2019 A Low-Power 8-Read 4-Write Register File Design
http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 1/4
A Low-power 8-Read 4-Write Register File Design
Hao YAN1,2, Yan LIU1, Dong-hui WANG1, Chao-huan HOU1
Digital System Integration Lab
Institute of Acoustics, Chinese Academy of Sciences1
Graduate University of Chinese Academy of Sciences2 Beijing, China
Abstract — This paper gives a study of leakage in 90nm logic
CMOS technology, and by analyzing the power constitutions in
multi-ported register files and the leakage in different nMOS
transistors with different power supply, a low-swing strategy for
bit lines is used in saving power. In this paper an 8 read / 4 write
ports write-through register file is designed and it dissipates
10.04mW at 500MHz, which save 33.36% power in read
operation and 24.8% energy in average. The leakage power on
the bit lines also reduces 1.7% by using high threshold voltage
transistors in low-swing scheme.
I. I NTRODUCTION
For mobile and handheld microprocessors, lowering power consumption for longer battery life is always a key issue in thedesign. And the main energy consumed in a microprocessors’core is in ALU and on-chip memories, which contain the mostcritical speed paths, limiting the microprocessor frequency.
In the modern microprocessors’ architecture, various typesof memories such as caches, register files (RF), sometimescontent addressable memories (CAM) are used. Among theseon-chip memories, register files on the top of memoryhierarchy that directly contacted with the execution unitsrequire very fast access times.
Multi-ported register files are basic block of superscalar microprocessors [1], which enable concurrent executionmultiples instructions in a single cycle and consuming one of the largest percentages of power at about 25%. The architectureof multi-ported register files consists of a decode block andmemory array block. In multi-ported register file, the energycost is proportion to its ports’ number, and the main power contribution to multi-ported register files is in address decodingsteps and the charge and discharge in bit lines for readingoperations. By using different decoder types, the power consumption also changes. Generally, the NAND structuredecoder consumes less power and the total switchingcapacitances are also reduced via pre-decoding. The other part
in power consumption becomes even more serious coupledwith the technology’s progressing. With CMOS technologyscaling, aggressively low threshold devices result in anexponential increase in bit line active leakage currents and poor bit line noise immunity [2]. In order to make the bit line robust,the additional circuits called keeper is added to maintain thecharge on bit lines. However, despite the die costs, this comesat a cost of bit lines delay increase due to keeper contention.
In reference [3], the leakage trend increases 2 or 3 times for each technology generation, and consume 30~50% of the power in the core. In order to reduce the energy lost generated by leakage, several modified memory core cells are proposed
[4] [5]. In reference [6] [7], substrate/well bias schemes havealso been proposed to enhance bit lines leakage tolerance.However, general N-well CMOS process can not provide thespecial bias for substrate. This restricts the application. Other leakage tolerant techniques like reference [8], a special controlsignal is added before a read operation occurs. And this signalis always complex pre-conditioning in controlling.
In this paper, a low-swing strategy is adopted to reduce notonly the leakage power on the bit lines, but also limited the
total voltage swing for saving power. Different with reference[9]’s differential low-swing circuits, this paper introduces alow-swing scheme for multi-ported register files. And thisscheme is very easy to use and gives about 24.8% power saving in active mode. Section II gives a detail description of this low-swing strategy and section III shows theimplementation in this method and the simulation results aregiven, section IV is the conclusion of this paper.
II. LOW-SWING STRATEGY I N BIT LINE
This section covers the details of this low-swing method insaving power. Typically, lowering the power supply is a powerful method in reducing both the active and leakage power [10] [11]. As the previous part goes with the square of
VDD and the last one is proportion to the VDD, the whole power could be cut down by decreasing VDD voltage.However, the lowering VDD increases the gate delays, whichgoes against the high speed demands in register files. In order to keep the gate delays same as before, the low thresholdvoltage transistors are used for compensating the lowed VDD, but setting 90nm CMOS logic technology as an example, theVDD is already falling down to 1 volt. This means that theadoption of low Vth transistors brings little space in loweringVDD. And in more advanced processes, lowering VDD becomes even more difficult.
Even though the low threshold transistors can be used inlowering the VDD, but the leakage caused by low threshold
becomes prominent. In the below part, the leakage in 90nmCMOS technology is studied as a guide for total power reduction in register files.
A. The Leakage in 90nm Logic CMOS Process
The leakage current component is the product of the off-current (sub-threshold, gate, etc.) multiplied by the supplyvoltage. Lowering the supply voltage VDD exponentiallyreduces the leakage power.
In this part, various MOS transistor’s leakage is studied byusing 90nm logic CMOS technology. There are mainly three
![Page 2: A Low-Power 8-Read 4-Write Register File Design](https://reader031.vdocuments.net/reader031/viewer/2022020812/577cd7121a28ab9e789df9fd/html5/thumbnails/2.jpg)
7/27/2019 A Low-Power 8-Read 4-Write Register File Design
http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 2/4
types of nMOS transistors in 90nm CMOS technology. Fig .1shows the simulation results of three type nMOS transistors’leakage currents when VDD changes from 0 to 1 volt. Thesetransistors are all in minimum size. Compared with eachothers, the nMOS transistor with high threshold has the lowestchannel leakage current and the low Vth transistor hasextremely high leakage current which is about 52 times larger than high Vth transistor when VDD equals to 1 volt. So for the purpose of reducing the leakage power, the low Vth MOStransistors should be avoided. And from figure 1, a fact is presented that only the VDD reduces to about 0.15 volt couldmake the leakage current equals to normal nMOS transistor leakage current whose VDD is 1 volt. Based on those truthsmentioned above, the usage of low threshold MOS transistorsand lowering VDD is not working well at shrinking down theleakage power consumption.
Figure 1. The simulation results of different threshold voltage nMOStransistors’ channel leakage currents as the supply voltage changes.
Figure 2. The leakage currents through gates in there types nMOS transistorswith minimum geometry in simulation.
Figure 2 depicts the changes in the currents that leak through the gates with different type nMOS transistors whenthe input voltages rising from ground to VDD. And the resultshows that those currents are very close to each other, whichcan be neglect compared with the channel leakage currents infigure 1.
B. The Low-swing Strategy in Saving Power
This part will give a low-swing scheme on the bit lines toreduce the total charging power consumed on the bit lines, andto fight against the energy lost through leakages, the lowchannel leakage transistors are used.
Nowadays, Multi-ported register files always using single-ended implementation to realize high density in area, and time- borrowable domino logic to enhance performances. Due to the bit lines’ segments, additional power consumption on the globe bit lines is coming out, and the parasitic effects and leakagedeteriorate the whole chip’s performance along with the scalingin technology.
In multi-ported register files, the main power consumption
is in reading operation. During reading operation, the bit linesneed to be charged and discharged frequently. This makes the power consumed on bit lines take a great part in total power distribution. And in traditional precharge step, the pMOStransistor is used. The advantage of pMOS precharge is the fullswing on the bit lines with high noise immunity.
However, the full swing on the bit lines is not a must inregister file. Therefore a low swing method can be used inregister files design if the bit line is robust enough.
To realize the low-swing on the bit lines, there are severalways such as precharge the bit line in carefully calculated timewith special current source, or use feedback circuits to detectthe level on the bit line. In this paper, a very simplicity method
is to use nMOS transistors instead of pMOS transistors tocharge the bit lines. As the poor capability of passing highlogic in nMOS transistors, the voltage can be only charged toVDD - Vth. Different with multi voltage strategy, this methoduse a unit single voltage supply, and the low-swing schememainly focuses on the voltage level of bit lines. And byswitching the types of nMOS transistor, the voltage level on bitlines can be adjusted slightly. In this time, the power consumedin bit lines is the square of VDD- Vth, and is cut down directly.Suppose VDD- Vth is 0.5 volt, and this can give about 75%improvement in power on the bit lines. That is very attractivein reducing the energy.
In the full swing bit lines register files, the sense parts may
be use inverters to pass the results, which threshold voltage isone half of VDD. And after lowering the swing on the bit lines,the threshold voltage of sense parts must be dropped as well.Figure 3.a is an optional logic in sense parts and figure 3.b isthe voltage transfer characteristic curve. In figure 3.b, thethreshold of this circuit is about 0.25 volt, and in actualimplantations, the threshold voltage can be designed accordingto the exact specification. There are many other ways to detectthe bit lines voltage. For examples, the sense amplifiers can beobtained just by carefully designed the transistors’ size in theinverters to get the proper threshold voltage. And the pre-charged domino logic or differential sense amplifier can also be used to detect the voltage changing on the bit lines. The later one needs to generate a lower voltage reference on-chip.
By using the sense logic in figure 3, the low-swing methodin charging bit lines also does not affect the bit lines’ segment.Therefore the usage of this scheme brings no trade off intiming demands. On the contrary, by setting reasonable sensingthreshold, an improvement in charging speed is also obtained.But too sensitivity would also make the register file’s noiseimmunity worse. Thus aiming at the robustness of bit lines, theless sensitive amplifiers are recommended to compensate the poor noise immunity brought by low-swing on the bit lines.
![Page 3: A Low-Power 8-Read 4-Write Register File Design](https://reader031.vdocuments.net/reader031/viewer/2022020812/577cd7121a28ab9e789df9fd/html5/thumbnails/3.jpg)
7/27/2019 A Low-Power 8-Read 4-Write Register File Design
http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 3/4
Figure 3. (a) The sense logic circuit in low-swing scheme. (b) The voltagetransfer characteristic curve of this logic.
As the off current in nMOS transistor goes down with thevoltage on bit line, in this low-swing scheme, the leakage power consumption is also cut down along with the bit linesactive power consumptions. And from figure 1, when the power supply is about half VDD, the leakage current of low Vth MOS transistors is 4.76 times of normal nMOS transistors and34.5 times of high Vth MOS transistors. This result implies thatthe energy caused by leakage could be cut down about 34%.And it is very powerful in the situations where the leakage’s
influence is too bad.
Figure 4 shows a leakage situation on the bit lines in asingle-ended application. After the pre-charge step, the voltageon the bit line goes high. (The voltage is VDD in pMOS precharge step, and in nMOS precharge scheme, it is VDD-Vth ). And if the data on the bit line is “0”, the charged bit linedoes not need to be discharge, but the charge stored on the bitline is losing all the time as a result of the leakages of nMOStransistors.
In figure 4 the bit line is also connected to the gates of thesense amplifiers, and there is always a leakage current throughthe gates. But fortunately, the bit lines’ leakages due to thesense amplifiers’ gates are no longer needed to be considered,
because of the neglectable currents compared with the leakagecurrents in read out logics in 90nm CMOS technology.
Coupled with the advanced technology, the channel leakagecurrents can not be ignored. Additionally, in order to make upthe voltage drops caused by the leakage, the voltage keepersare always introduced for bit lines’ robustness. But this let thevoltage on the bit lines invariably equals to the voltage supply,which maximizes the leakage currents of nMOS transistors.
Therefore by replacing the transistors in read out logic pathwith high Vth nMOS transistors can control the leakagecurrents when these transistors are turned off. And from figure1, the leakage current of high Vth nMOS transistors is below0.3 nA in the smallest geometry. So plus with the low-swing
method, the leakage currents can be reduced greatly. As aresult, the voltage keeper is no longer need. This is truly in90nm CMOS technology by using smallest size transistors asthe bit line segment is less than 32 bits, but below the 65nm,the leakage becomes extremely serious. The keeper may beadded to increase the stabilities in bit lines. In other words, afine segment in bit line is needed for alleviating leakage.
Figure 4. The leakages on the bit line through read out logics.
III. IMPLEMENTATION AND SIMULATION R ESULTS
In order to show the performance of the low-swing strategyin saving power, an 8 read / 4 write write-through register fileis designed. This multi-ported register file is organized in32words×32bits. Figure 5 depicts the timing of this register fileand figure 6 gives the layout of this register file. In this register file, the bit lines are not segmented, and the 32 cell’s bit linesare charged or discharged simultaneity.
Figure 5. The timing waveform of this register file.
Read port Read portRead Control Logic
16 bits
Memory
Array
16 bits
Memory
Array
PRE-CHARGE LOGIC
Decoder
Block
Write ports Write portsAddress Ports and
Priority Encoder
Figure 6. The layout of this register file.
In this register file, each port has its own gated clock andwhen the enable signal is invalid, the port is shut down for saving power. And the decoder is implemented with SourceCoupled Logic circuit.
![Page 4: A Low-Power 8-Read 4-Write Register File Design](https://reader031.vdocuments.net/reader031/viewer/2022020812/577cd7121a28ab9e789df9fd/html5/thumbnails/4.jpg)
7/27/2019 A Low-Power 8-Read 4-Write Register File Design
http://slidepdf.com/reader/full/a-low-power-8-read-4-write-register-file-design 4/4
The power consumption of this register file is measuredunder the condition that all 12 ports work and the switchingactivity of each port is above 1/2. Table I shows the postsimulation results of this register file at 500MHz in the typicalcorner. The sense logic in this register file is a designedthreshold inverter just for comparing with the inverter sensed innon low-swing register file. And the timing parameters of eachregister files are almost same. The read access time of thisregister file is 1.4ns at worst case.
From table 1, the low-swing scheme plays a great role inreducing the power dissipation and can save 24.8% energycompared the one not using low-swing. And by replacing thetransistors in the read out logic of memory core cell, the highVth nMOS transistors consume lowest energy about 1.7% power improvement in low-swing scheme. These results showthat the leakage on bit lines in 90nm CMOS technology is notvery severity. However, there are a lot of leakages take placesin memory core cells, and these leakages take a great part inleakage power. If the transistors in crossed coupled inverters incore cell are all changed into high threshold voltage transistors,the power is reduced to 10.02mW. This gives a little additional power improvement and shows that the leakages in crossed
coupled inverters are not big deals. Actually, the leakagesthrough the writing and reading logic in core cells are the mainsource of core cells, especially in multi-ported register file.
In the reading cycles, the low-swing scheme can save33.36% power. And by using high Vth nMOS transistors, theenergy consumed in leakage reduces about 2%. This 2% saved power is in the low-swing scheme and is typically above 2%compared with Non low-swing applications.
In this multi-ported register file, the souse coupled logicdecoder block also dissipates a great part in power because of frequently charging and discharging. If the decoder is changedto the NAND structure, the effect in power saving will be evenmore prominent.
TABLE I. THE POWER SUMMURY.
Power
/ mW
Non
low-swing
Low-swing
Bit lines core
N10 Nlvt10 N10 Nhvt10 Nhvt10
Write 3.65 3.63 3.59 3.57 3.55
Read 9.71 6.59 6.57 6.47 6.47
Total 13.36 10.22 10.16 10.04 10.02
IV. CONCLUSION
This paper proposed a low-swing strategy in bit linescharging step. And by the study of different threshold voltagesnMOS transistors’ leakage in 90nm logic CMOS process, ananti-leakage in bit line using low-swing scheme multi-portedregister file is implemented. This register file is an 8 read / 4write write-through register file which is organized in32words×32bits without bit lines segment. The post simulationresults prove that the total power consumption reduce to 24.8%in low-swing strategy and 33.36% in reading processes. The
energy saving in leakage is about 1.7% in low-swing scheme inactive state, and in non active mode, the leakage current can bereduce greatly by the high Vth transistors.
R EFERENCES
[1] Wei Hwang, Rajiv V Joshi, Walter H Henkels, “A 500-MHz, 32-Word64-Bit, Eight-Port Self-Resetting CMOS Register File”, IEEE J. Solid-State Circuits,1999, 34:56A. N. Netravali and B. G. Haskell, Digital
Pictures, 2nd ed., Plenum Press: New York, 1995, pp. 613-651.[2] R. Krishnamurthy, et al., “A 130-nm 6-GHz 256x32b leakage-tolerant
register file,” IEEE Journal of Solid-State Circuits, vol. 37, pp. 624-632,May 2002.
[3] De and S. Borkhar, “Technology and Design Challenges for Low Power and High Performance,” in Proceedings International Symposium Low
Power Electronics Design, Aug. 1999, pp. 163-168.
[4] Shengqi Yang; “Low-leakage robust SRAM cell design for sub-100nmtechnologies”, Proceedings of the ASP-DAC 2005. Asia and South
Pacific, pp.539 - 544 Vol. 1,2005[5] Jain, S.K ,Agarwal, P., “A low leakage and SNM free SRAM cell design
in deep sub micron CMOS technology”, VLSI Design, 2006[6] H. Kawaguchi, et al., “Dynamic leakage cut-off scheme for low-voltage
SRAMs,” in Symposium VLSI Circuits Digest Technical Papers, June1998, pp. 140-141.
[7] T. Kuroda, et al., “A 0.9V 150MHz 10mW 4mm2 2-D discrete cosinetransform core processor with variable threshold voltage scheme,” IEEE
Journal Solid-State Circuits, vol. 31, pp. 1770-1779, Nov. 1996.[8] K. Agawa, et al., “A bitline leakage compensation scheme for low-
voltage SRAMs,” IEEE Journal Solid-State Circuits, vol. 36, pp. 726-734, May 2001.
[9] D. Deleganes, et al., “Low-voltage swing logic circuits for a Pentium® 4 processor integer core,” IEEE Journal Solid-State Circuits, vol. 40, pp.36-43, Jan. 2005.
[10] D. Liu and C. Svensson, “Trading speed for low power by choice of supply and threshold voltage,” IEEE Journal Solid-State Circuits, vol.28, pp. 10-17, January 1993.
[11] R. Gonzalez, et al., “Supply and threshold voltage scaling for low power CMOS,” IEEE Journal Solid-State Circuits, vol. 32, pp. 1210-1216,Aug. 1997.