mips a microprocessor architecture

Upload: heather-harris

Post on 09-Apr-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 MIPS a Microprocessor Architecture

    1/6

    M I PS : A M i c r o p r o c e s s o r A r c h i t e c t u r eJ o h n H e n n e s s y , N o r m a n J o u p p i , S t e v e n P r z y b y l s k i , C h r i s t o p h e r R o w e n ,T h o m a s G r o s s , F o r e s t B a s k e t t , a n d J o h n G i l l

    D e p a r t m e n t s o f E l e c t r ic a l E n g i n e e r in g a n d C o m p u t e r S c i e n c eS t a n f o r d U n i v e r s i t y

    A b s t r a c tMIPS i s a n ew s i n g l e ch i p V LSI mi c ro p ro cesso r . I t a f t emp t s t oach i ev e h i g h p e r fo rma n ce w i t h t h e u se o f a s i mp l i f i ed i n s tru c t i o nset , s imi lar to those found in microengines. The processor i s a fastp ipel ined engine wi thout p ipel ine in terlocks. Software so lu t ionsto several t rad i t ional hardware problems, such as provid ingpipel ine in terlocks, are used .

    I n t r o d u c t i o nMIP S (M i c ro p ro cesso r w i t h o u t In t e r l o ck ed P i p e S t ag es ) is a n ewg en e ra l p u rp o se m i c ro p ro cesso r a rch i t ect u re d e s i g n ed t o b ei mp l em en t ed o n a s i n g le V LSI ch i p . Th e ma i n g o a l o f t h e d e s i g ni s h i g h p e r fo rman ce i n t h e ex ecu t i o n o f co mPi l ed co d e . Th earchi tecture i s experimental s ince i t i s a rad ical break wi th thet ren d o f mo d e rn co mp u t e r a rch i tec t u res . Th e b as i c p h i l o so p h y o fMIPS is to present an inst ruct ion se t that i s a co~!api ler-drivenen co d i n g o f t h e mi c ro en g i n e . Th u s , l i t tl e o r n o d eco d i n g i sn eed ed an d t h e i n s t ru c t i o n s co r re sp o n d c l o se ly to m i c ro eo d ei n s tru c t io n s . Th e p ro cesso r i s p i p e l i n ed b u t p ro v i d es n o p i p e l i n ei n t e r lo ck h a rd w are ; t h is fu n c t i o n m u s t b e p ro v i d ed b y so f t w a re .Th e MIPS a rch i t ec t u re p re sen ts t h e u se r w i th a fa s t mach i n e w i t ha s imple inst ruct ion se t . This approach has been used by the IBM8071_ pro ject and i s current ly be ing ex plored by the RI SC projecta t Berkeley2; i t i s d i rect ly 'oppo sed to the a pproa ch taken b yarchi tectures such as the VAX. How ever, there are s igni ficantd i f fe ren ces b e t w een t h e RISC ap p ro ach an d t h e ap p ro ach u sed i nM I P S :

    1 . Th e RIS C a rch i t ec t u re i s s imp l e b o t h i n t h e i n s t ru c ti o n se tan d t h e h a rd w are n eed ed t o i m p l emen t t h a t i n s tru c t i on se t .A l t h o u g h t h e MIPS i n s t ru c t i o n se t h a s a s i mp l e h a rd w arei mp l eme n t a t i o n ( i. e. i t r eq u i re s a mi n i m a l amo u n t o fhardware cont ro l ) , the user level inst ruct ion se t i s not a sst ra ight forward , and the s impl ic i ty of the user levelinst ruct ion se t i s secondary to the perform ance goals .

    2 . Th e t h ru s t o f t h e R IS C d es i g n i s t o w ard s c f f i c i en timp leme ntat ion o f a s t ra ight forward inst ruct ion se t. In theM1 PS d es i g n , h i g h p e r fo rman ce f ro m t h e h a rd w are en g i n eis a primary goal , and the microengine i s presented to theen d u se r w i t h a m i n i m a l amo u n t o f i n te rp re t a ti o n . Th i smak es mo s t o f t h e mi c ro cn g i n e ' s p a ra l l e li sm av a i l ab l e a t t h einst ruct ion se t level .

    3 . Th e RIS C p ro j ec t re l i e s o n a s t ra i g h tfo rw ard i n s t ru c t i o n se tan d s t ra i g ht fo rw ard co m p i l e r t e chn o l o g y . MIPS w i l l r eq u i remo re so p h i s t i c a t ed co mp i l e r t e ch n o l o g y an d w i l l g a i ns i g n if i can t p e r fo rman ce b en e f i t s f ro m t h a t t e ch n o l o g y . Th eco mp i l e r t e ch n o lo g y a l l o w s a mi c ro co d e - l ev e l i n s t ruc t i o nset to appe ar l ike a normal inst ru ct ion se t to both c o d eg en e ra t o rs an d a s semb l y l an g u ag e p ro g rammers .

    Th e MIP S a rch i t ec tu re i s c l o se r t o t h e 8 0 1 a rch i t ec t u re i n man yaspects . In both machines the macroinst ruct ion se t maps verydi rect ly to the micro operat ions of the processor. Both processorsmay b e t h o u g h t o f a s a rch it ec t u re s w i t h mi c ro - l ev e l u se ri n s t ru c t i o n se t s . Mi c ro co d e i s c rea t ed b y co mp i l e rs an d co d eg en e ra t o rs a s it i s n eed ed t o i mp l em en t co mp l ex o p e ra t i o n s . Th eprimary d i fferences l ie in various archi tectural choices aboutp ipel ine design, reg is ters , op eodes and in the a t tem pt in the MIPSi n s t ru c ti o n se t t o m ak e a l l t h e mi c ro en g i n e p a ra l l e l ism av a i l ab le a tthe user inst ruct ion se t level . Thes e a t tempts are mo st v is ib lew i t h i n MIP S i n t h e fo l l o w i n g w ay s: t h e t w o -p a r t me m o ry / A LUan d A LU / A LU i n s t ru c t io n s , t h e ex p l i c i t p i p e l i n e i n te r l o ck s, an dt h e co n d i t i o n a l j u mp i n s tru c t io n s .MIP S i s d e s i g n ed fo r h ig h p e r fo rman ce . T o a l l o w t h e u se r t o g e tmaximum perf~)rmance, the complexi ty of indiv idual inst ruct ionsis minimized . This a l lows the execut ion of these inst ruct ions a ts igni ficant ly h igher speeds. To take advan tage of s imp lerhardware and an inst ruct ion se t that easi ly maps to themiero inst ruct ion se t , addi t ional compi ler-type t ransla t ion i sn eed ed . Th i s co mp i l e r t e ch n o l o g y mak es a co mp a c t an d t i me-e f f i c i en t map p i n g b e t w een h i g h e r l ev e l co ns t ru c ts an d t h esimpl i f ied inst ruct ion se t. Th e sh i ft ing of the com plexi ty from theh a rd w are t o t h e so f t w a re h a s sev e ra l ma j o r ad v an t ag es :

    T h e co m p l ex i t y i s p a i d fo r o n l y o n ce d u r i n g co mp i l a t i o n .W h en a u se r ru n s h i s p ro g ram o n a co mp l ex a rch i t ec t u re ,he pays the cost of the archi tectural ove rhead cach t ime heruns h is progrmn.

    I t a l lows the concent ra t ion of energ ies on the software ,ra t h e r t h an co n s t ru c t i n g a co mp l ex h a rd w are en g i n e , w h i chis hard to design , debug, and effic ien t ly u ti lize. Software i snot necessari ly easier to const ruct , bu t the WL SI envi -ro n m en t mak es h a rd w are s i mp l i ci t y i mp o r t an t .

    Th e design of a h igh perform ance VLS I processor i s drarnat icallya f fec t ed b y t h e t e ch n o l o g y . A mo n g t h e mo s t i mp o r t an t d e s i g nconsiderat ions are : the effect of p in l imi ta t ions, avai lab le s i l icon

    0 1 9 4 - 1 8 9 5 / 8 2 / 0 0 0 0 / 0 0 1 7 5 0 0 . 7 5 1 9 82 I E E E 17

  • 8/8/2019 MIPS a Microprocessor Architecture

    2/6

    area, and s iz e/spe ed t radeoffs . Pin l imi ta t ions force the carefu ldesign of a scheme for mul t ip lexing the avai lab le p ins, especia l lyw h en d a t a an d i n s t ru c t i o n fet ch es a re o v e r l ap p ed . A real i mi t a ti o n s an d t h e sp eed o f o f f -ch i p i n t e rco mm u n i ca t i o n req u i rechoices between on- and off-chip funct ions as wel l as l imi t ing thecomplete on-chip design . With current s ta te-of-the-art iechnologye i t h e r so me v i t a l co mp o n en t o f t h e p ro cesso r ( su ch a s memo ryman ag em en t ) m u s t b e o f f -ch i p, o r t h e s i ze o f t h e ch i p w i l l mak eb o t h i t s p e r fo rman ce an d y i e l d s u n accep t ab l y l o w . Ch o o s i n g w h a tfu n c t i o n s a re mi g ra t ed o f f -ch i p mu s t b e d o n e ca re fu l ly so t h a t t h ep e r fo rman ce e f fec t s o f t h e p a r t i t i o n i n g a re mi n i mi zed . In so mecases, through carefu l design , the effects may be e l im inated a tsom e ext ra cost for h igh speed off-chip funct ions.Sp eed / co mp l ex i t y / a rea t rad eo f fs a re p e rh ap s t h e mo s t i mp o r t an tan d d i f f i cu lt p h en o m en a t o d ea l w i th . A d d i t i o n a l o n -ch i pfunct ional i ty requi res more area , which a lso s lows down thep e r fo rma n ce o f ev e ry o t h e r fu n c t i o n . " Ibi s o ccu rs fo r t w o eq u a l l yi mp o r t an t rea so n s : ad d i t i o n a l co n t ro l an d d eco d i n g l o g i c i n -c rea se s t h e l en g t h o f t h e c r i t ic a l p a t h (b y i n c rea s in g t h e n u m b er o fac t i v e e l emen t s i n t h e p a t h ) an d each ad d i t i o n a l fu n c t i o nincreases the length o f in ternal w ire delays. In the processor ' s d atapath these w ire delays can be substant ia l , s ince thy accum ulateb o t h f ro m b u s d e l ay s , w h i ch o ccu r w h en t h e d a t a p a t h i slengthed , and co nt ro l delays, which occur when the de coding andco n t ro l i s ex p an d ed o r w h en t h e d a t a p a t h i s w i d en ed . In t h eMIPS a rch i t ec t u re w e h av e a t t emp t ed t o co n t ro l t h e se d e l ay s ;however, they remain a dominant factor in detexTnining the speedof the processor.

    T he m i c r o a r c h i t e c t u r e

    D e s i g n p h i l o s o p h yTh e fa s te s t ex ecu ti o n o f a ta sk o n a m i c ro en g i n e w o u l d b e o n e i nw h i ch a l l r e so u rce s o f t h e m i c ro en g i n e w e re u sed a t a 1 0 0 % d u t ycy c le p e r fo rmi n g a n o n rcd u n d an t an d a l g o r i t h mi ca l ly e f f i c i en ten co d i n g o f t h e t ask . Th e MIP S m i c ro en g i n e a tt emp t s t o ach i ev eth is goal. T he us er inst ruct ion se t i s an encod ing o f them i c r o e n g in e t h at m a k e s a m a x i m u m a m o u n t o f t h e m i c r o e n g in eav a i lab l e . Th i s g o a l mo t i v a t ed man y o f t h e d e s i g n d ec is i o n sfo u n d i n t h e a rch i t ec t u re .MIPS i s a l o ad / s t o re a rch i t ec t u re , i . e . d a t a may b e o p e ra t ed o nonly whe n i t is in a reg is ter and only load /store inst ruct ions accessmem o ry . I f d a t a o p e ran d s a re u sed rep ea t ed l y i n a b a s i c b lo ck o fco d e , h av i n g t h em i n reg i s te r s w il l p rev en t red u n d an t l o ad / s t o re sand redundant addressing calcula t ions; th is a l lows h ighert h ro u g h p u t s i n ce mo re o p e ra t i o n s d i rec t l y re l a t ed t o t h eco mp u t a t i o n can b e p e rfo rmed . Th e o n l y ad d re ss in g mo d essu p p o r t ed a re i mmed i a t e , b a sed w i t h o f f se t , i n d ex ed , o r b a seshi fted . ~ ibese addressing modes may requi re f ie lds from theinst ruct ion i t se l f, gener al reg is ters , and one A LU or sh i fter

    ~ p e ra t i o n . A n o t h e r A LU o p e ra t i o n av a i l ab le i n t h e fo u r t h s tag eof every inst ruct ion can be used for a (possib ly unrela ted)co mp u t a t i o n . A n o t h e r ma j o r b en e f i t d e r i v ed f ro m t h e l o ad / s t o re

    archi tecture i s s impl ic i ty of the p ipel ine s t ructure . The s impl i f iedst ructure has a f ixed num ber of p ipestages, each of the samelength . Because , the s tages .can be used in vary ing (but re la ted)ways, p ip l ine u t i l iza t ion improves. Also , the absence ofsynchro nizat ion betwe en stages of the p ipe , increases thep e r fo rm an ce o f t h e p i p e l i n e an d s i mp l i f ie s th e h a rd w are . T h es i mp l i f i ed p i p e l i n e ea se s t h e h an d l i n g o f b o t h i n t e r ru p t s a n d p a g efau l t s .A l t h o u g h MIPS i s a p i p e l i n ed p ro cesso r i t d o es n o t h av ehardware p ipel ine in terlocks. This approach i s often seen in lowan d m ed i u m p e r fo rm an ce mi c ro en g i n es . MIPS f i v e s t ag e p i p e l i n eco n t a i n s t h ree ac t i v e i n s t ru c t i o n s a t an y t i me ; e i t h e r t h e o d d o rev en p i p es t ages a re ac t iv e . Th e ma j o r p i p e s t ag es an d t h e i r t a sk sa re sh o w n i n Tab l e 1 .

    Tab l e 1 " M a j o r p i p e s tag es an d t h e i r fu n c t i o n sStaq e Hne monic Tas kIns t ruc t i on Fe tc h IF Se nd ou t the PC,

    increment i tInstruction Decode ID Decode instructionOperand Decode OD Compute effectivo

    address and send tOmemory i f load orstore, use ALU

    Operand Store/ OS/ Store: wri te o p e r a n d /Exe cution EX "Execution: use ALUOperand Fetch OF Load: read operand

    In t e r lo ck s t h a t a re req u i red b ecau se o f d ep en d en c i e s b r o u g h t o u tb y p i p e l i n i n g a re no t p ro v i d ed b y t h e h a rd w are . In s t ead , t h e sei n t e rl o ck s mu s t b e s t a ti ca ll y p ro v i d ed w h e re t h ey a ren eed ed b y apipeline reorganizer. Th i s h a s t w o b en e f i t s :

    1 . A mor e regular an d faster harclware imp lem entat io n i spossib le s ince i t does not have the usual complexi ty

    a sso c i a t ed w i t h a p i p e l i n ed mach i n e . H a rd w are i n t e r l o ck scause sm al l delays for ,a l l inst ruct ions, regard less of thei rre la t ionship on o ther inst ruct ions. Also , in terlock hardwaret en d s t o b e v e ry co mp l ex an d n o n reg u l a r 3 ,4 . q h e l ack o fsu ch h a rd w are i s e sp ec i a l l y i mp o r t an t fo r V LSI i mp l emen -ta t ions, ~vhere regulari ty and s impl ic i ty i s important .

    2 . Rea r ran g i n g o p e ra t i o n s a t co mp i l e t i me i s b e t t e r t h and e l ay i ng t h em a t m n t i me . W i t h a g o o d p i p e l in ereorganizer, most cases where in terlocks are avoidableshould be found and taken advantage of. This resu l t s inp e r fo rman ce b e t t e r t h an a co mp arab l e mach i n e w i t hhardware in terlocks, s ince usage of resources wi l l no t bedelayed. In cases where th is is not detected or is notpossib le , no-ops mu st be inserted in to the code. This doesn o t s l o w d o w n ex ecu t i o n co mp ared t o a s i mi l a r mach i n ewi th- hardw are in terlocks, but d oes increase co de s ize. Th esh i f t in g o f w o rk t o a reo rg an i ze r w o u l d b e a d i sad v an t ag e i fi t took excessive amo unts of compu tat ion . I t appear s th is isnot a problem for our f i rs t reorganizer.

    In t h e MIPS p i p e l i n e re so u rce u sag e i s p e rman en t l y a l l o ca t ed t o

    18

  • 8/8/2019 MIPS a Microprocessor Architecture

    3/6

    v a r i o u s p i p e s t ag es . Ra t h e r t h an h av i n g p i p e l i n e s t ag es co mp e t efo r t h e a se o f re so u rce s t h ro u g h q u eu es o r p r i o r i t y sch emes , t h emac h i n e ' s re so u rce s a re d ed i ca t ed t o sp ec i f i c s t ag es so t h a t t h eya re 1 0 0 % u ti l iz ed . In F i g u re I , t h e a l l o ca t io n o f re so u rce s t oi n d i v i d u a l p i p e s t ag es i s sh o w n . W h en co n cu r ren d y ex ecu t i n gp i p e s t ag es a re o v e r l ay ed , a l l av a il ab l e re so u rces can b e u sed .

    Figure I: Resource Al location by P ipestage

    Time, ->1 2I F I D

    Resource A llocation by PipestageFi gur e 1

    3 4 f i 6 7 8 9 10

    IF OFo o

    F IO

    A L U

    Of ) E X

    Inst ruct ion .DoraMemor ., , i cm or 'O F

    ~:)en OSI F I Do r es A L U r e s e rv e d f o r u s e b y O O a n d E X

    To ac h i ev e 1 0 0 % u ti l i z at i o n p r i mi t i v e o p e ra t i o n s i n t h e mi c ro -en g i n e (e .g . , l o ad / s t o re , A I .U o p e ra t i o n s ) mu s t b e co mp l e t e l yp ack ed i n t o mae ro i n s tru c t i on s . Th i s i s n o t p o ss i b le fo r t h reereasol l s :

    1 . D ep en d en c i e s can p rev en t fu l l u sag e o f t h e m i c ro en g i n e ,fo r ex amp l e w h en a seq u en ce o f reg i s te r lo ad s mu s t b e d o n eb e fo re an A LU o p e ra t i o n o r w h en n o -o p s m u s t b e in se r t ed .

    Z A n en co d i n g t h a t p re se rv ed a l l t h e p a ra l l e l i sm ( i .e . , t h emi c ro co n t ro l w o rd i t s e l 0 w o u l d b e t o o la rge . Th i s i s n o tse r i o u s p ro b l em s i n ce man y o f t h e p o ss i b l e mi c ro -inst ruct ions are not usefu l .

    3 . Th e en co d i n g o f t h e mi c ro cn g i n e p re sen t ed i n t h e i n s t ru c -t ion se t ~acri f iees some funct ional speci ficat ion for immed-iate data. In the worst case, spa ce in the instrxlcti .on wo rdu sed fo r l o ad i n g l a rg e i mmed i a t e v a l u es t ak es u p t h e sp acen o ru ml l y u sed fo r a b ; Lse reg i st e r , d i sp l acemen t , an d A L Uo p era t i o n sp ec i f ic a t io n . In t h i s c a se t h e me mo ry i n t e r facean d A I ,U can n u t b e u sed d u r i n g t h e p i p e s t ag e fo r w h i cht h ey a re d ed i ca t ed .

    N ev e r t h e l e s s , f i r s t r e su l t s o n mi c ru cn g i n e u t i l i z a t i o n ame,~eouraging . Many inst ruct ions fu l ly u t i l ize the major resou rceso f t h e mach i n e . O t h e r i n s tru c t io n s , s~Jch ~ Io ;i d i mme d i a t e w h i chu se few o f t h e re so u rces o f t h e m: l ch i n e , w o u l d man d a t e g rea t l yincreased cont ro l complexi ty i f ovett~ tp wi th surroun ding inst ruc-

    l i o n s w as a t t em p t ed i n an i r reg u l a r fa sh io n .MI PS h as o n e i n s t ru c t i o n s i ze, an d a l l i n s t ru c t i o n s ex ecu t e i n t h e. , ame amo u n t o f t i me (o n e d a t a memo ry cy c l e ) . Th i s ch o i ces i mp l i f i e s t h e co n s t ru c t i o n o f co d e g en e ra t o rs fo r t h e a rch i t ec t u re(b y e l i mi n a t i n g man y n o n o b v i o u s co d e seq u en ces fo r d i f fe ren tfu n c t i o n s ) an d mak es t h e co n s t ru c t i o n o f a sy n ch ro n o u s reg u l a rp i p e l i n e mu ch ea s i e r . A d d i t i o n a l l y , t h e fac t ' t h a t e ach mae ro m -s t ru c t i o n i s a s i n g l e mi c ro i n s t ru c t i o n o f f i x ed l en g t h an d ex ecu t i o nt i me mean s t h a t a mi n i mu m amo u n t o f i n t e rn a l s t a t e i s n eed ed i nt h e p ro cesso r . Th e ab sen ce o f t h is i n t e rn a l s t a t e l e ad s to a fa s t e rp ro cesso r an d mi n i m i zes t h e d i f f i cu lt y o f su p p o r t i n g i n t e r ru p t san d p ag e fau l t s.R e s o u r c e s o f t h e m i c r o e n g i n eT he m a j o r fu n c t i o n a l co mp o n en t s o f t h e mi c ro en g i n e i n c l u d e :

    A L U re so u rces : A h i g h sp eed , 3 2 -b i t c a r ry l o o k ah ead A L Uw i t h h a rd w are su p p o r t fo r mu l t i p l y an d d i v i d e ; an d a b a r re lsh i t t e r w i t h b y t e in se r t an d ex t rac t c ap ab il i ti e s . O n l y o n e o ft h e A L U re so u rces i s u sab l e a t a t i me . Th u s w i t h i n th e c l a sso f A L U re so u rces , fu n c t i o n a l u n i ts c an n o t b e fu l l y u sedev en w h en t h e c la s s i t s e l f i s u sed 1 0 0 %.

    In t e rn a l b u s re so u rces : Tw o 3 2 -b i t b i d i rec t i o n a l b u sse s ,each co n n ec t i n g a l mo s t a l l fu n c t i o n a l co mp o n en t s . On chip s torage: Six teen 32-bi t gene ral ,pu rpose registers . Me mo ry re so u rces : Tw o m emo ry i n t e r faces , o n e fo r

    i n s t ru c ti o n s an d o n e fo r d at a . re ach o f th e p a r ts o f t h eme mo ry re so u rce can b e 1 0 0 % u t i l iz ed ( su b j ec t t o p ack i n gan d i n s t ru c t i o n sp ace u sag e ) b ecau se e i t h e r o n e s t o re o rl o ad fo rm d a t a mem o l 3 , an d o n e i n s t ru c t i o n fe t ch can o ccu rs i mu l t an eo u s l y .

    A mu l t i s t ag e PC u n i t : A n i n c rem en t ab l e cu r ren t PC w i t hSt o rage o f o n o b ran ch t a rg e t a s w e l l a s fo u r p rev i o u s PCv a l u es. Th ese a re req u i red b y t h e p i p e l i n i n g o f ' i n s t ru c t i o n san d i n t e ru p t an d ex cep t i o n h an d l in g .

    Th e i n s t r u c t i o n se tAll MIPS inst ruct ions are 32-bi t s . The user inst ruct ion se t i s aco mp i l e r -b as ed en co d i n g o f t h e mi c ro m ach i n e . S t a t ic an ddyn,'unie ins t ruct ion se t effic iency , as detcn:a ined by a cod eg en e ra t o r , i s u sed t o d ec i d e w h a t mi c ro mach i n e fea t u re s t oen co d e i n t o mac ro i n s t ru c t i m~ s i n t h e a rch i t ec t u re . Mu l t i p l es i mp l e (an d po ss i b ly u n re l a t e d ) i n s t ru c t i o n p i ece s a re p ack edt o g e t l t e r in t o an i n s t ru c t i o n w o rd . ' l h e b a s i c i n s t ru c t io n p i ece sare-"

    l . A LU p i eces - t h e se i n s t ru c ti o n s a re a l l r eg i s t e r / reg i s t e r (2and 3 operan d form:=ts). ' l l l cy a l l use less that1 1 /2 of aninst ruct ion word . Includ ed in th is category are byteinsert /ex t ract , two b!t l~oolhs mul t ip ly s tep , and one b i tnonrcstoring d iv ide s tep , ,as wel l as , , , t audard AI,U andlogical oper, t ions.

    2 . Lo ad / s t o re p icce,~ - t h e se i u s t ru c l i ,n s l o ad an d s t o re

    19

  • 8/8/2019 MIPS a Microprocessor Architecture

    4/6

    me mo r y o p e ran d s. Th ey u se b e t w een 1 6 an d 3 2 b i ts o f aninst ruct ion word . When a load inst ruct ion i s less than 32b i ts , i t may b e p ack ag ed w i t h an A L U i n s t ru c ti o n , w h i ch i sex ecu t ed d u r i n g t h e E x ecu t i o n s t age o f t h e p i p e l i n e .

    3 . Co n t ro l f l o w p i eces - t h e se i n c l u d e d i rec t j u m p s an dco mp ar e i n s t ru c ti o n s w i t h re l a t i v e j u mp s . M IPS d o es n o th av e co n d i t i o n co d es , b u t i n c l u d es a r i ch co l l ec t i o n o f se tcondi t ional ly and comp, ' i re and jump inst ruct ions. The se tco n d i t i o n a l i n s t ru c t i o n s p ro v i d e a p o w erfu l i mp l em en t a t i o nfor condi t ional expressions. The y set a reg is ter to a l l l ' s o rO 's b a sed o n o n e o f 1 6 p o ss i bl e co mp ar i so n s d o n e d u r i n gt h e o p e ran d d eco d e s tag e. D u r i n g t h e Ex ecu t i o n st ag e anA L U o p e ra t i o n i s av a i l ab le fo r l o g i ca l o p e ra t i o n s w i t h o t h e rb o o l ean s . Th e co m p are an d j u m p i n s tru c t i on s a re d i rec ten co d i n g s o f t h e mi c ro mac f i i n e : t h e o p e ran d d eco d e s t ag eco mp u t e s t h e ad d re ss o f t h e b ran ch t a rg e t an d t h eEx ecu t i o n cy c l e d o es t h e co mp ar i so n . A l l b ran ch i n s t ru c -t ions hav e a delay in thei r effe ct of one ins t ruct ion; i .e. , thenext seque nt ia l inst ruct ion i s a lways executed .

    4 . O t h e r i n s t ruc t i o n s - i n c !u d e p ro ced u re an d i n t e r ru p tl inkage. The p roce dure l inkage inst ruct ions a lso f i t easi lyi n t o t h e m i c ro mach i n e fo rma t o f e f fec ti v e ad d re ss ca l cu -la t ion and register-register computat ion inst ruct ions.

    MIPS i s a w o rd -ad d re ssed mach i n e . Th i s p ro v i d es sev e ra l ma j o rp e r fo rm an ce ad v an t ag es o v e r a b y t e ad d re ssed a rch i t ec t u re . F i r s t ,t h e u se o f w o rd ad d re ss i n g s i mp l i f i e s t h e mem o ry i n t e r face s in ceex t rac t i o n an d i n se r t i o n h a rd w are i s n o t n eed ed . Th i s i sp a r t i cu la r l y i mp o r t an t , s i n ce i n s tru c t i o n an d d a t a fe t ch / s t o re a rein a cri t ica l path . Secon d, when byte data (characters) can beh an d l ed i n w o rd b l o ck sl t h e co mp u t a t i o n i s mu ch m o re e f f i c i en t.Last , the effe ct iveness of short offse ts from base reg ister i smu l t i p l i ed b y a fac t o r o f fo u r .MIP S d o es n o t d i rec tl y su p p o r t f l o a t in g p o i n t a ri t h me t i c . Fo rap p l i ca t i o n s w h e re su ch co mp u t a t i o n s a re i n f req u en t , f l o a t i n gp o i n t o p e ra t i o n s i mp l emen t ed w i t h i n t eg e r o p c ra t i o n s an d f i e l di n se r t i o n / ex t rac t i o n seq u en ces sh o u l d b e su f f ic i en t. Fo r mo rei n t en s i v e ap p l ica t i o n s a n u m er i c co -p ro cesso r s i mi l a r t o t h e In t e l8 0 87 w o u l d b e ap p ro p r i a te .

    S y s t e m s i s s u e sTh e k ey sy s tems i s su es a re t h e mem o ry sy st em, an d i n t e rn a l t rap san d ex t e rn a l i n t e r ru p t su p p o r t .The memory systemT h e u s e o f m e m o r y m a p p i n g h a r d w a r e ( o f f c hi p i n t h e c u r r e n td es i g n ) i s n eed ed t o su p p o r t v i r t u a l memo ry . Mo d e rn mi c ro -p ro cesso rs (Mo t o ro l a 6 8 0 0 0 ) a re a l read y faced w i t h t h e p ro b l emt h a t t h e s u m o f t h e m e m o r y a c c e ss t im e a n d t h e m e m o r y m a p p i n gt i me i s t o o l o n g t o a l l o w t h e p ro cesso r t o ru n a t fu l l sp eed . Th i sp ro b l em i s co m p o u n d e d i n MIPS; t h e e f fec t o f p i p e li n i n g is th a t as i n g l e i n s t ru c t i o n / d a t a memo r3 / mu s t p ro v i d e acce~ a tap p ro x i ma t e l y t w i ce th e n o rma l ra t e ( fo r 6 4 k RA MS ) .

    The so lu t ion we have chosen to th is p l :oblem is to separate thed a t a an d i n s t ru c ti o n me mo ry sy s tems . Sep a ra t i o n o f p ro g ram an dd a t a i s a reg u l a r p rac ti ce o n man y mach i n es ; i n f i l e MIP S sy s temi t a ll o w s u s t o s i g n if i can tl y i n c rea se p e r fo rman ce . A n o t h e r b en e f i to f t h e sep a ra t i o n i s t h a t i t a l lo w s t h e u se o f a cach e o n l y fo ri n s tru c t io n s . Becau se t h e i n s t ru c ti o n mem o ry can b e t rea t ed a sread -o n l y memo ry (ex cep t w h en a p ro g ram i s b e i n g l o ad ed ) , t h ecach e co n t ro l i s s imp l e . Th e u se o f an i n s t ru c ti o n cach e a l l o w si n c rea sed p e r fo rman ce b y p ro v i d i n g mo re t i m e d u r i n g t h e c r i t i ca linst ruct ion decode p ipe s tage.Fau l t s and i n t e r r u p t sTh e MIPS archi tecture wi l l supp ort page fault s , ex ternal lyg en e ra t ed i n t e r ru p t s , an d i n t e rn a l l y g en e ra t ed t rap s (a r i t h me t i co v e r f l o w ) . Th e n ecessa ry h a rd w are t o h an d l e su ch t h i n g s i n ap i p e l i n ed a rch i t ec t u re u su a l l y l a rg e an d co m p l ex 3 ,4 . Fu r t h e r -mo re , t h i s i s an a rea w h e re t h e l a ck o f su f f i c i en t h a rd w are su p p o r tmak es t h e co n s t ru c t i o n o f sy s tems so f t w a re i mp o ss ib l e . H o w e v e r ,b ecau se t h e MIPS i n s t ru c t i o n se t i s n o t i n t e rp re t ed b y ami c ro en g i n e (w i th i t s o w n s t at e) , h a rd w are su p p o r t fo r p ag e fau l t sand in terrupts i s s igni ficant ly s impl i f ied .To h an d l e i n t e r ru p t s an d p ag e fau l t s co r rec t l y , t w o i mp o r t an tp ro p e r t i e s a re req u i red . F i r s t , t h e a rch i tec t u re mu s t e n su re co r rec tsh u t d o w n o f t h e p i p e , w i t h o u t ex ecu t i n g an y fau l t ed i n s t ru c ti o n s(such as the inst ruct ion which page faul ted). Most p r e s e n tmi c ro p ro cesso rs can n o t p e r fo rm t h i s fu n c t i o n co r rec t l y (e .g .Mo torol a 68000, Zi log ZS000, and th e In te l 8086). Second , thep ro cesso r mu s t b e ab l e t o co r rec t l y re s to re t h e p i p e , an d co n t i n u eex ecu t i o n a s i f t h e i n t e r ru p t o r fau l t h ad n o t o ccu r red .Th ese p ro b l ems a re s i g n if i can tl y ea sed i n MIP S b ecau se o f t h elocat ion of wri tes wi th in the p ipe stages. In MI PS al l inst ruct ionswhich can page faul t do not wri te to any s torage, e i ther reg is terso r memo ry , b e fo re t h e fau l t is d e t ec t ed . Th e o ccu r ren ce o f a p ag efau l t n eed o n l y t u rn o f f w r it e s g en e ra t ed b y t h i s an d an yinst ruct ions fo l low ing i t which are a l ready in the p ipe . The sefo l l ow i n g i n st ru c t io n s a l so h av e n o t w r i t ten t o an y s t o rag e b e fo rethe faul t occurs . Th e inst ruct ion prece ding the fau l t ingi n s t ru c t i o n i s g u a ran t eed t o b e ex ecu t ab l e o r t o fau l t i n arestartab le m ann er ev en after the inst ruct ion fo l lowing i t fau l ts .Th e p i p e l i n e i s d ra i n ed an d co n t ro l i s t ran s fe r red t o a g en e ra lp u rp o se ex cep t i o n h an d l e r . To co r rec t l y re s t a r t ex ecu t i o n th reeinst ruct ions need to be reexecuted . A mu l t i s tage PC t racks theseinst ruct ions and a ids in correct ly execut ing them.

    S o f t w a r e i s s u e sT h e t w o m a j o r c o m p o n e n t s o f t h e M I P S s o f tw a r e s y st e m a r eco mp i l e rs , an d pipeline reorganizers. The i n p u t t o a p i p e l i n ereo rg an i ze r i s a seq u en ce o f s i mp l e MIPS i n s t ru ct i o n s o rinst ruct ion p ieces generated wi thout tak ing the p ipel ine in terlocksand inst ruct ion packing feature s in to account . This re l ieves thecompi ler from the task of deal ing wi th the rest r ic t ions that areimp osed by the p ipe l ine const ra in ts on lega! co . '; e ~;equences. The

    20

  • 8/8/2019 MIPS a Microprocessor Architecture

    5/6

    r e o r g a n iz e r r e o rd e r s t h e i n s t ru c t i o n s t o m a k e m a x i m u m u s e o f t h ep ipe l in e whi l e en f o r c ing the p ipe l in e in t e r locks in t he code . I t a l sopacks the ins t r uc t ion p i eces t o max imize us e o f each ins t r uc t ionwor d . L as t ly . t he p ipe l ine r eo r gan izer handle ,s t he e f f ec t o fb r an ch de l ays . T h i s s o f twar e i s an imp or t an t par t o f t he M I PSar ch i t ec tu r e . I t i s r espons ib l e f o r ma k ing the low- leve lm i c r o a r c h i te c t u r e i n t o a u s a b l e a n d c o m p r e h e n s i b l e i n s t r u c t io nse L Since the exac t de t a i l s o f p ipe l ine in t e r locks and b r anchd e l a y s m a y c h a n g e b e t w e e n i m p l e m e n t a t i o n s , t h e a r c h i t e c tu r e i sac tua l ly def ined by the inpu t t o the p ipe l ine r eo r gan izer.S i n c e a l l i n s tr u c t i o n s e x e c u t e i n t h e s a m e t i m e , a n d m o s ti n s t r u c t io n s g e n e r a t e d b y a c o d e g e n e r a t o r w i l l n o t b e f u l l M I P Sins t r uc t ion s e t, t he ins t r uc t ion pack ing can be ver y e f f ec t ive inr educ ing execu t ion t ime . I n f u l ly packed ins t r uc t ions , e .g . a l o adc o m b i n e d w i t h a n A L U i n s tr u c t io n , a l l t h e m a j o r p r o c e ss o rr es our ces ( bo th memor y in t e r f aces , t he a lu , bus s es and con t r o llogic) are used 100% o f t h e t i m e .The basic opt imizat ion techniques aopl ied to the code sequencesare

    i reorde r instruction sequences o remove pipel in e interlocks,2. pack together instruction pie ces into a single MIP S

    instructior,3. r emov e the c f fcc ts o f de l ayed b r anches

    I n s ome cas es i t may be ncccs s ar y to i ns e r t no - ops to p r evef i ti l l ega l p ipe l ine in t e r ac t ions o r t o accomodate de l ayed b r anches .Al s o , p i eces o f i ns t r uc t ions may be l e f t b l an k whe never no i )i etmi s ava i l ab l e t o pack w i th the ins t r uc t ion .T he r eo r gan iza t ion p r ob lem i s d i s cus s ed in de t a i l i n ano therp a p e r S ; t h e p r o b l e m i s s h o w n t o b e N P - c o m p l e t e a n d a s e t o fheur i s t i c s o lu t ions i s p r opos ed . T he r eo r gan iza t ion a lgor i thm i ses s en t i a l ly an ins t r uc t ion s chedu l ing a lgor i thm. T he bas i c a lgo-r i t hm i s

    1 . R e a d i n t h e p r o g r a m i n a s se m b l y l a n g u a g e a n d c r e a t e a d a gi n d i c a t i n g p r e c e d e n c e s c h e d u l i n g r e l a ti o n s h ip s a m o n g t h eins t r uc t ions .

    2 . D e t e r m i n e w h i c h g r o u p s o f i n s t r u c t i o n s c a n b e s c h c d u l e df o r exec. u tion nex t an d e l im ina te t he o ther s ,

    3 . Heur i s t i ca l ly choose an ins t r uc t ion to s hcdu le f r om the, execu tab le i ns t r uc t ions . At t empt t o choos e an ins t r uc t iontha t can be packed w i th the las t i n s t r uc t ion execu ted a ndtha t wi l l a l l ow the r es t o f t he code to be s chedu led wi th am i n i m u m n u m b e r o f n o -o ps .

    T h e r e o r g a n i z a ti o n p r o b l e m i s m a d e d i f fi c u lt b u t t h e p o t e n t i a lp r es ence o f over l a l~p ing r es our ce u t i l i t a t i on in par a l l e l codes t r eams . T h i s over l ap nms t be de t ec t ed bef o r e s chedu l ing o fe i ther s t r eam occur s ; once i t i s de t ec t ed , a dead lock s t a t e wher ene i ther s t r eam can be s chedu led f o r execu t ion i s avo idab le . "lhes er eor gan iza t ion t echn iques ( wi thou t t he ins t r uc t ion pack ing) cano b t a i n p e r f o r m a n c e i m p r o v e m e n t s o f 5 - .1 0% o v e r c o d e t h a t m u s t~ 'ai t for co ! np lc t ion o f a p r ev ious ly dependen t i ns t r uc t io l t . T heuse o f in .qtruct iot~ packi ,~g increases the relat ive ef fc ct iv enc ~ o fth i s r eo r gan iza t ion .

    " l' l~e op t imiza t ion o f de l ayed b r anch es ks t he c on t r o l - n owc o n t e r p a r t o f c o d e r e o r g a n i za t i o n . O u r a l g o r it h m f o r b r a n c hd e l a y o p t i m i z a ti o n e x a m i n e s t h e t a r ge t s o f t h e b r a n c h i n a na t t e m p t t o o b t a i n u s e f u l i n s t r u c t io n s to e x e c u t e d u r i n g t h e d e l a yt ime . " l ' he b r anch de l ay a lgor i thm 6 c a n o b t a i n s p a c e a n d t i m ei m p r o v e m e n t s i n t h e r a n g e o f 1 0 -2 0 % f or t h e M I P S b r a n c hins t r uc t ions .

    P r e s e n t s t a tu s a n d c o n c l u s io n sT h e e n t i r e M I P S p r o c e s s o r h a s o e e n r ai d o u t a n d p a r t i t i o n e d i n t oa s e t o f s ix t e s t c h i p s t h a t c o v e r a ll t h e d a t a p a t h a n d c o n t r o lf u n c t io n s o n t h e c h i p . F o u r t e s t ch i p s h a v e b e e n s e n t o u t f o rf ab r i ca t ion as o f Augus t 1982 ; we expec t s end the r e main der t of abr i ca t ion dur ing Au gus t 1982 .I n the s o f twar e a r ea . code gener a to r s have been wr i t t en f o r bo l l :C and Pas caL T hes e code gener a to r s p r oduce s imple ins t r uc t ions ,r e l y i n g o n a p i p e l i n e r e o rg a n i z e r. A c o m p l e t e v e r s i o n o f t h ep i p e l i n e r e o r g a n iz e r is r u n n i n g . A n i n s t r u c t i o n l ev e l si m u l a t o r i sbe in g us ed to ob ta in per f o r m ance es t imates .F i g u r e 2 s h o w s t h e f l o o r p l a n o f t h e c h i p . T h e d i m e n s i o n s o f t h ec h i p a r e a p p r o x i m a t e l y 6 . 9 b y 7 . 2 m m w i t h a m i n i m u m f e a t u r es ize of 4 p. ( i . e . }~ = 2 p,) . Th e ch ip area is heav i ly dedic ated to theda ta pa th as o ppos ed to con t r o l s t r uc tu r e , bu t n o t as r ad ica lly asi n R I S C i m p l e m e n t a t i o n . -E a rl y e s t im a t e s o f p e r f o r m a n c e s e e m t oind ica t e t ha t we s hou ld ac ;hi eve appr ox im ate ly 2 M I PS ( us ing theP u z z l e p r o g r a m 7 as a benc hm ar k) com par ed to o the r a r ch i tec t~ t rese x e c u t i n g c o m p i l e r g e n e r a t e d c o d e . W e e x p e c t t o h a v e m o r ea c c u r a te a n d c o m p l e t e b e n c h m a r k s a v a i l a b le i n t h e n e a r f u t u re .

    F i g u r e 2 : M I P S F l o o r p l a niE

    o ! I-7o . o .

    I r . ~

    4Co

    I I I I

    " II ~e f o l lowing char t com par es t he M I PS p r occs, qor t o t heMotovt~la 6~00() r t l tmin~ Ihe I'u~'zle bencl i tuatk v~r i t let t ht C ~i thno optin~iz,ltinn o r regi,~ter atka"ttioP,. ' lh e P ort abl e C Co~,q~iler( wi th d i fl~ :r ent t a r ge t m ach in e do~ r ip t ion s ) geuer , l ed o r :t ie t b r

    21

  • 8/8/2019 MIPS a Microprocessor Architecture

    6/6

    both processors. The M]PS numbers are a close approximation ofour expected perfomaance.

    M o t o r o l a 6 8 0 0 0 M I P ST r a n s i s t o r C o u nt 6 5 , 0 0 0 2 5 , 0 0 ~Cloc k s pe e d 8 MHz 8 MHz=D a t a p a t h w i d t h 1 6 b i t s 3 2 b i t s 2S t a t i c I n s t r u c t i o n C o u n t 1 3 0 0 6 4 7S t a t i c I n s t r u c t i o n B y t es 5 3 60 2 5 88E x e c u t i o n T i m e ( s e c ) 2 6 . 5 0 . 6

    A c k n o w l e d g m e n t sThe MIPS project has been supported by the Defense AdvancedResearch Projects Agency und er contract # MDA903-79-C-0680.Thomas Gross is supported by an IBM Graduate Fellowship.Many other people have contributed to the success of the MIPSproject; these include Judson Leonard, A lex Strong,K. Gopinath, and John Burnett.An earlier version of this report appears in th,.~ Proceedings of theCMU Conference on VLSI Systems and Computations, 1981.

    R e f e r e n c e sRadin, G., "The 801 Minicomputer," ProcSIGARCIt/SIGPLAN Symposium on Architectural.Support for Programming Languages and OperatingSystems,, ACM, Palo Alto, March 1982, pp. 39 - 47.Patterson, D .A. and Sequin C.H., "RISC-I: A ReducedInstruction Set VLSI Computer," Proc. of the I~Tghth.4nnual Symposium on Computer ArchitectureMinneapolis, Minn., May 1981,.I.arnpson, B.W., McDaniel, G.A. and S.M. Ornstein, "AnInstruction Fetch Unit for a High Performance PersonalC~,mputer," Tech. report CSL-81-1, Xerox PARC, Januaryi 9 8 . t .

    4 .

    5.

    6 .

    Widdoes, LC., "The S-1 Project: Developing highperformance digital computers," Proc. Compcon, IEEE,San Francisco, February 1980,.Hennessy, J.L. and Gross, T.R., "Code Generation andReorganization in the Presence of Pipeline Constraints,"Proa Ninth POPL Conference,ACM, January 1982,.Gross, T.R. and Hennessy, J.L, "Optmizing DelayedBranches," ProceedingsofMicro-15, IEEE, October 1982,.

    7. Baskett, F., "Puzzle: an informal coalpute bound bench-mark", Widely circulated and nln.

    LI 'he 68(X)0 IC- techr.o logy is much be tte r , and the 68000 per fo lms across a widerange of environm enta l s i tua tions . We do not exp ec t to ach ieve th is c lock speed acrossthe sam e range of environmenta l f actors .

    2 T h i s a d v a n ta g e i s no t used in the benchrnat'. ' ,:~ i.e. the 68iX.~) versio n deal s wi th 16 l:i~objec ts while M IPS uses 32 b it ob jec ts

    22