the arithmetic t anslator compiler of the ibm · translator in eonverting fortran formulas into 704...

13
The Arithmetic T anslator Compiler of the ~"l .. r~ "1 ® [ O..I.aA:N Automatic Coding System IBM P:~:TH~ It. Sm~mn):~N, Intert~ational Business Machines Corporation, New York, N. Y. 1. I n t rod action The present paper describes, in formal terms, the steps in translation employed by the Fom't~Ax arithmetic translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually taken during translation. All, hough sections 2 and 3 give a formal description of the FOaTaAN source language, insofar as arithmetic type stal~ements are concerned, the reader is expected to be familiar with FORTRAN II, as well as with S.~vPII and the programming logic of the 704 computer. The first major phase of translation, described in sec- tions 4-7, :is concerned with converting an arithmetic formula, regarded as the statement of an algorithm, into a set of triples (Ci, @i, NO which also d~seribe the al- gorithm, but in a manner which lends itsdf more easily to conversion into conventional computer code (not neces- sarily that of the 704). The three elements of a triple have !~essentially the following meanings: ~ : operation to be performed perand time" at which the operation must be performed are as many distinct "times" of operation as there are subexpressions in the entire expression to be evalu- ed. Ior example, eonstder the artthmette expressmn (((A + B) -- C)/((D • (E + F)/G) - H + J)). (1) *subexpressions as there are distinct pairs of parentheses ~;shown--namcly six. To evaluate the entire expression :'all six subexpressions must be:~ evaluated. While there is some latitude in the order in which these subexpressions are to be evaluated, not all orders will work (since "higher i r 1" me subexpressions depend on "lower level" ones for their values). A possible order of evaluation for the ex- ample shown is as follows: :~. (d + B) i 2. ((A + B) - C) a. (u + F) 4. (D, (E + F)/G) 5. (D, (E + F)/G) - H + J (~. (((A + B) - C)/((D • (E + F)/(;) -- H + J)). rhe triples representing this coniputation would /)reak tp into six subsets, called segments according to their C- mmber (first member of a triple), in each segment (cor- esponding to a sut)expression) there are as many triples i~: t In the FORTRAN system * is used for multiplication, ** for xpon(mlsiation. as there are terms in the subexpression. (The number o: triples in a segment may be called the lengthof a seg ment.) Using the numbers written alongside the sub expressions as names for these subexpressions, we gel the following representation of the computation in triple~, notation: (1., +, A)(:I, +,//)(2, +, 1)(2, -, C) (3, +, E)(3, +, e)(4, ,, D) (2: (4, ,, 3)(4, /, G)(5, +, 4)(5, -, H) (5, +, J)(6, ,, 2)(6,/, 5) If the divide operation is regarded as multiplication by th( inverse, then it is clear that the triples belonging to an) given segment of (2) could be reshuffled into some differ. ent order without disturbing the algorithm. Although not all possible segments which the FORTRAN translator pro. duces allow reshuffling, the latitude, whenever presertt is later used to obtain economies in the computer cod( eventually produced. It should also be observed that since each triple bears its own segment number, extensiv( rearrangements of the triples could be tolerated without making it impossible to restore them into proper order The FORTmkN translator takes advantage of this fact it: that it does not necessarily produce the triples segmen| by segment with segments in proper order. However, the deviations from this ultimately required order which th( process of triples generation introduces are correetibl( by reordering. The segment numbers developed by the FORTRAN translator are in reverse numerical order, i.e. the largest segment number represents the subexpressio~ first to be evaluated, while the smallest segment number represents the last (which always coincides with the en- tire expression). The first major objective of the FORTRAN translator is accomplished in a number of steps and con- sists in the translation of form (1) into form (2). The first step, described in section 4, is to replace con- stants and subscripted variables by simple variables: thus ensuring that all of the arguments entering into the computation are of a uniform nature. The next step, described in section 5, is aimed at ensur- ing that. everT subexpression in the expression to be evalu- ated is provided with an explicit pair of parentheses. In writing FowraAN fornmlas one need not indicate all parentheses, the usual order of precedence among arith- metic operations being assumed: exponentiation, multi- plication and division, addition and subtraction. The method by which these precedence relations are made explicit by the FORTRAN translator is' roughly the follow- ing: Art arithmetic connective of high precedence may be Conununications of the ACM 9

Upload: others

Post on 13-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

The Arithmetic T anslator Compiler of the ~"l . . r ~ "1 ® [ O..I.aA:N Automatic Coding System

IBM

P:~:TH~ It. Sm~mn):~N, Intert~ational Business Machines Corporation, New York, N. Y.

1. I n t rod ac t ion

:. The present paper describes, in formal terms, the :; steps in translation employed by the Fom't~Ax arithmetic

translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually taken during translation. All, hough sections 2 and 3 give a formal description of the FOaTaAN source language, insofar as arithmetic type stal~ements are concerned, the reader is expected to be familiar with FORTRAN II, as well as with S.~vP II and the programming logic of the 704 computer.

The first major phase of translation, described in sec- tions 4-7, :is concerned with converting an arithmetic formula, regarded as the statement of an algorithm, into a set of triples (C i , @i, NO which also d~seribe the al-

; . gorithm, but in a manner which lends itsdf more easily to conversion into conventional computer code (not neces- sarily that of the 704). The three elements of a triple have

:!~essentially the following meanings:

~ : operation to be performed perand time" at which the operation must be performed

are as many distinct "times" of operation as there ia re subexpressions in the entire expression to be evalu-

a t e d . Ior example, eonstder the artthmette expressmn

(((A + B) -- C)/((D • (E + F)/G) - H + J)). (1)

i*subexpressions as there are distinct pairs of parentheses _~;shown--namcly six. To evaluate the entire expression

:'all six subexpressions must be:~ evaluated. While there is some latitude in the order in which these subexpressions are to be evaluated, not all orders will work (since "higher i r 1" me subexpressions depend on "lower level" ones for their values). A possible order of evaluation for the ex- ample shown is as follows:

:~. (d + B) i 2. ((A + B) - C) : a. (u + F)

i 4. ( D , (E + F)/G) 5. ( D , (E + F)/G) - H + J

- (~. ( ( ( A + B ) - C ) / ( ( D • (E + F ) / ( ; ) - - H + J ) ) .

rhe triples representing this coniputation would /)reak tp into six subsets, called segments according to their C- mmber (first member of a triple), in each segment (cor- esponding to a sut)expression) there are as many triples

i~: t In the FORTRAN system * is used for multiplication, ** for xpon(mlsiation.

as there are terms in the subexpression. (The number o: triples in a segment may be called the lengthof a seg ment.) Using the numbers written alongside the sub expressions as names for these subexpressions, we gel the following representation of the computation in triple~, notation:

(1., + , A)(:I, + , / / ) (2 , + , 1)(2, - , C)

(3, + , E)(3, + , e)(4, ,, D) (2:

(4, ,, 3)(4, / , G)(5, + , 4)(5, - , H)

(5, + , J)(6, ,, 2)(6, / , 5)

If the divide operation is regarded as multiplication by th( inverse, then it is clear that the triples belonging to an) given segment of (2) could be reshuffled into some differ. ent order without disturbing the algorithm. Although not all possible segments which the FORTRAN translator pro. duces allow reshuffling, the latitude, whenever presertt is later used to obtain economies in the computer cod( eventually produced. It should also be observed that since each triple bears its own segment number, extensiv( rearrangements of the triples could be tolerated without making it impossible to restore them into proper order The FORTmkN translator takes advantage of this fact it: that it does not necessarily produce the triples segmen| by segment with segments in proper order. However, the deviations from this ultimately required order which th( process of triples generation introduces are correetibl( by reordering. The segment numbers developed by the F O R T R A N translator are in reverse numerical order, i.e. the largest segment number represents the subexpressio~ first to be evaluated, while the smallest segment number represents the last (which always coincides with the en- tire expression). The first major objective of the FORTRAN translator is accomplished in a number of steps and con- sists in the translation of form (1) into form (2).

The first step, described in section 4, is to replace con- stants and subscripted variables by simple variables: thus ensuring that all of the arguments entering into the computation are of a uniform nature.

The next step, described in section 5, is aimed at ensur- ing that. ever T subexpression in the expression to be evalu- ated is provided with an explicit pair of parentheses. In writing FowraAN fornmlas one need not indicate all parentheses, the usual order of precedence among arith- metic operations being assumed: exponentiation, multi- plication and division, addition and subtraction. The method by which these precedence relations are made explicit by the FORTRAN translator is' roughly the follow- ing: Art arithmetic connective of high precedence may be

Conununications of the ACM 9

Page 2: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

Normal form being scarified:

Operation~ Operand:

Triples

Sub-expression to which this triple belongs;

~{ecord of" sub- expressions not yet comelete

Algorithm for Computing Triples fr6m Normal Forms (Example)

(Read from left to right)

(Column in which the arrowhead of an operatiom symbol appears is the "time" at which the operation is executed; more than one ooeration may be performed at a time.)

Operation Codes: ] x ~ x = T r a n s f e r x i ----+.-~,--x : Generate x]

] ~..._~ Erase x

I I I ' * ~ i ~ I I

i / I ' *~D~ I I I

lq g.

thought of as weakly separating the arguments to either side of it, while a connective of low priority strongly separates the arguments to either side of it. Since there are three degrees of precedence to be considered, we may represent three degrees of separation power by placing one, two, or three pairs of parentheses to either side of a connective a(:eording to its order of precedence, thus:

A ** B A) ** (B (exponentiation)

convert A • B into A)) * ((B (multiplication)

A + 1~ A))) + (((B (addition)

This introduction of parentheses is balanced in the sense that as many left pm'entheses are introduced as right ones. If, in addition to inserting parentheses as shown above, one also prefixes the entire expression by three left, parentheses and closes it by three right parentheses, then---as the reader may convince hirnself-.a correct p~trenthesization of the expression is accomplished even though many unnecessary (though harmless) pMrs of parentheses arc also inserted. ]for example:

A + B ** C /D (a)

beeomes

(((A))) + (((B) ** (C))/((D)))

In the FORTreSS translation four rather than three levels of precedence are recognized because the character "," which eonventionMly separates successive arguments of a function--as in f(a, b, c)- -is also treated as an arithmetic connective. Also, for reasons having to do with the sub- sequent germration of triples, each inserted left paren- thesis is preceded by an operation sign which essentially tags the p~trenthesis with its strength:

+(*(**(A))) + (,(**(B) ** (C))/(**(D))) (4)

Arithmetic expressions expanded by the insertion of parentheses and operation signs as indicated above (and

10 Communications of the ACM

1. i

em'efully (leseribed in seed(m 5) are called arithmetic expressions ill normal .lbrm.

The next step in translation, described in section 5, is the generation of a list of triples which (except for lhe in- ('lusion of some extraneous list rnemt)ers) constitutes a~t uns(n'ted representation of the desired algorithm. A single semi f()rward on expressions of ~ form equivalent t() that shown in (3) above produces the desired triples, l<xery left pare|lthesis marks the beginning of a subexpression; the o/)eration sign preceding it shows in what sort of arithmeti(. (q)erati(m that subexpression is to participate as a term. A rmming index, N~, (N for n<ct s'ube:cpres,~i(m), whieh ,ste. 4)s up o , e every iime another left parenthesis is e r o s s e d ~ s e r v e s {o g e l l e r a t e a s e t o f S H t ) e x p F e s s i o I I l l t l l l J ) e l ' 8

which become the indices N.~ of the generated triples. The numbers C,: (C for c~rre~t sube:cpression) may be generated from AT; pr(wided that one keeps a re(',oM of wh:tt subexpression nmnbers were generated at previous I left parenthesis (.rossings, and steps back t(/ the proper { earlier sltbexpressh)n after having er(ssed one or more right parentheses.

This procedure is illustrated in figure l, using expression (4) as a s(/uree of triples production.

Seetion 7 describes how to eliminate extraneous triples from lists deseribed as abovc. To begin ~ ith, one ean (with only one exception) eliminate all triples which fornl a single segment by themseh, es. S(.eon(lly, after sorting the triples into order, one can eliminate duplication bet weel~ entire segInents of triples \vhieh can be recognized to be klentieal sub-expressions.

Section 8 deserit)es some intra-segment rearrangenmnts of triples which lead to an ultimate order of computer operation which is not randomly wasteful of storage-to- register and register-to-storage transfers in the eourse of expression evaluation.

When once an expression has been converted into a string of triples, properly rearranged and made as short as possible by elimination of duplication, it is ready for

Page 3: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

conversio~ in to one-address SAP assembly code. In mos t tr iples d*e bust e lement )V~ will have to be in terpre ted as a,n SAP address of a mtmt)er to be operated upon by an operal~ioh correst)onding to O,; •

S(:(:tio~s 9 and 10 show how the FO1YI'I.tAN t rans la tor assigns sl;ora, ge lo(::d.ions an(t (~omputes addresses for' the var ious classes ()[' opertm(ts ('ll(}Olllltcred in these triples. l i e followb~g clas~.~es o[ operand s torage are considered in or(ler :

Integer and floating point constant storage Input and oldput arrays ( l lepresented by source lan-

guage refereu(.es to std)scripted variables) WorMng storage for each expresdon (Represented in

triples code by references in one s(:gment to the result p roduced by ano the r segment of the same expression)

Argun~.ent storage for functions teeturn jump address storage required when functions are

nested and control mus t pnss down a chain and then back up agMn.

Sect ion 11 ac tuMly shows, ahnos t in f low-chart form, how SAP a s sembly ins t ruc t ions are produced f rom the str ings of t r iples in to which F o w r m t x expressions have been reeoded.

The Appendix shows how FowraAx s t a t e m e n t com- pi la t ion can be ex tended to the case, in which the operat ions consist in Boolean and, or and not.

The au tho r is indeb ted to Messrs J. W. Baekus, R. A. Nelson, and [. Ziller of I B M for initiMly suggest ing to him the basic ideas underlyir~g sectior, 6 below, and he is mos t especial ly gra tefu l to Mr. A. W. Ho l t of R e m i n g t o n R a n d for the ex tended in t roduct ion to this paper , wi thout which its r eadab i l i ty would htrve suffered considerably.

2. D e f i n i t i o n s

rF1 (a) ne alphabet of FOICTRAX comprises the alpha- merle cha rac te r s A , . . . , Z , 0, 1 , . . . , 9 and the set of

~ "speeiM" c h a r a c t e r s : . , ( , ) , ,, + , - , * , / , = , and ft. The last is not a cha rac t e r expl ici t ly indicated in a n y IPORTRAN s t a t emen t , serving solely as a s t a t emen t erMmark on the execut ive level, wi th which we shall here be pr imar i ly concerned. Lower ease Greek le t ters will be used through- out to denote a r b i t r a r y charac te rs of the FOWrRAN al-

Jl phabe t . By the terul string we shall mean a sequence of juxta-

posed a lphabe t characters , of finite length. Upper' case Greek le t ters will be usedAo denote a rb i t r a ry strings.

The length L(2) of a s t r ing ~ is the n u m b e r of literM occurrences of cha rac te r tokens in its construct ion. The symbol A will be used to denote the null string, of length 0; of course, for a n y str ing E, EA = AB = Z.

Two str ings (P = ~2t " ' ~ .... kO = ~ . . . ¢,, are said [ to be identical (and we write q) = ~ ) if and only if m = n e and for each i, ¢,: = ~x.

A str ing q) is said to be includcd in (or to be a substring o/) a s t r i | |g q~ (trod we write: iP ~ ~') if q* has tire represen- ta t ion q* = A<bB where, A, B axe strings. As usual, A is said to be a head of q:, mtd B a tail of ~ .

)! I f q, is an a r b i t r a r y string, then the nota t ion H,,(,I~), !~ T,,(qe), where n is mr a r b i t r a r y non-neg~ttive integer, is

defined as follows: If ,I, = ~X, then

(/l, if n N L('10 and L@) = n,

H.,~(q.) = t or n > L ( ~ ) and ,1, = ~1, t

~A if n = 0

I X if n =< L(,P) and L(X) = n,

T,~('IO = or n > L ( ~ ) and ,Is = X

IA if n = 0

Thus, if • = A B C + D / E * * F then

II4(gz) = A B C +

7;('I,) = / E ** F

I i ,7('P ) = "t,

(b) Names are strings classified as follows:

integer cons tan ts

f loat ing-point cons tants

integer wa ' iables

f loat ing-point var iables

function names

An integer constant is a string K = K1 " " K. such t ha t each K~ is an integer be tween 0 and 9 (inclusively) and the relation K < f 5 holds.

A floating-point constant is a str ing F = " r ~ ' " 7 , ~ where 'Yi = • for some j and each "y~, for' i # j , is be- tween 0 and 9 (inclusively), provided F = 0 or 10 as < P < 10 as.

A n integer variable is a s tr ing I = t~ . . . L,, where each ~j is a lphameric , bu t t~ is one of the charac ters ~, z, ~, L, M or x, and L( I ) ~ 6.

A floating-point variable is a string P = ' y ~ . . . % , where each .y~ is Mphamerie and 'yl is a lphabet ic but not one of the charac te rs 1, a, K, ~, M or N. Again, L(F) =< 6.

A function name is a s t r ing q) = ~,,, "-- , ~,~, where each ~ is a lphamerie , ~1 is Mphabet ic and such t h a t either

2 (i) L@) =< 6, • does not appea r in a DIMENSION

s ta tement , and L@) < 4 or ~n # F

or ( i i ) 4 =< L @ ) N 7 and ¢,~ = r'.

F u n c t i o n s d e n o t e d b y n a m e s in the first categol3 , are

referred to as Fs- type f lmctions, and are defined by FORTRAN I I FUNCTION o r SUBI-~OUTINE subprograms. Those denoted by names in tile second ca tegory are referred to as FN-type functions, and are defined by machine-coded l ibrary t ape subprograms, buil t- in open subroutines, or by a single a r i thmet ic s t a t e m e n t (see function definition below). An Fs- type funct ion is integer- valued if and only if ~ is one of the charac te rs I, J, K, L, M or X. An Fx- type funct ion is integer-vMued if and

.... ~-i~(~Tdcfinition, see, e.g., the FORTRAX Programmer's Refer- enee Manual.

C o n n n u n i c a t i o n s o f t h e ACM 11

Page 4: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

only if ~, = x . (This d i sc repancy in nota t ional convent ion is, in principle, a nonessent ia l one and arose only th rough historical accident . )

(c) A subscript is a str ing of one of the following forms:

K

K , E

E - [ - K

~ - - K

K * E + P

K * Z - - P

where K, P are integer cons tan ts and E is an integer var i - able. In addi t ion, the magn i tude of a subscr ip t cannot exceed 2 is - 1.

(d) A subscripted variable is a s tr ing of the fo rm

T(Z, , . - - , Ze)

where I ' is an integer or f loat ing-point var iable and such t h a t if T = r~ . . . r~ ,

(i) T appea r s in a DIMENSION s t a t e m e n t

and (it) L(T) < 4 or r~ ~ F

and the 2a are subscr ipts with k ~ 3.

3. Rules of Format ion

(t) The set of expressions is recursively character ized as follows:

E l . A cons tan t or var iable n a m e ~ is an expression of the same mode as ~. If il, = ,#~ . . . ,~, and L(~) < 4t or ~,, # ~', and a) appea r s in a DIMENSION s t a t emen t , and if, fu r thermore , E l , . . . , 2e , (1 =< k =< 3), are subscripts , then ~l,(E~, . - - , 2e) is an expression of the same mode as el,.

E2. I f ~ is an expression not. of the fo rm +,I~ or - % then q-~ and -~I, are expressions of the same mode as ~.

E3. i f ~I, is an expression, then @) is an expression of the same mode as ~.

E4. I f • is an n-adie funct ion name and A~, . . . , A~ are expressions, then ~(A~, . . . , A~) is an expression of the same mode as ~.

E5. I f ,I,, ,I~ are expressions of the same mode and ,I, is not of the fo rm + X or - X , then • + % ,I~ - ~ , q, * q(, q , / ~ are expressions of the same mode as ~.

E6. If ~o, qe are expressions and ,I~ is f loat ing-point mode only when ~I, is, then unless ei ther q, or • is of the fo rm A ** B (A, B variables, constants , or funct ion expres- sions), • ** "I-' is an expression of the same mode as el,.

No te t h a t E l - E 6 prohibi t , b y implicat ion, the wri t ing of mixed-mode expressions except under cer tain special formal conditions. T o wit, the only allowable expressions in our language are those we shall t e rm integer and float- i'ng-point expressions. An integer expression is one in which any f loat ing-point mode name occurs, if a t all, wi thin the " a r g u m e n t vec to r " of some funct ion expres-

12 C o m m u n i c a t i o n s o f t h e ACM

sion,-- for example :

i + :xst~'~,"(A • :~/c)

A float ing-point expression, on the other hand, is one in which any in teger -mode nnme occurs, if a t all, within the a r g u m e n t vec tor of a functior~ expression, within the ex- ponen t of an exponential , or within a subscript , ~-for example: A(0 ** a + srxr(K • L/~0

Any other s tr ing is referred to as mixed, and c a n n o t belong to the set of allowed expressions.

(H) An arithmetic expression is a string of the fo rm

= ~ - ~

where q, is an expression in the sense of ~ , - ~o. (m) A pure arithmetic statement is a s tr ing of the f o r m

= ¢ ~

where T is a subscr ip ted or nonsubser ip ted variable, a n d = ee -t is an a r i thmet ic expression.

0 v ) A quasi-arithmetic statement is a s tr ing of one of t h e following forms:

(a) IF@), where • is an expression. (b) CaLL ~(A~, . ." , A,~), where • is a funct ion n a m e

such tha t (i) i f ~ = ~ t . . . ~ , t h e n L @ ) < 4 or,#~ ~ F , L @ ) =<

6 and q~ does not appea r in a DIMENSION s t a t e m e n t (it) each A, is an expression, in the sense of E1-£'6

or a Hol ler i th field (q.v., FoaTm~X P r o g r a m m e r ' s R e f e r - enee M~mual). T h e reason for referring to the above as quasi-arithmetic s t a t e m e n t s will be made clear later. [

(V) A .h~nction definition is a s tr ing of the fo rm

,~(A~, . . . , A,~) ~- ~'[A~, . - . , A4

where ¢I~ is a funct ion n a m e such t h a t (i) if • = ~, - . . ~,,, then 4 N L(~) =< 7 and ,~,~ = F, (ii) each d~ is a nonsubser ip ted w~riable name, (iii) E [ A i , . . . , An] i8 ant expression (in the sense o f

El---E6) in the (free) var iables A 1 , ' . . , A,~. We n o t e in passing, t h a t any Ai m a y occur vacuous ly in E l A n , • . . , A,,]. Thus , E[Ai , . . . , A,] is a funct ion form in t h e free var iables A~, . . . , d,~.

4. R e d u c t i o n of Expressions

An expression • is reduced in the following m a n n e r : (I) Each integer or f loat ing-point cons t an t occur r ing

in ~, bu t not conta ined in a subscript , is replaced by a n integer or f loat ing-point var iable name , respec t ive ly , p rovided the r ep lacement is made a t all occurrences of t h a t cons t an t mid the r ep l acemen t name does no t a l r e a d y occur i,},,I~.

(H) Each iuteger or f loat ing-point subscr ip ted v a r i a b l e is replaced by a nonsubscr ip ted integer or f l oa t ing -po in t var iable name , respect ively, in such a m a n n e r t h a t t h e r ep lacement n a m e does not a l ready occur in ,l~ and s u c h t h a t the r ep l acemen t of each subscr ip ted var iab le n a m e is m a d e a t all occurrences of the la t te r in q,. Also, no s u b - scr ipted var iab le can have the same n a m e as a n o n s u b -

Page 5: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

~<ripted variable, nor can any two n-dimensional sub- s( ripted wtriables have the same names if their first n - 1 dimensions are respectively iden t i ca l .

The result, then, of applying procedures (i), (H), above, to the expression ¢, will be termed its reduced form and denoted by ~ze.

Now, if d) is an expression, Chen we shall denote by -_~>~,' the set of all non-null c(mnected name strings occurring in qb. [htL!;, for example, if

4, = : ~ B C , ( - - x v )

~ = {A, ~, C, aN, Be, ,tBC, X, Y, XY}. Clearly, for pression qP, there exists at least one element ~" @ ~,r ~hat if ~, ~ ~3~ and E C 'If, then 2: = ~, i.e., E this sense, :~ maximal element. We shall denote

the subset of ~,r consisting of its maximal con- non-null name strings. In the last example, ~B,~ =

KY} °

rmal F o r m of an A r i t h m e t i c Express ion

his section we shall define a set of transformations beginning with any reduced arithmetic expression =

reeursively generate what we shall call partial nor- ,ms A,X where T~(X) = { and ~0 is the character ~e recursion on the head-strings hi is as fffllows:

k i ~ X -+ k i ~ ' X ,

Ai+I = A i ~ ' .

tall assume throughout that E (~ g!3.t, U {(}.

eX --) A~ + (,(**(@EX if 5I'1(A~) ~7 {=, (} (:~ ± ( , ( * * ( e z x if r , ( = , ) ~ { - - , ( l ) ± EX - -4

~A~ ) ) ) ~ (,(**(~:~X, otherwise •/zx - , ~, ) ) •/(**(®zx • * EX --~ a~ ) ** (@:~X ' ,X-- , A~ @ (X if 7'~(kO ~ 93¢ } x + ~ ) ) ) ) x

, x-~ ~)))) o ( x -~ ~..)))

that the recursion is terminated by 77'8, sittce at this step. Note, further, that stratification

~els) of the arithmetic expression = q, ~ via the schema proceeds from the assumption that the

or degree) of "binding" of operations appearing in .,ides with that of conventional mathematical usage, order of increasing "strength"): ± ; . / ; **; and @.

rse, the last noted "operation" is associated solely partial normal form with functions and has only a

¢ieal purpose in that context. Titus, 77'1-T8 reflect ader more explicit normally assumed usage regard- :atifieation of algebraic expressions. e denote by Nd¢') the ith partial normal form of by N@) the last of these (which we may refer to as the normal form of ¢,), then the following ex-

should suffice to illustrate the above schema. qD = --XYZF(A, B * e ** ( - - I ) ) ) /E "Jl-" F. Then,

;V0@) ~v~(®)

N,@)

N~(+)

N~@)

Nxo(([ ~)

= A0q~ ~, with ~o standing :for = = ~ ( ~ , B • c ** ( - - D ) ) / i , : + ~. -~,

with At = ~0 - (*(**(@XYZF = ~x,., :~, • c ** ( - - D ) ) / E + r ~,

with 52 = A1 ® ( = ~ , ~ • c ** ( - - D ) ) / , ~ : + F q ,

with Aa = k~ ÷ ( , (**(®, = a,~l~ • c ** ( - - D ) ) / i , : + F q ,

w i t h & = 5 a ) ) ) ) ® ( = ~ • c ** (-D))/i~ + e ~,

with 5~ = ,54 + (*(**(@B = ~ 6 " * ( - - D ) ) / E - t - F ~,

with & = & ) )*(**(@c = . 5 7 - D) ) /~ . :+F~ , withAr = zX6)**(~( = &))/~: + ~' ~, with & = e ,~ - (,(**(@D = Ag)/E + F q , withAo = & s ) ) ) ) = & o / E + F-~, wi th&o = & ~ ) ) ) ) = & ~ + F ~ , with&u = &o ) ) / (**(®E = ~x,, ~, w i t h zx,, = ~. ) ) ) + (.(**(eF

and, finally,

N@) = Nla@) = Ala, with At3 = ~12 ) ) ).

More explicitly, then, N@) is tile expression

= - ( , ( * * ( e x ~ - z t . e ( + ( , ( * * ( e ~ ) ) ) ) • ( + (.(**

(er~) ),(**(ec )**(e(- (,(**(e,) ) ) ) ) ) ) ) ) )/

(**(e:~:) ) ) + (,(**(eF) )

I t may be seen quite easily that the result N(q,) of app l i cation of T1-T8 to a reduced arithmetic express im = cP ~¢ ~ is such that the balance of left and right p a r e n theses is not disturbed (closure condition). To wit: t he threq additional left parentheses generated by TI are close( by the three additional" right parentheses genera ted b.! 7'8, if TI(A~) is = , or 77'6, if TI(A0 is (; the same is truq for the first line of .7'2, and the second line of T2 is se l f dosing; .7'3, 77'4 and 7'5 are self-closing; 7'6 in t roduee l three additional right parentheses which are d o s e d b3 the three additional left parentheses generated b y T: or the first line of 77'2; identical assertions hold for T; and 77'8.

If N@) is = % then we shall define ~N as ~, i.e., th ( string N@) minus the FORTRAN = sign. Thus, ¢x is t. string of elements of the form @A'~ where, for each i

~ i = A a n d ~ i = )

or ~ , ~ { + , - , , , * * , ®} and q~ ~ ~B,U {(}.

6. L e v e l A n a l y s i s

The level analysis of an arithmetic expression = ,ll consists in the reeursive generation of what we shall e a l partial productions I t , each partial production a s t r in{ of triples (of entities to be described below) f o r m e d irl the following manner.

We define three integer sequences {Nd, {C~}, {A,:} and a sequence {Kd of integer sequences such t h a t : initially, N1 = 1, C1 = A1 = 0 and K1 = A. B y K / we shall mean the last teml of the sequence K~, a n d it K~ = (0, K / ) then R.~ = (0) (possibly null). We: se t t he

C o m m u n i c a t i o n s o f t h e A C ) [ 1 3

Page 6: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

T A B L E 1 . . . . . . . . . . . . . . . . . " ~ . . . . . . 7 2 7 . . . . . . . . . . . . 7 - . . . . . . . . . . . . . . . . . . . . . . . . . . ; - - - r . . . . . . . . . . = " . . . . . . " 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

I(: i )_' ( i i .,< . . . . . . . . . * ( I . . . . . ** ( , ~) 'X ) ) --k( * ( * * ( , £ ~ { .. , ) ) ]~ ~ ( l @ C ) ** (, @ (" 0-- (

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4 1 . . . . . . } ....... ~ ....................... , . . . . . . . . . . . . . . . i ..... i .....................................

Ne • 7~ 8 8 8 8 8, 9 10 11 i 12 12 ~2 i 1~ I ]3] ~, 14 14 ~5 16 (":, 0 6 7 7 6 ] ;3 8 9 10 ' 11 11 1.0 i 9 [ 12 ! I3 13 12 [4 15 A~ 0 6 7 7 6 i .3 ~ 4 5 6 7 7 6 [ 1 5 i 6 , 7 7 6 7 8 i 1 8 9 10 11 1 4 15 16 t7 18 19 2() i 21 { 22 i 23 24 25 26 27

I i '

N , ~17 18 19 19 19 19 19 19 19 20 2l 1 4 ' 12 / 9 ] 8 1 19 2() : 2 I t 21 2l 21 22 23 24 24 2,t 24 24 c, 17 It8 [8 20 19 1 0 21 22 23 23 22 ! 21 0 Ai I 9 10 lI 11 7 6 5 4 1 2 3 [ 3 2 , 1 0 1 2 i 3 3 2 i It 0 i 28 29 30 31 '35i 36 137 38 41 42 43:44- 45146 47 48 49 i 50 51 52 i 53 5i

i

)

i ~ L

: j

d

inithd partial production II1 = A, and if [[i ~ ILE, then IL+t = II.iE.

Now if ~ ' i ~ denotes the /tit element of q;v, the pro- duetion schema is as follows:

I ( then [L -~ IIi(C~, <~ , NO N,iH = N i + 1

I Ci+t = N.~ Ai+~ = A~:+ 1 Ki+~ = (K~, C d

Z, where Z C !~33¢, then [Ii ~ II,:(Ci , ~ i , X)

Ni+l = N i If ~ i = C,,+~ = C~

Ai+~ = A:; K g+~ = Ki

then [Ii --~ H~ Ni+l = N~ Ci+l = K,:' A,i+, = A ~ - l K~+t = R.:

The effect of applying these transformations com- pletely to the normal form ~pv is to produce a string 11@)- which we shall refer to simply as the production of ~--- which is a representation of the computation required to evaluate the original arithmetic expression = ~-~. The computation proceeds from the innermost to the outermost levels (in the sense of T1-TS) in a systematic manner, as we shall later see.

Note that, since A~ = 0 and

{ A1-4- 1 when'-P; = (

Ai+t = A i when Ti ~ ~3¢

A,: - - l whenT~ = ),

then the last term of the sequence {Ai} is 0 if and only if (I, itself is closed with respect to parenthesization.

Now, each element C (respectively N) of the set of terms of the sequence {C.i} (respectively {Nil) m~y be interpreted as the index of a "currently defined" (respec- tively, "next-to-be-defined") expression S~ (respectively SN) embedded in the production 1I@). Each such expres- sion is defined as the set of all triples of the form (C, ~,~J, T~), for arbitrary, fixed C, where the index j ranges over the number of triples having the "current" index C.

14 C o m m u n i c a t i o n s o f t h e ACM

li'rom the above "partial production" schema, it, is easily seen that, for any i, j, C,: = Cj entails A., = d j (but not conversely). • s l 'hu:, sinee each element of the se- qtmnce {di} is a level index (rising by 1 for each left parenthesis, dropping by it for each right parenthesis, and otherwise stationary), it is meaningful to state that each of the above mentioned triples belongs to the same level. We shall call such a set of triples a segmerd of the

i level to which any of its members belong. Of two seg- ments Sc.~, Sc: , moreover, we may say that they belong to (or are segments qf) the same level if' and only if A~ = A~. Since this is at, equivalence relation between seg- ments, we see that each level is completely defined by the set of all segments bdonging to it, in the above defined sense. Furthermore, since all triples comprising a segment behmg to the same level, their operation elements are of the same type, i.e., all • and/or / ; a[l + and/or - ; all • , ; o r all @.

Before proeeeding further, let us pause to illustrate the partial production schema with respect to the example

~, = - x Y z r ( a , n • c ** ( - D ) ) / E + r

of sect, i(m 5. To this end, it is convenient to arrange all information in the form of table I, wherein the ith column contains at its heard the element ~,ivg,i of ~I ;v and, imme- diately below, the respective values of N~, C~, A.:, and i. Table 2 is a parallel display ef the generated terms of the sequence {Ki} (which we name the C-sequence) and the partial productions 1L.

7 . O p t i m i z a t i o n ( G e n e r a l )

The first stage of optimization of II@) consists in the elimination of redundant parentheses arising out of the transformations T1-T8. Having achieved the desired stratification of ~, they shall now disappear from the scene, and in the following manner.

The production lI@) is scanned, "back-to-front", one triple at a time. If and only if a triple (C, ~ , T) belongs to a segment of length 1. and ~ # - is this triple elimi- nated from 1I, and • replaces the third member of its immediate predecessor. This "telescoping" procedure is based on the fact that a segment is of length 1 only if its "name" (i.e., current index C) is "addressed" by its imme- diate predecessor. This last assertion follows, in t u rn , from the f i rs t partial production rule. i

Page 7: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

T A B L E 2

C-Sequence

K~ = A K~ = (0) Ka = (0, 1) K4 = (0, l , 2) K.~ = (0, t , 2) K~ = (0, 1, 2, 3) K7 = (0, 1, 2, 3, 4) Ks = ( 0 , 1 , 2 , 3 , 4 , 5 ) K~ = ( 0 , 1 , 2 , 3 , 4 , 5 , 6 ) K~0 = (0, 1, 2, 3, 4, 5, 6) Kn = (0, 1, 2, 3, 4, 5) K~2 = (0. 1 ,2 , 3, 4) K~a = (0, 1, 2, 3) Kt4 = (0, l , 2) K~a = (0, 1 ,2 , 3) K~ = (0, 1, 2, 3, 8) K17 = (0, 1, 2, 3, 8, 9) KlS = (0, 1, 2, 3, 8, 9, 10) KI9 = (0, 1, 2, 3, 8, 9, 10) K~o = (0, 1, 2, 3, 8, 9) K~ = ( 0 , 1 , 2 , 3 , 8 ) Keg = (0, 1, 2, 3, 8, 9) K,,,a = (0, 1 , 2 , 3 , 8 , 9 , 12) K~,~ = (0, 1, 2, 3, 8, 9, 12) K ~ = (0, 1, 2, 3, 8, 9) K ~ = (0, 1 , 2 , 3 , 8 , 9 , 12) K~7 = (0, 1 , 2 , 3 , 8, 9, 12, 14) K~s = (0, 1 , 2 , 3 , 8, 9, 1 2 , 1 4 , 1 5 ) Ke,~ = ( 0 , 1 , 2 , 3 , 8 , 9 , 12, 14, 15, !.6) Kao = (0, 1 , 2 , 3 , 8 , 9 , 12, 14,15, 16, 17) Ka~ = Kao Ka~ = (0, l , 2, 3, 8, 9, 12, 14, 15, 16) Kaa = (0, 1, 2, 3 , 8, 9, 12, 14, 15) Ka.t = (0, 1 , 2 , 3 , 8 , 9 , 12, 14) Kay, = (0, 1 , 2 , 3 , 8 , 9 , 12) Ka~ = (0, 1 , 2 , 3 , 8 , 9 ) K87 = (0, 1, 2, 3, 8) Kas = (0. 1, 2, 3) Ka9 = (0, 1, 2) K40 = (0, 1) K~ = (0) K~e = (0, 1) K4a = (0, 1, 19) K44 = K4a K.~5 = (0, 1) K46 = (0) K~ = A S.*s = (0) K~.~ = (0, 21) Kr~0 = (0, 21, 22) K ~ = K~0 K~2 = (0, 21) Kaa = (0) K~4 = A

Par tia[ Productions

[I~ = A lie = IIl(0, --, 1) IIa = IIe(l, * ,2 ) H4 = Ha(2, **, 3) II., = I14(3, Q , xYZt,') II~ = n d 3 , ® , 4) 117 = r id4 , + , 5) Ils = II7(5, *, 6) II~ = IIs(6, **, 7) Illo = r[9(7, @ , A) I lu = H10 II12 = IIil~ 11ta = I I p , II14 = II,a 11,8 = II1114(3, @, 8) 1116 = It:5(8, + , 9) Ill7 = I h d 9 , *, 10) 111s = IltT(10, *% 11) IIl~ = 111s(11, @, B) II20 = IIi9 112t = I1,~0 1122 = 1121(9, *, 12) H~,a = lice(12, *% 13) I[~ = llea(13, @, C) II~8 = II~4 Iie~ = II-2~(12, *% 14) II27 = II~(14, @, 15) l[I:s = II~(15, -- , 16) 11~ = II~a(16, *, 17) 118o = 11~9(17, **, 18) Hat = IIa0(18,(f), D) IIa~ = Ilat Ilaa = [Ia~ I[a4 = Ilaa Ilaa = IIa4 I11a,~ = lIIaa 11at = Ilaa l-[as = t137 Has = lls8 II4o = Ha9 II4~ = II~o I I~ = n ~ ( 1 , / , tg) II~a = Ilv.,(19, **, 20) 11,~4 = 114a(20, @, I~:) 11~8 = II44 H4B = 1148 II47 = [[46 [I~s = [147(0, -- , 21) I Io = II~s(21, *, 22) II~0 = II~9(22, **, 23) IIbl = II~0(23, @, F) II52 = 11.~1

I I ~ = II~a

Next, the set of so-condensed segments fI@) is ordered according to current indices, so that, if

,vhere

fi@) = & . . . &

& = (c, 0o', ~o') . . . (c, O ) ~, ~)o)

;hen S~, "precedes" S¢. if and only if c' =< c". The next stage of optimization involves the "elimina-

t ion" of comnlon subexpressions, so as to avoid redundant computation. This is accomplished in two steps:

1) Beginning with &:, the last segment in fI(~), and for each i =< L, the set, of all $5 with j < i is examined for the occurrence of an S~ = S~. As soon as some Si = Sx, S~ is eliminated from [](qs), and all references to Sj replaced by references to S~, i.e., if some % = j, then j is set equal to i.

2) Having elinfinated, by 1), common segments, we now eliminate eonmmn sul)expressions. Beginning with S£., and for each i =< L, the set of all Si with j < i is examined for the occurrence of more than one reference to S~, i.e., the occurrence of ~,,,, ',P,~, with m # n and ~Ig,,~ = ~,~ = i. If and only if this is the case is S~ tagged as a eomnmn subexpression (what we call a cs-type seg- ment).

Procedures 1) and 2) together assure the elimination of outermost common subexpressions. Thus, if

(D = A * (U * C) + SINF(A * (U * C)) ,

t h e n

[ f I @ ) = ( 0 , + , 1 ) ( 0 , + , 1 4 ) ( 1 , , , x ) ( 1 , , , 7 )

(7 , . , B)(7, *, C)(14, @, SlNF)(14, @, 16)(16, *, A)

(16, ,, 22)(22, ,, n)(22, ,, c)]

Procedures 1) and 2) reduce fI to

(0, + , 16)(0, + , 14)(14, @, S~NF)(14, @, 16)

(16, *, A)(16, *, 22)(22, ,, n)(22, ,, c),

with S~ tagged as a cs-type segnlent, since q,o ~ = ~t~ ~ = 16.

W e s h a l l d e n o t e t h e r e s u l t of c o m m o n s u b e x p r e s s i o n

e l i m i n a t i o n b y ( ~ ) .

8 . O p t i m i z a t i o n ( S p e c i a l )

O w i n g t o t h e f a c t t h a t t h e FORTRAN S y s t e m w a s o r ig i -

n a l l y d e s i g n e d t o c, o m p i l e " o b j e c t " ( r u n n i n g ) p r o g r a m s

i n 7 0 4 l a n g u a g e , c e r t a i n f u r t h e r s p e c i e s of o p t i n f i z a t i o n

r e g a r d i n g t h e c o m p i l a t i o n of a r i t h m e t i c s t a t e n l e n t s a p -

p e a r t o b e n e c e s s a r y if a d v a n t a g e i s t o b e t a k e n of t h e

m a c h i n e ' s o w n s p e c i a l c h a r a c t e r i s t i c s . W e l i s t t h e s e in

t h e o r d e r i n w h i c h t h e y a r e c o n s i d e r e d b y t h e e x e c u t i v e

p r o g r a m .

1) E a c h s e g m e n t , S~ w i t h @ i ~ = * is s c a n n e d f o r p o s -

s ib le p e r m u t a t i o n of i t s m e m b e r s , so a s t o m i n i m i z e t h e

o c c u r r e n c e of c o m p i l e d m e m o r y a c c e s s e s . S p e c i f i c a l l y ,

e a c h s e g m e n t S i of t h e f o r m

(i, *, ~, ') . . - (i, ~ ? ' , ~ 7 9 where

~ ] = . o r / , 1 < j =< X~

undergoes permutation of its elements so as to yield a (possibly null) maximal subsegment of the form

(i, ,, ~ / ' ) ( / , / , ~ P ) . . . (i, , , ~ / ~ - ' ) ( i , / , "~ /9

i.e., a maximal subsegment whose operator structure is • , / , , / , . . . , , / .

Since the only remaining elements (if any) are of the form (i, *, ,t,) or ( i , / , ~), consider the following eases:

(i) The number of . 's in S~ is one more than the num-

C o m m u n i c a t i o n s o f t h e A C M 115

Page 8: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

ber of / 's . It: this case, tile operator structure of St is , / , / . . . ,/,.

(it) The number of , ' s in S,: is at least two more than the number o f / ' s . In this case, the operator structure of S< is * * / , / . . . , / * . . - , ,

(iii) The number of / ' s in N~. exceeds the mm~ber of • 's. In this case, the structure is . / . . . , / . . . / .

2) A segment S~ is said to be type Mq it! its last opera- lion is / ; otherwise, it is said to be type AC.

A fm'ther species of optimization of H(~), which we term linkage, designed to minimize memory accesses, is performed in the following manner. Beginning with the last segment, SL, each segment S~ is examined as to type and affected in the following ways.

(I) S~ is type he. Then & is tagged as Ae-linkable and Si_t as At-linked, if and only if one of the following con- ditions obtains:

(i) @,:_1 + o r - and for some j, .i-1 = i. In th i s ease, in addition to tagging ,~.q and S,:_~ , interchange the first and j th elements of &_~.

(it). @.i_1 t *, @i._: 2 / , and for some j , ~_~ = i, <-:~ = *. Again, in addition to tagging Si and S~ 1,

interchange the first and j th elements of & - l . (iii) @ ~_~: @, i-: = i and ,Iq_~: is the name of a

closed subroutine (see below), and ~N-type function, or art open univariate function.

(iv) @ i-t' = ** and ~.,_** = i. (I0 S., is type MQ. Then S~ is tagged as ~*Q-linkable

and S~_, as Mq-lirdced, if and only if one of the following conditions obtains:

1 (i) ~.i_: = *, @~-,2 = * and for some j, T.i_:i= i,

J @~_~ = *. In this ease, in addition to tagging & and S~_,, interchange the first and j th elenlents of S,_~.

~i f 8 1 (it) @ i_, 1 @, ~-t = i and T~_~ is the name of a dosed subroutine (see below), an ~N-type function or an open univariate function.

(iii) @i_** = **. There are two cases: (a) ~_, is the name of art integer constant less than

7 (in which ease S~_~ is compiled as an open subroutine), and .i-t = i.

(b) 'Iq_l ~ = i. [n all other cases, i.e. eases which do not fall either under (0 or (tI) above, S< is unlinkable and Si-~ unlinked.

9. Funct ion Types

With each library or FN-type function appearing in a FORTRAN program is associated a type number according to the following scheme:

I) Each library function is of type 0. H) If • is an FN-function name, where

q~(At, . . . , A,~) = td[A~ , . . . , Anl q ,

then

(a) if E contains no library or FN-fmlction name, ~ is of type 1.

(b) if E contains a library or FN-functk)n name, and r r , - . . , r~: are the type numbers already associated with these functions, then the type number of ,I, is simply

max (r~ , . . . , r~) + 1.

16 C o m m u n i c a t i o n s o f t h e A C M

10. Address Compi /a f io l l

Each member of an element (triple) occurring in lI@) is represented during compilation by the cont, ents of a full word of 704 storage. These three words are re- ferred to, respectively, as the htg word, operator word, and symbol word. The precise bit-structure of each of these words is a function of the role plnyed by the element in question.

The tag word of each element of a. segment S~ contains not only the current index i, but in addition, if this ele- ment refers to a subscripted variable in the original (un- reduced) expression (I, a set of three tags identifying the dimension, subscript and addend combinations.

The operator word of each element contains the ope ra - tion code and, if the first clemcnt of t~ segment S~., a set of bits containing information as to certain properties of this segment, viz., whether the segment is linkable or linked (and through which arithmetic register); whether arithmetic for this segment is floating or integer mode; whether, if 1 @i = @, St defines a library, open sub- routine, FS- or F~'-function (and, if the latter, what its type); whether or not S~ defines a common subexpres - sion; whether the result of computation of S~ appears the accumulator or multiplier-quotient register.

The symbol word of each element contains the nan of the eperand, which may be an integer or floating mode w~riable, an integer or floating mode constant, a func- tion, or some other segment S i .

We shall denote the compiled tagged-address asso- ciated with the j th symbol of the ith segment by ~ / .

Actual SAP-form address compilation proceeds as follows:

(i) An address reference to an integer (respectively floating-point) constant is compiled into a symbolic address 2) (respectively 3)) and relative address v (re spectively u), where ~ (respectively u) is associated with the vth (respectively uth) distinct integer (respectively floating-point) constant occurring in a given source program.

(it) An address reference to a subscripted variable is compiled as follows:

(a) If K • ~ ~ P is the canonical form of the subseripl associated with a one-dimensional variable ,I,/, then the symbolic address is compiled as "I,'i ~ and the relative address as 1 ~ P.

(b) If K : . 2 h =t: P1, K s . Z2 ± P2 are the canonical subscripts associated with a two-dimensional variable 'It .j~ , then the symbolic address is compiled as ~ : ' and the relative address as

e- (=I=Pi- El) -- r1(±P2- e~),

where

f ~ if ~1 = Z2 = A = otherwise,

if Zk # A ek = "~lfo otherwise,

and Pl is the first dimension of ,F/.

Page 9: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

i;

ie

re) Finally, if K, • E:t ± P l , K.e * Ee ± Pe,

Ka * Ea ± Pa

are the canonical subscripts associated with the three- dimensional w~riable q,,/, then the symbolic address is compiled as ~F/and the relative address as

e,) - C , ( ± e ~ - ee) - r , P e ( ~ P a - e:~) e - - r i P , -

where

e = otherwise,

I(~ if ~ ~ A ~k = otherwise,

mid I'~, I'2 are the first two dimensions of q>/. (iii) An address reference to another segment Sj is

compiled into a symbolic address l) r, where r is the type number associated with the arithmetic expression =

-~, and a relative address ~i. The ~ ' s are erasable storage relative addresses deternfined in the following way: Beginning with the last segment SL of I~[(4S), each S~ is examined to determine whether it is nonlinkable or is tagged as a common subexpression, hi either case (and only then) an erasable storage relative address

~ = ~ + 1

is associated with & , where S~ (i =< j) is the last, seg- ment nonlinkable or tagged as a common subexpression, mld, initially, ~ = - 1 or 0, depending upon whether I i@) does or does riot define an Fx-type function.

(iv) With each tape library or ~\,¢-type function is associated a class of erasable storage cells set aside as a i)uffer for the transmission of its arguments. The type r of any given class is determined by the type assigned to the function in question. Thus, tape library fimctions are ~dways of type 0, and an Fi-function is of type 1 greater than the highest type occurring in its definition.

In the case of a tape library function, art address refer- ence to its kth argument is compiled into a symbolic address 4)0 and a relative address - ( k - 1 ) .

In the case of all FN-function, on the other hand, an address reference to its kth argument is compiled into a symbolic address 4)r, where r is the type number, arid relative address k-1.

I t should be rioted, at this point, that the necessity *'or typing tape library and FN-functions arises from the fact that either may occur within the definition of an Fx-funetion. Unrestricted nesting of these functions within such a context is possible, therefore, only if their ~rgument buffer regions are non-overlapping.

(v) For the reasons cited at the end of (iv), the class of calling-index saving cells is also typed, type 0 as- signed for tape library functions and type r > 0 for a given FN-function of type r. The relative address in both eases is 0, the symbolic address 7)0 for tape library func- tions and 7)r for Fx-functions.

(vi) Finally, when intrasegment erasable storage is required, a single cell is set aside having symbolic address 7)0 and relative address 0.

11. A r i t h m e t i c S t a t e m e n t C o m p i l a t i o n

Beginning with the last segment SL of H@), each seg- ment S~ is "forward scanned" and compiled according to the following schenm.

(I) Initial Compilation of a Segment

(A) ~ i ~ = + . There are two cases: (i) S:i is linked. Proceed to (g), unless Si is of length

1, in which case proceed to (hi). (ii) S~ is unlinked. Compile CLA ~ and proceed t(~

(~) .

(B) ~ i 1 = - . Again, two cases: (i) Si is linked. Compile cns and proceed to (H),

unless S~ is of length 1, in which case proceed to (Iii).~ (ii) S~ is unlinked. Compile C:LS q~, then proceed as

per B(i). (C) C , ) = *. T w o cases: (i) & is linked. Proceed to (~). (ii) & is unlinked. Two subcases: (a) ~ i ~ = / . Compile cI~:t ~,1. (b) ~,i 2 = *. Compile LDO ~i I. In either case, proceed next to (H). (D) <~1 = ®. There are several cases: (i) qct is the name of a tape library subroutine. Three

s u b c a s e s : (a) S; is At-linked. Compile the sequence

I [ S IA)Q ~i4 TQ 4)0 -- 2

|1LI Q ~,: ~(STq 4)0 -- ( M - 2)

LDq ~a

followed by the sequence

SXD 7)0,4

TSX q2il,4

LXl/ 7)0,4.

Either subsequence in braces is vacuous (not compiled) in the event S,: defines a mfivariate or bivariate func- tion only. Note, further, that both the SXD and Lxo in- structions surrounding the TSX may be eliminated by a later section of the FORTRAN executive system in the event that the flow of indexing information obviates saving the contents of register 4 at this point.

(b) S i is Eta-linked. Compile the sequence

CLA ~-~ i 4 STO 4)0 -- 2

t . . . . . . . . . . . . CLA ~f i hi

~STO 4 )0 - - (M-- 2)

followed by the sequence

CLA ~d~i 2

SXD 7)0,4

TSX x~il,4

LXD 7)0,4.

Communicat ions of the ACM 17

Page 10: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

The 8tt[)se(ti.l(;iK:e it, braee, s is v:~.cuotts ira the event ,% de,- fines a bivariate function only,

(c) S~ is unlit/ked. Compile (:lea q,,~, then lhu sequence

((L,)Q ~,4

4)o - - e) ( 'I,I)Q @i:; "

follmved t}y the se(tu(mee

sxD 7)0,4

'['sx q, il,4

l ,xo 7)0, l,

Either subse(luence in |)races is vacuous in the event; £',. defines a univariate or biv:tri:tte function only.

(ii) qC ~ is the name of :m Fx-funetion of type r. Three subct ts (}S :

(a) S, is At-linked. Compile STo 4)r, then the sequence

[A)O x[t/3

STq 4) r + I 'CLA x[t,: '~

STO l)r + 2

CI,A @i xi

~ST() 4)7" + (Xl -- 2)

folh~wed by the sequence

sxl) 7)r,4

!l'SX qr;t, [

[,XO 7)r , I.

li',ither subse<luence in braces is vaetlotls lit the event S,. defines a univariale or I)iwa'iate ftmetion only.

(t>) S~ is ~tq-linke(:l. (]<)mpile the se<luelme

{'.LA '~[ti '~

s'ro 4) r

STQ [ ) r + 1

then the se(luenee

CLA ~ti 4

STO 4)r + 2

CLA x~r i '\ i

STo ;t)~ + ( X , - 2)

R~llowed by the sequence

sxD 7)r ,4

TSK qril,4

I, XI) 7) r,4.

The subseqttenee in braces is vacuous in the event S, defines a bivariate funetkm only.

18 C o m m u n i c a t i o n s of the ACM

(c) S~ i,~5, u~lhiked. Compile the :-equence - - 9

sTo '-IOT,

~}112[] t~l(} S('(]llfHI(::C

I,bQ CP; ;~

! i . . . . . . . . . . .

l + (x,:-

followed, agai~, by the sequeuce

s x J) 7) r,4

'rsx q,,~,4

~,xl) 7)7,4

I[!;ither subseque~lce in b r a c e s is v~t(tl.l()lb5 ill l:}]e e V e l / t ~S', ;!

defines a tmivariate or bivariate fm~etioH o n l y (I l l ) XPi lS th(~ / t a l l l e of ~iAi FS-{l l l l0 t lOl l . [ w e s l l l ) ( ' ases :

(a) No subscripted va.riatfle contaiaing a variable subscript itMex occurs as aH argumenl of xP, 1. (:olnpi/e

{{ t h e 8e(tl leIlc(} [;

sxl~ 7)0,4

' rsx ~I,,:~, t

........................ ;L"'

I,xK) 7)0,-t.

(b) Some sui)scripted x.ariable oeetlrrillg its ttl/ ai'gu- meltt of ,It, t contains a, variable stl])se'ipt index. If .,[,a

i! , . . , ,,p,./k ()Oll tpFise S/l( h tI s ( t t | l ( ! l l compile the sequence

PXI) @i g

A ItS 18

A H ) * - - ' 2

S T . . + j~

PXD ~[Q a'k,

At.~S 18

ADD * - - '2

ST, c¢ + j:,,

SXD 7)(), ~

TSX ~ .z ,~

• 7)0+[.

The symbolic address c~ denotes the relative location 0f the TSX instruction within the body of' the program, and each entry in the sequence between the TSX a.nd LXD in-

Page 11: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

iii~ii!; structions is of the form ~ , if m ~ any j,, , or R,a,,~ if obherwise. We recall, in passit,g, that in this cont.ext ~4J" iS our symbol for the composite symbolie-rela.tive ~ddress arid tag of ;F ~'~ if the latter denotes a subscripted w~riable. Tha t is, if p~s" is the algebraic (signed) relative address mid r~ ~'~ the tag associated with t, his variable, then

~,;'~ = ,It/'~ + p,/" r]'* , .

Thus, the effect of the sequence PXD, At{S, aO[), STA is to compute the effective address of tiffs variable and store same as an actual address it, the calling sequence, which address is then available to the subroutine de- fitting the value of ,It** via the calling index register 4. r a e [h , symbolic address • denotes the contents of the 704 program counter at the time an instruction having this symbolic address is interpreted by the machine.

(ix') ,P? is the name of a built-in open subroutine. Owing to the fact that compilation of an open subroutine into the main body of an object program is an essentially ad hoe procedure--depending, as it, does, on the particu- lar nature of the function in question, the number of its arguments, and upon the particular context within which the function arises--and since, further, actual compilation of open subroutines is deferred to a section of the executive system la, ter than that with which the present paper is eormerned, we shall emit a detailed de- scription of this subject for the present.

( E ) ~ ? = **. (it S,: defines an open subroutine, via., ~(' is the name

of an integer constant less than 7. The same remarks apply here as for (D) (iv) (q.v.).

(if) S~ defines a closed subroutine. (at S~ is ate-linked. Compile LDQ ~.2,~, then proceed

to (C). - - 1 (b) S~ is not At-linked. Compile cb.,~ ~ i .

1) S¢ is ~aq-linked. Proceed to (C). 2) S~ is not Mq-linked. Proceed to (at. (e) Compile SXD 7)0,4; then proceed to (d). (d) Three distinct built-in tape library subroutines

compute the exponential according as the exponent is integer or floating valued, or the base is fixed valued.

1 2 g, ~ are both integer wtlued. Compile TSX

? is floating valued, T~ integer valued. Compile (2,4. 1 ~tlf2 ~, ~ are both floating valued. Compile TSX

~ integer valued, 'Icz ~ floating valued. Disallowed. nally, compile LXD 7)0,4 and proceed to (m).

~dra-Segment Compilation

>i ~ = + . T w o e a s e s : is in floating-point mode. Compile FAn q,j. is in integer mode. Compile ADD ~¢~.

>i / -~- --. TWO eases: is in floating-point mode. Compile FSB ~ / . - j

(if) S, is in integer mode. Compile SVB 'I'~. (C) ~ i ~ = *. Two cases:

(it Predecessor in MO. I.e., Q/ -* = / or ~ ? = * and j = 2. Two subeases:

a) S.i floating-point. Compile FMP ~ / b) ~5g,: integer. Com!oile ~pY ~,/, ALS 17. (if) Predecessor in ace. I.e., ~ i >* = • and j # 2

Compile STO 7)0, LDQ 7)0, and proceed to (i) (at or (it (b) above, depending upon mode of S.i.

(D) ~ / = /. Two cases: (it Predecessor in MQ. I.E. ~,/-~ = / . Two subcases (at S~ floating-point. Compile STQ 7)0, CLa 7)0:

F D P ~gi j. (b) S~ integer. Compile Dye ~J, eLM, LLS 18. (it) Predecessor in ACe. I.e. ~ j - t = , . Two subcases (at Si floating-point. Compile FDP ~J . (b) S~ integer. Compile Lt~S 35 and proceed to (it (b)

above.

(m) Final Compilation of a Segment

(At Last segment compiled was So. (it ~@) is an iF-type production, i.e., is the productior

of an expression ~I, contained in a Foln'm~x seurce hm- guage statement of the form

IF@)n, , n2 , na

where the n~'s are source program statement names. (at So is type Ac. Finis. (b) So is_' type MQ. Compile LLS 37 and finis. (it) i i @ ) is a CALL-type production, i.e., is the produc-

tion of an expression ¢ contained in a ti'm¢T~tAn sourc( hmguage stagement of the form cAr, t, ~, where ,I, is ar Fs-function. Finis.

(iii) ~ is neither an IF- nor CiH~-type production Then the source language statement containing • is el the form 'I* = ~I, -~, where T is a w~riable or FIv-functior name. Consider the cases:

(at • is integer-valued. (1) So is in floating mode and is type Ac. Compile th{

(fixing) sequence UFA 6), LRS 0, AiR 6) q- 1, LLS 0, ALS 18 We note, here, that two constants having SaP identifl. cations 6), 6)+1, - . . are compiled into the object pro. grant constant-region. These constants are, in 704 octal word-format, 233000000000 and 000000077777, respec. tively. Thus, the above sequence has the effect of placin~ ~he point of the floating-point number, whose intege] form is desired, to the extreme right of the accumulator preserving its sign in the MQ register, extracting th( mantissa (now positioned in the last 15 bits of the ac. cumulator), restoring the sign and shifting the mantiss~ into the decrement field.

(a) If • is an r~-funetion name, compile a'mt 1,4 Finis.

(5) If not, compile STO ~, and finis. (2) So is in floating mode and is type uq. Compih

STQ 7)0, eLA 7)0, then proceed as in (at (1), above. (3) So is integer valued. (at 'I~ is an FN-funetion name. a) So is type he. Compile TR.< 1,4. b) So is type ~Q. Compile STQ 7)0, eLa_ 7)0, "rna 1,4.

C o m m u n i c a t i o n s of the ACM 1 {.

Page 12: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

(¢3) ,P is a var iab le . a) So is t y p e AC. Compi le STO ~ . b) So is t y p e ~q . Compi le STQ ~. (b) qz is f loat ing valued. (1) So is f loat ing valued. Proceed as in (a) (3), above. (2) So is in teger valued. (a) So is t y p e he. Compi le LRS 18, Om~ 6), FAD 6). a) qP is an ~'N-funetion name . Compi le T:m~ 1,4. b) Otherwise , compile STO '~. (B) So is t y p e MQ. Compi le swQ 7)0, CL:~ 7)0, and pro-

ceed as in (2) (a) above. B) Las t s egmen t (St) compiled was not S0.

(i) S~ l inkable arid no t a common subexpressio~>

Proceed to compi la t ion of S~_~.

(it) S,: no t l inkable or is a common subexpressiol~. (a) S.z is t y p e ac . Compi le STO I ) r + ~ i , where T is

the t ype n u m b e r associa ted with q,, and equals 0 if ~, is

not an Fs-funet ion name; otherwise, T is the t y p e of the

funct ion cur ren t ly being defined (see section 9). The relat ive address ~z is defined as in section 10 (iii). Pro-

ceed to compi la t ion of S~ ~. (b) S.~ is type M(~. Compi le STQ l ) r + ~.~, and proceed

to compi la t ion of S i - - 1 •

APPENDIX

A1. I m p l i e i t M u l t l p l i e a t l o n

A certain conciseness of no ta t ion in the wri t ing of expressions is a l lowed of b y the fac t than an • sign need not occur in an expression 4, if q~ is no t of the fo rm ,I, • X, where T~(,P), H~(X) belong to ~ . Thus, if • ~ ,It (read ,%5 equ iva len t to ' I ,") is t aken to mean t h a t the eorre- sponding a r i t hme t i c expressions -- q5 ~, = ,I¢ ~ yield identic,,fl object (machine- language) programs, then

(-A)t~(~) ~ (--A),B(i)

X01.(I)B ~-~ X01(I)*B

s INF(x)cos~(x) ~ S~Nr(X)*COS~(X)

A(B + c) ~ ~, (B + e)

(A/B)~o~r (x) ~ (*/B)*~O~r(X)

(:~ + ~0(x + Y) ~ ' (~ + B) , (x + v)

~r~Nr(x) (A -- B) ~ ~.~N,~(X)*(~ - ~)

A2. B o o l e a n S t a t e m e n t s

An i m m e d i a t e extension of the mechan i sms of ar i th- met ic s t a t e m e n t compila t ion, exploit ing the AND-OR logic of the 704, is readi ly a t hand. I f tile opera t ion signs + , . , and - are in te rpre ted to denote urfion, intersect ion and complemen ta t i on , respect ively, then a cer tain subset of the set of expressions defined b y E l - E 6 (see section 3) is sufficient for the fo rmula t ion of any Boolean funct ion on the set of all 36-bit b ina ry strings. We shall call the e lements of this subset sentences and reeurs ively charac- terize t h e m as follows:

S1. E v e r y f loat ing-point var iab le name q5 is a sentenee. If, f u r the rmore , • = ~ , • " ~ , and L@) < 4 or,f,, ¢ F, and ,I, appea r s in a. D~MENS~ON s ta tement , and if Z~, . . • , ~ (1 _<_ /c N :3) are subscripts , then ~(Z~, . . - , Z~) is a sentence.

$2. If ,I~ is a sentence, so is @). $3. If • is a sentence such t h a t H ~ @ ) ¢ - , and ¢, is

not of the fo rm • + X, where % X are sentences, t hen - ~ is a sentence.

$4. If ~I, is a sentence of the fo rm ,i, + X, where ~ , X are sentences, then - @ ) is a sentence.

$5. If • is an n-adie funct ion n a m e with H~(~I,) ¢ x ,

and A~, . . . , An are sentences, then ,I,(A~, . . • , A,,) is a sentence.

$6. I f ~, '.P are sentences, and Hi@) ~ - , H:('.P) ¢ - , then ,I, + % ¢ , , ,I, are sentences.

(Note : T h e same rule regarding implici t inultiplic~t- t ion of exp re s s ions - -men t ioned in A l a b o v e - - a p p l i e s as } well to the cons t ruc t ion of sentences.)

Rules S 1 $6 prohibi t , by implicat ion, the writ ing of expressions of the form + - ~ , where q~, q¢ are sen ten(es Thus , w h a t in convent iona l logical no ta t ion is wr i t t en

Nxp ' or ~A ~-~ ,It cannot be abb rev i a t ed to ,1~ - ~ , b u t m u s t be rendered by ~ * ( - ~ ) .

We define a Boolean expression as a s tr ing of the f o r m

where • is a sentence. Simihtrly, a pure Boolean statement is a s tr ing of t h e

fo rm

where ,Is is a subscr ip ted or nonsubscr ip ted f l oa t i ng -po in t variable, and = • ~ is a Boolean expression.

A quasi-Boolean statement is a s tr ing of one of t h e following forms

(a) IF @), where • is a sentence. (b) CALT~ ,I,(A~, - - - , A,~), where • is a funct ion narc le

such t h a t (i) i f ~ = ~ . . . ~ , ~ , t h e n L @ ) < 4or,p,~ ~ F,~I~ d o e s

not appea r in a I)IMENSION s ta t emen t , and L@) ~ G. (it) each A~ is a sentence or a I to l ler i th field. A (Boolean) f l tnction definition is a s tr ing of the f o r m

ii ,~(A, , . . . , An) = E[A, , . . . , A,~]

where • is a funct ion name, and such t h a t (i) if cp = ¢t " " , ~ , then H1(~)¢ x, 4 =< L(~) _~ 7

and ~ = F. (it) each A~ is a nonsubscr ip ted t loat ing-point v a r i a b l e

name.

(iii) E[A1 , . . . . , .4,,] is a sentence in the (free) v a r i a b l e s : A~, . - • , A , , wherein each A~ m a y occur vacuously .

E x a c t l y the same reduct ion, level analysis and gene ra~ op t imiza t ion procedures are appl ied to Boolean as t o

20 Communicat ions of the ACM

Page 13: The Arithmetic T anslator Compiler of the IBM · translator in eonverting FORTRAN formulas into 704 as- semMy code. The steps are described in about the order in which they are actually

6-

arithmetic expressions. In addition, the special optimiza- tio3~ procedures (section 8) apply with one minor modifi- (:ad(m. Since each segment S; is type ac (the / sign ean- t~ot occttr in a, sent;ence), then linkage can only occur, if ever, through the machine's accumulator regisger. f{ellce, if 4~ i--I 1 = * , and for some j, %+-i'; = i, interchat~ge the first and j t h elelnents of S,_~, tagging S; as ~c- lin#able and S~--t as ~c-linlced.

Compih+tion of Boolean statements proceeds in a man- tier ana logo t l s to that of arithmetic statements except for the following operation code transformations:

CLA --+ CAL

LDQ --} CAL

STO - + SLW

CLS --+ CA, L~ COM

FMP --~ ANA

FAD --+ ORA

CHS --> COM

STQ ~ SLW

[~XAMI[ I !;.

. , v ( x , Y) = ( - x ) + v

EQU[VF(X, Y) = [MPF(X, Y)[MPF(Y, X)

z = (((**~)c) + D) + Im, K((**t,)c) + D),

r:qt~wF(- ( - x ) , y))

We shall e~ssume tha t the above statements appear in a possiMy more extensive program, and that each is tagged as a Boolean-type statement (the letter u in cohunn 1 of an IBM Fen'emiR, card).

Note that the IMPF function is type 1 and translates into the following instruction sequence:

CAb 4)1 cou o m t 4 )1 + 1

T R i 1, 4

The (free) variables x, Y are associated with 4)1, 4) 1 + 1, respectively, in this ease.

The I.:Qu,vF function is type 2, and translates into

ea~, 4)2 + 1 SLW 4) 1

eAL 4)2 SLW 4)1 + 1 SXO 7)1, 4 TSX IMP,~

Lxo 7)1,4 SLW 1)2 CAL 4)2 SLW 4)1 CAL 4 )2 + 1

SLW 4)1 + 1

SXD 7)1,4 TSX IMPs4

Lx D 7) 1,4 ANA 1)2 TRA 1,4

Finally, the third of the above statements translates into the sequence

CAL X

CO~/I

COM

SLW 4)2 CAL Y

SLW 4)2 + 1 TSX EQUIV,4

s l , w 1)1 + 1

CAB A

ANA B

ANA C

ORA D

s~w 1) + 2 sI~w 4)1 eAL 1) + 1 s ~ w 4)1 + i

TSX 1MP,4

o 3 ~ 1) + 2 SLW Z

C o m m u n i c a t i o n s o f t h e A C M 21