ch10.1 cse 4100 chap 10: optimization prof. steven a. demurjian computer science & engineering...

59
CH10.1 CSE 4100 Chap 10: Optimization Chap 10: Optimization Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-3155 [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818 Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre

Upload: samuel-terry

Post on 04-Jan-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

CH10.1

CSE4100

Chap 10: OptimizationChap 10: Optimization

Prof. Steven A. Demurjian Computer Science & Engineering Department

The University of Connecticut371 Fairfield Way, Unit 2155

Storrs, CT [email protected]

http://www.engr.uconn.edu/~steve(860) 486 - 4818

Material for course thanks to:Laurent MichelAggelos KiayiasRobert LeBarre

CH10.2

CSE4100

OverviewOverview Motivation and BackgroundMotivation and Background Code Level Optimization Code Level Optimization

Common Sub-expression elimination Copy Propagation Dead-code elimination

Peephole optimizationPeephole optimization Load/Store elimination Unreachable code Flow of Control Optimization Algebraic simplification Strength Reduction

Concluding Remarks/Looking AheadConcluding Remarks/Looking Ahead

CH10.3

CSE4100

MotivationMotivation What we achievedWhat we achieved

We have working machine code What is missingWhat is missing

Code generation does not see the “big” picture We can generate poor instruction sequences

What we needWhat we need A simple way to locally improve the code quality

Goal:Goal: Transition from “Lousy” Intermediate Code to

More Effective and Efficient Code Response Time, Performance (Algorithms), Memory

Usage Measured in terms of Number of Variables Saved,

Operands Saved, Memory Accesses, etc.

CH10.4

CSE4100

Where can Optimation Occur?Where can Optimation Occur?

Software Engineer can:Software Engineer can: Profile Program Change Algorithm Data Transform/Improve Loops

Front EndLA, Parse,Int. Code

Code Generator

Int. Code TargetProgram

SourceProgram

Compiler Can:Compiler Can: Improve Loops/Proc

Calls Calculate Addresses Use Registers Selected Instructions Perform Peephole Opt.

All are OptimizationsAll are Optimizations 1st is User Controlled and Defined At Intermediate Code Level by Compiler At Assembly Level for Target Architecture (to take

advantage of different machine features)

CH10.5

CSE4100

Code Level OptimizationCode Level Optimization First Look at OptimizationFirst Look at Optimization

Section 9.4 in 1st Edition Introduce and Discuss Basic Blocks

Requirements for OptimizationRequirements for Optimization Section 10.1 in 1st Edition Basic Blocks, Flow Graphs

Indepth Examination of OptimizationIndepth Examination of Optimization Section 10.2 in 1st Edition Function Preserving Transformations Loop Optimizations

CH10.6

CSE4100

First Look at OptimizationFirst Look at Optimization Optimization Applied to 3 Address Coding (3AC) Optimization Applied to 3 Address Coding (3AC)

Version of Source Program - Examples:Version of Source Program - Examples: A + B[i] * c t1 = b[i]

t2 = t1 * at3 = t2 * c

CH10.7

CSE4100

First Look at OptimizationFirst Look at Optimization Once Code has been Generated in 3AC, an Algorithm Once Code has been Generated in 3AC, an Algorithm

can be Applied to:can be Applied to: Identify each Basic Block which Represents a set of

Three Address Statements where Execution Enters at Top and Leaves at Bottom No Branches within Code

Represent the Control Flow Dependencies Among and Between Basic Blocks Defines what is Termed a “Flow Graph”

Let’s see an ExampleLet’s see an Example

CH10.8

CSE4100

First Look at OptimizationFirst Look at Optimization Steps 1 to 12 from two Slides Back Represented as:Steps 1 to 12 from two Slides Back Represented as:

Optimization Works with Basic Blocks and Flow Optimization Works with Basic Blocks and Flow Graph to Perform Transformations that:Graph to Perform Transformations that: Generate Equivalent Flow Graph w/Improved Perf.

CH10.9

CSE4100

First Look at OptimizationFirst Look at Optimization Optimization will Perform Transformations on Basic Optimization will Perform Transformations on Basic

Blocks/Flow GraphBlocks/Flow Graph Resulting Graph(s) Passed Through to Final Code Resulting Graph(s) Passed Through to Final Code

Generation to Obtain More Optimal CodeGeneration to Obtain More Optimal Code Two Fold Goal of OptimizationTwo Fold Goal of Optimization

Reduce Time Reduce Space

Optimization Used to Come at a Cost:Optimization Used to Come at a Cost: In “Old Days” Turning on Optimizer Could Double

the Compilation Time From 2 hours to 4 hours

Is this an Issue Today?Is this an Issue Today?

CH10.10

CSE4100

First Look at OptimizationFirst Look at Optimization Two Types of TransformationsTwo Types of Transformations

Structure Preserving Inherent Structure and Implicit Functionality of Basic

Blocks is Unchanged Algebraic

Elimination of Useless Expressionsx = x + 0 or y = y * 1

Replace Expensive OperatorsChange x = y ** 2 to x = y * yWhy?

We’ll Focus on Both …

CH10.11

CSE4100

Structure Preserving TransformationsStructure Preserving Transformations Common Sub-Expression EliminationCommon Sub-Expression Elimination

How can Following Code be Improved?a = b + cb = a – dc = b + cd = a – d

What Must Make Sure Doesn’t happen? Dead-Code EliminationDead-Code Elimination

If x is not Used in Block, Can it be Removed?x = y + z

What are the Possible Ramifications if so?

d = b

CH10.12

CSE4100

Structure Preserving TransformationsStructure Preserving Transformations Renaming Temporary VariablesRenaming Temporary Variables

Consider the code t = b + c Can be Changed to u = b + c May Reduce the Number of temporaries Make Change from all t’s to all u’s

Interchange of StatementsInterchange of Statements Consider and Change to:

t1 = b + c t2 = x + yt2 = x + y t1 = b + c

This can Occur as Long as: x and y not t1 b and c not t2

What Do you have to Check?

CH10.13

CSE4100

Requirements for OptimizationRequirements for Optimization Identify Frequently Executed Portions of Code and Identify Frequently Executed Portions of Code and

Make them Perform BetterMake them Perform Better Rule-of-Thumb - Most Programs spend 80% of their Rule-of-Thumb - Most Programs spend 80% of their

Time in 20% of Code – Is this True?Time in 20% of Code – Is this True? We Focus on Loops since Every Gain in Space or We Focus on Loops since Every Gain in Space or

Time is Multiplied by Loop IterationsTime is Multiplied by Loop Iterations Reduce Loop’s Code and Improve Performance

What Other Programming Technique Should be a What Other Programming Technique Should be a Major Concern for Optimization?Major Concern for Optimization?

CH10.14

CSE4100

Requirements for OptimizationRequirements for Optimization Criteria for TransformationsCriteria for Transformations

Preserve Meaning of CodeDon’t Change Output, Introduce Errors, etc.

Speed up Programs by Measurable Amount(on Average for Entire Code)

Must be Work the EffortStick to Meaningful, Useful Transformations

Provide Different Versions of Compiler Non-Optimizing Optimizing Extra Optimization on Demand

CH10.15

CSE4100

Requirements for OptimizationRequirements for Optimization

Beware that Some Optimization Directives are Beware that Some Optimization Directives are Ignored!Ignored! In C, Define variable as “register int I;” While a Feature of Language, cc States that these

Instructions are Ignored and Compiler Controls Use of Registers

CH10.16

CSE4100

The Overall Optimization ProcessThe Overall Optimization Process

AdvantagesAdvantages Intermediate Code has Explicit Operations and Their

Identification Promotes Optimization Intermediate Code is Relatively Machine Independent Therefore, Optimization Doesn’t Impact Final Code

Generation

CH10.17

CSE4100

Example Source CodeExample Source Code

CH10.18

CSE4100

Generated Three Address CodingGenerated Three Address Coding

CH10.19

CSE4100

Flow Graph of Basic BlocksFlow Graph of Basic Blocks

CH10.20

CSE4100

Indepth Examination of OptimizationIndepth Examination of Optimization Code-Transformation Techniques:Code-Transformation Techniques:

Local – within a “Basic Block” Global – between “Basic Blocks”

Data Flow Dependencies Determined by InspectionData Flow Dependencies Determined by Inspection

what do i, a, and v refer to? what do i, a, and v refer to? Dependent in Another Basic BlockDependent in Another Basic Block Scoping is Very CriticalScoping is Very Critical

CH10.21

CSE4100

Indepth Examination of OptimizationIndepth Examination of Optimization Function Preserving TransformationsFunction Preserving Transformations

Common Subexpressions Copy Propagation Deal Code Elimination

Loop OptimizationsLoop Optimizations Code Motion Induction Variables Strength Reduction

CH10.22

CSE4100

Common Sub-ExpressionsCommon Sub-Expressions E is a Common Sub-Expression ifE is a Common Sub-Expression if

E as Previously Computed Value of E Unchanged since Previous Computation

What Can be Saved in B5?What Can be Saved in B5? t6 and t7 same computation t8 and t10 same computation Save:

Remove 2 temp variables Remove 2 multiplications Remove 4 variable accesses Remove 2 assignments

t6 := 4 * ix := a[t6] t7 := 4 * it8 := 4 * jt9 := a[t8] a[t7] := t9 t10 := 4 * ja[t10]:= xGoto B2

t6 := 4 * ix := a[t6] t8 := 4 * jt9 := a[t8] a[t6] := t9 a[t8]:= xGoto B2

CH10.23

CSE4100

Common Sub-ExpressionsCommon Sub-Expressions What about B6?What about B6?

t11 and t12 t13 and t15

Similar Savings as in B5Similar Savings as in B5

t11 := 4 * ix := a[t11] t12 := 4 * it13 := 4 * nt14 := a[t13] a[t12]:= t14 t15 := 4 * na[t15]:= x

t11 := 4 * ix := a[t11] t13 := 4 * nt14 := a[t13] a[t11]:= t14 a[t13]:= x

CH10.24

CSE4100

Common Sub-ExpressionsCommon Sub-Expressions What else Can be Accomplished?What else Can be Accomplished? Where is Variable j Determined?Where is Variable j Determined?

In B3 – and when drop through B3 to B4 and into B5, no change occurs to j!

What Does B5 Become?What Does B5 Become? Are we done? No t9 same as t5!Are we done? No t9 same as t5! Again savings in access, variables, Again savings in access, variables,

operations, etc.operations, etc.t6 := 4 * ix := a[t6] t8 := 4 * jt9 := a[t8] a[t6] := t9 a[t8]:= xGoto B2

j := j - 1t4 := 4 * jt5 := a[t4]if t5>4 goto B3

B4

t6 := 4 * ix := a[t6] t9 := a[t4] a[t6] := t9 a[t4]:= xGoto B2

t6 := 4 * ix := a[t6] a[t6] := t5 a[t4]:= xGoto B2

CH10.25

CSE4100

Common Sub-ExpressionsCommon Sub-Expressions Are we done yet?Are we done yet?

Where is “i” defined? Any Values we can Leverage?

Yes!Yes! t2 = 4*i Defined in B2 and is

unchanged as it arrives at B5 t3 = a[t2] in B3 and B2 and

also unchanged as it arrives Result at Left Saves:Result at Left Saves:

From 9 statements down to 4 4 Multiplications are Gone 4 addr/array offsets are only 2

t6 := 4 * ix := a[t6] a[t6] := t5 a[t4]:= xGoto B2

x := t3 a[t2] := t5 a[t4]:= xGoto B2

CH10.26

CSE4100

Common Sub-ExpressionsCommon Sub-Expressions B6 is Similarly Changed ….B6 is Similarly Changed ….

t11 := 4 * ix := a[t11] t13 := 4 * nt14 := a[t13] a[t11]:= t14 a[t13]:= x

x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x

CH10.27

CSE4100

Resulting Flow DiagramResulting Flow Diagram

CH10.28

CSE4100

Copy PropagationCopy Propagation Introduce a Common Copy Statement to Replace an Introduce a Common Copy Statement to Replace an

Arithmetic Calculation with AssignmentArithmetic Calculation with Assignment

Regardless of the Path Chosen, the use of an Regardless of the Path Chosen, the use of an Assignment Saves Time and SpaceAssignment Saves Time and Space

a:= d + e b:= d + e

c:= d + e

a:= d + ea:= t

b:= d + ea:= t

c:= t

CH10.29

CSE4100

Copy PropagationCopy Propagation In our Example for B5 and B6 Below:In our Example for B5 and B6 Below:

Since x is t3, we can replace the use of x on right hand Since x is t3, we can replace the use of x on right hand side as below:side as below:

We’ll come back to this shortly!We’ll come back to this shortly!

x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x

x := t3 a[t2] := t5 a[t4]:= xGoto B2

x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3

x := t3 a[t2] := t5 a[t4] := t3Goto B2

CH10.30

CSE4100

Dead Code EliminationDead Code Elimination Variable is “Dead” if its Value will never be Utilized Variable is “Dead” if its Value will never be Utilized

Again SubsequentlyAgain Subsequently Otherwise, Variable is “Live”Otherwise, Variable is “Live” What’s True about B5 and B6?What’s True about B5 and B6?

Can Any Statements be Eliminated? Which Ones? Can Any Statements be Eliminated? Which Ones? Why?Why?

B5 and B6 are Now Optimized withB5 and B6 are Now Optimized with B5 has 9 Statements Reduced to 3 B56 has 8 Statements Reduced to 3

x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3

x := t3 a[t2] := t5 a[t4] := t3Goto B2

CH10.31

CSE4100

Loop OptimizationsLoop Optimizations Three Types: Code Motion, Induction Variables, and Three Types: Code Motion, Induction Variables, and

Strength ReductionStrength Reduction Code MotionCode Motion

Remove Invariant Operations from Loopwhile (limit * 2 > i) do

Replaced by:t = limit * 2while (t > i) do

Induction VariablesInduction Variables Identify Which Variables are Interdependent or in

Stepj = j – 1t4 = 4 * j

Replaced by below with an initialization of t4t4 = t4 - 4

CH10.32

CSE4100

Loop OptimizationsLoop Optimizations Strength ReductionStrength Reduction

Replace an Expensive Operation (Such as Multiply) with a Cheaper Operation (Such as Add)

In B4, I and j can be replaced with t2 and t4 This Eliminates the need for Variables i and j

CH10.33

CSE4100

Final Optimized Flow Graph – Done?Final Optimized Flow Graph – Done?

CH10.34

CSE4100

Turn to Prof. Michel’s Slides …Turn to Prof. Michel’s Slides … MotivationMotivation

Rewrite the basic block to eliminate sub-expressions

TechniqueTechnique Change the representation Move to a tree!

CH10.35

CSE4100

ExampleExample

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.36

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.37

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.38

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.39

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.40

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.41

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.42

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.43

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.44

CSE4100

ExampleExampleL1: t1 := 4 * i;

t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.45

CSE4100

ExampleExample What we haveWhat we have

Common sub-expressions are known

Used variables are known (leaves) Live on exit are known

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

L1: t1 := 4 * i;t2 := a[t1];t3 := 4 * i;t4 := b[t3];t5 := t2 * t4;t6 := prod + t5;prod := t6;t7 := i + 1;i := t7;if i <= 20 then goto L1

CH10.46

CSE4100

Peephole OptimizationPeephole Optimization Simple IdeaSimple Idea

Slide a window over the code Optimize code in the window only.

Optimizations are Optimizations are Local [still no big picture] Semantic preserving Cheap to implement

UsuallyUsually One can repeat the peephole several times! Each pass can create new opportunities for more

CH10.47

CSE4100

Peephole OptimizerPeephole Optimizer

block_3:mov [esp-4],ebp mov ebp,esp mov [ebp-8],esp sub esp,28 mov eax,[ebp+8] cmp eax,0mov eax,0sete ahcmp eax,0 jz block_5

block_4:mov eax,1 jmp block_6

block_5:mov eax,[ebp+8] sub eax,1 push eax mov eax,[ebp+4] push eax mov eax,[eax] mov eax,[eax] call eax add esp,8 mov ebx,[ebp+8] imul ebx,eax mov eax,ebx

block_6:mov esp,[ebp-8] mov ebp,[ebp-4] ret

block_3:mov [esp-4],ebp mov ebp,esp mov [ebp-8],esp sub esp,28 mov eax,[ebp+8] cmp eax,0mov eax,0sete ahcmp eax,0 jz block_5

block_4:mov eax,1 jmp block_6

block_5:mov eax,[ebp+8] sub eax,1 push eax mov eax,[ebp+4] push eax mov eax,[eax] mov eax,[eax] call eax add esp,8 mov ebx,[ebp+8] imul ebx,eax mov eax,ebx

block_6:mov esp,[ebp-8] mov ebp,[ebp-4] ret

CH10.48

CSE4100

Peephole OptimizationsPeephole Optimizations A Few Simple technique [in a nutshell]A Few Simple technique [in a nutshell]

Load/Store elimination Get rid of redundant operations

Unreachable code Get rid of code guaranteed to never execute

Flow of Control Optimization Simply jump sequences.

Algebraic simplification Use rules of algebra to rewrite some basic operation

Strength Reduction Replace expensive instructions by equivalent ones (yet

cheaper) Machine Idioms

Replace expensive instructions by equivalent ones (for a given machine)

CH10.49

CSE4100

Load / Store SequencesLoad / Store Sequences Imagine the following sequenceImagine the following sequence

“a” is a label for a memory location e.g. a variable in memory on on the stack If “a” is on the stack, it would look like ebp(k) [k ==

constant]

mov a,eax mov eax,a

mov a,eax mov eax,a

What is guaranteed to be true after the first instruction ?

Corollary....

CH10.50

CSE4100

Unreachable CodeUnreachable Code What is it?What is it?

A situation that arise because... Conditional compilation Previous optimizations “created/exposed” dead code

ExampleExample

#define debug 0

....if (debug) {

printf(“This is a trace message\n”);}....

#define debug 0

....if (debug) {

printf(“This is a trace message\n”);}....

CH10.51

CSE4100

ExampleExample The Generated code looks like....The Generated code looks like....

If we know that...If we know that... debug == 0 Then

....if (debug == 0) goto L2printf(“This is a trace message\n”);

L2: ....

....if (debug == 0) goto L2printf(“This is a trace message\n”);

L2: ....

....if (0 == 0) goto L2printf(“This is a trace message\n”);

L2: ....

....if (0 == 0) goto L2printf(“This is a trace message\n”);

L2: ....

1

CH10.52

CSE4100

ExampleExample Final transformationFinal transformation

Given this codeGiven this code There is no way to branch “into” the blue block The last instruction (goto L2) jumps over the blue

block The blue block is never used. Get rid of it!

....goto L2printf(“This is a trace message\n”);

L2: ....

....goto L2printf(“This is a trace message\n”);

L2: ....

CH10.53

CSE4100

Unreachable Code ExampleUnreachable Code Example Bottom LineBottom Line

Now L2 is instruction after goto...Now L2 is instruction after goto... So get rid of goto altogether!

....goto L2

L2: ....

....goto L2

L2: ....

....L2: ....

....L2: ....

CH10.54

CSE4100

Flow of Control OptimizationFlow of Control Optimization SituationSituation

We can have chains of jumps Direct to conditional or vice-versa

ObjectiveObjective Avoid extra jumps.

Why? [a.k.a. motivation....]Why? [a.k.a. motivation....] ExampleExample

if (x relop y) goto L2....

L2: goto L4L3: ....L4:

L4_BLOCK

if (x relop y) goto L2....

L2: goto L4L3: ....L4:

L4_BLOCK

CH10.55

CSE4100

Flow of ControlFlow of Control What can be doneWhat can be done

Collapse the chain

if (x relop y) goto L4....

L2: goto L4L3: ....L4:

L4_BLOCK

if (x relop y) goto L4....

L2: goto L4L3: ....L4:

L4_BLOCK

CH10.56

CSE4100

Algebraic SimplificationAlgebraic Simplification Simple IdeaSimple Idea

Use algebraic rules to rewrite some code ExamplesExamples

x := y + 0x := y + 0

x := yx := y

x := y * 1x := y * 1

x := yx := y

x := y * 0x := y * 0

x := 0x := 0

CH10.57

CSE4100

Strength ReductionStrength Reduction IdeaIdea

Replace expensive operation By semantically equivalent cheaper ones.

ExamplesExamples Multiplication by 2 is equivalent to a left shift Left shift is much faster

CH10.58

CSE4100

Hardware IdiomHardware Idiom IdeaIdea

Replace expensive instructions by... Equivalent instruction that are optimized for the

platform ExampleExample

add eax,1add eax,1

inceax

inceax

CH10.59

CSE4100

Concluding Remarks/Looking AheadConcluding Remarks/Looking Ahead Optimization Techniques/Concepts are Not Only Optimization Techniques/Concepts are Not Only

Relevant to Programming LanguagesRelevant to Programming Languages Database Systems do Optimization to Reduce Access Database Systems do Optimization to Reduce Access

to Secondary Storageto Secondary Storage Concern when Asking for too Much Data Joining Three or More Tables at Once Doing a Cartesian Product Instead of a Join Doing Selections before Joins Termed Query Optimization

Looking AheadLooking Ahead Review Machine Code Generation (if time) Final Exam Review