daddy! -- where do instructions come from? program sequencer controls program flow and provides the...

26
Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code, jumps and loops

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

Daddy! -- Where do instructions come from?

Program Sequencer controls program flow and

provides the next instruction to be executedStraight line code, jumps and loops

Page 2: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

2

Tackled today

Program sequencer Linear flow of instruction Why not discuss idle instruction here? Jumps

Software loops – normal and more efficient “down-counting” loops

Special Motorola MC68XXX software loop instructions

Loops – hardware loops

Subroutines -- – next lecture Interrupts and Exceptions – next lecture Idle – next lecture

Page 3: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

3

Example code

Look at moving elements from array fooHere[ ] to farAway[ ] using various instruction modes Straight line coding In a loop – please make sure that you

understand the terminology – exam question Software loop Hardware loop

In a subroutine Via an interrupt

Page 4: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

4

Linear program flow

Program flow on the chip is mainly linear

The processor fetches and executes program instructions sequentially

Non sequential structures (instructions and supporting registers) direct the processor to execute an instruction that is not the next sequential address

Page 5: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

5

Array movement

.extern _fooHere, _farAway; extern long fooHere[5], farAway[5]

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

R0 = [P0]; [P1] = R0;

farAway[0] = fooHere[0];

R0 = [P0 + ?]; [P1 + ?] = R0;

farAway[1] = fooHere[1];

R0 = [P0 + ??]; [P1 + ??] = R0;

farAway[2] = fooHere[2];

farAway[3] = fooHere[3];

farAway[4] = fooHere[4];

Question – What goes in the place of the ? and ?? when doing loop or when doing

[P1 + ?] = R0; W[P1 + ?] = R1; B[P1 + ?] = R2; ANSWER: -- Find out the correct answer – and make sure you do it correctly all the time

ANSWER: -- Why worry? DO THE CODE a different way and don’t

worry

Page 6: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

6

Better solution – let the processor worry about getting the indexing correct!

.extern _fooHere; .extern _farAway;

.extern _fooHere; .extern _farAway;

extern long fooHere[5], farAway[5]

P0.H = _fooHere; P0.L = _fooHere;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

P1.H = _farAway; P0.L = _farAway;

R0 = [P0++]; [P1++] = R0;

R0 = [P0]; [P1] = R0;

farAway[0] = fooHere[0];

R0 = [P0++]; [P1++] = R0;

R0 = [P0 + ?]; [P1 + ?] = R0;

farAway[1] = fooHere[1];

R0 = [P0++]; [P1++] = R0;

R0 = [P0 + ??]; [P1 + ??] = R0;

farAway[2] = fooHere[2];

Remember -- P0 will end up pointing PAST the end of the array

farAway[3] = fooHere[3];

farAway[4] = fooHere[4];

Page 7: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

7

The C++ code we actually developed

.extern _fooHere; .extern _farAway;

extern long fooHere[5]; extern farAway[5];

extern long fooHere[5], farAway[5];

P0.H = _fooHere; P0.L = _fooHere;

long *pt0; pt0 = fooHere; (Actually pt0 = &fooHere[0];)

P1.H = _farAway; P1.L = _farAway;

long *pt1; pt1 = farAway; (Actually pt1 = &ffarAway[0];)

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++; farAway[0] = fooHere[0];

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++; farAway[1] = fooHere[1];

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++; farAway[2] = fooHere[2];

Remember -- P0 will end up pointing PAST the end of the array

farAway[3] = fooHere[3];

Page 8: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

8

IDLE – Seems the next simplest!

IDLE instruction is part of a sequence of instructions to place the processor in a quiescent state so that something can happen External system can change clock

frequencies – power saving – high clock frequency can mean high power consumption

A ssync instruction MUST immediately follow the idle instruction

Getting out of the idle instruction sequence needs an understanding of interrupts Will discuss more about idle later

More info in instruction ref. manual p 11.3

Page 9: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

9

Jump instruction Both JUMP and CALL instructions transfer

program flow to another memory location The difference between JUMP and CALL is

that the CALL automatically loads the return address into the RETS register. The return address is the next sequenctal address after the CALL instruction.

JUMPs can be conditional (depends on CC bit in ASTAT register.

Conditional JUMP instructions use static branch prediction to reduce branch latency caused by the length of the Blackfin instruction pipeline. What does “static” branch prediction mean? What is “dynamic” branch prediction?

When possible the assembler will use the short relative jump. The target instruction must be within -4096 to +4094 bytes of the current instruction.

Page 10: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

10

Array movement

.extern _fooHere, _farAway; extern long fooHere[5], farAway[5]

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

R0 = [P0]; [P1] = R0;

for (int num = 0; num < 5 ; num++) {

R0 = [P0 + ?]; [P1 + ?] = R0;

farAway[num] = fooHere[num];

R0 = [P0 + ??]; [P1 + ??] = R0;

}

…… and so on ….

Linear code – Straight line coding is STILL a viable solution for solving a loop.

You don’t waste any time in incrementing a loop counterYou don’t waste time in checking a loop counterYou don’t waste time upsetting the processor instruction pipeline by jumping back and throwing away all prefetched instructions.

Page 11: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

11

Standard software LoopThe C++ code we actually developed

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

extern long fooHere[5], farAway[5];

R1 = 0; R2 = 5;LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END;

int num = 0; for ( /* empty */; num < 5 ; num++) {

for (int num = 0; num < 5 ; num++) {

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++; farAway[num] = fooHere[num];

R1 += 1; JUMP LOOP;LOOP_END: outside loop

} }

PREDICTED NOT TAKEN

Page 12: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

12

Program Loops

Most programs have 1 or 2 loops embedded inside each other, occasionally 3 or more

For all images in a list For each row in each image For each column (pixel) in each row For each colour in each pixel

Important to get the maximum efficiency of the instructions that are executed the most often!

Page 13: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

13

Efficiency of Standard software Loop

Suppose we go round the loop N times

2 loop control instructions outside of loop + 4 * N loop control instructions inside the loop

2 * N “useful instructions” inside loop + 4 useful set up instructions

Loop efficiency =

4 + 2 * N-------------------------- * 100%4 + 2 * N + 2 + 4 * N

If N is large 2 * N ----------- * 100% = 33% 6 * N

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

R1 = 0; R2 = 5;LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END;

int num = 0; for ( /* empty */; num < 5 ; num++) {

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++;

R1 += 1; JUMP LOOP;LOOP_END: outside loop

}

Page 14: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

14

Down-counting software loop

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

extern long fooHere[5], farAway[5];

R1 = ; CC = R1 <= 0; IF CC JUMP DO_WHILE_END;

DO_WHILE:

int num = 5 ; if (num > 0) do { // Test needed if // exact value of // num not known

for (int num = 0; num < 5 ; num++) {

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++; farAway[num] = fooHere[num];

R1 += -1; CC = R1 <= 0; IF !CC JUMP DO_WHILE (BP);

DO_WHILE_END: outside loop

} while ( (--num) > 0)

}

Page 15: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

15

Efficiency of Down-counting software LoopSuppose we go round the loop N times

3 loop control instructions outside of loop + 3 * N loop control instructions inside the loop

2 * N “useful instructions” inside loop + 4 useful set up instructions

Loop efficiency =

4 + 2 * N-------------------------- * 100%4 + 2 * N + 3 + 3 * N

If N is large 2 * N ----------- * 100% = 40% 5 * N

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

R1 = ; CC = R1 <= 0; IF CC JUMP DO_WHILE_END;

DO_WHILE:

int num = 5 ; if (num > 0) do { // Test needed if // exact value of // num not known

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++;

R1 += -1; CC = R1 <= 0; IF !CC JUMP DO_WHILE (BP);

DO_WHILE_END: outside loop

} while ( (--num) > 0)

Page 16: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

16

Efficient loops Motorola MC68XXX has specialized loop instruction – essentially

Decrement the counter (data register) and start the jump occurring While the decrement is occurring, test if OLD COUNTER WAS

LESS THAN ZERO. If old counter less than zero then stop the jump Motorola has specialized memory operations WHICH TAKE MANY

PROCESSOR CYCLES Motorola has instruction [P1++] = [P0++] which has all the following

steps – each taking 4 clock cycles Fetch instruction internReg.L = W[P0]; internReg.H = W[P0+2]; W[P1] = internReg.L; W[P1+2] = internReg.H; P0 += 4; P1 += 4;

TOTAL OF 24 cycles at 8 MHz

Page 17: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

17

Efficiency of “Motorola-style” Down-counting software Loop with specialized branch instructions

Suppose we go round the loop N times

3 loop control instructions outside of loop + 1 * N loop control instructions inside the loop

1 * N “useful instructions” inside loop + 2 useful set up instructions

Loop efficiency =

6 + 5 * N-------------------------- * 100%6 + 5 * N + 4 + 1 * N

If N is large 5 * N ----------- * 100% = 84% 6 * N

.extern _fooHere; .extern _farAway;

P0 = _fooHere; P1 = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

R1 = (5 – 1); CC = R1 < 0; IF CC JUMP DO_WHILE_END;

DO_WHILE:

int num = 5 ; if (num > 0) do { // Test needed if // exact value of // num not known

[P1++] = [P0++]; *pt1++ = *pt0++;

IF (R1 < 0 ) THEN CONTINUE OTHERWISE (R1 += -1) AND JUMP DO_WHILE (BP);

DO_WHILE_END: outside loop

} while ( (--num) > 0)

NOTE: NOT AVAILABLE ON BLACKFIN

Page 18: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

18

Blackfin Hardware Loops Blackfin supports a mechanism

for zero-overhead looping Common design decision –

the two inner-most loops are the most often executed – so make those the most efficient

The program sequencer contains TWO loop units, each containing three registers Loop Top registers – LT0,

LT1 Loop Bottom registers –

LB0, LB1 Loop Count registers – LC0,

LC1

Page 19: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

19

Blackfin Hardware Loops The program sequencer contains TWO loop units, each containing

three registers Loop Top registers – LT0, LT1 Loop Bottom registers – LB0, LB1 Loop Count registers – LC0, LC1

When that when an instruction at address X is executed (meaning PC = = X) and if the address X matches the contents of LBn

(meaning PC = = LBn) and the counter register is greater than equal to 2 (LCx

>= 2) THEN the next instruction will be taken from address

LTn Note that if two loops end on the same

instruction then loop 1 has the highest priority

Page 20: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

20

Pseudo code example

Set LT0 = first instruction in loop -- LOOP STARTSet LB0 = last instruction in loop; -- LOOP END:Set LC0 = 5;LOOP_START: R0 = [P0++];LOOP_END: [P1++] = R0;

Manual (P4-16) says Each loop register can be loaded individually with a register transfer, but this incurs a significant overhead if the loop count is non-zero (the loop is active) at the time of the transfer.

That sounds unpleasant – so lets find an easier wayManual (P4-16) says The LSETUP instruction can be used to load

all three registers of a loop unit at the same time

Page 21: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

21

Efficiency of Standard software Loop

Suppose we go round the loop N times

2 loop control instructions outside of loop + 4 * N loop control instructions inside the loop

2 * N “useful instructions” inside loop + 4 useful set up instructions

Loop efficiency =

4 + 2 * N-------------------------- * 100%4 + 2 * N + 2 + 4 * N

If N is large 2 * N ----------- * 100% = 33% 6 * N

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

R1 = 0; R2 = 5;LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END;

int num = 0; for ( /* empty */; num < 5 ; num++) {

R0 = [P0++]; [P1++] = R0;

*pt1++ = *pt0++;

R1 += 1; JUMP LOOP;LOOP_END: outside loop

}

WARNING: LOOP_END is an instruction that IS NOT EXECUTED INSIDE THE SOFTWARE LOOP

Page 22: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

22

Efficiency of Hardware Loop

Suppose we go round the loop N times

2 loop control instructions outside of loop + 0 loop control instructions inside the loop – There are some pipeline overhead issues on leaving loop

2 * N “useful instructions” inside loop + 4 useful set up instructions

Loop efficiency =

4 + 2 * N-------------------------- * 100%4 + 2 * N + 2

If N is large 2 * N ----------- * 100% = 100% 2 * N

.extern _fooHere; .extern _farAway;

P0.H = _fooHere; P0.L = _fooHere;

P1.H = _farAway; P1.L = _farAway;

extern long fooHere[5]; extern farAway[5];

long *pt0; pt0 = fooHere;

long *pt1; pt1 = farAway;

P2 = 5; LSETUP( LOOP_START, LOOP_END) LC1 = P2;

int num = 0; for ( /* empty */; num < 5 ; num++) {

LOOP_START:

R0 = [P0++];

*pt1++ = *pt0++;

LOOP_END: [P1++] = R0;

OUTSIDE_LOOP:

}

WARNING: LOOP_END is an instruction that IS EXECUTED INSIDE THE HARDWARE LOOP

Page 23: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

23

Big warning

SOFTWARE LOOP HARDWARE LOOP

R1 = 0; R2 = 5;LOOP: CC = R2 <= R1; IF CC JUMP LOOP_END;

LOOP_START:

R0 = [P0++];

R0 = [P0++]; [P1++] = R0;

LOOP_END: [P1++] = R0;

OUTSIDE_LOOP:

R1 += 1; JUMP LOOP;LOOP_END: outside loop

LOOP_END Always executed in hardware loop

Page 24: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

24

Warning and speed issues The distance between LSETUP instruction and LOOP_START

instruction MUST NOT BE MORE THAN 30 bytes (otherwise the offset description will not fit into the instruction). There is a 4 clock cycle advantage if LSETUP is the instruction

immediately before the LOOP_START instruction The distance between LSETUP instruction and LOOP_END instruction

MUST NOT BE MORE THAN 2046 bytes (otherwise the offset description will not fit into the instruction)

The processor supports a four-location instruction loop buffer. If the loop code contains four or fewer instructions, then no fetched to instruction memory are necessary for any number of loop iterations because the instructions are stored locally. This eliminates instruction fetch time (especially important when

accessing external memory) Really efficient loops are no more than 4 long. Have requested information if 4 instructions or 4 instructions which can

be highly parallel (like 16 instructions in a non-parallel mode)

Page 25: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

25

Tackled today

Program sequencer Linear flow of instruction Why not discuss idle instruction here? Jumps

Software loops – normal and more efficient “down-counting” loops

Special Motorola MC68XXX software loop instructions

Loops – hardware loops

Subroutines -- – next lecture Interrupts and Exceptions – next lecture Idle – next lecture

Page 26: Daddy! -- Where do instructions come from? Program Sequencer controls program flow and provides the next instruction to be executed Straight line code,

04/18/23 Program sequencer , Copyright M. Smith, ECE, University of Calgary, Canada

26

Information taken from Analog Devices On-line Manuals with permission http://www.analog.com/processors/resources/technicalLibrary/manuals/

Information furnished by Analog Devices is believed to be accurate and reliable. However, Analog Devices assumes no responsibility for its use or for any infringement of any patent other rights of any third party which may result from its use. No license is granted by implication or otherwise under any patent or patent right of Analog Devices. Copyright Analog Devices, Inc. All rights reserved.