generating a software loop with memory accesses
DESCRIPTION
Generating a software loop with memory accesses. TigerSHARC assembly syntax. Concepts. Learning just enough TigerSHARC assembly code to make a software loop “work” Comparing the timings for rectification of integer and floating point arrays, using debug C++ code, Release C++ code - PowerPoint PPT PresentationTRANSCRIPT
Generating a software loop with memory accesses
TigerSHARC assembly syntax
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
2 / 28
Concepts
Learning just enough TigerSHARC assembly code to make a software loop “work”
Comparing the timings for rectification of integer and floating point arrays, using debug C++ code, Release C++ code Our FIRST_ASM code
Looking in “MIXED mode” at the code generated by the compiler
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
3 / 28
Test Driven Development
DescribeRequirements
Design Solution
Build Solution Test Solution
WriteAcceptance Tests
WriteUnit Tests
CUSTOMER
DEVELOPER
Work with customer to check that the tests properly express what the customer wants done. Iterative process with customer “heavily involved” – “Agile” methodology.
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
4 / 28
Note
Special marker
Compiler optimization
FLOATS 927 304 -- THREE FOLD
INTS 960 150 – SIX FOLD
Why the difference, and can we do better, and do we want to?
Note the failures – what are they
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
5 / 28
Write tests about passing values back from an assembly code routine
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
6 / 28
More detailed look at the code
Single semi-colonsDouble semi-colons
Start function labelEnd function label
Used for“profiling code”
Label format similar to 68KNeeds leading underscore and final colon
As with 68K and Blackfin needs a .sectionBut name and format different
As with 68K need .align statementIs the “4” in bytes (8 bits)
or words (32 bits)
As with 68K need .globalto tell other code that this function
exists
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
7 / 28
Return registers There are many, depending on what you need to return Here we need to use J8
Many registers available – need ability to control usage J0 to J31 – registers (integers and pointers) (SISD mode) XR0 to XR31 – registers (integers) (SISD mode) XFR0 to XFR31 – registers (floats) (SISD mode)
Did I also mention I0 to I31 – registers (integers and pointers) (SISD mode) YR0 to YR31 , YFR0 to YFR31 (SIMD mode) XYR, YXR and R registers (SIMD mode) And also the MIMD modes And the double registers and the quad registers …….
#define return_pt_J8 J8 // J8 is a VOLATILE, NON-PRESERVED register
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
8 / 28
Parameter passing
Spaces for first four parameters ARE ALWAYS present on the stack (as with 68K)
But the first four parameters are passed in registers (J4, J5, J6 and J7 most of the time) (as with MIPS)
The parameters passed in registers are often stored into the spaces on the stack (like the MIPS) when assembly code functions call assembly code functions
J4, J5, J6 and J7 are volatile, non-preserved registers
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
9 / 28
Can we pass back the start of the final array
Still passing tests byaccident and this needs to be conditional returnvalue
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
10 / 28
What we need to know based on experiences from other processors Can we return from an assembly language routine
without crashing the processor? Return a parameter from assembly language routine
(Is it same for ints and floats?) Pass parameters into assembly language
(Is it same for ints and floats?) Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from
memoryAll this stuff is demonstrated by coding
HalfWaveRectifyASM( )
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
11 / 28
Why is ELSE a keyword
FOUR PART ELSE INSTRUCTION IS LEGAL
IF JLT; ELSE, J1 = J2 + J3; // Conditional execution – if true ELSE, XR1 = XR2 + XR3; // Conditional – if true YFR1 = YFR2 + YFR3;; // Unconditional -- always
IF JLT; DO, J1 = J2 + J3; // Conditional execution -- if true DO, XR1 = XR2 + XR3; // Conditional -- if true YFR1 = YFR2 + YFR3;; // Unconditional -- always
Having this sort of format means that the instruction pipeline is not disrupted when we do IF statements
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
12 / 28
Label name is not the problem
NOTE:This is “C-like” syntax,But it is not “C”
Statement must end in ;;Not ;
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
13 / 28
Add dual-semicolons everywhereWorry about “multiple issues” later
This dual semi-colonIs so important that youMUST code review for it allthe time or else you wasteso much time in theLab. Key in exams / quizzes
At last an error I know how to fix
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
14 / 28
Well I thought I understood it !!!
Speed issue – JUMPS can’t be too close together.
Not normally a problem when “if” is larger
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
15 / 28
Add a single instruction of 4 NOPsnop; nop; nop; nop;; Fix the last error as part of Assignment 1Fix the remaining error
In handling the IF THEN ELSEas part of assignment 1
Worry about code efficiency later(refactor) when all code working
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
16 / 28
What we need to know based on experiences from other processors Can we return from an assembly language routine
without crashing the processor? Return a parameter from assembly language routine
(Is it same for ints and floats?) Pass parameters into assembly language
(Is it same for ints and floats?) Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from
memoryAll this stuff is demonstrated by coding
HalfWaveRectifyASM( )
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
17 / 28
Target for this week. Changing this code into assembly (more speed)
Code we generated yesterday was similar to parts of this, but not equivalent. Refactor to make equivalent
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
18 / 28
The code was not exactly what we designed (C++ equivalent) – refactor and retest after the refactoring
NEXT STEP
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
19 / 28
Refactored C++ code
I THINK I UNDERSTANDENOUGH TO CHANGE THEFORMAT OF THE
IF-THEN-ELSE
IN THIS CASE
Avoiding JUMPS in the mainflow of the code will speedthe flow of the code
Almost right.
Look in the manual to findthe correct syntax
IF NJLE; DO, J8 = 0
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
20 / 28
No syntax errors (No ERRORS). Code does not work (DEFECTS)
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
21 / 28
Run “forensic tests” to find out where DEFECT is being introduced
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
22 / 28
Add another line to the codeCan now spot the error
New format of
IF-THEN-ELSE
Is doing exactly the opposite of what we want
Need JLE not NJLE
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
23 / 28
Assignment 1 – code the following as a software loop – follow MIPS approach
int CalculateSum(void) {
int sum = 0;
for (int count = 0; count < 6; count++) {
sum = sum + count;
}
return sum;
}
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
24 / 28
Reminder – software for-loopbecomes “while loop” with initial test
int CalculateSum(void) {
int sum = 0;
int count = 0;
while (count < 6) {
sum = sum + count;
count++;
}
return sum;
}
Do line by line translation
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
25 / 28
USE SOFTWARE LOOP HEREDo loop control first Have some jumps too close together
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
26 / 28
Run the tests with 4 nop padding to check that get out of loop as expected
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
27 / 28
Accessing memory Basic mode
Special register J31 – acts as zero when used in additions
Pt_J5 is a pointer register into an array Value_J1 is being used as a data register J registers like MIPS registers (used as pointer and data).
NOT like 68K or Blackfin registers – either data or address but not both
1. Value_J1 = [Pt_J5];; read value from memory location pointed to by J5 -- Compare to Blackfin Value_R0 = [Pt_P0];;
2. Value_J1 = [Pt_J5 + J31];; read value from memory location pointed to by J5 – but read somewhere that this CAN be faster than just Value_J1 = [Pt_J5];; -- NEED TO CONFIRM
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
28 / 28
Accessing memory – step 2
Basic mode Pt_J5 is a pointer register into an array Offset_J4 is used as an offset Value_J1 is being used as a data register
1. Read_J1 = [Pt_J5 + Offset_J4];; read value from memory location pointed to by (J5 + J4)
PRE-MODIFY – address used J5 + J4, no change in J5
2. Read_J1 = [Pt_J5 += Offset_J4];; read value from memory location pointed to by J5, and then perform add
POST-MODIFY – address used J5, then perform J5 = J5 + J4
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
29 / 28
Add in the memory accessesFORGET TigerSHARC = RISC PROCESSOR
LOAD/STORE ONLYLike MIPS
Must place value intoregister, and then copyregister to memory
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
30 / 28
Understand the error messageToo many J resource usage = missing ;;
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
31 / 28
Note: Missing label is not an assembler error, it’s a linker error
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
32 / 28
Now the assembler know where “CONTINUE” is, then it can tell you that you have two JUMP too close together Fix with magic 4 nops; and lose one cycle
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
33 / 28
Not getting expected Test resultsSomething is logically wrong (DEFECT)
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
34 / 28
Obvious question – are we even getting into the loop. Add BREAKPOINT to test (not to code follow)
NEVER GOT TOBREAKPOINT meansnever entered loop
Forgot to do count = 0
So not even getting into loop as there isa garbage value inCount_J0
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
35 / 28
Not bad for a first effortFaster than compiler in debug mode
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
36 / 28
Where did the float ASM code suddenly appear from? Integer 0 has bit pattern 0x0000 0000 Float 0.0 has bit pattern 0x0000 0000
Integer +6 has format b 0??? ???? ???? ???? ???? ???? ???? ????
Float +6.0 has format b 0??? ???? ???? ???? ???? ???? ???? ????
Integer -6 has format b 1??? ???? ???? ???? ???? ???? ???? ????
Float -6.0 has format b 1??? ???? ???? ???? ???? ???? ???? ????
Format’s are very different, but the sign bit is in the same place
Float algorithm - if S == 1 (negative) set to zeroOtherwise leave unchanged – same as integer algorithm Just re-use integer algorithm with a change of name
EXPONENT
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
37 / 28
Final code – Float rectify code just has a different name
04/19/23 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada
38 / 28
What we NOW KNOW
Can we return from an assembly language routine without crashing the processor?
Return a parameter from assembly language routine (Is it same for ints and floats?)
Pass parameters into assembly language (Is it same for ints and floats?)
Do IF THEN ELSE statements Read and write values to memory Read and write values in a loop Do some mathematics on the values fetched from
memoryAll this stuff is demonstrated by coding
HalfWaveRectifyASM( )