assembly strings and arrays

25
4345 Assembly Language Dr. Esam Al_Qaralleh CE Department Princess Sumaya University for Technology Strings and Arrays String Instructions String is a collection of bytes, words, or long-words that can be up to 64KB in length String instructions can have at most two operands. One is referred to as source string and the other one is called destination string — Source string must locate in Data Segment and SI register points to the current element of the source string — Destination string must locate in Extra Segment and DI register points to the current element of the destination string 53 48 4F 50 50 45 S H O P P 52 E R 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 49 S H O P P 4E I N 02A8:2000 02A8:2001 02A8:2002 02A8:2003 02A8:2004 02A8:2005 02A8:2006 DS : SI ES : DI Source String Destination String

Upload: omar-sallam

Post on 22-Oct-2014

38 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Assembly Strings and Arrays

1

4345 Assembly Language

Dr. Esam Al_QarallehCE Department

Princess Sumaya University for Technology

Strings and Arrays

String InstructionsString is a collection of bytes, words, or long-words that can be up to 64KB in length

String instructions can have at most two operands. One is referred to as sourcestring and the other one is called destination string — Source string must locate in Data Segment and SI register points to the current

element of the source string — Destination string must locate in Extra Segment and DI register points to the current

element of the destination string

53484F505045

SH

OPP

52ER

0510:00000510:00010510:0002

0510:00030510:00040510:00050510:0006

53484F505049

SH

OPP

4EIN

02A8:200002A8:200102A8:2002

02A8:200302A8:200402A8:200502A8:2006

DS : SI ES : DI

Source String Destination String

Page 2: Assembly Strings and Arrays

2

String Primitive Instructions

• MOVSB, MOVSW, and MOVSD• CMPSB, CMPSW, and CMPSD• SCASB, SCASW, and SCASD• STOSB, STOSW, and STOSD• LODSB, LODSW, and LODSD

MOVSB, MOVSW, and MOVSD (1 of 2)

• The MOVSB, MOVSW, and MOVSD instructions copy data from the memory location pointed to by DS:SI to the memory location pointed to by ES:DI.

.datasource DWORD 0FFFFFFFFhtarget DWORD ?.codemov si,OFFSET sourcemov di,OFFSET targetmovsd

Page 3: Assembly Strings and Arrays

3

MOVSB, MOVSW, and MOVSD (2 of 2)

• SI and DI are automatically incremented or decremented:∗ MOVSB increments/decrements by 1∗ MOVSW increments/decrements by 2∗ MOVSD increments/decrements by 4

Direction FlagDirection Flag (DF) is used to control the way SI and DI are adjusted during theexecution of a string instruction

— DF=0, SI and DI will auto-increment during the execution; otherwise, SI and DI auto-decrement

— Instruction to set DF: STD; Instruction to clear DF: CLD

— Example:

CLDMOV CX, 5REP MOVSB

At the beginning of execution,DS=0510H and SI=0000H

53484F505045

SH

OPP

52ER

0510:00000510:00010510:0002

0510:00030510:00040510:00050510:0006

DS : SI

Source String

SI CX=5

SI CX=4

SI CX=3

SI CX=2SI CX=1

SI CX=0

Page 4: Assembly Strings and Arrays

4

Repeat Prefix InstructionsREP String Instruction

— The prefix instruction makes the microprocessor repeatedly execute the string instructionuntil CX decrements to 0 (During the execution, CX is decreased by one when the stringinstruction is executed one time).

— For Example:

MOV CX, 5REP MOVSB

By the above two instructions, the microprocessor will execute MOVSB 5 times.

— Execution flow of REP MOVSB::

While (CX!=0){

CX = CX –1;MOVSB;

}

Check_CX: If CX!=0 ThenCX = CX –1;MOVSB;goto Check_CX;

end if

OR

Using a Repeat Prefix• Example: Copy 20 doublewords from source to

target

.datasource DWORD 20 DUP(?)target DWORD 20 DUP(?).codecld ; direction = forwardmov cx,LENGTHOF source ; set REP countermov si,OFFSET sourcemov di,OFFSET targetrep movsd

Page 5: Assembly Strings and Arrays

5

String InstructionsMOVSB (MOVSW) Example

53484F505045

SH

OPP

52ER

0510:00000510:00010510:0002

0510:00030510:00040510:00050510:0006

0300:0100DS : SI ES : DI

Source String Destination String

MOV AX, 0510HMOV DS, AXMOV SI, 0MOV AX, 0300HMOV ES, AXMOV DI, 100HCLDMOV CX, 5REP MOVSB

Repeat Prefix InstructionsREPZ String Instruction— Repeat the execution of the string instruction until CX=0 or zero flag is clear

REPNZ String Instruction— Repeat the execution of the string instruction until CX=0 or zero flag is set

REPE String Instruction— Repeat the execution of the string instruction until CX=0 or zero flag is clear

REPNE String Instruction— Repeat the execution of the string instruction until CX=0 or zero flag is set

Page 6: Assembly Strings and Arrays

6

Your turn . . .• Use MOVSD to delete the first element of the

following double-word array. All subsequent array values must be moved one position forward toward the beginning of the array:

array DWORD 1,1,2,3,4,5,6,7,8,9,10

.dataarray DWORD 1,1,2,3,4,5,6,7,8,9,10.codecldmov cx,(LENGTHOF array) - 1mov si,OFFSET array+4mov di,OFFSET arrayrep movsd

CMPSB, CMPSW, and CMPSD

• The CMPSB, CMPSW, and CMPSD instructions each compare a memory operand pointed to by DS:SI to a memory operand pointed to by ES:DI.∗ CMPSB compares bytes∗ CMPSW compares words∗ CMPSD compares doublewords

• Repeat prefix often used∗ REPE (REPZ)∗ REPNE (REPNZ)

Page 7: Assembly Strings and Arrays

7

String InstructionsCMPSB (CMPSW) Example

Assume: ES = 02A8HDI = 2000HDS = 0510HSI = 0000H

CLDMOV CX, 9REPZ CMPSB

What’s the values of CX afterThe execution?

53484F505045

SH

OPP

52ER

0510:00000510:00010510:0002

0510:00030510:00040510:00050510:0006

02A8:2000

DS : SIES : DI

Source String Destination String

02A8:200102A8:2002

02A8:200302A8:200402A8:200502A8:2006

53484F505049

SH

OPP

4EIN

Comparing Arrays

.datasource DWORD COUNT DUP(?)target DWORD COUNT DUP(?).codemov ecx,COUNT ; repetition countmov esi,OFFSET sourcemov edi,OFFSET targetcld ; direction = forwardrepe cmpsd ; repeat while equal

Use a REPE (repeat while equal) prefix to compare corresponding elements of two arrays.

Page 8: Assembly Strings and Arrays

8

Example: Comparing Two Strings (1 of 3)

.datasource BYTE "MARTIN "dest BYTE "MARTINEZ"str1 BYTE "Source is smaller",0dh,0ah,0str2 BYTE "Source is not smaller",0dh,0ah,0

This program compares two strings (source and destination). It displays a message indicating whether the lexical value of the source string is less than the destination string.

Source is smallerScreen output:

Example: Comparing Two Strings (2 of 3).codemain PROC

cld ; direction = forwardmov si,OFFSET sourcemov di,OFFSET destmov cx,LENGTHOF sourcerepe cmpsbjb source_smallermov dx,OFFSET str2 ; "source is not smaller"jmp done

source_smaller:mov dx,OFFSET str1 ; "source is smaller"

done:call WriteStringexit

main ENDPEND main

Page 9: Assembly Strings and Arrays

9

Example: Comparing Two Strings (3 of 3)

• The following diagram shows the final values of SI and DI after comparing the strings:

M

M

A R T I N

A R T I N E Z

M

M

A R T I N

A R T I N E Z

AfterBefore

Source:

Dest:

ESI

EDI

ESI

EDI

Before After

Your turn . . .

• Modify the String Comparison program from the previous two slides. Prompt the user for both the source and destination strings.

• Sample output:

Input first string: ABCDEFGInput second string: ABCDDG

The first string is not smaller.

Page 10: Assembly Strings and Arrays

10

SCASB, SCASW, and SCASD

• The SCASB, SCASW, and SCASD instructions compare a value in AL/AX/EAX to a byte, word, or doubleword, respectively, addressed by DI.

• Useful types of searches:∗ Search for a specific element in a long string or array.∗ Search for the first element that does not match a given

value.

SCASB Example

.dataalpha BYTE "ABCDEFGH",0.codemov di,OFFSET alphamov al,'F' ; search for 'F'mov cx,LENGTHOF alphacldrepne scasb ; repeat while not equaljnz quitdec di ; DI points to 'F'

Search for the letter 'F' in a string named alpha:

What is the purpose of the JNZ instruction?

Page 11: Assembly Strings and Arrays

11

STOSB, STOSW, and STOSD

• The STOSB, STOSW, and STOSD instructions store the contents of AL/AX/EAX, respectively, in memory at the offset pointed to by DI.

• Example: fill an array with 0FFh.dataCount = 100string1 BYTE Count DUP(?).codemov al,0FFh ; value to be storedmov di,OFFSET string1 ; ES:DI points to targetmov cx,Count ; character countcld ; direction = forwardrep stosb ; fill with contents of AL

LODSB, LODSW, and LODSD• LODSB, LODSW, and LODSD load a byte or

word from memory at SI into AL/AX/EAX, respectively.

• Example:.dataarray BYTE 1,2,3,4,5,6,7,8,9.code

mov si,OFFSET arraymov cx,LENGTHOF arraycld

L1: lodsb ; load byte into ALor al,30h ; convert to ASCIIcall WriteChar ; display itloop L1

Page 12: Assembly Strings and Arrays

12

Array Multiplication Example

.dataarray DWORD 1,2,3,4,5,6,7,8,9,10multiplier DWORD 10.code

cld ; direction = upmov si,OFFSET array ; source indexmov di,esi ; destination indexmov cx,LENGTHOF array ; loop counter

L1: lodsd ; copy [SI] into EAXmul multiplier ; multiply by a valuestosd ; store EAX at [DI]loop L1

Multiply each element of a doubleword array by a constant value.

Arrays

Page 13: Assembly Strings and Arrays

13

Arrays

One-Dimensional Arrays• Array declaration in HLL (such as C)

int test_marks[10];

specifies a lot of information about the array:» Name of the array (test_marks)» Number of elements (10)» Element size (2 bytes)» Interpretation of each element (int i.e., signed integer)» Index range (0 to 9 in C)

• You get very little help in assembly language!

Arrays (cont’d)

• In assembly language, declaration such astest_marks DW 10 DUP (?)

only assigns name and allocates storage space.∗ You, as the assembly language programmer, have to

“properly” access the array elements by taking element size and the range of subscripts.

• Accessing an array element requires its displacement or offset relative to the start of the array in bytes

Page 14: Assembly Strings and Arrays

14

Arrays (cont’d)

• To compute displacement, we need to know how the array is laid out

» Simple for 1-D arrays

• Assuming C style subscriptsdisplacement = subscript *

element size in bytes

Multidimensional Arrays

• We focus on two-dimensional arrays» Our discussion can be generalized to higher dimensions

• A 5×3 array can be declared in C asint class_marks[5][3];

• Two dimensional arrays can be stored in one of two ways:∗ Row-major order

– Array is stored row by row– Most HLL including C and Pascal use this method

∗ Column-major order– Array is stored column by column– FORTRAN uses this method

Page 15: Assembly Strings and Arrays

15

Multidimensional Arrays (cont’d)

Multidimensional Arrays (cont’d)

• Why do we need to know the underlying storage representation?

» In a HLL, we really don’t need to know» In assembly language, we need this information as we have to

calculate displacement of element to be accessed

• In assembly language,class_marks DW 5*3 DUP (?)

allocates 30 bytes of storage• There is no support for using row and column

subscripts» Need to translate these subscripts into a displacement value

Page 16: Assembly Strings and Arrays

16

Multidimensional Arrays (cont’d)

• Assuming C language subscript convention, we can express displacement of an element in a 2-D array at row i and column j as

displacement = (i * COLUMNS + j) * ELEMENT_SIZEwhere

COLUMNS = number of columns in the arrayELEMENT_SIZE = element size in bytes

Example: Displacement of class_marks[3,1]

element is (3*3 + 1) * 2 = 20

Examples

Page 17: Assembly Strings and Arrays

17

Reverse an array

• Reverse and array of N elements, BX has the number of elements N, SI points to the arrayMOV DI, SI ; DI = SIDEC BX ; BX = N – 1ADD DI, BX ;DI Points to the last elementINC BX ; BX = NSHR BX, 1 ;BX = N/2LOOP:

MOV AX, [SI]XCHG AX, [DI]MOV [SI], AXINC SIDEC DIDEC BXJNZ LOOP

TWO DIMENTIONAL ARRAY EXAMPLE

• Suppose A is a 5x7 word array:1) clear row 3. 2) clear column 41) the starting address of the row 3 is

A+[(3-1) x 7 ] x 2 = A+28

MOV BX, 28 ; BX indexes row 3XOR SI, SI ; SI will index the columnsMOV CX, 7 ; number of elementsCLEAR:

MOV A[BX][SI], 0ADD SI, 2DEC CXJNZ CLEAR

Page 18: Assembly Strings and Arrays

18

TWO DIMENTIONAL ARRAY EXAMPLE

• Suppose A is a 5x7 word array:1) clear row 3. 2) clear column 42) the starting address of column 4 is

A+[(4-1)] x 2 = A+6

XOR BX, BX ; BX indexes rowsMOV SI, 6 ; SI will index column 4MOV CX, 5 ; number of elementsCLEAR:

MOV A[BX][SI], 0ADD BX, 14 ;go to next row (7 elements * 2bytes)DEC CXJNZ CLEAR

String Copy Example

• Copy string1 into string2∗ String1 DB ‘HELLO’∗ String2 DB 5DUP(0)

» LEA SI, STRING1» LEA DI, STRING2» MOV CX, 5 » CLD» REP MOVSB

Page 19: Assembly Strings and Arrays

19

String Copy Example

• Copy string1 into string2 in reverse orderString1 DB ‘HELLO’String2 DB 5DUP(0)

LEA SI, STRING1+4LEA DI, STRING2CLDMOV CX, 5MOVE:

MOVSBADD DI,2DEC CXJNZ MOVE

Two-Dimensional Table ExampleImagine a table with three rows and five columns. The data can be arranged in any format on the page:

table BYTE 10h, 20h, 30h, 40h, 50hBYTE 60h, 70h, 80h, 90h, 0A0hBYTE 0B0h, 0C0h, 0D0h, 0E0h, 0F0h

NumCols = 5

table BYTE 10h,20h,30h,40h,50h,60h,70h,80h,90h,0A0h,0B0h,0C0h,0D0h, 0E0h,0F0h

NumCols = 5

Alternative format:

Page 20: Assembly Strings and Arrays

20

Two-Dimensional Table ExampleThe following code loads the table element stored in row 1, column 2:

RowNumber = 1ColumnNumber = 2

mov bx,NumCols * RowNumbermov si,ColumnNumbermov al,table[bx + si]

10 20 30 40 50 60 70 80 90 A0

150 155

table[ebx]

B0 C0 D0 E0 F0

table[ebx + esi]

157

table

Searching and Sorting Integer Arrays

• Bubble Sort∗ A simple sorting algorithm that works well for small

arrays

• Binary Search∗ A simple searching algorithm that works well for

large arrays of values that have been placed in either ascending or descending order

Page 21: Assembly Strings and Arrays

21

Bubble Sort

3

1

7

5

2

9

4

3

1

3

7

5

2

9

4

3

1

3

7

5

2

9

4

3

1

3

5

7

2

9

4

3

1

3

5

2

7

9

4

3

1

3

5

2

7

4

9

3

1

3

5

7

2

4

3

9

One Pass (Bubble Sort)

1

3

5

2

7

9

4

3

(shaded values have been exchanged)

Each pair of adjacent values is compared, and exchanged if the values are not ordered correctly:

Bubble Sort PseudocodeN = array size, cx1 = outer loop counter, cx2 = inner loop counter:

cx1 = N - 1while( cx1 > 0 ){

si = addr(array)cx2 = cx1while( cx2 > 0 ){

if( array[si] < array[si+4] )exchange( array[si], array[si+4] )

add si,4dec cx2

}dec cx1

}

Page 22: Assembly Strings and Arrays

22

Bubble Sort ImplementationBubbleSort PROC USES eax ecx esi,

pArray:PTR DWORD,Count:DWORDmov cx,Countdec cx ; decrement count by 1

L1: push cx ; save outer loop countmov si,pArray ; point to first value

L2: mov ax,[si] ; get array valuecmp [si+4],ax ; compare a pair of valuesjge L3 ; if [si] <= [di], skipxchg ax,[si+4] ; else exchange the pairmov [si], ax

L3: add si,4 ; move both pointers forwardloop L2 ; inner looppop cx ; retrieve outer loop countloop L1 ; else repeat outer loop

L4: retBubbleSort ENDP

Binary Search

• Searching algorithm, well-suited to large ordered data sets

• Divide and conquer strategy• Each "guess" divides the list in half• Classified as an O(log n) algorithm:

∗ As the number of array elements increases by a factor of n, the average search time increases by a factor of log n.

Page 23: Assembly Strings and Arrays

23

Binary Search Estimates

Binary Search Pseudocodeint BinSearch( int values[],

const int searchVal, int count ){

int first = 0;int last = count - 1;while( first <= last ){

int mid = (last + first) / 2;if( values[mid] < searchVal )

first = mid + 1;else if( values[mid] > searchVal )

last = mid - 1;else

return mid; // success}return -1; // not found

}

Page 24: Assembly Strings and Arrays

24

Binary Search Implementation (1 of 3)

BinarySearch PROC uses ebx edx esi edi,pArray:PTR DWORD, ; pointer to arrayCount:DWORD, ; array sizesearchVal:DWORD ; search value

LOCAL first:DWORD, ; first positionlast:DWORD, ; last positionmid:DWORD ; midpointmov first,0 ; first = 0mov eax,Count ; last = (count - 1)dec eaxmov last,eaxmov edi,searchVal ; DI = searchValmov ebx,pArray ; EBX points to the array

L1: ; while first <= lastmov eax,firstcmp eax,lastjg L5 ; exit search

Binary Search Implementation (2 of 3); mid = (last + first) / 2

mov eax,lastadd eax,firstshr eax,1mov mid,eax

; EDX = values[mid]mov esi,midshl esi,2 ; scale mid value by 4mov edx,[ebx+esi] ; EDX = values[mid]

; if ( EDX < searchval(EDI) ); first = mid + 1;

cmp edx,edijge L2mov eax,mid ; first = mid + 1inc eaxmov first,eaxjmp L4 ; continue the loop

base-index addressing

Page 25: Assembly Strings and Arrays

25

Binary Search Implementation (3 of 3)

; else if( EDX > searchVal(EDI) ); last = mid - 1;L2: cmp edx,edi ; (could be removed)

jle L3mov eax,mid ; last = mid - 1dec eaxmov last,eaxjmp L4 ; continue the loop

; else return midL3: mov eax,mid ; value found

jmp L9 ; return (mid)

L4: jmp L1 ; continue the loopL5: mov eax,-1 ; search failedL9: retBinarySearch ENDP

Summary

• String primitives are optimized for efficiency• Strings and arrays are essentially the same• Keep code inside loops simple• Use base-index operands with two-dimensional

arrays• Avoid the bubble sort for large arrays• Use binary search for large sequentially ordered

arrays