An assembly (or assembler) language, often abbreviated asm, is a low-level programming language for a computer, or other programmable device, in which there is a very strong (generally one-to-one) correspondence between the language and the architecture's machine code instructions.
2. Starting with an Example
2
TITLE Add and Subtract (AddSub.asm)
; Adds and subtracts three 32-bit integers
; (10000h + 40000h + 20000h)
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h ; EAX = 10000h
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs ; display registers
exit
main ENDP
END main
Title/header
Include file
Code section
3. 3
Meanings of the Code
Assembly code Machine code
MOV EAX, 10000h B8 00010000
(Move 10000h into EAX)
ADD EAX, 40000h 05 00040000
(Add 40000h to EAX)
SUB EAX, 20000h 2D 00020000
(SUB 20000h from EAX)
3
Operand in instruction
4. 4
Fetched MOV EAX, 10000h
4
…
ALU
MemoryRegister
EAX
EBX
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
data
PC
0000011B8 00010000
IR
B8
00
01
00
00
ADD EAX, 40000h
05
00
04
00
5. 5
Execute MOV EAX, 10000h
5
…
ALU
Register
EAX
EBX
data
PC
0000011B8 00010000
IR
00010000
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8
00
01
00
00
ADD EAX, 40000h
05
00
04
00
Memory
address
6. 6
Fetched ADD EAX, 40000h
6
…
ALU
Register
EAX
EBX
data
PC
000100005 00040000
IR
00010000
Memory
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8
00
01
00
00
ADD EAX, 40000h
05
00
04
00
7. 7
Execute ADD EAX, 40000h
7
…
ALU
Register
EAX
EBX
data
PC
000100005 00040000
IR
0001000000050000
Memory
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8
00
01
00
00
ADD EAX, 40000h
05
00
04
00
8. Chapter Overview
• Basic Elements of Assembly Language
• Integer constants and expressions
• Character and string constants
• Reserved words and identifiers
• Directives and instructions
• Labels
• Mnemonics and Operands
• Comments
• Example: Adding and Subtracting Integers
• Assembling, Linking, and Running Programs
• Defining Data
• Symbolic Constants
• Real-Address Mode Programming
8
9. Reserved Words, Directives
• TITLE:
• Define program listing title
• Reserved word of directive
• Reserved words
• Instruction mnemonics,
directives, type attributes,
operators, predefined
symbols
• See MASM reference in
Appendix A
• Directives:
• Commands for assembler
9
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
10. Directive vs Instruction
• Directives: tell assembler what to do
• Commands that are recognized and acted upon by the assembler, e.g. declare
code, data areas, select memory model, declare procedures, etc.
• Not part of the Intel instruction set
• Different assemblers have different directives
• Instructions: tell CPU what to do
• Assembled into machine code by assembler
• Executed at runtime by the CPU
• Member of the Intel IA-32 instruction set
• Format:
LABEL (option), Mnemonic, Operands, Comment
10
11. Comments
•Single-line comments
• begin with semicolon (;)
•Multi-line comments
• begin with COMMENT
directive and a
programmer-chosen
character, end with the
same character, e.g.
COMMENT !
Comment line 1
Comment line 2
!
11
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
12. Include Files
• INCLUDE directive:
• Copies necessary
definitions and setup
information from a text file
named Irvine32.inc, located
in the assembler’s INCLUDE
directory (see Chapt 5)
12
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
13. Code Segment
• .code directive:
• Marks the beginning of the
code segment, where all
executable statements in a
program are located
13
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
14. Procedure Definition
• Procedure defined by:
• [label] PROC
• [label] ENDP
• Label:
• Place markers: marks the
address (offset) of code and
data
• Assigned a numeric address by
assembler
• Follow identifier rules
• Data label: must be unique,
e.g. myArray
• Code label: target of jump and
loop instructions, e.g. L1:
14
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
15. Identifiers
• Identifiers:
• A programmer-chosen
name to identify a variable,
a constant, a procedure, or
a code label
• 1-247 characters, including
digits
• not case sensitive
• first character must be a
letter, _, @, ?, or $
15
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
16. Instructions
[label:] mnemonic operand(s)
[;comment]
• Instruction mnemonics:
• help to memorize
• examples: MOV, ADD, SUB,
MUL, INC, DEC
• Operands:
• constant
• constant expression
• register
• memory (data label,
register)
16
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
immediate valuesDestination
operand
Source
operand
17. I/O
• Not easy, if program by
ourselves
• Will use the library
provided by the author
• Two steps:
• Include the library
(Irvine32.inc) in your code
• Call the subroutines
• call DumpRegs:
• Calls the procedure to
displays current values of
processor registers
17
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
18. Remaining
• exit:
• Halts the program
• Not a MSAM keyword, but
a command defined in
Irvine32.inc
• END main:
• Marks the last line of the
program to be assembled
• Identifies the name of the
program’s startup
procedure
18
TITLE Add and …
; Adds and subtracts
; (10000h + …
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
19. Example Program Output
• Program output, showing registers and flags
19
EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFF
ESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4
EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0
20. Alternative Version of AddSub
20
TITLE Add and Subtract (AddSubAlt.asm)
; adds and subtracts 32-bit integers
.386
.MODEL flat,stdcall
.STACK 4096
ExitProcess PROTO, dwExitCode:DWORD
DumpRegs PROTO
.code
main PROC
mov eax,10000h ; EAX = 10000h
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs
INVOKE ExitProcess,0
main ENDP
END main
21. Explanations
• .386 directive:
• Minimum processor required for this code
• .MODEL directive:
• Generate code for protected mode program
• Stdcall: enable calling of Windows functions
• PROTO directives:
• Prototypes for procedures
• ExitProcess: Windows function to halt process
• INVOKE directive:
• Calls a procedure or function
• Calls ExitProcess and passes it with a return code of zero
21
22. Suggested Program Template
22
TITLE Program Template (Template.asm)
; Program Description:
; Author:
; Creation Date:
; Revisions:
; Date: Modified by:
INCLUDE Irvine32.inc
.data
; (insert variables here)
.code
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main
23. What's Next
• Basic Elements of Assembly Language
• Example: Adding and Subtracting Integers
• Assembling, Linking, and Running Programs
• Defining Data
• Symbolic Constants
• Real-Address Mode Programming
23
24. Assemble-Link-Execute Cycle
• Steps from creating a source program through executing the
compiled program
http://kipirvine.com/asm/gettingStarted/index.htm
24
Source
File
Object
File
Listing
File
Link
Library
Executable
File
Map
File
Output
Step 1: text editor
Step 2:
assembler
Step 3:
linker
Step 4:
OS loader
25. Listing File
• Use it to see how your program is compiled
• Contains
• source code
• addresses
• object code (machine language)
• segment names
• symbols (variables, procedures, and constants)
• Example: addSub.lst
25
26. Listing File
00000000 .code
00000000 main PROC
00000000 B8 00010000 mov eax,10000h
00000005 05 00040000 add eax,40000h
0000000A 2D 00020000 sub eax,20000h
0000000F E8 00000000E call DumpRegs
exit
00000014 6A 00 * push +000000000h
00000016 E8 00000000E * call ExitProcess
0000001B main ENDP
END main
26
address memory
content
27. Data Types
BYTE 8-bit unsigned integer
SBYTE8-bit signed integer
WORD 16-bit unsigned integer
SWORD 16-bit signed integer
DWORD 32-bit unsigned integer
SDWORD 32-bit signed integer
FWORD 48-bit integer (Far pointer in protected mode)
QWORD 64-bit integer
TBYTE 80-bit (10-byte) integer
REAL4 32-bit (4-byte) short real
REAL8 64-bit (8-byte) long real
REAL10 80-bit (10-byte) extended real
27
28. Defining BYTE and SBYTE Data
• Each of following defines a single byte of storage:
28
value1 BYTE 'A' ; character constant
value2 BYTE 0 ; smallest unsigned byte
value3 BYTE 255 ; largest unsigned byte
value4 SBYTE -128 ; smallest signed byte
value5 SBYTE +127 ; largest signed byte
value6 BYTE ? ; uninitialized byte
• MASM does not prevent you from initializing a BYTE with a
negative value, but it is considered poor style
• If you declare a SBYTE variable, the Microsoft debugger will
automatically display its value in decimal with a leading sign
30. Defining Strings
• An array of characters
• Usually enclosed in quotation marks
• Will often be null-terminated
• To continue a single string across multiple lines, end each line with a comma
30
str1 BYTE "Enter your name",0
str2 BYTE 'Error: halting program',0
str3 BYTE 'A','E','I','O','U'
greeting BYTE "Welcome to the Encryption Demo program "
BYTE "created by Kip Irvine.",0
menu BYTE "Checking Account",0dh,0ah,0dh,0ah,
"1. Create a new account",0dh,0ah,
"2. Open an existing account",0dh,0ah,
"Choice> ",0
Is str1 an array? End-of-line sequence:
•0Dh = carriage return
•0Ah = line feed
31. Using the DUP Operator
• Use DUP to allocate (create space for) an array or string
• Syntax:
counter DUP (argument)
• Counter and argument must be constants or constant expressions
31
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK"); 20 bytes,
; "STACKSTACKSTACKSTACK”
32. Defining WORD and SWORD
• Define storage for 16-bit integers
• or double characters
• single value or multiple values
32
word1 WORD 65535 ; largest unsigned value
word2 SWORD –32768 ; smallest signed value
word3 WORD ? ; uninitialized, unsigned
word4 WORD "AB" ; double characters
myList WORD 1,2,3,4,5 ; array of words
array WORD 5 DUP(?) ; uninitialized array
33. Defining Other Types of Data
• Storage definitions for 32-bit integers, quadwords, tenbyte values,
and real numbers:
33
val1 DWORD 12345678h ; unsigned
val2 SDWORD –2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array
val4 SDWORD –3,–2,–1,0,1 ; signed array
quad1 QWORD 1234567812345678h
val1 TBYTE 1000000000123456789Ah
rVal1 REAL4 -2.1
rVal2 REAL8 3.2E-260
rVal3 REAL10 4.6E+4096
ShortArray REAL4 20 DUP(0.0)
34. Adding Variables to AddSub
34
TITLE Add and Subtract, Version 2 (AddSub2.asm)
; This program adds and subtracts 32-bit unsigned
; integers and stores the sum in a variable.
INCLUDE Irvine32.inc
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1 ; start with 10000h
add eax,val2 ; add 40000h
sub eax,val3 ; subtract 20000h
mov finalVal,eax ; store the result (30000h)
call DumpRegs ; display the registers
exit
main ENDP
END main
35. Listing File
00000000 .data
00000000 00010000 val1 DWORD 10000h
00000004 00040000 val2 DWORD 40000h
00000008 00020000 val3 DWORD 20000h
0000000C 00000000 finalVal DWORD ?
00000000 .code
00000000 main PROC
00000000 A1 00000000 R mov eax,val1 ; start with 10000h
00000005 03 05 00000004 R add eax,val2 ; add 40000h
0000000B 2B 05 00000008 R sub eax,val3 ; subtract 20000h
00000011 A3 0000000C R mov finalVal,eax; store result
00000016 E8 00000000 E call DumpRegs ; display registers
exit
00000022 main ENDP
35
36. C vs Assembly
36
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1
add eax,val2
sub eax,val3
mov finalVal,eax
call DumpRegs
exit
main ENDP
main()
{
int val1=10000h;
int val2=40000h;
int val3=20000h;
int finalVal;
finalVal = val1
+ val2 - val3;
}
37. What's Next
• Basic Elements of Assembly Language
• Example: Adding and Subtracting Integers
• Assembling, Linking, and Running Programs
• Defining Data
• Symbolic Constants
• Equal-Sign Directive
• Calculating the Sizes of Arrays and Strings
• EQU Directive
• TEXTEQU Directive
• Real-Address Mode Programming
37
38. Equal-Sign Directive
• name = expression
• expression is a 32-bit integer (expression or constant)
• may be redefined
• name is called a symbolic constant
• Also OK to use EQU
• good programming style to use symbols
38
COUNT = 500
…
mov al,COUNT
39. Calculating the Size of Arrays
• Current location counter: $
• Size of a byte array
• Subtract address of list and difference is the number of bytes
• Size of a word array
• Divide total number of bytes by 2 (size of a word)
39
list BYTE 10,20,30,40
ListSize = ($ - list)
list WORD 1000h,2000h,3000h,4000h
ListSize = ($ - list) / 2
40. EQU Directive
• Define a symbol as either an integer or text expression
• Cannot be redefined
• OK to use expressions in EQU:
• Matrix1 EQU 10 * 10
• Matrix1 EQU <10 * 10>
• No expression evaluation if within < >
• EQU accepts texts too
40
PI EQU <3.1416>
pressKey EQU <"Press any key to continue",0>
.data
prompt BYTE pressKey
41. TEXTEQU Directive
• Define a symbol as either an integer or text expression
• Called a text macro
• Can be redefined
41
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
rowSize = 5
.data
prompt1 BYTE continueMsg
count TEXTEQU %(rowSize * 2) ; evaluates expression
setupAL TEXTEQU <mov al,count>
.code
setupAL ; generates: "mov al,10"
42. Summary
• Integer expression, character constant
• Directive – interpreted by the assembler
• Instruction – executes at runtime
• Code, data, and stack segments
• Source, listing, object, map, executable files
• Data definition directives:
• BYTE, SBYTE, WORD, SWORD, DWORD, SDWORD, QWORD, TBYTE, REAL4,
REAL8, and REAL10
• DUP operator, location counter ($)
• Symbolic constant
• EQU and TEXTEQU
42