In the code examples so far, we have separated out the coded instructions from the data. Modern processors like the 8088 have separate registers which deal with each section of a program.
CS and IP = instructions DS, BX, SI= data ES, BX, DI= extra data SS, SP, BP= stack
In writing programs for modern processors like the 8088, the program is structured with a minimum of three sections, called SEGMENTS. The three segments represent the CODE, DATA and STACK areas of the program. Information within each segment is accessed differently depending upon the segment type. To access data in the stack segment requires the use of the SS, SP and or BP registers. The following diagrams illustrates how information in the stack and data segments are accessed.
Special assembler directives are used to specify the different segments
The following directives illustrate how to define the three basic
segments for an 8088 assembly language program.
.STACK 100H .DATA .CODE
The value following the stack directive specifies the size of
the stack segment.
The programmer is responsible for initializing the segment
registers DS and ES to the correct segments of the
program. Failure to do so will result in a program which will not
access the data and extra data segments properly. The operating
system will only initialize the CS, SS, SP and IP registers.
The following code portion illustrates how to setup the data
segment register. This is performed at the beginning of the code
.STACK 100H .DATA .CODE MOV AX, @DATA ; initialize DS MOV DS, AX
DIFFERENT SIZED MEMORY MODELS
The 8088 processor supports several different memory models. We
shall look at the most common types.
- SMALL memory model
The small memory model is limited to a single combined
segment of 64k bytes. This segment is a combination of
the stack, code and data segments. The assembler
directive used to specify a small memory model is,
- LARGE memory model
The large memory model supports multiple segments, each
segment limited to 64k bytes. The code and stack segments
are limited to 64k bytes each, but we can have two data
segments of 64k bytes each. The assembler directive used
to specify a large memory model is,
Use this memory model for all your programs.
SUPPORT FOR DIFFERENT CPU TYPES
The following directives are used to specify the processor type.
.186 .286 .386 .8087 .8086
RETURNING TO PCDOS
When an assembly language program running under PCDOS terminates,
it must return to the operating system so that the user shell
program can be re-loaded. The correct format is to use the
following code sequence
mov ax, 4c00h int 21h
ASSEMBLER DIRECTIVES FOR IBM-PC PROGRAMS
The following is a discussion of the assembler directives
applicable to packages like Microsoft Masm and Turbo Assembler.
These packages are used to write machine code programs which run
The EQU directive creates absolute symbols and
aliases by assigning an expression or value to the
declared variable name. Its format is,
name EQU expression
An absolute symbol represents a 16bit value; an alias
is a name that represents another symbol. The declared
name must be unique, one that has not been previously
pi EQU 3.14159 clearax EQU xor ax,ax
The first example directs the assembler to replace
every occurrence of the name pi with the value 3.1459,
whilst the second example instructs the assembler to
replace every occurrence of clearax with the
instruction xor ax,ax
- BYTE STORAGE
The DB directive allocates and initializes a byte
(8bits) of storage for each argument. Its format is,
name DB initialvalue,,,
The name portion is optional.
value1 DB 16 form DB 6*2 text DB "Enter your name:"
In the first example, value1 is assigned a
byte, and is initialized to 16, the second example sets form
equal to 12 and assigns it a byte, and in the last
example, text is defined as a sequence of bytes
which each contain a character from the specified string.
The first byte will be initialized to ‘E’, whilst the
last byte will be initialized to a space character.
- WORD STORAGE
The DW directive allocates a word (2bytes) of
storage for each initialized value. Its format is,
name DW initialvalue,,,
The name portion is optional.
DW ? mess DW 'ab'
The first example allocates one word of storage, but
does not define its initial value (?). The second example
defines mess as a word initialized with the
character string ‘ab’.
Strings when using the DW directive must not contain
more than two characters. The ‘b’ will be placed in the
low-order byte, and the ‘a’ will be placed in the high
order byte. If only one character is specified, the
high-order byte will contain 00H. The low-order byte
appears FIRST for Intel Processors.
The title directive specifies the program listing title.
This appears at the top of each page in the assembler
list file, after the source file name.
The name directive is used to set the name of the current
module. The module name is used by the linker when
displaying error messages. If no module name is used, the
linker will use the name specified using the title
- PAGE CONTROL
The PAGE directive can be used to designate the
line length and width for the program listing; normally
used to generate a page break in the assembler listing
file.When assembly is taking place, and the page
directive is encountered, the assembler generates a
form-feed character to set a new page, and continues the
assembly on the new page. In this way, the programmer can
organize a printout of modules on a per page basis, so
that the printout of more than one module per page does
PAGE 66,132 ; 66 lines per page ; 132 characters wide PAGE ; go to new page in list file
These directives are used to implement small procedures
name PROC codetype .... ret name ENDP
The last instruction in a procedure is a RETurn
instruction. The codetype is FAR for large memory
models, NEAR for small memory models. A procedure must be
entered using the appropriate CALL instruction.
- DEFINE DOUBLE WORD, DEFINE QUAD WORD and DEFINE TEN
The DD directive defines a double word [4bytes] of
storage. This is used to reserve storage for 32 bit
integers, floating point numbers, or far pointers to code
or data [segment:offset pair].The DQ directive
defines a quad word [8bytes] of storage for double
precision floating point numbers.
The DT directive defines 10bytes of storage.
This is normally used for Packed BCD numbers and a 10
byte temporary real floating point value, as this storage
format is also used by the 80×87 arithmetic co-processor.
The offset directive returns the number of bytes a
variable begins at, relative to the start of the segment
it is in. This is necessary when calling PCDOS routines.
.DATA temp db 10 mess db 'Hi there','$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET mess ;1 byte in .DATA segment int 21h ;print message mov ax, 4c00h ;return to PCDOS int 21h END start
SAMPLE PROGRAM FOR IBM-PC
TITLE Doscall ;Doscall.asm source file .MODEL SMALL CR equ 0ah LF equ 0dh EOSTR equ '$' .stack 200h .datamessage db 'Hello and welcome.' db CR, LF, EOSTR .code print proc near mov ah,9h ;PCDOS print function int 21h ret print endp start: mov ax, @data mov ds, ax mov dx, offset message call print mov ax, 4c00h int 21h end start
The program is assembled by typing
$ TASM DOSCALL Turbo Assembler V1.0 Copyright(c)1988 by Borland International Assembling file: DOSCALL.ASM Error messages: None Warning messages: None Remaining memory: 257k $
This produces an object file named DOSCALL.OBJ which must be
linked to create an executable file which can run under PCDOS.
$ TLINK DOSCALL Turbo LinkV2.0 Copyright (c) 1987, 1988 Borland International $
The program when run, produces the following output.
$ DOSCALL Hello and welcome. $
The macro directive allows the programmer to write a named block
of source statements, then use that name in the source file to
represent the group of statements. During the assembly phase, the
assembler automatically replaces each occurrence of the macro
name with the statements in the macro definition.
Macros are expanded on every occurrence of the macro name, so
they can increase the length of the executable file if used
repeatably. Procedures or subroutines take up less space, but the
increased overhead of saving and restoring addresses and
parameters can make them slower. In summary, the advantages and
disadvantages of macros are,
- Repeated small groups of instructions replaced by one
- Errors in macros are fixed only once, in the definition
- Duplication of effort is reduced
- In effect, new higher level instructions can be created
- Programming is made easier, less error prone
- Generally quicker in execution than subroutines
In large programs, produce greater code size than procedures
When to use Macros
- To replace small groups of instructions not worthy of
- To create a higher instruction set for specific
- To create compatibility with other computers
- To replace code portions which are repeated often
throughout the program
Defining Macros is done as follows,
name MACRO [optional arguments] statements statements ENDM
Consider the following macro to return to PCDOS from an
assembly language program.
exittodos MACRO mov ax,4C00h int 21h ENDM
Macros are expanded when the program is assembled. This means
that every occurrence of the macro name (apart from the
definition) is replaced by the statements in the macro
definition. An example will demonstrate this.
TITLE dosmacro .MODEL small exittodos MACRO mov ax,4C00h int 21h ENDM .STACK 100h .DATA message DB 'Hello and Welcome', '$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET message int 21h exittodos END start
When assembled, the macro is replaced and the internal
representation of the file looks like,
TITLE dosmacro .MODEL small exittodos MACRO mov ax,4C00h int 21h ENDM .STACK 100h .DATA message DB 'Hello and Welcome', '$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET message int 21h mov ax,4C00h int 21h END start
Macros can also accept values (parameters).
addup MACRO ad1,ad2, ad3 mov ax, ad1 mov dx, ad2 mov cx, ad3 ENDM
In this example a macro named addup is created. It
accepts three parameters, ad1, ad2 and ad3.
The code which follows, consisting of the mov statements,
will be used to replace every occurrence of the macro name addup
in the source file. The macro is terminated with the ENDM
statement.Calling a macro with arguments is done as
addup bx, 2, count
This has the effect of loading the ax register with the
contents of the bx register, the dx register with the value 2,
and the cx register with the value of count.
Macro definitions may include other macro names, and macros
may also be recursive: they can call themselves, eg,
pushall MACRO reg1, reg2, reg3, reg4, reg5, reg6 IFNB <reg1> ;; If parameter not blank push reg1 ;; push one register and ;; repeat pushall reg2, reg3, reg4, reg5, reg6 ENDIF ENDM pushall ax, bx, si, ds pushall cs, es
This shows a recursive macro called pushall that
continues to call itself until it encounters a blank argument. In
effect, it pushes the registers specified in the macro call onto
The ;; indicates that the comment field of the macro should
not be expanded with the macro statements.
IMPLEMENTING FP NUMBERS, ARRAYS, RECORDS AND JUMP TABLES
Floating Point Numbers
The following example shows the declaration of a single precision
floating point decimal number (stored in IEEE 754 standard).
FPnum1 DD 1.32740
The following example declares a packed BCD constant.
BCDval DT 123456
Ten bytes are allocated, giving a number range of 0 to
Arrays and array elements are dealt with using pointers. This
involves either based or indexed addressing.
- Manipulating an Array Element
1: Load a base/index register with the address of the first element 2: Calculate the offset position of the required element (1 byte for characters, 2 bytes for integers etc) 3: Perform the operation by either a) incrementing the base/index register by the required amount b) use based indexed addressing eg, X := IntArray; mov bx, offset IntArray ; base address mov ax, 4 ; calculate offset mul ax, 2 mov si, ax mov X, [bx + si]
- Cycling through an Array using a Loop count variable
The principles are the same, but the offset is the loop
count variable adjusted by the number of bytes per
FOR Loop := 1 to 10 do BEGIN sum := sum + IntArray[Loop] END; initfor:mov ax, 1 ; Loop := 1 mov Loop, ax mov bx, offset IntArrat ; setup base register for: mov ax, Loop cmp ax, 10 ja forexit mov ax, Loop ; calculate offset mul ax, 2 mov si, ax mov ax, [bx + si] mov cx, sum ; add sum and intArray[Loop] add ax, cx mov sum, ax ; update sum jmp for forexit:
Integer arrays occupy two bytes per element. A typical operation
is to sum the contents of an integer array. The following code
for an 8086 shows this.
TITLE IntArray .MODEL Large .STACK 200h .DATA mess db 'The total is ','$' result dw ? IntArry dw 10, 34, 76, 25, 14, 9, 3, 22 IntAlen dw ($ - IntArry) / 2 buff db 6 dup( 20h ) db '$' .CODE binasc proc far ; convert result to ascii string mov ax, 0 mov ax, [result] ; get number to convert push ax ; save it mov si, offset buff ; point to string area mov cx, 10 ; divide base factor shl ax, 1 ; clear sign bit shr ax, 1 do1: cmp ax, 10 ; compare with base fact jb exit1 mov dx, 0 ; clear upper numerator div cx ; divide by base factor add dl, 30h ; convert to ASCII mov [si], dl ; and store it dec si ; next character jmp do1 exit1: add al, 30h ; convert last character mov [si], al ; and store it pop ax ; recover or ax, ax ; and test for sign bit jns exit2 dec si ; store '-' sign mov bl, 2dh mov [si], bl exit2: ret binasc endp start: mov ax, @data mov ds, ax mov [result], 0000h ; clear result mov cx, IntAlen ; count of elements mov bx, offset IntArry ; point to IntArry mov si, 0000h ; first element xor ax, ax ; clear total lp1: add ax, [bx + si] ; add value to total inc si ; next element inc si dec cx jne lp1 mov [result], ax ; store total mov dx, offset mess ; print message mov ah, 9h int 21h call binasc ; convert result to ASCII mov dx, offset buff mov ah, 9h int 21h mov ax, 4c00h ; exit to DOS int 21h END start
Other typical operations involve the determination of the
minimum and maximum values.
Records in Pascal support the use of different sized field items.
Consider the storage of the following record.
Var example_record = RECORD int_number : integer; fp_number : real; letter : character; END;
The same record is implemented in assembly language by first
defining its composition.
ex_rec STRUC int_num dw fp_num dd lett db ex_rec ENDS
The next step creates a record which has the composition of
the previous records definition.
my_rec ex_rec <22, 3.2, 'Hi there.$'>
Each field of the record is accessed in a similar method to
that of Pascal, eg,
accesses the lett field of the record ex_rec.
The following program shows an implementation for the 8088
TITLE Records .MODEL Large ex_rec STRUC int_num dw fp_num dd mess db " " ex_rec ENDS .STACK 200h .DATA myrec ex_rec <22,1.30, "Hello there.$"> .CODE start: mov ax, @data mov ds, ax mov dx, offset myrec.mess mov ah, 9h int 21h mov ax,4c00h int 21h END start
Jump tables are an efficient method of implementing switch/case
type statements. A jump table consists of an array of addresses.
Using an offset into the array selects the address of the routine
which handles that particular value.
Jump tables are efficient, because it always take the same
time to select any routine from the table. The order may be
re-arranged or new routines added simply be increasing the size
of the table.
The following program implements a jump table.
TITLE Jump.asm .MODEL Large .STACK 200h .DATA help db 'This program exits when a function key is pressed.' db 10, 13, 'Ctrl A generates underline.', 10, 13 db 'Ctrl B generates bold.', 10, 13 db 'Ctrl C generates blinking.', 10, 13 db 'All other control codes return to normal text.', 10, 13 db 10, 13, 'Start typing characters.', 10, 13, '$'attrib db 07h ; screen attribute byte ; a table of addresses used to decipher recieve control codes ; each entry is the address of the appropriate routine ctl_tbl label word dw ctrl_null ; 0 dw ctrla ; 1 dw ctrlb ; 2 dw ctrlc ; 3 dw ctrld ; 4 dw ctrle ; 5 dw ctrlf ; 6 dw ctrlg ; 7 dw ctrlh ; 8 10 dw ctrli ; 9 11 dw ctrlj ; a 12 dw ctrlk ; b 13 dw ctrll ; c 14 dw ctrlm ; d 15 dw ctrln ; e 16 dw ctrlo ; f 17 dw ctrlp ; 10 20 dw ctrlq ; 11 21 dw ctrlr ; 12 22 dw ctrls ; 13 23 dw ctrlt ; 14 24 dw ctrlu ; 15 25 dw ctrlv ; 16 26 dw ctrlw ; 17 27 dw ctrlx ; 18 30 dw ctrly ; 19 31 dw ctrlz ; 1a 32 dw ctrl_lbkt ; 1b 33 dw ctrl_bslash ; 1c 34 dw ctrl_rbkt ; 1d 35 dw ctrl_carat ; 1e 36 dw ctrl_ul ; 1f 37 .CODE bumpcur proc far ; move cursor right one character mov ah, 3 xor bh, bh int 10h ; read int dh, dl inc dl ; next column cmp dl, 80 ; end of line? jle short bpcur1 xor dl, dl ; go to start of next line inc dh cmp dh, 24 ; end of screen? jl short bpcur1 mov ax, 0601h ; then scroll up xor cx, cx push dx mov dh, 24 mov dl, 80 mov bh, [attrib] int 10h pop dx mov dh, 24 ; position bottom linebpcur1: xor bh, bh ; set cursor position mov ah, 2 int 10h ret bumpcur endp ctrl_code proc far ; process Control CODES push bx cbw ; convert AL to AX mov bx,ax ; use bx and an index into shl bx,1 ; the ctrl_tbl jmp ctl_tbl[bx] ; jump to key routine ctrla: and byte ptr [attrib], 0f9h ; underline jmp ctrl_exit ctrlb: or byte ptr [attrib], 08h ; bold jmp ctrl_exit ctrlc: or byte ptr [attrib], 80h ; blink on jmp ctrl_exit ctrld: ; all others normal ctrl_null: ctrle: ctrlf: ctrlg: ctrlh: ctrli: ctrlj: ctrlk: ctrll: ctrlm: ctrln: ctrlo: ctrlp: ctrlq: ctrlr: ctrls: ctrlt: ctrlu: ctrlv: ctrlw: ctrlx: ctrly: ctrlz: ctrl_lbkt: ctrl_bslash: ctrl_rbkt: ctrl_carat: ctrl_ul: mov byte ptr [attrib], 07h ; normal attribute ctrl_exit: pop bx ret ctrl_code endp start: mov ax, @data mov ds, ax mov ah, 9h ;print help message mov dx, offset help int 21 hlp1: mov ah, 06h ; read character from keyboard mov dl, 0ffh int 21h jz lp1 ; repeat if character not ready cmp al, 00h ; if function key then exit je exit cmp al, 32 ; else if control code jae disp1 call ctrl_code ; then process control code jmp lp1 disp1: push bx xor bx, bx ; page zero on video memory mov bl, [attrib] ; get character attribute mov cx, 1 ; one character to write mov ah, 9 ; write char + attribute int 10h ; use BIOS call call bumpcur ; next cursor position jmp lp1 ; repeat exit: mov ax, 4c00h int 21h END start