Assembly Language Programming – 16 bit processing

[8086] In the code examples so far, we have separated out the coded instructions from the data. Modern processors like the 8088 have separate registers which deal with each section of a program.

	CS and IP = instructions
	DS, BX, SI= data 
	ES, BX, DI= extra data 
	SS, SP, BP= stack

In writing programs for modern processors like the 8088, the program is structured with a minimum of three sections, called SEGMENTS. The three segments represent the CODE, DATA and STACK areas of the program. Information within each segment is accessed differently depending upon the segment type. To access data in the stack segment requires the use of the SS, SP and or BP registers. The following diagrams illustrates how information in the stack and data segments are accessed.


Special assembler directives are used to specify the different segments


The following directives illustrate how to define the three basic
segments for an 8088 assembly language program.

	.STACK	100H

The value following the stack directive specifies the size of
the stack segment.

The programmer is responsible for initializing the segment
registers DS and ES to the correct segments of the
program. Failure to do so will result in a program which will not
access the data and extra data segments properly. The operating
system will only initialize the CS, SS, SP and IP registers.

The following code portion illustrates how to setup the data
segment register. This is performed at the beginning of the code

	.STACK	100H 
	MOV 	AX, @DATA	; initialize DS


The 8088 processor supports several different memory models. We
shall look at the most common types.

  • SMALL memory model

    The small memory model is limited to a single combined
    segment of 64k bytes. This segment is a combination of
    the stack, code and data segments. The assembler
    directive used to specify a small memory model is,

  • LARGE memory model

    The large memory model supports multiple segments, each
    segment limited to 64k bytes. The code and stack segments
    are limited to 64k bytes each, but we can have two data
    segments of 64k bytes each. The assembler directive used
    to specify a large memory model is,


    Use this memory model for all your programs.


The following directives are used to specify the processor type.



When an assembly language program running under PCDOS terminates,
it must return to the operating system so that the user shell
program can be re-loaded. The correct format is to use the
following code sequence

	mov ax, 4c00h
	int 21h


The following is a discussion of the assembler directives
applicable to packages like Microsoft Masm and Turbo Assembler.
These packages are used to write machine code programs which run
under PCDOS.


    The EQU directive creates absolute symbols and
    aliases by assigning an expression or value to the
    declared variable name. Its format is,

    	name EQU expression

    An absolute symbol represents a 16bit value; an alias
    is a name that represents another symbol. The declared
    name must be unique, one that has not been previously

    		pi	EQU	3.14159
    		clearax	EQU	xor ax,ax

    The first example directs the assembler to replace
    every occurrence of the name pi with the value 3.1459,
    whilst the second example instructs the assembler to
    replace every occurrence of clearax with the
    instruction xor ax,ax


    The DB directive allocates and initializes a byte
    (8bits) of storage for each argument. Its format is,

    	name DB initialvalue,,,

    The name portion is optional.

    		value1	DB	16
    		form	DB	6*2
    		text	DB	"Enter your name:"

    In the first example, value1 is assigned a
    byte, and is initialized to 16, the second example sets form
    equal to 12 and assigns it a byte, and in the last
    example, text is defined as a sequence of bytes
    which each contain a character from the specified string.
    The first byte will be initialized to ‘E’, whilst the
    last byte will be initialized to a space character.


    The DW directive allocates a word (2bytes) of
    storage for each initialized value. Its format is,

    	name DW initialvalue,,,

    The name portion is optional.

    			DW	?	
    		mess 	DW 	'ab'

    The first example allocates one word of storage, but
    does not define its initial value (?). The second example
    defines mess as a word initialized with the
    character string ‘ab’.

    Strings when using the DW directive must not contain
    more than two characters. The ‘b’ will be placed in the
    low-order byte, and the ‘a’ will be placed in the high
    order byte. If only one character is specified, the
    high-order byte will contain 00H. The low-order byte
    appears FIRST for Intel Processors.


    The title directive specifies the program listing title.

    	TITLE Graphics

    This appears at the top of each page in the assembler
    list file, after the source file name.

  • NAME

    The name directive is used to set the name of the current
    module. The module name is used by the linker when
    displaying error messages. If no module name is used, the
    linker will use the name specified using the title

    	NAME Calculate_Gross

    The PAGE directive can be used to designate the
    line length and width for the program listing; normally
    used to generate a page break in the assembler listing
    file.When assembly is taking place, and the page
    directive is encountered, the assembler generates a
    form-feed character to set a new page, and continues the
    assembly on the new page. In this way, the programmer can
    organize a printout of modules on a per page basis, so
    that the printout of more than one module per page does
    not occur.

    		PAGE 66,132	; 66 lines per page 
    				; 132 characters wide 
    		PAGE 		; go to new page in list file

    These directives are used to implement small procedures

    	name PROC codetype .... ret name

    The last instruction in a procedure is a RETurn
    instruction. The codetype is FAR for large memory
    models, NEAR for small memory models. A procedure must be
    entered using the appropriate CALL instruction.


    The DD directive defines a double word [4bytes] of
    storage. This is used to reserve storage for 32 bit
    integers, floating point numbers, or far pointers to code
    or data [segment:offset pair].The DQ directive
    defines a quad word [8bytes] of storage for double
    precision floating point numbers.

    The DT directive defines 10bytes of storage.
    This is normally used for Packed BCD numbers and a 10
    byte temporary real floating point value, as this storage
    format is also used by the 80×87 arithmetic co-processor.


    The offset directive returns the number of bytes a
    variable begins at, relative to the start of the segment
    it is in. This is necessary when calling PCDOS routines.

    		temp	db	10
    		mess 	db 	'Hi there','$'
    		start: 	mov 	ax, @data 
    			mov 	ds, ax 
    			mov 	ah, 9h 
    			mov 	dx, OFFSET mess 	;1 byte in .DATA segment
    			int 	21h 			;print message 
    			mov 	ax, 4c00h		;return to PCDOS 
    			int 	21h 
    			END 	start


	TITLE	Doscall		;Doscall.asm source file 
	CR	equ	0ah
	LF	equ 	0dh
	EOSTR 	equ 	'$' 

	.stack 200h 
	.datamessage 	db 	'Hello and welcome.' 
			db 	CR, LF, EOSTR 

	print 	proc 	near 
		mov 	ah,9h		;PCDOS print function 
		int 	21h 
	print endp

	start:	mov 	ax, @data
		mov 	ds, ax 
		mov 	dx, offset message 
		call 	print 
		mov 	ax, 4c00h 
		int	21h 
		end 	start

The program is assembled by typing

		Turbo Assembler V1.0 Copyright(c)1988 by Borland International 
		Assembling file: DOSCALL.ASM 
		Error messages: None 
		Warning messages: None
		Remaining memory: 257k 

This produces an object file named DOSCALL.OBJ which must be
linked to create an executable file which can run under PCDOS.

		Turbo LinkV2.0 Copyright (c) 1987, 1988 Borland International 

The program when run, produces the following output.

		Hello and welcome. 


The macro directive allows the programmer to write a named block
of source statements, then use that name in the source file to
represent the group of statements. During the assembly phase, the
assembler automatically replaces each occurrence of the macro
name with the statements in the macro definition.

Macros are expanded on every occurrence of the macro name, so
they can increase the length of the executable file if used
repeatably. Procedures or subroutines take up less space, but the
increased overhead of saving and restoring addresses and
parameters can make them slower. In summary, the advantages and
disadvantages of macros are,


  • Repeated small groups of instructions replaced by one
  • Errors in macros are fixed only once, in the definition
  • Duplication of effort is reduced
  • In effect, new higher level instructions can be created
  • Programming is made easier, less error prone
  • Generally quicker in execution than subroutines


In large programs, produce greater code size than procedures

When to use Macros

  • To replace small groups of instructions not worthy of
  • To create a higher instruction set for specific
  • To create compatibility with other computers
  • To replace code portions which are repeated often
    throughout the program


Defining Macros is done as follows,

	name MACRO [optional arguments]

Consider the following macro to return to PCDOS from an
assembly language program.

	exittodos	MACRO	mov	ax,4C00h
				int	21h 

Macros are expanded when the program is assembled. This means
that every occurrence of the macro name (apart from the
definition) is replaced by the statements in the macro
definition. An example will demonstrate this.

		TITLE	dosmacro 
		.MODEL	small
		exittodos	MACRO	mov	ax,4C00h 
					int 	21h 

		.STACK	100h 
		message	DB	'Hello and Welcome', '$' 
	start:	mov	ax, @data 
		mov 	ds, ax 
		mov 	ah, 9h 
		mov 	dx, OFFSET message 
		int 	21h 
		END 	start

When assembled, the macro is replaced and the internal
representation of the file looks like,

		TITLE	dosmacro 
		.MODEL	small
		exittodos	MACRO	mov	ax,4C00h 
					int 	21h 

		.STACK	100h 
		message	DB	'Hello and Welcome', '$' 
	start:	mov	ax, @data 
		mov 	ds, ax 
		mov 	ah, 9h 
		mov 	dx, OFFSET message 
		int 	21h 
		mov 	ax,4C00h 
		int 	21h 
		END 	start

Macros can also accept values (parameters).

		addup	MACRO	ad1,ad2, ad3 
			mov	ax, ad1
			mov	dx, ad2 
			mov 	cx, ad3 

In this example a macro named addup is created. It
accepts three parameters, ad1, ad2 and ad3.
The code which follows, consisting of the mov statements,
will be used to replace every occurrence of the macro name addup
in the source file. The macro is terminated with the ENDM
statement.Calling a macro with arguments is done as

	addup	bx, 2, count

This has the effect of loading the ax register with the
contents of the bx register, the dx register with the value 2,
and the cx register with the value of count.

Macro definitions may include other macro names, and macros
may also be recursive: they can call themselves, eg,

	pushall	MACRO	reg1, reg2, reg3, reg4, reg5, reg6 
		IFNB 	<reg1>		;; If parameter not blank push reg1 
					;; push one register and 
					;; repeat 
			pushall 	reg2, reg3, reg4, reg5, reg6 

		pushall 	ax, bx, si, ds
		pushall 	cs, es

This shows a recursive macro called pushall that
continues to call itself until it encounters a blank argument. In
effect, it pushes the registers specified in the macro call onto
the stack.

The ;; indicates that the comment field of the macro should
not be expanded with the macro statements.


Floating Point Numbers

The following example shows the declaration of a single precision
floating point decimal number (stored in IEEE 754 standard).

	FPnum1	DD	1.32740

BCD strings

The following example declares a packed BCD constant.

	BCDval	DT	123456

Ten bytes are allocated, giving a number range of 0 to


Arrays and array elements are dealt with using pointers. This
involves either based or indexed addressing.

  • Manipulating an Array Element
    		1: Load a base/index register with the address of the first element
    		2: Calculate the offset position of the required element (1 byte for characters, 2 bytes for integers etc) 
    		3: Perform the operation by either 
    			a) incrementing the base/index register by the required amount 
    			b) use based indexed addressing eg, 
    				X := IntArray[4]; 
    				mov bx, offset IntArray ; base address 
    				mov ax, 4 		; calculate offset 
    				mul ax, 2 
    				mov si, ax 
    				mov X, [bx + si]
  • Cycling through an Array using a Loop count variable

    The principles are the same, but the offset is the loop
    count variable adjusted by the number of bytes per,

    			FOR Loop := 1 to 10 do
    				sum := sum + IntArray[Loop] 
    			initfor:mov	ax, 1 	; Loop := 1 
    				mov 	Loop, ax 
    				mov 	bx, offset IntArrat ; setup base register
    			for:	mov 	ax, Loop 
    				cmp 	ax, 10 
    				ja 	forexit 
    				mov 	ax, Loop 	; calculate offset 
    				mul 	ax, 2 
    				mov 	si, ax 
    				mov 	ax, [bx + si] 
    				mov 	cx, sum 	; add sum and intArray[Loop] 
    				add 	ax, cx 
    				mov 	sum, ax 	; update sum 
    				jmp 	for 

Integer Arrays

Integer arrays occupy two bytes per element. A typical operation
is to sum the contents of an integer array. The following code
for an 8086 shows this.

	TITLE	IntArray 
	.MODEL 	Large 
	.STACK 	200h 
	mess	db	'The total is ','$'
	result 	dw 	?
	IntArry	dw	10, 34, 76, 25, 14, 9, 3, 22
	IntAlen	dw	($ - IntArry) / 2
	buff 	db 	6
	dup( 20h ) 	db '$' 


	binasc 	proc	far 			; convert result to ascii string 
		mov 	ax, 0 
		mov 	ax, [result]		; get number to convert 
		push 	ax 			; save it 
		mov 	si, offset buff[5]	; point to string area 
		mov 	cx, 10 			; divide base factor 
		shl 	ax, 1 			; clear sign bit 
		shr 	ax, 1
	do1: 	cmp 	ax, 10 			; compare with base fact 
		jb 	exit1 
		mov 	dx, 0 			; clear upper numerator 
		div 	cx 			; divide by base factor 
		add 	dl, 30h 		; convert to ASCII 
		mov 	[si], dl 		; and store it 
		dec 	si 			; next character 
		jmp 	do1
	exit1:	add 	al, 30h 		; convert last character 
		mov 	[si], al 		; and store it 
		pop 	ax 			; recover 
		or 	ax, ax 			; and test for sign bit 
		jns	exit2
		dec 	si 			; store '-' sign
		mov 	bl, 2dh 
		mov 	[si], bl
	exit2:	ret
	binasc	endp

	start:	mov	ax, @data
		mov	ds, ax 
		mov 	[result], 0000h		; clear result 
		mov 	cx, IntAlen		; count of elements 
		mov 	bx, offset IntArry 	; point to IntArry
		mov 	si, 0000h 		; first element 
		xor 	ax, ax 			; clear total
	lp1:	add	ax, [bx + si] 		; add value to total 
		inc 	si 			; next element 
		inc 	si
		dec 	cx 
		jne 	lp1 
		mov 	[result], ax 		; store total 
		mov 	dx, offset mess		; print message 
		mov 	ah, 9h 
		int 	21h 
		call 	binasc 			; convert result to ASCII 
		mov 	dx, offset buff 
		mov 	ah, 9h 
		int 	21h 
		mov 	ax, 4c00h		; exit to DOS 
		int 	21h 
		END 	start

Other typical operations involve the determination of the
minimum and maximum values.

Records (Structures)

Records in Pascal support the use of different sized field items.
Consider the storage of the following record.

	Var example_record = RECORD
		int_number : integer; 
		fp_number : real; 
		letter : character; 

The same record is implemented in assembly language by first
defining its composition.

	ex_rec	STRUC
		int_num	dw	
		fp_num	dd 
		lett	db
	ex_rec	ENDS

The next step creates a record which has the composition of
the previous records definition.

	my_rec	ex_rec	<22, 3.2, 'Hi there.$'>

Each field of the record is accessed in a similar method to
that of Pascal, eg,


accesses the lett field of the record ex_rec.
The following program shows an implementation for the 8088

		TITLE	Records 
		.MODEL 	Large

		ex_rec	STRUC
			int_num	dw
				fp_num 	dd 
			mess 	db	" "
		ex_rec	ENDS

		.STACK 200h 
		myrec	ex_rec <22,1.30, "Hello there.$"> 

	start:	mov	ax, @data
		mov	ds, ax 
		mov 	dx, offset myrec.mess 
		mov 	ah, 9h 
		int 	21h 
		mov 	ax,4c00h 
		int 	21h 
		END 	start

Jump Tables

Jump tables are an efficient method of implementing switch/case
type statements. A jump table consists of an array of addresses.
Using an offset into the array selects the address of the routine
which handles that particular value.

Jump tables are efficient, because it always take the same
time to select any routine from the table. The order may be
re-arranged or new routines added simply be increasing the size
of the table.

The following program implements a jump table.

	TITLE	Jump.asm 
	.MODEL 	Large 
	.STACK 	200h 
	help	db	'This program exits when a function key is pressed.' 
		db 10, 13, 'Ctrl A generates underline.', 10, 13 
		db 'Ctrl B generates bold.', 10, 13 
		db 'Ctrl C generates blinking.', 10, 13 
		db 'All other control codes return to normal text.', 10, 13 
		db 10, 13, 'Start typing characters.', 10, 13, '$'attrib 
		db 07h 	; screen attribute byte
		; a table of addresses used to decipher recieve control codes
		; each entry is the address of the appropriate routine

	ctl_tbl	label	word
		dw ctrl_null 	; 0 
		dw ctrla 	; 1 
		dw ctrlb 	; 2 
		dw ctrlc 	; 3 
		dw ctrld	; 4 
		dw ctrle 	; 5 
		dw ctrlf 	; 6 
		dw ctrlg 	; 7 
		dw ctrlh 	; 8 10 
		dw ctrli 	; 9 11 
		dw ctrlj 	; a 12 
		dw ctrlk 	; b 13 
		dw ctrll 	; c 14 
		dw ctrlm 	; d 15 
		dw ctrln 	; e 16 
		dw ctrlo 	; f 17 
		dw ctrlp 	; 10 20
		dw ctrlq 	; 11 21 
		dw ctrlr 	; 12 22 
		dw ctrls 	; 13 23 
		dw ctrlt 	; 14 24 
		dw ctrlu 	; 15 25 
		dw ctrlv 	; 16 26 
		dw ctrlw 	; 17 27 
		dw ctrlx	; 18 30 
		dw ctrly 	; 19 31 
		dw ctrlz 	; 1a 32 
		dw ctrl_lbkt 	; 1b 33
		dw ctrl_bslash 	; 1c 34 
		dw ctrl_rbkt 	; 1d 35 
		dw ctrl_carat 	; 1e 36 
		dw ctrl_ul 	; 1f 37 

	bumpcur	proc	far 	; move cursor right one character 
		mov ah, 3 
		xor bh, bh 
		int 10h 	; read int dh, dl 
		inc dl 		; next column 
		cmp dl, 80 	; end of line? 
		jle short bpcur1 
		xor dl, dl 	; go to start of next line 
		inc dh 
		cmp dh, 24 	; end of screen?
		jl short bpcur1 
		mov ax, 0601h 	; then scroll up 
		xor cx, cx 
		push dx 
		mov dh, 24 
		mov dl, 80 
		mov bh, [attrib]
		int 10h 
		pop dx 
		mov dh, 24 	; position bottom 
		xor bh, bh 	; set cursor position
		mov ah, 2 
		int 10h 
	bumpcur	endp

	ctrl_code	proc	far	; process Control CODES 
		push bx 
		cbw 		; convert AL to AX 
		mov bx,ax 	; use bx and an index into 
		shl bx,1 	; the ctrl_tbl 
		jmp ctl_tbl[bx] ; jump to key routine
	ctrla:	and byte ptr [attrib], 0f9h	; underline 
		jmp ctrl_exit
	ctrlb:	or byte ptr [attrib], 08h 	; bold 
		jmp ctrl_exit
	ctrlc:	or byte ptr [attrib], 80h 	; blink on 
		jmp ctrl_exit
	ctrld: 					; all others normal 
		mov byte ptr [attrib], 07h	; normal attribute
	ctrl_exit: pop bx
	ctrl_code	endp

	start:	mov ax, @data 
		mov ds, ax 
		mov ah, 9h	 	;print help message 
		mov dx, offset help 
		int 21
	hlp1: 	mov ah, 06h		; read character from keyboard 
		mov dl, 0ffh 
		int 21h 
		jz lp1 			; repeat if character not ready 
		cmp al, 00h 		; if function key then exit
		je exit 
		cmp al, 32 		; else if control code 
		jae disp1 
		call ctrl_code		; then process control code 
		jmp lp1
	disp1:	push bx 
		xor bx, bx 		; page zero on video memory 
		mov bl, [attrib] 	; get character attribute
		mov cx, 1 		; one character to write 
		mov ah, 9 		; write char + attribute
		int 10h 		; use BIOS call 
		call bumpcur 		; next cursor position 
		jmp lp1 		; repeat
	exit: 	mov ax, 4c00h 
		int 21h 
	END 	start



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.