Introduction To Assembly Language Programming
Introduction To Assembly Language Programming
Early computer systems were basically programmed by hand. The front panel switches were
used to enter instructions and data. These switches are represented the address, data and
control lines of the computer system. To enter the data into memory, the address switches
were toggled to the correct address, the data switches were toggled next, and finally the
control lines switch was toggled. This wrote the binary value on the front panel data
switches to the address specified. Once all the data and instruction were entered, the run
switch was toggled to run the program.
The programmer also needed to know the instruction set of the processor. Each instruction
needed to be manually converted into bit patterns by the programmer so that the front
panel switches could be set correctly. This led to errors in translation as the programmer
could easily misread 8 as the value B. It became obvious that such methods were slow and
error prone.
With the arrival of better hardware which could address larger memory, and the increase in
memory size (due to better production techniques and lower cost), programs were written
to perform some of this manual entry. Small monitor programs became popular, which
allowed entry of instructions and data via hex keypads or terminals. Additional devices such
as paper tape and punched cards became popular as storage methods for programs.
The programs were still programmed by hand, in that the conversion from mnemonics to
instructions was still performed manually. To increase programmer productivity, the idea of
writing a program to interpret another was a major breakthrough. This would be run by the
computer, and translate the actual mnemonics into instructions.
As programmers were writing the source code in mnemonics, it seemed the logical next
step. The source file was fed as input into the program, which translated the mnemonics
into instructions, then wrote the output to the desired place (etc. paper-tape). This
sequence is now accepted as a common place. The only advances have been the increasing
use of high level languages to increase programmer productivity.
1
Introduction
Assembly language represents each of the many operations that the computer can do with a
mnemonic, a short series of letters, using as assembler to convert these mnemonics into
actual processor instructions and associated data.
Assemblers are programs which generate the machine code instructions from a source code
program written in assembly language. Some of the features provided by an assembler are:
Typically, an assembler reads a file of assembly language and translates it one line at a
time, outputting a file of machine language. The input file is called the „source file‟ and the
output file is called the „object file‟. The machine language patterns produced are called the
„object code‟.
Also produced during the assembly process is a „listing‟, which summarizes the results of the
assembly process. In the event that the assembler was unable to understand any of the
source lines, it inserts error message in the listing, pointing out the problem.
This project is to construct a program for counting the number of sentences and
paragraphs. In this program, the input file and the program file must be in the same
location as all the assembly and exe file. The program is started with no number insertion
prompted out that asked user to key in. The program will be run and the result obtained
from the program is integer for sentences and paragraphs.
FUNDAMENTAL STUDY
To successfully carry out the project, some fundamental knowledge on program design r=is
required. Below are the elaborations of the task that need to be included in constructing a
program.
2
2.1. Design Stage
The design stage involves expressing the solution to the problem in a standard format.
There are several design tools (pseudo-language) that can be used by the programmer. The
most common design tool used by assembly language programmer is a flowchart. While,
pseudo code is common design tool used by third and fourth generation programmers.
These common design tools are briefly described below:
2.1.1. Flowchart
Flowchart is a graphical representation of the program logic. It is one of the oldest methods
of program logic. It is one of the oldest methods of program design. Flowchart uses
graphical shapes to represent different actions that actions that the computer will perform.
Arrow that indicates the flow of control connects these shapes.
2.1.2. Coding
A written computer instruction is called a code and the process of writing codes is called
coding. The code written by programmers is called source code. The programmer then uses
a compiler to compile the source code to produce what is called an object code. The object
code is then converted to a program that is ready to run which is called executable code or
machine code.
2.1.3. Debugging
1. Model Small
MODEL ONLY needs to be used for simplified segments. Code & Data have separate
segment, but must be each less than 64k Both Code and Data are NEAR. For most
applications, this will suffice.
3
2. .Stack 200h
Tells the compiler to set up a 200h byte stack upon execution of the program. NOTE: the
size you choose for the stack does not change the size of the file on disk. You can see what
I mean by changing the 200h to, say, 400h and then recompiling. The file sizes are
identical. This could be replaced with:
: MyStack ENDS
BUT, doing it this way makes your executable 512 bytes bigger. If you were to double to
400h, the executable would be another 512 bytes bigger. I think it's pretty obvious why the
simplified version is preferred.
3. Data
Is the place to declare the entire required variable in the program. For a normal usage of program,
WORD or BYTE could have been used.
START:
The label of starting a program. Mostly all the program will be written under Start.
4. End Start
This tells the compiler that we are all done with our program and that it can stop compiling,
now.
5. The INT, INTO, BOUND, and IRET Instructions
The int (for software interrupt) instruction is a very special form of a call instruction.
Whereas the call instruction calls subroutines within your program, the int instruction calls
system routines and other special subroutines. The major difference between interrupt
serviceroutines and standard procedures is that you can have any number of different
procedures in an assembly language program, while the system supports a maximum of 256
different interrupt service routines. A program calls a subroutine by specifying the address of
that subroutine; it calls an interrupt service routine by specifying the interrupt number for
4
that particular interrupt service routine. This chapter will only describe how to call an
interrupt service routine using the int, into, and bound instructions, and how to return from
an interrupt service routine using the iret instruction.
7. Unconditional Jumps
The jmp (jump) instruction unconditionally transfers control to another point in the program.
There are six forms of this instruction: an intersegment/direct jump, two intersegment/
direct jumps, an intersegment/indirect jump, and two intersegment/indirect jumps.
Intersegment jumps are always between statements in the same code segment.
Intersegment jumps can transfer control to a statement in a different code segment. These
instructions generally use the same syntax; it is jmp target the assembler differentiates
them by their operands.
Although the jmp, call, and ret instructions provide transfer of control, they do not allow you
to make any serious decisions. The 80x86‟s conditional jump instructions handle this task.
The conditional jump instructions are the basic tool for creating loops and other conditionally
executable statements like the if..then statement. The conditional jumps test one or more
flags in the flags register to see if they match some particular pattern (just like the setcc
instructions). If the pattern matches, control transfers to the target location. If the match
fails, the CPU ignores the conditional jump and execution continues with the next
.
5
PROBLEM STATEMENT
Create a program for counting the number of sentences and paragraphs.
PROGRAM OBJECTIVES
With the given problem statement, several main objectives or functions of the
program are outlined before attempting to write the source code. The basic objectives of
this program are listed as follow:
Read a text file (.txt) and load the texts into data registers
Read the data character by character to count the occurrences of periods.
Read the data character by character to count the occurrences of „carriage return‟ or
in ASCII code, „0Dh‟.
Print and display the results.
DESIGN STAGE
3.1 Design Considerations
The first attempts of producing the program are by identifying all the functions that
the program should perform in order to solve the stated problem. Hence, the program‟s
design considerations are initially listed as shown below:
Read a separate text file for data and store in data registers as a string for further
operations.
Since every end of a sentence must be followed by a period or full stop „.‟, therefore
the program design should detect the occurrences of periods in the text file.
A „sentence counter‟ is created to count and store the number of occurrences of
periods, or similarly, the number of sentences.
Generally, a new paragraph always begin in a „new line‟, hence the program should
be able to detect the occurrences of „new line‟ input through the keyboard‟s Enter
key or ASCII code, „0Dh‟.
Another counter is created to count the occurrences of „new line‟ inputs in the text
file.
The results of the both counters are then printed and displayed on the command
prompt window.
3.2 Flowcharts
6
7
8
9
Figure 1 shows a program process flow for counting the number sentences from the
input file. As the program is executed, it will begin by opening the designated input text file.
If it is successfully opened, the program continues to read the contents of the text file and
load them into data registers as strings. In the event of error reading the file, the program
will return an error message and it is terminated immediately.
The program counts the number of bytes of data from the input file then reads the
first character and decides if the character is a dot, question mark or exclamation mark
character. If no, a loop is used to restart the process and read the next character. If yes,
the counter increase count by 1 and the process is looped to read the next character. The
process will repeat looping until no more character is detected in the string. Subsequently,
the total counts by the counter will be printed to be seen by the user on the command
prompt window.
Figure 2 shows another individual program to count the occurrences of „new line‟
inputs. The only difference is that this read the data for ASCII code input „0Dh‟.
Figure 3 is another flowchart that integrates both counting procedures of sentences
and paragraphs into one single program. The desired program should run with accordance
to the flow as shown in figure 3 where each character is tested for dot character,
exclamation mark or question mark to count sentences while it also simultaneously test for
ASCII code input „0Dh‟ which is a new line to count paragraphs. Both the results are printed
and displayed after all characters have been read. The program ends after printing the
results.
In order to increase the reliability of the program, an initial procedure is created to
open and read the input text file. This procedure will return an error message such as
„UNABLE TO OPEN FILE‟, „UNABLE TO READ FILE‟ or „UNABLE TO CLOSE FILE‟. On executing
the program, it will locate the designated text file which name is written in the source code.
The file will fail to open in the event that the stated file name is too long or the file name
has been wrongly typed. Upon successfully opening the file, the program will proceed to get
the file handle and then read the data in the file. If the program fails to read the data then it
will print and display an error message and terminate the program. If successful, the
program proceeds to reading the number of bytes of data from the input file and
sequentially activates the counting procedures. After the process has been done, program
will attempt to close the input text file and detects for error in the process. If process has an
error, then error message will be displayed and program is terminated. If no error is
detected then the program ends normally.
3.3 Coding
With the design considerations in mind and using the flowchart as guidelines, a set
of written computer instructions is written using the Notepad software available in any
Microsoft Windows operating systems. After the source code is complete, an assembler has
10
to be used to translate the source code into machine code instructions in an object file that
can be executed by the microprocessor. In this project, the BORLAND TURBO ASSEMBLER is
used.
The procedures of using the turbo assembler to create an executable file from a set
of source code will be briefly discussed here. The turbo assembler used for the purpose of
this project is a ready-to-use folder package named tasm. The package includes an
assembler, a linker and also a debugger.The steps are as follow:
The source code has to be saved as a .asm file, for example program.asm, and this
file is stored in the „BIN‟ folder in the „tasm‟ folder which is usually kept in the C
drive for ease of access.
The text file to be read is saved as .txt file and is saved in the BIN folder also. In this
demonstration. A text file named test.txt is used and the contents are shown below.
11
comes together in the Turbo Assembler package and the command line to link an
object file is:
TLINK filename.obj
12
PROGRAM CODE
.model small
.stack 200h
.data
bufferSize = 5120
bytesRead dw ?
inputFile db "test.txt$"
inputHandle dw ?
countParagraph dw 0
countSentences dw 0
.code
start:
mov ds, ax
fileOperationErrorStart:
13
fileOperationErrorOpen:
int 21h
fileOperationErrorRead:
int 21h
fileOperationErrorClose:
int 21h
fileOperationErrorEnd:
14
mov dx, offset buffer ; POINTER TO BUFFER VARIABLE
jmp Outputs
bytesReadNotZero:
; SENTENCE COUNTER
countSentencesLoop:
; LOOP IS STOPPED
15
cmp buffer[bx], 3Fh ; COMPARE CURRENT CHARACTER
incrementCountSentences:
jmp incrementCountSentencesBX
incrementCountSentencesBX:
countSentencesEOS:
; PARAGRAPH COUNTER
countParagraphsLoop:
; LOOP IS STOPPED
jmp incrementCountParagraphsBX
incrementCountParagraphs:
jmp skipCRLFAndSpaceLoop
16
incrementCountParagraphsBX:
skipCRLFAndSpaceLoop:
; LOOP IS STOPPED
; JUMP TO incrementCountParagraphsBX
skipCRLFAndSpaceLoopAddBX:
; JUMP TO skipCRLFAndSpaceLoop
; TO RECHECK FOR CR LF SP
countParagraphsEOS2:
inc countParagraph
countParagraphsEOS:
mov bx, 0
Outputs:
; OUTPUT OF outputFileName
17
int 21h ; INTERRUPT
; OUTPUT OF inputFile
int 21h
int 21h
; OUTPUT countSentences
int 21h
; OUTPUT countParagraph
; END PROCESS
int 21h
printNumbers PROC
push bx
push cx
push dx
pushToStackLoop:
mov dx, 0 ; DX = 0
18
div bx ; DIVIDE BX WITH AX
printStackLoop:
int 21h
cmp cx, 0
pop dx
pop cx
pop bx
ret
printNumbers ENDP
exit:
.exit
end start
19
ANALYSIS
RESULT OF PROGRAM
To execute the program, insert the name of the exe file. In the example that shown in
figure 1, pns was key in to the ms-dos. After pressing enter, the program was executed and
the result shows that in a bom.txt file, there are 3 sentences and 1 paragraphp found.
The error message is the message that will displayed if any error occurs. There are
three types of error message designed in this program specifically to indicated the error of
Ms-dos filling. The program will shows the message of outputErrorO when the file wasn‟t
able to open, outputErrorR when the file wasn‟t able to be read and outputErrorC when the
file wasn‟t able to be closed. Meanwhile, output message from outputFileName,
outputParagraph and outputSentences will be shown in the resut in Ms-Dos.
20
Figure 2 - variable in data segment
On the other hand, „$‟ sign is a sign that used to show the desirable output of the
program. Thus, the name of the file will be displayed on the reading file while the numbers
of sentences and the numbers of paragraphs will be displayed in the rest of the two
messages.
in order to begin the counting sequence from zero, 0 is assigned to the variable
countParagraph and countSentences as the initial value of the counter. These two counter
will be used in counting the number of paragraph and sentences and shall be explain in the
next sentence and paragraph counting section.
To carry out the open and import data from txt file, ms-dos filling method is required
to be used. This method consists of three major section: open file/create file, read file and
close file.
716Ch is a dual function command which can be used to create a file or to open a
file. In this program, this command is used to open a file. The name of the file is assigned to
21
the source index so that the whole opening task will open according to the name that set in
the inputFile variable. If error was found during the loading of the file name, the program
will soon proceed to fileOperationErrorOpenwhich is a print message that will show the
message of “Unable to open file”. Figure 4 shows the message printing program for
fileOperationErrorOpen.
Meanwhile to read a file, dos function 3FH is required. In this program, the input
handle will define which file to read while the bufferSize will define the reading size. On the
other side, the offset buffer is the pointer to the buffer variable. The program will jump to
fileOperationErrorRead if error occurs during the reading progress. Figure 6 illustrate the
error reading message displaying program. In this program, the message that assign to
variable outputErrorR will be displayed when error occurs during file reading.
Finally, the last ms-dos filling program will most certainly end with a file closing program. As
been shown in figure 7, Dos function 3Eh was used in the program as a file closing
command. TheinputHandle command will chose the file to be handled in this program. If
22
error occurs, the sequence will straight away move to fileOperationErrorClose as shown in
figure 8.
23
SENTENCE COUNTING
In contrast, if the non-of the comparison are equal, the program will loop to the
incrementCountsentcesBx to increase the value of bx and then loop back again to
countSentenceLoop. In this case, the value of the counSentences will not be increased and
no sentence count during this move.
24
PARAGRAPH COUNTING
In the paragraph counting section, the program start with comparing bx with
bytesRead. After the comparison, itwill straight away jump to counterParagraphsEOS2 and
increase the countParagraph if the value of bytesRead is equal to zero. If the value of
bytesRead is not equals to zero, it will proceed the comparison with CR or LF.
Again during this comparison section, if any of it is matched, the program will jump
to incrementCountParagraphs and increase countParagraph. Yet, after increasing the
counterParagraph, instead of jumping back to the beginning of paragraph counting
program. In fact, it jump to skipCRLFAndSpaceLoop to recheck again whether there is still
any CR, LF or space occur after detecting the similarity in IncrementCountParagraphs.
This rechecking progress will helps on eliminating the posibility of having blank space
in new line and also larger spacing. It will skip all the following CR,LF and space until it
detect a character and jump back again to incrementCountParagraphsBx to increase the
paragraphs counter. Figure 9 at below shows the program for counting a paragraphs.
After gone through this paragraph counting program the total number of paragraph will be
stored in countParagraph and be ready to be display in the output of the program.
25
CONCLUSION
The program is successfully design. And the number of the sentence and paragraph were able to be
calculated. Although there are some limitations in this program, still the program can be further
improved in the future.
REFERENCE
1. 80386/80486 Registers, Retrieved on December 19 2012, from : www.tu-
ilmenau.de/fileadmin/media/ra/template/.../webster_teil.pdf
2. Intel 80386 Programmer's Reference Manual,1986, Retrieved on December 19 2012,
from : pdos.csail.mit.edu/6.828/2012/readings/i386.pdf
3. Ismail Saad, 2010, Introduction to computer, KM40203 Mikropemproses Dan
Elektronik lecture note.
26
Contents
INTRODUCTION TO ASSEMBLY LANGUAGE PROGRAMMING ....................................................... 1
FUNDAMENTAL STUDY ........................................................................................................................ 2
1. .Model Small .............................................................................................................................. 3
PROBLEM STATEMENT............................................................................................................................ 6
PROGRAM OBJECTIVES ........................................................................................................................... 6
DESIGN STAGE......................................................................................................................................... 6
ANALYSIS ............................................................................................................................................... 20
CONCLUSION......................................................................................................................................... 26
REFERENCE ............................................................................................................................................ 26
27