Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

MASM Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

4.

Familiarity with MASM, Introduction to Memory Segments


Part I: Background
The Microsoft Assembler package, MASM, is a programming environment that contains four major tools: the assembler and linker; Quick help complete on-line help; Programmers Workbench and CodeView. The programmers workbench serves as an editor that has a series of options to guide you through assembler language program development. The Quick help program allows the developer to access detailed online help about assembler language instructions, DOS INT 21H function calls, and BIOS function calls. The CodeView tool is an enhanced version of DEBUG with a graphical interface that also handles 32 bit instructions. Processors in the 80x86 family divide the memory into segments. In real mode, the program uses a single memory segment of 64 kB. There are six defined segments, the code, stack, data, extra, F, and G. The instructions are stored in the code segment; the stack segment is reserved for the stack; and the data is stored in the data segment. The physical locations are determined by the values in the registers CS, SS, DS, ES, FS, and GS. Segments may be separate or may overlap fully or partially. The 80x86 family supports two types of executable files COM and EXE. Each makes use of the segments differently. The simplest form of an x86 program is the COM file. The COM file uses only a single REAL MODE memory segment. Thus, COM programs are limited to 64 kB in length. When we write COM files, we should ensure that the code, data and stack information are all stored in the same memory segment. This can be accomplished in MASM by including the following directives in the assembler language source file. cseg segment assume code cs:cseg, ds:cseg, ss:cseg, es:cseg

The EXE program has no file size restrictions and may contain several segments. Modular programs are often written using several different segments. However, multiple segments must be aligned on 16 byte boundaries since all segments begin at addresses that end with 4 binary 0s. The linker will ensure that the multiple segments are grouped on paragraph boundaries. The figure below is an example of 4 different segments and the addresses where they are stored

20060 20040 20030

SSEG DSEG CSEG2

20000 10000

CSEG1

Objectives:
Learn to: A. Use the Programmers Workbench to create, link and assemble a program. B. Use CodeView to debug and execute an assembler language program. C. Use QuickHelp to access help on instructions and the assembler. D. Meanings behind the code and data segments

Pre-Lab
Read sections 3.2, 6.1, 6.2, and 6.3 in the Uffenbeck text. What is the physical address corresponding to DS:103fH if BX=94d0H? (physical address corresponding to DS:BX) Explain why segments must be located on paragraph boundaries when they are loaded into memory (Hint: think about how logical addresses are converted to physical addresses).

Lab A.1 The Assembly Language Process Using the Command line
The following section explains how to assemble and link a file using the command line from a DOS window. The steps are: 1. Create or edit the source code (.asm file) using any ASCII text editor. Warning -- the file must be saved in an ASCII format - some editors like ' winword'or ' , word' store the file by default in a binary format. To save as an ASCII format in some of the microsoft editors, select output type as *.TXT but specify the full file name as myfile.asm (the .asm extension should be used for assembly language files). Invoke the masm program to assemble the file and produce a .obj file and optionally, a .lst file. Invoke the link program to produce a .exe program (or a .com program via a command line argument).

2. 3.

Assume we have an assembly language file called test.asm that has been saved in ASCII format. Open a DOS window. To assemble the file, change to the directory where the file is via the ' command, and cd' type: C:\> masm test If assembly is successful, this will produce a file called test.obj. If errors are present, you will be given the line numbers where the syntax errors ocurred. You can also produce a listing file (.lst) which shows opcodes for all instructions via: C:\> masm test,test, test It is a good idea to always create a .lst file output. A .exe file must be created from the .obj file via the link program. Type: C:\> link test

You will be prompted for file names for the Run file, List file, libraries, and Definitions file. Just hitting <enter> for each choice will use the defaults. This will produce a test.exe file which can then be executed. You can also produce the .exe file with no prompting from the link program via: C:\> link test,,,,, Use 5 commas after filename (test) to provide defaults for all other choices. Using the command line for masm/link is probably the easiest thing to do if you are only assembling/linking one source file. If your program is composed of multiple source files, the PWB program (next section) is probably a better choice. Most of your labs will only consist of one source file. Section B discusses how to use a debugger called codeview. In order to view the source code of your program within the codeview debugger, you need to use some command line switches with the masm and link programs in order to include debugging information. The switches are "/zi for masm, and "/co for link as shown below: C:\> masm /zi test,test, test C:\> link /co test,,,,,

A.2 The Assembly Language Process Using PWB


The following section discusses a program called Programmers WorkBench (PWB) for editing your assembly language file and invoking the assembler Program called MASM. You DO NOT HAVE TO USE the pwb program if you do not wish to. An alternative is to edit your file with any ASCII text editor, and invoke MASM via the DOS command line to produce .obj files. You can then invoke the link program to produce either an .exe or .com program. PWB offers a somewhat pushbutton approach to assembling your program, and will allow you to create a project that allows you to assemble/link multiple files with one pushbutton. PWB can be used for the development of the assembler language program. The development procedure follows a four-step process. 1. Create or edit the source code. (.asm) 2. Assemble the program to create the object code. (.lst) 3. Link the program to create the executable code. (.exe or .com) 4. Test and debug the program.

MASM is located in c:\Program Files\MASM611. Activate the PWB program by typing PWB at the MS-DOS command line (or Start -> Run -> pwb) Use the pulldown menu to create a new file. Type Ex. 2.1 in the new file using the editor. Save the program by selecting the file pulldown menu or ALT F3. Use the string .asm as the extension for the desired filename. Once the filename and the path are selected, choose OK to accept the new filename.

The Example 4.1 uses the basic shell of an assembler language program. The shell includes the stack segment, the data segment and the code segment. We will discuss the meanings of the various segments and definitions later in this lab. The section identified as the code segment will be used in CodeView. CodeView will display the code segment in symbolic form. The data segment is identified and is displayed as data in symbolic form.

Example 4.1 Title EX 4-1 (EXE) Purpose Adds 4 bytes of data STSEG STSEG DTSEG FOUR_NO SUM DTSEG CDSEG MAIN SEGMENT DB 32 DUP (?) ENDS SEGMENT DB 12H,0B5H,6CH,78H DB ? ENDS SEGMENT PROC FAR ASSUME CS:CDSEG,DS:DTSEG,SS:STSEG MOV AX,DTSEG MOV DS,AX MOV BX,OFFSET FOUR_NO MOV AL,0 ADD AL,[BX] INC BX ADD AL,[BX] INC BX INC BX INC BX MOV SUM,AL MOV AH,4CH INT 21H INT 20H MAIN CDSEG END MAIN ENDP ENDS ;set up BX as data ptr ;intialize AL ;add next item to AL (AL=AL+[BX]) ;point to next item (BX=BX+1) ;add next item to AL (AL=AL+[BX]) ;point to next item ;point to next item ;point to next item ;store result in SUM ;set up return ;invoke interrupt ;breakpoint, exit ;--------------------------

;----------------------------

ADD AL,[BX] ;add next item to AL (AL=AL+[BX]) ADD AL,[BX] ;add next item to AL (AL=AL+[BX])

Now lets configure the program for the desired assembly format. Use the Options -> Projects Template > Set Project Template from the pulldown menu. In this window, the runtime support section allows the choice is NONE because most programs dont require runtime support from a separate library such as C, C++, etc.

Select the DOS.exe entry to generate a DOS.exe (executable) file as the target for the assembler and linker. Once NONE and DOS.exe have been selected, choose OK at the bottom of the dialog box. Next, use Project->Edit Project , and select your .asm file from the list, and use the Add choice to add it to your project file list. Now that Project is defined, select the Options -> Build. This determines the type of program developed by the assembler and builder program. Choose the DEBUG option in the build dialog. After debugging is complete, choose the release option for the final program. Next, use the options->Language Options->MASM Options from the pull down menu. In the popup window, deselect Warnings Treated as Errors. Then select <Set Debug Options...>. In the new popup window, select Generate Listing File from the Listing section if it is not already selected. The generate listing options initiated the .exe file and the .lst file. The listing file shows the source and object in one file. Now that PWB has been configured, the project can be built. Select Project -> Build. You may ignore warnings about the stack unless the program uses more than 128 bytes of stack space. The final product is now in the form of an .exe file and a .lst file. Choose CANCEL to return to PWB. Choose FILE -> OPEN and view the .lst file. How are the opcodes displayed in the .lst file ? Do the source code and/or the comments display in the file? Are the other segments displayed such as stack or code ?

Run this program if it is free from errors. If not, debug it using instructions from the following section.

B. Debugging Assembly Language Programs Using Code View


Codeview (cv.exe) is an external debugger that offers many more features than the ' debug.exe' program. You can debug programs simply by using debug.exe, but Codeview allows you easily track both memory and register changes. It is recommended that you use Codeview for debugging your programs. The program typed in PWB should be error-free; however, we will use it to demonstrate the CodeView program. Codeview (cv.exe) can be executed from the DOS command line, or from within PWB from the Run menu. To execute codeview from the DOS command line for a .exe file, just do: C:\> cv myfile.exe This will bring up codeview for the file myfile.exe. Codeview can also be run from within PWB. If codeview is not available within PWB from the Run menu, then Select Run->Customize Run Menu from the Pull down menu. In the popup window, select <Add...>. In the new popup window, input CodeView following Menu Text. In the second field, Path Name, input the directory in which the cv.exe is located. It should be: c:\masm611\binr\cv.exe Lets configure CodeView. Choose Option -> CodeView. The configuration should vary depending on your monitor. Select a 50 line display and the default CodeView configuration.

Now that CodeView is configured, select Run -> CodeView. Codeview dynamically displays content of all registers and the various memory locations. What is the logical address for the code segment? What are the content of registers CS:IP ? What is the logical address of the data? Step through the program with F10. Restart the program. Execute the entire program using F5.

What is the result of the addition? What register(s) is the result stored?

Segments
Programming segments usually have a naming convention. The convention consists of label SEGMENT [options] ;statements belonging to the segment label ENDS The options field can be used to give information to the assembler for organizing the segment, but is not required. The label for ENDS must be the same as the label for SEGMENT.

C. Data Segments
The data segment is the portion of the memory used to store static data. The data is accessed in the code segment by the labels given in the data directives and types in the data segment definition portion of the assembler language source file. The x86 supports various data types and directives. MASM assembler directives are used to allocate space and names to data values and/or locations. ORG is a MASM directive that is used to indicate the origin of an offset address (A directive is an instruction to the assembler program, it is NOT an x86 instruction). The number must end in H to indicate hexadecimal otherwise the assembler will assume decimal and convert the number to hexadecimal. DB is the defined byte directive which is used to allocate memory in byte-size chunks. The assembler default is decimal; however, for hexadecimal, the number must end with an H and for binary the number must end with a B. DB is also the only directive used to define ASCII strings longer than 2 characters.

ORG 0020H DATA1 DATA2 DATA3 DB 37 DB 37H DB 100101B ;decimal ;hexadecimal ;binary

DATA4 DATA5

DB 0110111B

;binary

DB My name is Amy$

Assemble the data above. Dump the contents of memory at the respective address. Observe that the data storage is at the offset, 0020H.

What is the logical address (and offset) for the values equivalent to those listed above in Example 3.1? How are the numbers represented, decimal, hex, binary? What is stored in memory that corresponds to the string above?

DUP is a MASM directive that is used to duplicate a given number for a given number of characters. Assemble the instruction DATA6 DB 6 DUP(0FH) at origin 0030H. What are the memory contents at that offset? What is an alternate way to duplicate 0FH? DUP is also used to set aside or reserve space for variables. For example, DATA7 DB 32DUP (?) ; set aside 32 bytes DATA8 DW 32DUP(?) ;set aside 32 words DW is used to define words or allocate memory 2 bytes at a time. ORG 0070H DATA9 DATA10 DATA11 DW 253FH ;store 2 bytes DW 7,6,5,4,3,2,1 ;store various data words DW 8 DUP (?) ;set aside 8 words

If we use DW to store DATA9, then use DB as stated below ... ORG 0090H DATA12 DATA13 DB 25H DB 3FH

to store 253FH, will the memory appear the same? Why or why not? EQU is used to define a constant but does not reserve memory storage for the value. As an example, consider the following segment definition directives. ORG 0060H VALUE EQU 25 ; sets a constant 25 MOV CX, VALUE Assemble the above example using EQU, then assemble the following. Check the value of the internal registers. Does CX appear differently in the two examples? ORG 0080H VALUE2 DB 25 MOV CX, VALUE2

Equate also makes changing constants throughout the program easier. The value can be changed in the equate line, rather than at each instance in the program. DD (define double) is used to allocate memory for a double word (4 bytes). The data is converted to hex, then placed in the memory location. The low byte goes to the low address and the high byte goes to the high address (the x86 is a little endian architecture). DQ, define quadword, is used to allocate memory 8 bytes (4 words) in size. This directive will store up to 64 bits of data at a time. ORG 0080H DATA14 DATA15 DATA16 DQ HI DQ 7,6,5,4 DQ 65534H

What is the hexadecimal equivalent stored for HI? How many bytes are allocated for each character? Does that differ from the numbers in DATA15? DT, define ten bytes is useful to allocate memory for packed BCD. ORG 0090H DATA17 DATA18 DT 36768 DT 36768H

Do the values differ in memory? If so, explain why.

D. The Stack Segment


The stack is an area of memory reserved for temporary storage of program data and subroutine return addresses. We will talk more about the stack segment in a future lab. For now, in your programs include a stack segment declaration as shown below (allocates 64 bytes of memory for stack storage):

SSEG

SEGMENT DB 64 DUP (?)

SSEG

ENDS

For now, make sure that any program you write has a stack segment.

E. Code Segments
The code segment contains the x86 instructions that make up your program. Example 4.2 shows the shell of a program (repeated here for convenience).

Example 4.2 The form of an assembly language Program

SSEG

SEGMENT DB 64 DUP (?)

SSEG ; DSEG

ENDS

SEGMENT

; all data goes here DSEG ; CSEG MAIN SEGMENT CODE ENDS

PROC FAR; program entry pt ASSUME CS: CSEG, DS:DSEG, SS:SSEG MOV AX, DSEG; bring in segment for data MOV DS, AX; assign the DS value ; place code here ; MOVE AH, 4CH; set up to INT 21H; Return to DOS INT 20H

MAIN CSEG

ENDP ENDS END MAIN ; argument for ' directiive specifies end' ; program ENTRY point

The segment directive precedes the program entry point which defines a procedure labeled MAIN. A procedure is a group of functions designed to accomplish a specific function. A code segment is usually organized into several small procedures. Each procedure must contain a PROC directive at the beginning and it is closed by an ENDP directive. The procedure may contain options FAR or NEAR. FAR must designate the program entry point. NEAR refers to procedures that are not outside the current CS. The next line contains an ASSUME statement. The ASSUME statement associates segment registers with specific memory segments. .

Write an assembly language program that includes a code segment named Cod_Seg, a data segment named Dat_Seg, a stack segment named Sta_Seg. The data segment should have data items named BIG_DAT, SMAL_DAT and SUM. The program should add the two values in BIG_DAT and SMAL_DAT and then store the result in SUM. What is actually assigned to the CS, DS, and SS registers? What is the value of SUM? Where did you find it, that is, what is its address? You MUST use MASM to assemble your code and produce a listing file.

Lab Report A. Describing What You Learned


Answer all of the Think About It questions above.

B. Applying What You Learned


Discuss your experiences using PWB, Codeview, and MASM. Discuss the small program that you wrote, be sure you include the listing file of your program in your report.

Appendix (Short method of specifying segments)


There is a shorthand method for specifying segments using the .MODEL directive. .model small .stack 100h .data four_no db 12h,0b5h,6ch,78h sum main db ? .code proc mov mov mov int int main endp end main near ax,@data ds,ax ah,4ch 21h 20h

;; your statements here

The program above uses the .MODEL directive that specifies a small memory model. A memory model causes the assembler to make assumptions about the size and number of the program segments. The small memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K). There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <= 64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data address require only offset information, or both segment and offset information. The small memorey model will be sufficient for EE 3724 programs.

Appendix (Short method of specifying segments)


There is a shorthand method for specifying segments using the .MODEL directive. Example 4.3 .model small .stack 100h .data four_no db 12h,0b5h,6ch,78h sum main db ? .code proc mov mov mov int int main endp end main near ax,@data ds,ax ah,4ch 21h 20h

;; your statements here

The program above uses the .MODEL directive that specifies a small memory model. A memory model causes the assembler to make assumptions about the size and number of the program segments. The small memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K). There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <= 64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data address require only offset information, or both segment and offset information. The small memorey model will be sufficient for EE 3724 programs.

You might also like