MASM Notes
MASM Notes
MASM Notes
The EXE program has no file size restrictions and may contain several segments. Modular programs are often written using several different segments. However, multiple segments must be aligned on 16 byte boundaries since all segments begin at addresses that end with 4 binary 0s. The linker will ensure that the multiple segments are grouped on paragraph boundaries. The figure below is an example of 4 different segments and the addresses where they are stored
20000 10000
CSEG1
Objectives:
Learn to: A. Use the Programmers Workbench to create, link and assemble a program. B. Use CodeView to debug and execute an assembler language program. C. Use QuickHelp to access help on instructions and the assembler. D. Meanings behind the code and data segments
Pre-Lab
Read sections 3.2, 6.1, 6.2, and 6.3 in the Uffenbeck text. What is the physical address corresponding to DS:103fH if BX=94d0H? (physical address corresponding to DS:BX) Explain why segments must be located on paragraph boundaries when they are loaded into memory (Hint: think about how logical addresses are converted to physical addresses).
Lab A.1 The Assembly Language Process Using the Command line
The following section explains how to assemble and link a file using the command line from a DOS window. The steps are: 1. Create or edit the source code (.asm file) using any ASCII text editor. Warning -- the file must be saved in an ASCII format - some editors like ' winword'or ' , word' store the file by default in a binary format. To save as an ASCII format in some of the microsoft editors, select output type as *.TXT but specify the full file name as myfile.asm (the .asm extension should be used for assembly language files). Invoke the masm program to assemble the file and produce a .obj file and optionally, a .lst file. Invoke the link program to produce a .exe program (or a .com program via a command line argument).
2. 3.
Assume we have an assembly language file called test.asm that has been saved in ASCII format. Open a DOS window. To assemble the file, change to the directory where the file is via the ' command, and cd' type: C:\> masm test If assembly is successful, this will produce a file called test.obj. If errors are present, you will be given the line numbers where the syntax errors ocurred. You can also produce a listing file (.lst) which shows opcodes for all instructions via: C:\> masm test,test, test It is a good idea to always create a .lst file output. A .exe file must be created from the .obj file via the link program. Type: C:\> link test
You will be prompted for file names for the Run file, List file, libraries, and Definitions file. Just hitting <enter> for each choice will use the defaults. This will produce a test.exe file which can then be executed. You can also produce the .exe file with no prompting from the link program via: C:\> link test,,,,, Use 5 commas after filename (test) to provide defaults for all other choices. Using the command line for masm/link is probably the easiest thing to do if you are only assembling/linking one source file. If your program is composed of multiple source files, the PWB program (next section) is probably a better choice. Most of your labs will only consist of one source file. Section B discusses how to use a debugger called codeview. In order to view the source code of your program within the codeview debugger, you need to use some command line switches with the masm and link programs in order to include debugging information. The switches are "/zi for masm, and "/co for link as shown below: C:\> masm /zi test,test, test C:\> link /co test,,,,,
MASM is located in c:\Program Files\MASM611. Activate the PWB program by typing PWB at the MS-DOS command line (or Start -> Run -> pwb) Use the pulldown menu to create a new file. Type Ex. 2.1 in the new file using the editor. Save the program by selecting the file pulldown menu or ALT F3. Use the string .asm as the extension for the desired filename. Once the filename and the path are selected, choose OK to accept the new filename.
The Example 4.1 uses the basic shell of an assembler language program. The shell includes the stack segment, the data segment and the code segment. We will discuss the meanings of the various segments and definitions later in this lab. The section identified as the code segment will be used in CodeView. CodeView will display the code segment in symbolic form. The data segment is identified and is displayed as data in symbolic form.
Example 4.1 Title EX 4-1 (EXE) Purpose Adds 4 bytes of data STSEG STSEG DTSEG FOUR_NO SUM DTSEG CDSEG MAIN SEGMENT DB 32 DUP (?) ENDS SEGMENT DB 12H,0B5H,6CH,78H DB ? ENDS SEGMENT PROC FAR ASSUME CS:CDSEG,DS:DTSEG,SS:STSEG MOV AX,DTSEG MOV DS,AX MOV BX,OFFSET FOUR_NO MOV AL,0 ADD AL,[BX] INC BX ADD AL,[BX] INC BX INC BX INC BX MOV SUM,AL MOV AH,4CH INT 21H INT 20H MAIN CDSEG END MAIN ENDP ENDS ;set up BX as data ptr ;intialize AL ;add next item to AL (AL=AL+[BX]) ;point to next item (BX=BX+1) ;add next item to AL (AL=AL+[BX]) ;point to next item ;point to next item ;point to next item ;store result in SUM ;set up return ;invoke interrupt ;breakpoint, exit ;--------------------------
;----------------------------
ADD AL,[BX] ;add next item to AL (AL=AL+[BX]) ADD AL,[BX] ;add next item to AL (AL=AL+[BX])
Now lets configure the program for the desired assembly format. Use the Options -> Projects Template > Set Project Template from the pulldown menu. In this window, the runtime support section allows the choice is NONE because most programs dont require runtime support from a separate library such as C, C++, etc.
Select the DOS.exe entry to generate a DOS.exe (executable) file as the target for the assembler and linker. Once NONE and DOS.exe have been selected, choose OK at the bottom of the dialog box. Next, use Project->Edit Project , and select your .asm file from the list, and use the Add choice to add it to your project file list. Now that Project is defined, select the Options -> Build. This determines the type of program developed by the assembler and builder program. Choose the DEBUG option in the build dialog. After debugging is complete, choose the release option for the final program. Next, use the options->Language Options->MASM Options from the pull down menu. In the popup window, deselect Warnings Treated as Errors. Then select <Set Debug Options...>. In the new popup window, select Generate Listing File from the Listing section if it is not already selected. The generate listing options initiated the .exe file and the .lst file. The listing file shows the source and object in one file. Now that PWB has been configured, the project can be built. Select Project -> Build. You may ignore warnings about the stack unless the program uses more than 128 bytes of stack space. The final product is now in the form of an .exe file and a .lst file. Choose CANCEL to return to PWB. Choose FILE -> OPEN and view the .lst file. How are the opcodes displayed in the .lst file ? Do the source code and/or the comments display in the file? Are the other segments displayed such as stack or code ?
Run this program if it is free from errors. If not, debug it using instructions from the following section.
Now that CodeView is configured, select Run -> CodeView. Codeview dynamically displays content of all registers and the various memory locations. What is the logical address for the code segment? What are the content of registers CS:IP ? What is the logical address of the data? Step through the program with F10. Restart the program. Execute the entire program using F5.
What is the result of the addition? What register(s) is the result stored?
Segments
Programming segments usually have a naming convention. The convention consists of label SEGMENT [options] ;statements belonging to the segment label ENDS The options field can be used to give information to the assembler for organizing the segment, but is not required. The label for ENDS must be the same as the label for SEGMENT.
C. Data Segments
The data segment is the portion of the memory used to store static data. The data is accessed in the code segment by the labels given in the data directives and types in the data segment definition portion of the assembler language source file. The x86 supports various data types and directives. MASM assembler directives are used to allocate space and names to data values and/or locations. ORG is a MASM directive that is used to indicate the origin of an offset address (A directive is an instruction to the assembler program, it is NOT an x86 instruction). The number must end in H to indicate hexadecimal otherwise the assembler will assume decimal and convert the number to hexadecimal. DB is the defined byte directive which is used to allocate memory in byte-size chunks. The assembler default is decimal; however, for hexadecimal, the number must end with an H and for binary the number must end with a B. DB is also the only directive used to define ASCII strings longer than 2 characters.
ORG 0020H DATA1 DATA2 DATA3 DB 37 DB 37H DB 100101B ;decimal ;hexadecimal ;binary
DATA4 DATA5
DB 0110111B
;binary
DB My name is Amy$
Assemble the data above. Dump the contents of memory at the respective address. Observe that the data storage is at the offset, 0020H.
What is the logical address (and offset) for the values equivalent to those listed above in Example 3.1? How are the numbers represented, decimal, hex, binary? What is stored in memory that corresponds to the string above?
DUP is a MASM directive that is used to duplicate a given number for a given number of characters. Assemble the instruction DATA6 DB 6 DUP(0FH) at origin 0030H. What are the memory contents at that offset? What is an alternate way to duplicate 0FH? DUP is also used to set aside or reserve space for variables. For example, DATA7 DB 32DUP (?) ; set aside 32 bytes DATA8 DW 32DUP(?) ;set aside 32 words DW is used to define words or allocate memory 2 bytes at a time. ORG 0070H DATA9 DATA10 DATA11 DW 253FH ;store 2 bytes DW 7,6,5,4,3,2,1 ;store various data words DW 8 DUP (?) ;set aside 8 words
If we use DW to store DATA9, then use DB as stated below ... ORG 0090H DATA12 DATA13 DB 25H DB 3FH
to store 253FH, will the memory appear the same? Why or why not? EQU is used to define a constant but does not reserve memory storage for the value. As an example, consider the following segment definition directives. ORG 0060H VALUE EQU 25 ; sets a constant 25 MOV CX, VALUE Assemble the above example using EQU, then assemble the following. Check the value of the internal registers. Does CX appear differently in the two examples? ORG 0080H VALUE2 DB 25 MOV CX, VALUE2
Equate also makes changing constants throughout the program easier. The value can be changed in the equate line, rather than at each instance in the program. DD (define double) is used to allocate memory for a double word (4 bytes). The data is converted to hex, then placed in the memory location. The low byte goes to the low address and the high byte goes to the high address (the x86 is a little endian architecture). DQ, define quadword, is used to allocate memory 8 bytes (4 words) in size. This directive will store up to 64 bits of data at a time. ORG 0080H DATA14 DATA15 DATA16 DQ HI DQ 7,6,5,4 DQ 65534H
What is the hexadecimal equivalent stored for HI? How many bytes are allocated for each character? Does that differ from the numbers in DATA15? DT, define ten bytes is useful to allocate memory for packed BCD. ORG 0090H DATA17 DATA18 DT 36768 DT 36768H
SSEG
SSEG
ENDS
For now, make sure that any program you write has a stack segment.
E. Code Segments
The code segment contains the x86 instructions that make up your program. Example 4.2 shows the shell of a program (repeated here for convenience).
SSEG
SSEG ; DSEG
ENDS
SEGMENT
; all data goes here DSEG ; CSEG MAIN SEGMENT CODE ENDS
PROC FAR; program entry pt ASSUME CS: CSEG, DS:DSEG, SS:SSEG MOV AX, DSEG; bring in segment for data MOV DS, AX; assign the DS value ; place code here ; MOVE AH, 4CH; set up to INT 21H; Return to DOS INT 20H
MAIN CSEG
ENDP ENDS END MAIN ; argument for ' directiive specifies end' ; program ENTRY point
The segment directive precedes the program entry point which defines a procedure labeled MAIN. A procedure is a group of functions designed to accomplish a specific function. A code segment is usually organized into several small procedures. Each procedure must contain a PROC directive at the beginning and it is closed by an ENDP directive. The procedure may contain options FAR or NEAR. FAR must designate the program entry point. NEAR refers to procedures that are not outside the current CS. The next line contains an ASSUME statement. The ASSUME statement associates segment registers with specific memory segments. .
Write an assembly language program that includes a code segment named Cod_Seg, a data segment named Dat_Seg, a stack segment named Sta_Seg. The data segment should have data items named BIG_DAT, SMAL_DAT and SUM. The program should add the two values in BIG_DAT and SMAL_DAT and then store the result in SUM. What is actually assigned to the CS, DS, and SS registers? What is the value of SUM? Where did you find it, that is, what is its address? You MUST use MASM to assemble your code and produce a listing file.
The program above uses the .MODEL directive that specifies a small memory model. A memory model causes the assembler to make assumptions about the size and number of the program segments. The small memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K). There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <= 64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data address require only offset information, or both segment and offset information. The small memorey model will be sufficient for EE 3724 programs.
The program above uses the .MODEL directive that specifies a small memory model. A memory model causes the assembler to make assumptions about the size and number of the program segments. The small memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K). There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <= 64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data address require only offset information, or both segment and offset information. The small memorey model will be sufficient for EE 3724 programs.