Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 65
Digital Design and Computer Organization
Module -3
Machine Instructions and Programs
Chapter 2 :Machine Instructions and Programs
Memory Location and Addresses
Memory Operations Instruction and Instruction sequencing Addressing Modes
Topics: 2.2, 2.3, 2.4, 2.5
Introduct ion we discuss how instructions are composed and study the ways in which sequences of instructions are brought from the memory into the processor and executed to perform a given task. The addressing methods that are commonly used for accessing operands in memory locations and processor registers are also presented. These generic programs are specified at the assembly-language level, where machine instructions and operand addressing information are represented by symbolic names. A complete instruction set, including operand addressing methods, is often referred to as the instruction set architecture (ISA) of a processor. we will present enough examples to illustrate the capabilities of a typical instruction set. 2.1 Memory Locations and Addresses The memory consists of many millions of storage cells, each of which can store a bit of information having the value 0 or 1. Because a single bit represents a very small amount of information, bits are seldom handled individually. The usual approach is to deal with them in groups of fixed size. For this purpose, the memory is organized so that a group of n bits can be stored or retrieved in a single, basic operation. Each group of n bits is referred to as a word of information, and n is called the word length. The memory of a computer can be schematically represented as a collection of words, as shown in Figure 2.1. Memory consists of many millions of storage cells, each of which can store 1 bit.
Data is usually accessed
in n-bit group. n is called word length. Modern computers have word lengths that typically range from 16 to 64 bits. If the word length of a computer is 32 bits, a single word can store a 32- bit signed number or four ASCII-encoded characters, each occupying 8 bits, as shown in Figure 2.2. A unit of 8 bits is called a byte. Machine instructions may require one or more words for their representation.
Examples of encoded information in a 32 bit word
Accessing the memory to store or retrieve a single item of information, either a word or a byte, requires distinct names or addresses for each location. It is customary to use numbers from 0 to 2k − 1, for some suitable value of k, as the addresses of successive locations in the memory. Thus, the memory can have up to 2k addressable locations. The 2k addresses constitute the address space of the computer. For example, a 24-bit address generates an address space of 224 (16,777,216) locations. 2.1.1 Byte Addressability We have three basic information quantities to deal with: bit, byte, and word. A byte is always 8 bits, but the word length typically ranges from 16 to 64 bits. It is impractical to assign distinct addresses to individual bit locations in the memory. The most practical assignment is to have successive addresses refer to successive byte locations in the memory. This is the assignment used in most modern computers. The term byte-addressable memory is used for this assignment. Byte locations have addresses 0, 1, 2,.... Thus, if the word length of the machine is 32 bits, successive words are located at addresses 0, 4, 8,..., with each word consisting of four 2.1.2 Big-Endian and Little-Endian Assignments There are two ways that byte addresses can be assigned across words, as shown in Figure 2.3. The name big-endian is used when lower byte addresses are used for the more significant bytes (the leftmost bytes) of the word. The name little-endian is used for the opposite ordering, where the lower byte addresses are used for the less significant bytes (the rightmost bytes) of the word. The words “more significant” and “less significant” are used in relation to the weights (powers of 2) assigned to bits when the word represents a number. Both little-endian and big-endian assignments are used in commercial machines. In both cases, byte addresses 0, 4, 8,..., are taken as the addresses of successive words in the memory of a computer with Byte and Word addressing In addition to specifying the address ordering of bytes within a word, it is also necessary to specify the labeling of bits within a byte or a word. The most common convention, and is shown in Figure 2.2a. It is the most natural ordering for the encoding of numerical data. The same ordering is also used for labeling bits within a byte, that is, b7, b6,..., b0, from left to right. 2.1.3 Word Alignment
Words are said to be aligned in memory if they begin at a byte address that is a multiple of the number of bytes in a word
16 bit word : 0, 2, 4, ...
32 bit word : 0, 4, 8, ... 64 bit word : 0, 8, 16, .. 2.1.4 Accessing Numbers and Characters • A number usually occupies one word, and can be accessed in the memory by specifying its word address. • Similarly, individual characters can be accessed by their byte address.
• For programming convenience it is useful to have different ways of specifying
addresses in program instructions. 2.2 Memory •Operations Both program instructions and data operands are stored in the memory. • To execute an instruction, the processor control circuits must cause the word (or words) containing the instruction to be transferred from the memory to the processor. • Operands and results must also be moved between the memory and the processor. • Two basic operations involving the memory are needed, namely, Read and Write. • The Read operation transfers a copy of the contents of a specific memory location to the processor. • The memory contents remain unchanged. • To start a Read operation, the processor sends the address of the desired location to the memory and requests that its contents be read. The Write operation transfers an item of information from the
processor to a specific memory location, overwriting the former
contents of that location. To initiate a Write operation, the processor sends the address of
the desired location to the memory, together with the data to be
written into that location. The memory then uses the address and data to perform the write. 2.3 Instructions and Instruction Sequencing
The tasks carried out by a computer program consist of a sequence of small
steps, such as adding two numbers, testing for a particular condition, reading a character from the keyboard, or sending a character to be displayed on a display screen. A computer must have instructions capable of performing four types of operations: • Data transfers between the memory and the processor registers • Arithmetic and logic operations on data • Program sequencing and control • I/O transfers 2.3.1 Register Transfer Notation We need to describe the transfer of information from one location in a computer to another. Possible locations that may be involved in such transfers are memory locations, processor registers, or registers in the I/O subsystem. Most of the time, we identify such locations symbolically with convenient names. For example, names that represent the addresses of memory locations may be LOC, PLACE, A, or VAR2. Predefined names for the processor registers may be R0 or R5. Registers in the I/O subsystem may be identified by names such as DATAIN or OUTSTATUS. To describe the transfer of information, the contents of any location are denoted by placing square brackets around its name. Thus, the expression R2 ← [LOC] means that the contents of Example: consider the operation that adds the contents of registers R2 and R3, and places their sum into register R4. This action is indicated as R4 ← [R2]+[R3]. This type of notation is known as Register Transfer Notation (RTN). Note that the righthand side of an RTN expression always denotes a value, and the left-hand side is the name of a location where the value is to be placed, overwriting the old contents of that location. In computer jargon, the words “transfer” and “move” are commonly used to mean “copy.” Transferring data from a source location A to a destination location B means that the contents of location A are read and then written into location B. In this operation, only the contents of the destination will 2.3.2 Assembly-Language Notation We need another type of notation to represent machine instructions and programs. For this, we use assembly language. For example, a generic instruction that causes the transfer described above, from memory location LOC to processor register R2, is specified by the statement Load R2, LOC. The contents of LOC are unchanged by the execution of this instruction, but the old contents of register R2 are overwritten. The name Load is appropriate for this instruction, because the contents read from a memory location are loaded into a processor register. The second example of adding two numbers contained in processor registers R2 and R3 and placing their sum in R4 can be specified by the assembly-language statement Add R4, R2, R3. In this case, registers R2 and R3 hold the source operands, An instruction specifies an operation to be performed and the operands involved. In the above examples, we used the English words Load and Add to denote the required operations. In the assembly-language instructions of actual (commercial) processors, such operations are defined by using mnemonics, which are typically abbreviations of the words describing the operations. For example, the operation Load may be written as LD, while the operation Store, which transfers a word from a processor register to the memory, may be written as STR or ST. Assembly languages for different processors often use different mnemonics for a given operation. To avoid the need for details of a particular assembly language at this early stage, we will continue the presentation in this chapter by using English words rather than processor-specific mnemonics. 2.3.3 RISC and CISC Instruction One of the Setsmost important characteristics that distinguish different computers is the nature of their instructions. There are two fundamentally different approaches in the design of instruction sets for modern computers. One popular approach is based on the premise that higher performance can be achieved if each instruction occupies exactly one word in memory, and all operands needed to execute a given arithmetic or logic operation specified by an instruction are already in processor registers. This approach is conducive to an implementation of the processing unit in which the various operations needed to process a sequence of instructions are performed in “pipelined” fashion to overlap activity and reduce total execution time of a program. The restriction that each instruction must fit into a single word reduces the complexity and the number of different types of instructions that may be included in the instruction set of a computer. Such computers are called Reduced Instruction Set • An alternative to the RISC approach is to make use of more complex instructions which may span more than one word of memory, and which may specify more complicated operations. • This approach was prevalent prior to the introduction of the RISC approach in the 1970s. • Although the use of complex instructions was not originally identified by any particular label, computers based on this idea have been subsequently called Complex Instruction Set Computers (CISC). 2.3.4 Introduction to RISC Instruction Sets Two key characteristics of RISC instruction sets are: Each instruction fits in a single word. A load/store architecture is used, in which Memory operands are accessed only using Load and Store instructions. All operands involved in an arithmetic or logic operation must either be in processor registers, or one of the operands may be given explicitly within the instruction word. At the start of execution of a program, all instructions and dataused in the program are stored in the memory of a computer. Processor registers do not contain valid operands at that time. If operands are expected to be in processor registers before they can be used by an instruction, then it is necessary to first bring these operands into the registers. This task is done by Load instructions which copy the contents of a memory location into a processor register. Load instructions are of the form Load destination, source or more specifically Load processor_ register, memory _location The memory location can be specified in several ways. The term addressing modes is used to refer to the different ways in which this may be accomplished. We say that Add is a three-operand, or a three-address, instruction of the form Add destination, source1, source2 Store instruction is of the form Store source, destination the source is a processor register and the destination is a memory location. Observe that in the Store instruction the source and destination are specified in the reverse order from the Load instruction; this is a commonly used convention. Add R3, R2, R3 Store R3, C 2.3.5 Instruction Execution and Straight-Line Sequencing How this program is executed? The processor contains a register called the program counter (PC), which holds the address of the next instruction to be executed. To begin executing a program, the address of its first instruction (i in our example) must be placed into the PC. Then, the processor control circuits use the information in the PC to fetch and execute instructions, one at a time, in the order of increasing addresses. This is called straight-line sequencing. During the execution of each instruction, the PC is incremented by 4 to point to the next instruction. Thus, after the Store instruction at location i + 12 is executed, the PC contains the value i + 16, which is the address of the first instruction of the next program segment. Executing a given instruction is a two-phase procedure. In the first phase, called instruction fetch, the instruction is fetched • At the start of the second phase, called instruction execute, the instruction in IR is examined to determine which operation is to be performed. • The specified operation is then performed by the processor. • This involves a small number of steps such as fetching operands from the memory or from processor registers, performing an arithmetic or logic operation, and storing the result in the destination location. • At some point during this two-phase procedure, the contents of the PC are advanced to point to the next instruction. • When the execute phase of an instruction is completed, the PC contains the address of the next instruction, and a new instruction fetch phase can begin. 2.3.6 Branching Consider the task of adding a list of n numbers. The program outlined in Figure 2.5 is a generalization of the program in Figure 2.4. The addresses of the memory locations containing the n numbers are symbolically given as NUM1, NUM2,..., NUMn, and separate Load and Add instructions are used to add each number to the contents of register R2. After all the numbers have been added, the result is placed in memory location SUM. Instead of using a long list of Load and Add instructions, as in Figure 2.5, it is possible to implement a program loop in which the instructions read the next number in the list and add it to the current sum. To add all numbers, the loop has to be executed as many The body of the loop is a straight-line sequence of instructions executed repeatedly. It starts at location LOOP and ends at the instruction Branch_if_[R2]>0. During each pass through this loop, the address of the next list entry is determined, and that entry is loaded into R5 and added to R3. For now, we concentrate on how to create and control a program loop. Assume that the number of entries in the list, n, is stored in memory location N, as shown. Register R2 is used as a counter to determine the number of times the loop is executed. Hence, the contents of location N are loaded into register R2 at the beginning of the program. Then, within the body of the loop, the instruction reduces the We now introduce branch instructions. This type of instruction loads a new address into the program counter. As a result, the processor fetches and executes the instruction at this new address, called the branch target, instead of the instruction at the location that follows the branch instruction in sequential address order. A conditional branch instruction causes a branch only if a specified condition is satisfied. If the condition is not satisfied, the PC is incremented in the normal way, and the next instruction in sequential address order is fetched and executed. In the program in Figure 2.6, the instruction Branch_if_[R2]>0 LOOP is a conditional branch instruction that causes a branch to location LOOP if the contents of register R2 are greater than zero. This means that the loop is repeated as long as there are entries in the list that are yet to be added to R3. At the end of the nth pass through the loop, the Subtract instruction produces a value of zero in R2, and, hence, branching does not occur. Instead, the Store instruction is fetched and executed. It moves the final result from R3 into memory location SUM. For example, the instruction that implements the action Branch_if_[R4]>[R5] LOOP may be written in generic assembly language as Branch_greater_than R4, R5, LOOP or using an actual mnemonic as BGT R4, R5, LOOP. It compares the contents of registers R4 and R5, without changing the contents of either register. Then, it causes a branch to LOOP if the contents of R4 are greater than the contents of R5. 2.3.7 Generating Memory Addresses Let us return to Figure 2.6. The purpose of the instruction block starting at LOOP is to add successive numbers from the list during each pass through the loop. Hence, the Load instruction in that block must refer to a different address during each pass. How are the addresses specified? The memory operand address cannot be given directly in a single Load instruction in the loop. Otherwise, it would need to be modified on each pass through the loop. As one possibility, suppose that a processor register, Ri, is used to hold the memory address of an operand. If it is initially loaded with the address NUM1 before the loop is entered and is then incremented by 4 on each pass through the loop, it can provide the needed capability. This situation, and many others like it, give rise to the need for flexible ways to specify the address of an operand. The instruction set of a computer typically provides a number of such 2.4 Addressing Modes Programs are normally written in a high-level language, which enables the programmer to conveniently describe the operations to be performed on various data structures. When translating a high-level language program into assembly language, the compiler generates appropriate sequences of low-level instructions that implement the desired operations. The different ways for specifying the locations of instruction operands are known as addressing modes. In this section we present the basic addressing modes found in RISC-style processors. A summary is provided in Table 2.1, which also includes the assembler syntax we will use for each mode. The assembler syntax defines the way in which instructions and the addressing modes of their operands are specified. 2.4.1 Implementation of Variables and Constants
Variables are found in almost every computer program.
In assembly language, a variable is represented by allocatinga register or a memory location to hold its value. This value can be changed as needed using appropriate instructions. The program in Figure 2.5 uses only two addressing modes to access variables. We access an operand by specifying the name of the register or the address of the memory location where the operand is located. The precise definitions of these two modes are: Register mode—The operand is the contents of a processor register; the name of the register is given in the instruction. The instruction Add R4, R2, R3 uses the Register mode for all three operands. Registers R2 and R3 hold the two source operands, while R4 is the destination. The Absolute mode can represent global variables in a program. A declaration such as Integer NUM1, NUM2, SUM; In a high-level language program will cause the compiler to allocate a memory location to each of the variables NUM1, NUM2, and SUM. Whenever they are referenced later in the program, the compiler can generate assembly-language instructions that use the Absolute mode to access these variables. The Absolute mode is used in the instruction Load R2, NUM1 which loads the value in the memory location NUM1 into register R2 Constants representing data or addresses are also found in almost every computer program. Such constants can berepresented in assembly language using the Immediate addressing mode. Immediate mode—The operand is given explicitly in the instruction. For example, the instruction Add R4, R6, 200immediate adds the value 200 to the contents of register R6, and places the result into register R4. Using a subscript to denote the Immediate mode is not appropriate in assembly languages. A common convention is to use the number sign (#) in front of the value to indicate that this value is to be used as an immediate operand. Hence, we write the instruction above in the form Add R4, R6, #200 In the addressing modes that follow, the instruction does not give the operand or its address explicitly. 2.4.2 Indirection and Pointers The program in Figure 2.6 requires a capability for modifying the address of the memory operand during each pass through the loop. A good way to provide this capability is to use a processor register to hold the address of the operand. The contents of the register are then changed (incremented) during each pass to provide the address of the next number in the list that has to be accessed. The register acts as a pointer to the list, and we say that an item in the list is accessed indirectly by using the address in the register. The desired capability is provided by the indirect addressing mode. Indirect mode—The effective address of the operand is the We denote indirection by placing the name of the register given in the instruction in parentheses as illustrated in Figure 2.7 and Table 2.1. To execute the Load instruction in Figure 2.7, the processor uses the value B, which is in register R5, as the effective address of the operand. It requests a Read operation to fetch the contents of location B in the memory. The value from the memory is the desired operand, which the processor loads into register R2. Indirect addressing through a memory location is also possible, but it is found only in CISC- style processors. As another example of pointers, consider the C-language statement A = *B; where B is a pointer variable and the ‘*’ symbol is the operator for indirect accesses. This statement causes the contents of the memory location pointed to by B to be loaded into memory location A. The statement may be compiled into Load R2 B Load R3 (R2) Store R3 A Indirect addressin g through registers 2.4.3 Indexing and Arrays In an assembly-language program, whenever a constant such as the value X is needed, it may be given either as an explicit number or as a symbolic name representing a numerical value. The way in which a symbolic name is associated with a specific numerical value. When the instruction is translated into machine code, the constant X is given as a part of the instruction and is restricted to fewer bits than the word length of the computer. Since X is a signed integer, it must be sign-extended to the register length before being added to the contents of the register. Figure 2.9 illustrates two ways of using the Index mode. In Figure 2.9a, the index register, R5, contains the address of a memory location, and the value X defines an offset (also called a displacement) from this address to the location where the operand is found. An alternative use is illustrated in Figure 2.9b. Here, the constant X corresponds to a memory address, and the contents of the index register define the offset to the To see the usefulness of indexed addressing, consider a simple example involving a list of test scores for students taking a given course. Assume that the list of scores, beginning at location LIST, is structured as shown in Figure 2.10. A four-word memory block comprises a record that stores the relevant information for each student. Each record consists of the student’s identification number (ID), followed by the scores the student earned on three tests. There are n students in the class, and the value n is stored in location N immediately in front of the list. The addresses given in the figure for the student IDs and test scores assume that the memory is byte addressable and that the word length is 32 bits. We should note that the list in Figure 2.10 represents a two-dimensional array having n rows and four columns. Each row contains the entries for one student, and the columns give the IDs and test scores Suppose that we wish to compute the sum of all scores obtained on each of the tests and store these three sums in memory locations SUM1, SUM2, and SUM3. A possible program for this task is given in Figure 2.11. THANK YOU