Lec06 1
Lec06 1
Lec06 1
1
Assembly vs. machine language
So far we’ve been using assembly language.
— We assign names to operations (e.g., add) and operands (e.g., $t0).
— Branches and jumps use labels instead of actual addresses.
— Assemblers support many pseudo-instructions.
Programs must eventually be translated into machine language, a binary
format that can be stored in memory and decoded by the CPU.
MIPS machine language is designed to be easy to decode.
— Each MIPS instruction is the same length, 32 bits.
— There are only three different instruction formats, which are very
similar to each other.
Studying MIPS machine language will also reveal some restrictions in the
instruction set architecture, and how they can be overcome.
2
R-type format
Register-to-register arithmetic instructions use the R-type format.
op rs rt rd shamt func
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
3
About the registers
We have to encode register names as 5-bit numbers from 00000 to 11111.
— For example, $t8 is register $24, which is represented as 11000.
— The complete mapping is given on page A-23 in the book.
The number of registers available affects the instruction length.
— Each R-type instruction references 3 registers, which requires a total
of 15 bits in the instruction word.
— We can’t add more registers without either making instructions longer
than 32 bits, or shortening other fields like op and possibly reducing
the number of available operations.
4
I-type format
Load, store, branch and immediate instructions all use the I-type format.
op rs rt address
6 bits 5 bits 5 bits 16 bits
For uniformity, op, rs and rt are in the same positions as in the R-format.
The meaning of the register fields depends on the exact instruction.
— rs is a source register—an address for loads and stores, or an operand
for branch and immediate arithmetic instructions.
— rt is a source register for branches, but a destination register for the
other I-type instructions.
The address is a 16-bit signed two’s-complement value.
— It can range from -32,768 to +32,767.
— But that’s not always enough!
5
Larger constants
Larger constants can be loaded into a register 16 bits at a time.
— The load upper immediate instruction lui loads the highest 16 bits of a
register with a constant, and clears the lowest 16 bits to 0s.
— An immediate logical OR, ori, then sets the lower 16 bits.
To load the 32-bit value 0000 0000 0011 1101 0000 1001 0000 0000:
6
Loads and stores
The limited 16-bit constant can present problems for accesses to global
data.
7
Branches
For branch instructions, the constant field is not an address, but an offset
from the current program counter (PC) to the target address.
Since the branch target L is three instructions past the beq, the address
field would contain 3. The whole beq instruction would be stored as:
For some reason SPIM is off by one, so the code it produces would contain
an address of 4. (But SPIM branches still execute correctly.)
8
Larger branch constants
Empirical studies of real programs show that most branches go to targets
less than 32,767 instructions away—branches are mostly used in loops and
conditionals, and programmers are taught to make code bodies short.
If you do need to branch further, you can use a jump with a branch. For
example, if “Far” is very far away, then the effect of:
Again, the MIPS designers have taken care of the common case first.
9
J-type format
Finally, the jump instruction uses the J-type instruction format.
op address
6 bits 26 bits
For even longer jumps, the jump register, or jr, instruction can be used.
jr $ra # Jump to 32-bit address in register $ra
10
Summary of Machine Language
Machine language is the binary representation of instructions:
— The format in which the machine actually executes them
MIPS machine language is designed to simplify processor
implementation
— Fixed length instructions
— 3 instruction encodings: R-type, I-type, and J-type
— Common operations fit in 1 instruction
• Uncommon (e.g., long immediates) require more than one
11
Decoding Machine Language
How do we convert 1s and 0s to assembly language and to C
code?
Machine language --> assembly C?
12
Decoding (1/7)
13
Decoding (2/7)
14
Decoding (3/7)
Select the opcode (first 6 bits) to determine the format:
000000 00000 00000 00010 00000 100101
000000 00000 00101 01000 00000 101010
000100 01000 00000 00000 00000 000011
000000 00010 00100 00010 00000 100000
001000 00101 00101 11111 11111 111111
000010 00000 10000 00000 00000 000001
Look at opcode: 0 means R-Format, 2 or 3 mean J-Format,
otherwise I-Format
Next step: separation of fields R R I R I J Format:
15
Decoding (4/7)
Fields separated based on format/opcode:
16
Decoding (5/7)
MIPS Assembly (Part 1):
Address: Assembly instructions:
0x00400000 or $2,$0,$0
0x00400004 slt $8,$0,$5
0x00400008 beq $8,$0,3
0x0040000c add $2,$2,$4
0x00400010 addi $5,$5,-1
0x00400014 j 0x100001
Better solution: translate to more meaningful MIPS
instructions (fix the branch/jump and add labels, registers)
17
Decoding (6/7)
MIPS Assembly (Part 2):
or $v0,$0,$0
Loop: slt $t0,$0,$a1
beq $t0,$0,Exit
add $v0,$v0,$a0
addi $a1,$a1,-1
j Loop
Exit:
Next step: translate to C code (must be creative!)
18
Decoding (7/7)
Possible C code:
$v0: var1
$a0: var2
$a1: var3
var1 = 0;
or $v0,$0,$
while (var3 > 0) {
Loop: slt $t0,$0,$
var1 += var2; beq $t0,$0,E
var3 -= 1; add $v0,$v0,
} addi $a1,$a1,
j Loop
Exit:
19