Computer Architecture: connected by bus
- Word: fixed-size chunks of bytes that a machine processes in one op => max size
address space
- Data type size: pg.69
- CPU:
+ registers
+ ALU
+ caches
- Memory (managed with virtual memory by OS
=======================================================
Data Representation: if data type has > 1 byte, it's most likely will be stored
contiguously but different BYTE order (bit order is preserved)
254 = 1111 1110
- Little-Endian (most prevalent): most signigicant byte first -> 1110 1111
- Big-Endian : 1111 1110
- Integer: depends on system for LE and BE
- Pointer: depends on system for locations
- Strings: byte-order-independent, ends with /0
=======================================================
Boolean Algebra
- and (&), or (|), not (~), exclusive or - xor(^)
NOTE: operations are applied bitwise: 1100 & 0110 = 0100
- shifting:
x << y: shift x to right y positions (filled w/ 0)
x >> y: shift x to left y positions
+ logical shift: filled w/ 0
+ arithmetic shift: filled w/ most significant bit
NOTE: shifting is independent of LE/BE
Integer representation
- Unsigned: 2^32
- Signed:
+ most [Link] as sign bit: 1 if negative => can't do + arithmetic
+ two's complement: most [Link] as sign bit (same)
x = 1001 = -2^3 + 0 + 0 + 2^0 => can do arithmetic
Casting
- signed <-> unsigned: bit pattern maintain, interpreted differently
NOTE: mixing auto casts to unsigned
- sign extension: pad all left bits with old most [Link]
Expanding and Truncating
- Expanding: cast to higher data type
+ signed: sign extension
+ unsigned: pad 0
- Truncating: result reinterpreted, cut k most [Link]
+ unsigned: mod operation
+ signed: similar (may have different sign)
=======================================================
Integer Arithmetic
ADDITION
- if overflow (carry-over), discard leftmost (both unsigned and 2T)
- unsigned behavior: >>
- signed behavior:
+ positive overflow: >>
+ negative overflow: negative mod
MULTIPLICATION
- shift + addition, discard overflow, same bit representation for both
123 x 12 = 1230 + 123 x 2
=> depends on constant multiplication
- constant multiplication: shift 2 and add
x * 14 = x * 8 + x * 4 + x * 2
= (x<<3) + (x<<2) + (x<<1)
NOTE: C compiler generates shift/add code when multiply by constant
DIVISION
- subtraction + shift
- divide by power of 2: shift right
+ unsigned: floor, round towards 0
+ signed: floor, negative round towards -infty -> may cause overflow
=> want round towards 0
(x + (1<<k) - 1) >> k => add 1 to the result
=======================================================
Floats
REPRESENTATION
- x = 4 + 2 + 1 + 1/2 + 1/4 + 1/8 +... (fixed-point) => unefficient for very
small/large numbers
- IEEE representation: contains 3 parts
sign bit (s) / exponent (e) / fractional (M in [1.0, 2.0])
x = (-1)^s * M * 2^E
CASE 1: NORMALIZED (most common)
M = 1 + f (free precision)
E = e - Bias (easy to switch between cases)
f: 1/2 + 1/4 + ... (unsigned)
e: int unsigned (1 -> 2^k - 2) ~ k bit precision for
exponent
Bias: 2^(k-1)
CASE 2: DENORMALIZED (0 - epsilon)
When exp is all 0
- Use case:
+ represent 0: e = f = 0
+ represent very small and evenly spaced numbers:
M = f
E = 1 - Bias
CASE 3: OTHER (infty/NaN)
When exp is all 1
- if frac == 0: infty
- else: NaNNorm
=======================================================
MACHINE-LEVEL PROGRAMMING
=======================================================
DEFINITIONS
- Instruction set architecture: specifications of intructions, how to use
commands,...
+ e.g: ARM, Intel-x86
- Microarchitecture: cache size, frequency,...
BASIC ANATOMY of CPU
- Program counter (PC): holds address of next instruction (if nothing changes,
increment linearly)
- Registers: hold active data/instruction
- Condition Codes/Flags: special state of recent calculation
- Fetch - Decode - Execute cycle
CODE TO ASSEMBLY
.c code -> .s (assembly) -> .o (object code) -> binary/executable
- Assembly: text level
- Object code: byte encoding of instructions, to be linked with different
libraries,... by linker
- Machine code: complete binary code can be run by the CPU
gcc -Og (optimized for debugging) -S main.c => make main.s
OPERATION TYPES
- move/copy to register/memory
- arithmetic/logical operator on data
- conditional branches/jump
GETTING STARTED
- assembler: translate assembly -> object
- disassembler: object code -> assembly, see all address + byte commands
+ in GDB
gdb <name>
disassemble <name>
x/14xb <name> //examine first 14 bytes
ASSEMBLY BASICS: registers, operands, move
Registers:
- Naming convention: %eax, %rax (number or general purpose: counter,
accumulate,...)
+ can access lower half of the data in register (al, ax, eax,...)
Moving data:
movq <source> <dest>
Operands: can't do mem to mem directly
- constant to dest: $0x4 or $-69 (add $ before const)
- register to mem: %rax, %rdx
- reg to reg: %rax, (%rax) (parentheses represent pointer *p, when reg
is holding address of sth)
NOTE: example: see swap()