Introduction To The x86 Microprocessor
Introduction To The x86 Microprocessor
Introduction To The x86 Microprocessor
Prof. V. Kamakoti Digital Circuits And VLSI Laboratory Indian Institute of Technology, Madras Chennai - 600 036. http://vlsi.cs.iitm.ernet.in
Protected Mode
Memory Segmentation and Privilege Levels Definition of a segment Segment selectors Local Descriptor Tables Segment Aliasing, Overlapping Privilege protection Defining Privilege Levels Changing Privilege levels
Organization
Basic Introduction Structured Computer Organization Memory Management Architectural Support to Operating Systems and users Process Management Architectural Support to Operating Systems
Task Switching and Interrupt/Exception Handling
Legacy Management Instruction set compatibility across evolving processor Architectures Evolution of Instruction Sets MMX Instructions
Compilers ask for features from the Architecture to induce more sophistication in the Programming Languages Compiled code/ Assembly code Advanced Addressing modes Sophisticated Instruction set
Support for Memory Management and Task Management Multiuser OS - Protection, Virtual Memory, Context Switching
Computer Architecture
Microprogramming level
Intel
Memory Management
Multi User Operating Systems
Ease of Programming Process Mobility in the Address Space Multiprocess Context switching Protection across Processes
Inter process protection
Ensured by Intra process protection: Separation of Code, Paging Data and Stack
Virtual Memory
Main Memory if (j>k) max = j else max = k Code_Segment: mov EAX, [0] mov EBX, [4] cmp EAX,EBX jle 0x7 //Label_1 mov [8], EAX jmp 0x5 //Label_2 Label_1: mov [8], EBX Label_2: .
Every Memory Data Access should add the value stored in Data Segment Register By default.
Code and Data segments are separate and both assumed to start from 0
0000
Data Segment:
0: // Allocated for j
Address of j: 2100 Address of k: 2104 Address of max: 2108
Vacant Space
2100
Ease Of Programming
2500
Main Memory if (j>k) max = j else max = k Code_Segment: mov EAX, [0] mov EBX, [4] cmp EAX,EBX jle 0x7 //Label_1 mov [8], EAX jmp 0x5 //Label_2 Label_1: mov [8], EBX Label_2: .
A new process needs a segment of size 260 The space is available but not contiguous
0000
Data Segment:
0: // Allocated for j
Address Address of of j: j: 2300 2100 Address Address of of k: k: 2304 2104 Address of max: 2308 2108
2100
2160
Vacant
2300
Process Mobility
2500
R8D R15D (32 bit counter part) R8W R15W (16 bit counter part)
ST0-ST7, 80 bit floating point MMX0-MMX7, 64-bit multi media XMM0-XMM7, 128-bit registers used for floating point and packed integer arithmetic
Segment Registers
Multiple Segments
The segment register can change its values to point to different segments at different times. X86 architecture provides additional segment registers to access multi data segments at the same time. DS, ES, FS and GS X86 supports a separate Stack Segment Register (SS) and a Code segment Register (CS) in addition. By default a segment register is fixed for every instruction, for all the memory access performed by it. For eg. all data accessed by MOV instruction take DS as the default segment register. An segment override prefix is attached to an instruction to change the segment register it uses for memory data access.
0000
mov [10], eax - this will move the contents of eax register to memory location 0510 Opcode: 0x89 0x05 0x10 mov [ES:10], eax -this will move the contents of eax register to memory location 3510 Opcode 0x26 0x89 0x05 0x10
DS 0500 C S
1500 2500
SS
E S
3500
Multiple Segments
Process 1 CS
Process 1 DS
Process 2 CS Process 2 SS Process 2 DS Process 1 SS
in
Execution
Other Registers
EFLAGS 32 Bit Register
VM RF NT IO IO OF DF IF TF SF ZF AF PF CF
PL PL
Bits 1,3,5,15,22-31 are RESERVED. 18: AC, 19:VIF, 20: VIP, 21:ID
Set by arithmetic instructions that generate a carry or borrow. Also can be set, inverted and cleared with the STC, CLC or CMC instructions respectively. Set by most instructions if the least significant eight bits of the destination operand contain an even number of 1 bits.
PF Parity Flag
If a carry or borrow from the most significant nibble of the least significant byte Aids BCD arithmetic Set by most instructions if the result of the arithmetic operation is zero
ZF Zero Flag
TF Trace Flag
On being set it allows single-step through programs. Executes exactly one instruction and generates an internal exception 1 (debug fault)
When set, the processor recognizes the external hardware interrupts on INTR pin. On clearing, anyway has not effect on NMI (external non maskable interrupt) pin or internally generated faults, exceptions, traps etc. This flag can be set and cleared using the STI and CLI instructions respectively Specifically for string instructions. DF = 1 increments ESI and EDI, while DF = 0 decrements the same. Set and cleared by STD and CLD instructions
DF Direction Flag
Most arithmetic instructions set this flag to indicate that the result was at least 1 bit too large to fit in the destination
For protected mode operations indicates the privilege level, 0 to 3, at which your code must be running in order to execute any I/O-related instructions
When set, it indicates that one system task has invoked another through a CALL instruction as opposed to a JMP. For multitasking this can be manipulated to our advantage It is related to Debug registers DR6 and DR7. By setting this, you can selectively mask some exceptions while you are debugging code
RF Resume Flag
When it is set, the x86 processor is basically converted into a highspeed 8086 processor.
AC (bit 18) Alignment check flag Set this flag and the AM bit in the CR0 register to
enable alignment checking of memory references; clear the AC flag and/or the AM bit to disable alignment checking.
VIF (bit 19) Virtual interrupt flag Virtual image of the IF flag. Used in conjunction
with the VIP flag. (To use this flag and the VIP flag the virtual mode extensions are enabled by setting the VME flag in control register CR4.)
clear when no interrupt is pending. (Software sets and clears this flag; the processor only reads it.) Used in conjunction with the VIF flag.
ID (bit 21) Identification flag The ability of a program to set or clear this flag indicates
CR2 Read only register deposits the last 32bit linear address that caused a page-fault CR3 Stores the physical address of the PDB Page Directory Base register. The paging tables are to be 4KB aligned and hence the 12 least significant bits are not stored and ignored
DR0, DR1, DR2, DR3, DR6, DR7 DR0-DR3 can hold four linear address breakpoints so that of the processor generates these addresses a debug exception (Interrupt 1) is caused DR6 Debug status register indicating the circumstances that may have caused the last debug fault DR7 Debug control register. By filling in the various fields of this register, you can control the operation of the four linear address breakpoints
Used to perform confidence checking on the paging MMUs Translation Lookaside Buffer (TLB).
Answers
1. Eight GPRs 2. Segmentation 3. Three Features
Code Mobility Logically every segment can start with zero Inter and Intra process protection ensuring data integrity.
Learnt so far Intel Memory Management fundamentals Motivation from a Computer Organization standpoint Intel Register set General Purpose Registers, Segment registers and system registers x86 modes of operations
The mov will store the content of EAX in 0x10040 + 0x1000 = 0x11040
SELECTOR
OFFSET
Descriptor Table
Segment Descriptor
Base Address
Linear Address
A process always executes from Code segment. It should not execute by accessing from adjoining Data or stack area or any other code area too. A stack should not overgrow into adjoining segments
C S
ES SS
500
1000 1500 2000
Every segment is specified a start address and limit. Architecture checks if limit is not exceeded.
jmp mov PUSH CS:501 POP PUSH [ES:498], mov POP EAX EAX AX [ES:498], AX //This //Let //Let //Let EAX is SP a SP SP AX //This violation be be be //This 2, 498, 2, 498, is Violation!!! it a as is it violation!!! violation is fine limit fineis 500 jmp CS:250 //This is fine
Process 1 should be prevented from loading CS, such that it can access the code of Process 2 Similarly for the DS,SS, ES, FS and GS
C S D S S S
Process 1 CS
Process 1 DS
Process 2 CS Process 2 SS Process 2 DS Process 1 SS
3: Lowest privilege
Interprocess Protection
Protection Implementation
Every segment is associated with a descriptor stored in a descriptor table. The privilege level of any segment is stored in its descriptor. The descriptor table is maintained in memory and the starting location of the table is pointed to by a Descriptor Table Register (DTR). The segment register stores an offset into this table.
Structure of a Descriptor
The above command is successful if and only if the descriptor stored at the offset 0x10 in the descriptor table has a privilege level numerically greater than or equal to the CPL. A process with CPL = 3 cannot load the segment descriptor of CPL <= 2, and hence cannot access the segments.
jmp 0x20:0x1000 This updates the CS by 0x20, provided the descriptor stored at offset 0x20 has a privilege level numerically greater than or equal to CPL
Numerically higher to lower Privilege Levels using CALL gates useful for system calls. Any privilege level to any other privilege level using task switch.
Descriptor Tables
There are two descriptor tables
The global descriptor tables base address is stored in GDTR The local descriptor tables base address is stored in LDTR The two privileged instructions LGDT and LLDT loads the GDTR and LDTR.
Structure of a Selector
15 2 0
T1
Since segment descriptors are each 8 bytes, the last three bits of the selector is zero, in which one of them is used for LDT/GDT access.
Two process each of PL = 3 should be allotted segments such that one should not access the segments of other.
GDT R
GDT
All descriptors in GDT have PL = 0,1,2
LDTR
LDTR
Per process
Per process
If at all each process should access memory, it has to use the descriptors in its LDTR only and it cannot change the LDTR/LDT/GDTR/GDT contents as they would be maintained in a higher privileged memory area.
Visible part
Hidden part
Segment selector
CS SS DS ES FS GS
Be Careful
Logical Address
add [DS:20],eax
0x10
20
Descriptor Table
Linear address will still be 120 Have to execute mov DS,0x10 again to get the answer as 220, as this would update the hidden part
Base Address
Linear Address
120
Changing Base
Paging fundamentals
Each page is 4096 bytes Physical RAM has page frames like photo frames, which is also 4096 bytes. A page is copied into the page frame, if needed and removed to accommodate some other page. By this, a 4 GB code can run on a 128MB physical memory This is also called demand paging.
10
TABLE
12
OFFSET PAGE FRAME
PAGE DIRECTORY
PAGE TABLE 4KB entries with 4 bytes per entry PG TBL ENTRY
PHYS ADDRS
DIR ENTRY
If 20 bytes are used as a single level paging then page table alone is 4 MB which is inefficient. So two level paging. Develop the page table on demand TLBs used to improve performance Dirty bit accommodated in each page entry
CR3 REG
Task Switching
There are different types of descriptors in a Descriptor table. One of them is a task state segment descriptor. jmp 0x10:<dontcare> and that 0x10 points to a TGD, then the current process context is saved and the new process pointed out by the task state segment descriptor is loaded. A perfect context switch. TSS descriptor only in a GDT.
Task Switching
Every process has an associated Task State Segment, whose starting point is stored in the Task register. A task switch happens due to a jmp or call instruction whose segment selector points to a Task state segment descriptor, which in turn points to the base of a new task state segment
Interrupt Handling
Processor generates interrupts that index into a Interrupt Descriptor Table, whose base is stored in IDTR and loaded using the privileged instruction LIDT. The descriptors in IDT can be
Interrupt gate: ISR handled as a normal call subroutine uses the interrupted processor stack to save EIP,CS, (SS, ESP in case of stack switch new stack got from TSS). Task gate: ISR handled as a task switch
Needed for stack fault in CPL = 0 and double faults.
Interrupt Handling
Processor handles a total of 255 interrupts 0-31 are used by machine or reserved 32-255 are user definable 0 Divide error, goes to first descriptor in IDT 1 Debug 8 Double Fault 12 Stack Segment fault 13 General Protection Fault 14 Page Fault
Legacy Issues
16-bit code in 32-bit architecture Address override prefix 16-bit or 32-bit addresses in a 32-bit or 16-bit code segment Operand override prefix
Same opcode for say, add EAX,EBX and add AX,BX Distinguished by the operand override prefix 16-bit or 32-bit operands in a 32-bit ot 16-bit code segment
D flag in the code segment descriptor tells the size of the code segment, which is used above.
Legacy Issues
mod r/m: says if it is a memory or register access sib: says if it is memory then what addressing is issued for effective address calculation.
Memory Segmentation
Segment Descriptors
80886 to 80386+
In 8086, the program is not expected to generate a non-existent memory address. If it does, then the processor shall try to access the same and read bogus data, or crash In 80386+ (and above) the segment attributes (base, limit, privilege etc) are programmable and no matter how privileged the code may be, it cannot access an area of memory unless that area is described to it.
Areas of memory Defined by the programmer Used for different purposes, such as code, data and stack All the same size Necessarily paragraph aligned Limited to 64KB
Segment Descriptors
Describes a segment using 64-bits (0-63) Must be created for every segment Is created by the programmer Determines a segments base address (32bits) (Bits 16-39, 56-63) Determines a segments size (20-bits) (Bits 0-15, 48-51)
000 Data, Read only 001 Data, Read/Write 010 expand down, Read only 011 expand down, Read/Write 100 Code, Execute only 101 Code, Execute/Read 110 Conforming Code, Execute only 111 - Conforming Code, Execute/Read
D = 0 then 16-bit 80286 code D = 1 then 32-bit 80386+ code D = 0 then stack operations are 16-bit wide, SP is used as a stack pointer, maximum stack size is FFFF (64 KB) D = 1 then stack operations are 32-bit wide, ESP is used as a stack pointer, maximum stack size is FFFFFFFF (4 GB)
Stack Segment
G = 0 then a limit field in descriptor of value p indicates we can access p-1 bytes from base G = 1 then a limit field in descriptor of value p indicates we can access (p * 4096) - 1 bytes from base
FFFF
Limit
Base
Non-stack
Base
Stack/expanddown
Descriptor Tables
Descriptors are stored in three tables:
Segment Selectors
Out of several segments described in your GDT and LDT, which of the segment(s) that are currently being used are pointed to by the 16-bit CS,DS,ES,FS,GS and SS registers. Each store a selector Since descriptors are at 8-byte boundaries, the 16-bit selectors store the first most significant 13 bits to point to the corresponding descriptor. The bit 2 is the T1 bit, which when 0 (1) implies the selector is pointing to a descriptor in GDT (LDT). The bits (0-1) are the Request Privilege Level (RPL) bits used for privilege assignments.
Limit 19-16
0000010
Limit 15-0
Privilege levels
The need is to prevent
Users from interfering with one another Users from examining secure data Program bugs from damaging other programs Program bugs from damaging data Malicious attempts to compromise system integrity Accidental damage to data
Privilege Protection
Continuous checking by the processor on whether the application is privileged enough to
Type 1: Execute certain instructions Type 2: Reference data other than its own Type 3: Transfer control to code other than its own
To manage this every segment has a privilege level called the DPL (Descriptor Privilege Level) Bits 45,46
1.
Privileged Instructions
Segmentation and Protection Based (HLT, CLTS, LGDT, LIDT, LLDT, LTR, moving data to Control, Debug and Test registers) Interrupt flag based (CLI, STI, IN, INS, OUT, OUTS) Peripheral IO based
2.
3.
First two types of privileged instructions can be executed only when CPL = 0, that is, these instructions can be in code segment with DPL = 0.
I/O instructions
The I/O based privileged instructions are executed only if CPL <= IOPL in EFLAGS register. To add to the security the POPF/POPFD instructions which load values into the EFLAGS shall not touch the IOPL bit or IF bit if CPL > 0.
jmp <selector>:<offset of instruction from start of the new segment> call <selector>:<offset of instruction from start of the new segment>
A code segment (executable permission) Defined with the same privilege level Marked present
If not, you JMP back or RET to the source code segment after executing the conforming code segment. This should permit return from a numerically low privilege code to a numerically high privilege code, without check.
01100
000
WC
Destination Offset
15-0
Not only the selector for the target code segment, but also the offset in the code segment from which you should start executing is specified. The source code segment can only use it like a black-box
Code Desc
CALL CALL SEG OFFSET SEG OFFSET
Code Desc
Correct
Incorrect
Call Gates
Are defined like segment descriptors Occupy a slot in the descriptor tables Provide the only means to alter the current privilege level Define entry points to other privilege levels Must be invoked using a CALL Instruction