Main Memory
Main Memory
Main Memory
CHAPTER 3
MAIN MEMORY
1
OUTLINE
Background
Contiguous Memory Allocation
Non-contiguous Memory Allocation
Segmentation
Paging
Virtual Memory
OBJECTIVES
3
BACKGROUND
Program must be brought (from disk) into memory and placed within a process for it
to be run
Main memory and registers are only storage CPU can access directly
If the data are not in memory, they must be moved there before the CPU can operate
on them.
Register access in one CPU clock (or less)
Main memory can take many cycles, causing a stall
Cache sits between main memory and CPU registers
Protection of memory required to ensure correct operation
4
LINKERS AND LOADERS
Source code compiled into object files designed to be loaded into any physical memory location –
relocatable object file
Linker combines these into single binary executable file
Also brings in libraries
5
THE ROLE OF THE LINKER AND LOADER
6
WHY APPLICATIONS ARE OPERATING SYSTEM SPECIFIC
Apps compiled on one system usually not executable on other operating systems
Each operating system provides its own unique system calls
Own file formats, etc.
Application Binary Interface (ABI) is architecture equivalent of API, defines how different
components of binary code can interface for a given operating system on a given architecture,
CPU, etc.
7
MULTISTEP PROCESSING OF A USER PROGRAM
8
MULTISTEP PROCESSING OF A USER PROGRAM
9
MEMORY HIERARCHY
10
APPLICATION MEMORY
11
MEMORY ALLOCATION TO A PROCESS
Stacks:
Allocations and deallocations are performed in a LIFO manner.
Only the last entry of the stack is accessible at any time
A contiguous area of memory is reserved for the stack
12
MEMORY ALLOCATION TO A PROCESS
13
ADDRESS BINDING
Programs on disk, ready to be brought into memory to execute form an input queue
Without support, must be loaded into address 0000
Inconvenient to have first user process physical address always at 0000
How can it not be?
Further, addresses represented in different ways at different stages of a program’s life
Source code addresses usually symbolic
Compiled code addresses bind to relocatable addresses
i.e. “14 bytes from beginning of this module”
Linker or loader will bind relocatable addresses to absolute addresses
i.e. 74014
Each binding maps one address space to another
14
BINDING OF INSTRUCTIONS AND DATA TO MEMORY
Address binding of instructions and data to memory addresses can happen at three
different stages
Compile time: If memory location known a priori, absolute code can be generated; must
recompile code if starting location changes
Load time: Must generate relocatable code if memory location is not known at compile time
Execution time: Binding delayed until run time if the process can be moved during its execution
from one memory segment to another
Need hardware support for address maps (e.g., base and limit registers)
15
ADDRESS BINDING EXAMPLE
16
ADDRESS BINDING EXAMPLE – COMPILE TIME
17
ADDRESS BINDING EXAMPLE – LOAD TIME
18
LOGICAL VS. PHYSICAL ADDRESS SPACE
The user program deals with logical addresses it never sees the real physical addresses
Execution-time binding occurs when reference is made to location in memory
Logical addresses must be mapped to physical addresses before they are used
20
MEMORY-MANAGEMENT UNIT (MMU)
21
MEMORY-MANAGEMENT UNIT (CONT.)
22
DYNAMIC LOADING
Static linking – system libraries and program code combined by the loader into the binary program image
Dynamic linking – linking postponed until execution time
Small piece of code, stub, used to locate the appropriate memory-resident library routine
Stub replaces itself with the address of the routine, and executes the routine
Operating system checks if routine is in processes’ memory address
If not in address space, add to address space
25
SWAPPING
A process can be swapped temporarily out of memory to a backing store, and then brought back into
memory for continued execution
Total physical memory space of processes can exceed physical memory
Backing store – fast disk large enough to accommodate copies of all memory images for all users;
must provide direct access to these memory images
Swap out, Swap in – swapping variant used for priority-based scheduling algorithms; lower-priority
process is swapped out so higher-priority process can be loaded and executed
The system maintains a ready queue consisting of all processes whose memory images are on the
backing store or in memory
Major part of swap time is transfer time; total transfer time is directly proportional to the amount of
memory swapped
26
SWAPPING
27
CONTEXT SWITCH TIME INCLUDING SWAPPING
If next processes to be put on CPU is not in memory, need to swap out a process and swap in target
process
Context switch time can then be very high
Can reduce if reduce size of memory swapped – by knowing how much memory really being used
System calls to inform OS of memory use via request_memory() and release_memory()
28
SWAPPING ON MOBILE SYSTEMS
29
FRAGMENTATION
External fragmentation: Total sufficient quantity of area within the memory to satisfy
the memory request of a method. however the process’s memory request cannot be
fulfilled because the memory offered is during a non-contiguous manner
Internal fragmentation: when the memory is split into mounted sized blocks. The
mounted sized block is allotted to the method. The memory allotted to the method is
somewhat larger than the memory requested, then the distinction between allotted
and requested memory is that the Internal fragmentation.
30
REQUIREMENTS FOR MEMORY MANAGEMENT
Relocation
Protection
Sharing
Logical organization
Physical organization
31
PROTECTION (BASE & LIMIT)
A pair of base and limit registers define the logical address space
CPU must check every memory access generated in user mode to be sure it is
between base and limit for that user
32
HARDWARE ADDRESS PROTECTION
33
OUTLINE
Background
Contiguous Memory Allocation
Non-contiguous Memory Allocation
Segmentation
Paging
Virtual Memory
CONTIGUOUS ALLOCATION
36
CONTIGUOUS ALLOCATION (CONT.)
Relocation registers used to protect user processes from each other, and from changing operating-
system code and data
Base register contains value of smallest physical address
Limit register contains range of logical addresses – each logical address must be less than the limit register
MMU maps logical address dynamically
If a device driver (or other operating-system service) is not commonly used, we do not want to keep the code
and data in memory, as we might be able to use that space for other purposes. Such code is sometimes called
transient operating-system code;
37
MEMORY ALLOCATION
Multiple-partition allocation
Degree of multiprogramming limited by number of partitions
39
MULTIPLE-PARTITION ALLOCATION
40
TWO WAYS TO TRACK MEMORY USAGE
41
DYNAMIC STORAGE-ALLOCATION
43
EXAMPLE 1
Assume main memory is divided into partitions of size size is 600K, 500K,
200K, 300K (in order), progress with sizes 212K, 417K, 112K and 426K (in that
order) will. How is memory allocated, if using :
First-fit
Best-fit
Worst-fit
44
EXAMPLE 2
A dynamic partitioning scheme is being used, and the following is the memory configuration at a given
point in time. The shaded areas are allocated blocks and the white areas are free blocks. Assume next
four memory request are for 250K, 419K, 205K and 330K. Indicate the starting address (position on
the given memory configuration) for each of the four blocks using the following placement algorithms.
First-fit
Best-fit
Worst-fit
45
FIRST-FIT
Allocation of 250K
250K 250K
Allocation of 419K
250K 250K 419K 181K
Allocation of 205K
250K 205K
256K 45K 419K 181K
Allocation of 419K
419K 81K 250K
Allocation of 205K
419K 81K 250K 205K 395K
48
FRAGMENTATION
49
FRAGMENTATION
50
COMPACTION
4M
Before Compaction After Compaction
51
OUTLINE
Background
Contiguous Memory Allocation
Non-contiguous Memory Allocation
Segmentation
Paging
Virtual Memory
SEGMENTATION
54
LOGICAL VIEW OF SEGMENTATION
4
1
3 2
4
55
user space physical memory space
SEGMENTATION ARCHITECTURE
Segment-table base register (STBR) points to the segment table’s location in memory
Segment-table length register (STLR) indicates number of segments used by a program;
segment number s is legal if s < STLR
56
SEGMENTATION HARDWARE
57
LOGICAL-PHYSICAL ADDRESS TRANSLATION IN SEGMENTATION
58
EXAMPLE OF SEGMENTATION.
59
EXAMPLE OF SEGMENTATION (CONT.)
61
OUTLINE
Background
Contiguous Memory Allocation
Non-contiguous Memory Allocation
Segmentation
Paging
Virtual Memory
PAGING
Physical address space of a process can be noncontiguous; process is allocated physical memory whenever the
latter is available
Avoids external fragmentation
Avoids problem of varying sized memory chunks
Divide physical memory into fixed-sized blocks called frames
Size is power of 2, between 512 bytes and 1Gb
Divide logical memory into blocks of same size called pages
Keep track of all free frames
To run a program of size N pages, need to find N free frames and load program
Set up a page table to translate logical to physical addresses
Backing store likewise split into pages
Still have Internal fragmentation 63
ADDRESS TRANSLATION SCHEME
65
LOGICAL-PHYSICAL ADDRESS TRANSLATION IN PAGING
Frame 0
Frame 1
0000010111011110 Frame 2
(1, 478) Logical Address
Frame 3
Page 0
Frame 4
Page 1
Frame 5
Page 3
Frame 6
(6, 478)
Page 4
Physical Address
0001100111011110 Frame 7
Process
Frame 8
0 Frame 5
Page Table
1 Frame 6
Frame 9
2 Frame 7
66
PAGING MODEL OF LOGICAL AND PHYSICAL MEMORY
67
PAGING EXAMPLE
68
PAGING
69
OBTAINING THE PAGE SIZE ON UNIX SYSTEMS
The page size varies according to architecture, and there are several ways of obtaining
the page size. One approach is to use the getpagesize() system call.
Another strategy is to enter the following command on the command line:
getconf PAGESIZE
Each of these techniques returns the page size as a number of bytes.
70
FREE FRAMES
Page tables are too large to be kept on the chip (Registers). Page table is kept in main
memory.
Page-table base register (PTBR) :points to the beginning of the page table for this
process
Page-table length register (PTLR) : size of of page table
In this scheme every data/instruction access requires two memory accesses
One for the page table and one for the data / instruction
The two memory access problem can be solved by the use of a special fast-lookup
hardware cache called associative memory or translation look-aside buffers
(TLBs)
72
IMPLEMENTATION OF PAGE TABLE (CONT.)
Page # Frame #
73
PAGING HARDWARE WITH TLB
74
EFFECTIVE ACCESS TIME
Hit ratio – percentage of times that a page number is found in the TLB.
An 80% hit ratio means that we find the desired page number in the TLB 80% of the time.
Suppose that 10 nanoseconds to access memory.
If we find the desired page in TLB then a mapped-memory access take 10 ns
Otherwise we need two memory access so it is 20 ns
Effective Access Time (EAT) : P x hit memory time + (1-P) x miss memory time.
EAT = 0.80 x 10 + 0.20 x 20 = 12 nanoseconds
implying 20% slowdown in access time
Consider amore realistic hit ratio of 99%,
EAT = 0.99 x 10 + 0.01 x 20 = 10.1ns
implying only 1% slowdown in access time.
75
MULTIPLE LEVELS OF TLBS
CPUs today may provide multiple levels of TLBs. Calculating memory access times in
modern CPUs is therefore much more complicated than shown in the example above.
For instance, the Intel Core i7 CPU:
Has a (128-entry L1 instruction TLB) and a 64-entry L1 data TLB. In the case of a miss at L1, it takes
the CPU six cycles to check for the entry in the L2 (512-entry TLB).
A miss in L2 means that the CPU must either walk through the page-table entries in memory to find
the associated frame address, which can take hundreds of cycles
76
MEMORY PROTECTION
77
VALID (V) OR INVALID (I) BIT IN A PAGE TABLE
Shared code
One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers,
window systems)
Similar to multiple threads sharing the same process space
Also useful for interprocess communication if sharing of read-write pages is allowed
79
SHARED PAGES EXAMPLE
80
STRUCTURE OF THE PAGE TABLE
Memory structures for paging can get huge using straight-forward methods
Consider a 32-bit logical address space as on modern computers with 4K page size:
82
TWO-LEVEL PAGING EXAMPLE
A logical address (on 32-bit machine with 1K page size) is divided into:
a page number consisting of 22 bits
a page offset consisting of 10 bits
Since the page table is paged, the page number is further divided into:
a 12-bit page number
a 10-bit page offset
Thus, a logical address is as follows:
Where p1 is an index into the outer page table, and p2 is the displacement within the page of the inner
page table
Known as forward-mapped page table
83
TWO-LEVEL PAGE-TABLE SCHEME
5 100
5 15 6
84
ADDRESS-TRANSLATION SCHEME
85
THREE-LEVEL PAGING SCHEME
86
EXAMPLE: THE INTEL 32 AND 64-BIT ARCHITECTURES
87
EXAMPLE: THE INTEL IA-32 ARCHITECTURE
88
EXAMPLE: THE INTEL IA-32 ARCHITECTURE (CONT.)
89
LOGICAL TO PHYSICAL ADDRESS TRANSLATION IN IA-32
90
INTEL IA-32 SEGMENTATION
91
INTEL IA-32 PAGING ARCHITECTURE
92
INTEL IA-32 PAGE ADDRESS EXTENSIONS
32-bit address limits led Intel to create page address extension (PAE), allowing 32-bit
apps access to more than 4GB of memory space
• Paging went to a 3-level scheme
• Top two bits refer to a page directory pointer table
• Page-directory and page-table entries moved to 64-bits in size
• Net effect is increasing address space to 36 bits – 64GB of physical memory
93
INTEL X86-64
94
EXAMPLE: ARM ARCHITECTURE
96
INVERTED PAGE TABLE ARCHITECTURE
97
HASHED PAGE TABLE
Each element contains (1) the virtual page number (2) the value of the mapped page
frame (3) a pointer to the next element
Virtual page numbers are compared in this chain searching for a match
If a match is found, the corresponding physical frame is extracted
98
HASHED PAGE TABLE
99
OUTLINE
Background
Contiguous Memory Allocation
Non-contiguous Memory Allocation
Segmentation
Paging
Virtual Memory
VIRTUAL MEMORY
Virtual memory is a technique that allows the execution of processes that are not
completely in memory
One major advantage of this scheme is that programs can be larger than physical
memory.
Further, virtual memory abstracts main memory into an extremely large, uniform array
of storage, separating logical memory as viewed by the user from physical memory
101
BACKGROUND
102
DEMAND PAGING
During MMU address translation, if valid–invalid bit in page table entry is i page fault
104
BASIC CONCEPTS
1. If there is a reference to a page, first reference to that page will trap to operating system
Page fault
2. Operating system looks at another table to decide:
Invalid reference abort
Just not in memory
3. Find free frame
4. Swap page into frame via scheduled disk operation
5. Reset tables to indicate page now in memory
Set validation bit = v
6. Restart the instruction that caused the page fault
106
STEPS IN HANDLING A PAGE FAULT
107
PERFORMANCE OF DEMAND PAGING
108
PAGE REPLACEMENT
109
PAGE AND FRAME REPLACEMENT ALGORITHMS
Page-replacement algorithm
Want lowest page-fault rate on both first access and re-access
110
FIFO PAGE REPLACEMENT
A FIFO replacement algorithm associates with each page the time when that page was
brought into memory.
Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
3 frames (3 pages can be in memory at a time per process)
Page fault=?
Can vary by reference string: consider 1,2,3,2,4,1,2,3,0,3,1,5,7,1,3,2,1,2,4,5
111
FIFO ILLUSTRATING BELADY’S ANOMALY
112
LRU PAGE REPLACEMENT
Most Frequency Used (MFU): The page with the smallest count was probably just
brought in and has yet to be used.
113
OPT(OPTIMAL) PAGE REPLACEMENT
Replace the page that will not be used for the longest period of time.
9 is optimal for the example
114
EXERCICES
115
SOLUTIONS (FIFO)
1 2 3 4 2 1 5 6 2 1 2 3 6 7 3 2 1 2 3 6
1 1 1 1 5 5 5 1 1 1 1
2 2 2 2 6 6 6 6 7 7
3 3 3 3 2 2 2 2 6
4 4 4 4 4 3 3 3
* * * * * * * * * *
116
SOLUTIONS (RLU)
1 2 3 4 2 1 5 6 2 1 2 3 6 7 3 2 1 2 3 6
RLU 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
Page fault=8
3 3 5 5 3 3 3
4 4 6 6 7 6
* * * * * * * *
117
SOLUTIONS
1 2 3 4 2 1 5 6 2 1 2 3 6 7 3 2 1 2 3 6
OTP
1 1 1 1 1 1 1 6
2 2 2 2 2 2 2
Page fault=7 3 3 3 3 3 3
4 5 6 7 7
* * * * * * *
118
THRASHING
If a process does not have “enough” pages, the page-fault rate is very high
Page fault to get page
Replace existing frame
But quickly need replaced frame back
This leads to:
Low CPU utilization
Operating system thinking that it needs to increase the degree of multiprogramming
Another process added to the system
119
THRASHING (CONT.)
120