Memory Management
Memory Management
Memory Management
Hardware address
protection with base
and limit registers
Protection of memory space is accomplished by having the CPU
hardware compare every address generated in user mode with the
registers.
Any attempt by a program executing in user mode to access operating-
system memory or other users’ memory results in a trap to the
operating system, which treats the attempt as a fatal error.
This scheme prevents a user program from (accidentally or
deliberately) modifying the code or data structures of either the
operating system or other users.
The base and limit registers can be loaded only by the operating
system, which uses a special privileged instruction. Since privileged
instructions can be executed only in kernel mode, and since only the
operating system executes in kernel mode, only the operating system
can load the base and limit registers.
This scheme allows the operating system to change the value of the
registers but prevents user programs from changing the registers’
contents.
Address Binding
Usually, a program resides on a disk as a binary executable file. To
be executed, the program must be brought into memory and placed
within a process. Depending on the memory management in use,
the process may be moved between disk and memory during its
execution. The processes on the disk that are waiting to be brought
into memory for execution form the input queue.
User programs typically refer to memory addresses with symbolic
names (such as variable "i", "count" etc). A compiler typically binds
these symbolic addresses to relocatable addresses (such as “14
bytes from the beginning of this module”). The linkage editor or
loader in turn binds the relocatable addresses to absolute addresses
(such as 74014). Each binding is a mapping from one address space
to another.
The binding of instructions and data to memory addresses can be done at any step along the way:
• Compile Time - If it is known at compile time where
a program will reside in physical memory, then
absolute code can be generated by the compiler,
containing actual physical addresses. However if the
load address changes at some later time, then the
program will have to be recompiled. MS-DOS .COM
programs use compile time binding.
• Load Time - If the location at which a program will
be loaded is not known at compile time, then the
compiler must generate relocatable code, which
references addresses relative to the start of the
program. If that starting address changes, then the
program must be reloaded but not recompiled.
• Execution Time - If a program can be moved around
in memory during the course of its execution, then
binding must be delayed until execution time. This
requires special hardware, and is the method
implemented by most modern OSes
Logical Versus Physical Address Space
An address generated by the CPU is commonly referred to as a
logical address, whereas an address seen by the memory unit—that
is, the one loaded into the memory-address register of the memory
—is commonly referred to as a physical address.
The compile-time and load-time address-binding methods generate
identical logical and physical addresses.
The execution-time address binding scheme results in differing
logical and physical addresses.
In this case the logical address is also known as a virtual
address, and the two terms are used interchangeably.
The set of all logical addresses used by a program composes the
logical address space, and the set of all corresponding physical
addresses composes the physical address space.
Each physical memory location has an address. Physical addresses start at zero.
Example for a 32-bit computer with 16MB of memory:
Access to the memory is protected using base
and limit registers (hardware protection): base
≤ address < base + limit.
Paging model of
logical and physical
memory
The page size (like the frame size) is defined by the hardware. The size of
a page is a power of 2, varying between 512 bytes and 1 GB per page,
depending on the computer architecture.
The selection of a power of 2 as a page size makes the translation of a
logical address into a page number and page offset particularly easy. If the
size of the logical address space is 2m, and a page size is 2n bytes, then the
high-order m−n bits of a logical address designate the page number, and
the n low-order bits designate the page offset.
Thus, the logical address is as follows:
where p is an index into the page table and d is the displacement within the page.
The number of bits in the page number and the number of bits in the
frame number do not have to be identical. The former determines the
address range of the logical address space, and the latter relates to the
physical address space.
Logical address 0 is page 0,
offset 0. Indexing into the page
table, we find that page 0 is in
frame 5. Thus, logical address 0
maps to physical address 20 [=
(5 × 4) + 0].
• Logical address 3 (page 0,
offset 3) maps to physical
address 23 [= (5 × 4) + 3].
• Logical address 4 is page 1,
• Consider the example, in which a offset 0; according to the
process has 16 bytes of logical page table, page 1 is mapped
memory, mapped in 4 byte pages
to frame 6. Thus, logical
into 32 bytes of physical memory.
address 4 maps to physical
• In the logical address of this
address 24 [= (6 × 4) + 0].
example, n= 2 and m = 4. We use
a page size of 4 bytes and a • Logical address 13 maps to
physical memory of 32 bytes (8 physical address 9.
pages)
Paging itself is a form of dynamic relocation. Every logical address
is bound by the paging hardware to some physical address.
When a paging scheme is used, no external fragmentation exists:
any free frame can be allocated to a process that needs it. However,
we may have some internal fragmentation.
In the worst case, a process would need n pages plus 1 byte. It
would be allocated n + 1 frames, resulting in internal fragmentation
of almost an entire frame.
Larger page sizes waste more memory, but are more efficient in
terms of overhead. Modern trends have been to increase page sizes,
and some systems even have multiple size pages to try and make
the best of both worlds.
Page table entries (frame numbers) are typically 32 bit numbers,
allowing access to 232 physical page frames. If those frames are 4 KB
in size each, that translates to 16 TB of addressable physical memory. (32
+ 12 = 44 bits of physical address space.)
Paging Examples
Free frames
(a) before allocation
(b) after allocation
• Divide physical memory into
fixed- sized blocks called
frames (size is power of 2,
between 512 bytes and 8192
bytes).
• Divide logical memory into
blocks of same size called
pages.
• Keep track of all free frames.
• To run a program of size n
pages, need to find n free
frames and load program.
• Set up a Page Table to
translate logical to physical
addresses.
An important aspect of paging is the clear separation between the
programmer’s view of memory and the actual physical memory.
Processes are blocked from accessing anyone else's memory because all of
their memory requests are mapped through their page table. There is no
way for them to generate an address that maps into any other process's
memory space.
The OS must be aware of the allocation details of physical memory—
which frames are allocated, which frames are available, how many total
frames there are, and so on.
This information is generally kept in a data structure called a frame table.
The frame table has one entry for each physical page frame, indicating
whether the latter is free or allocated and, if it is allocated, to which page
of which process or processes.
The operating system must keep track of each individual process's page
table, updating it whenever the process's pages get moved in and out of
memory, and applying the correct page table when processing system calls
for a particular process. This all increases the overhead involved when
swapping processes in and out of the CPU.
Hardware Support
• Page lookups must be done for every memory reference, and
whenever a process gets swapped in or out of the CPU, its page
table must be swapped in and out too, along with the instruction
registers, etc. It is therefore appropriate to provide hardware
support for this operation, in order to make it as fast as possible
and to make process switches as fast as possible also.
• The hardware implementation of the page table can be done in
several ways. In the simplest case, the page table is implemented
as a set of dedicated registers. These registers should be built with
very high-speed logic to make the paging-address translation
efficient.
• For ex, the DEC PDP-11 uses 16-bit addressing and 8 KB pages,
resulting in only 8 pages per process. (It takes 13 bits to address 8
KB of offset, leaving only 3 bits to define a page number)
An alternate option is to store the page table in main memory, and
to use a single register (called the page-table base register,
PTBR) to record where in memory the page table is located.
Process switching is fast, because only the single register
needs to be changed.
However memory access just got half as fast, because every
memory access now requires two memory accesses - One to
fetch the frame number from memory and then another one to
access the desired memory location.
The solution to this problem is to use a very special, small,
fast lookup hardware cache called the translation look-aside
buffer, TLB.
The benefit of the TLB is that it can search an entire table
for a key value in parallel, and if it is found anywhere in
the table, then the corresponding lookup value is returned.
The TLB is associative, high-speed memory. Each entry in the TLB
consists of two parts: a key (or tag) and a value. When the associative
memory is presented with an item, the item is compared with all keys
simultaneously. If the item is found, the corresponding value field is
returned.
The search is fast; a TLB lookup in modern hardware is part of the
instruction pipeline, essentially adding no performance penalty. To be able
to execute the search within a pipeline step, however, the TLB must be
kept small. It is typically between 32 and 1,024 entries in size.
The TLB is used with page tables in the following way. The TLB contains
only a few of the page-table entries. When a logical address is generated
by the CPU, its page number is presented to the TLB. If the page number is
found, its frame number is immediately available and is used to access
memory.
If the page number is not in the TLB (known as a TLB miss), a memory
reference to the page table must be made. Depending on the CPU, this may
be done automatically in hardware or via an interrupt to the operating
system.
In addition, we add the page number and frame number to the TLB, so that
they will be found quickly on the next reference. If the TLB is already full
of entries, an existing entry must be selected for replacement.
Replacement policies range from least recently used (LRU) through round-
robin to random.
Some TLBs allow certain entries to be wired down, meaning that they
cannot be removed from the TLB. Typically, TLB entries for key kernel
code are wired down.
Some TLBs store address-space identifiers (ASIDs) in each TLB entry. An
ASID uniquely identifies each process and is used to provide address-
space protection for that process.
When the TLB attempts to resolve virtual page numbers, it ensures that the
ASID for the currently running process matches the ASID associated with
the virtual page. If the ASIDs do not match, the attempt is treated as a TLB
miss.
In addition to providing address-space protection, an ASID allows the TLB
to contain entries for several different processes simultaneously. Without
this feature the TLB has to be flushed clean with every process switch.
The percentage of times that the page number of interest is found in the
TLB is called the hit ratio.
For example, suppose that it takes 100 nanoseconds to access main
memory. So a TLB hit takes 100 nanoseconds total (100 to go get the
data), and a TLB miss takes 200 (100 to go get the frame number, and then
another 100 to go get the data.) So with an 80% TLB hit ratio, the average
memory access time would be:
effective access time 0.80 * 100 + 0.20 * 200 = 120 nanoseconds
We suffer a 20% slowdown in average memory access
A 99% hit rate would yield 101 nanoseconds average access time for a 1%
slowdown.
effective access time = 0.99 × 100 + 0.01 × 200 = 101 nanoseconds
TLBs are a hardware feature and therefore would
seem to be of little concern to operating systems and
their designers. But the designer needs to understand
the function and features of TLBs, which vary by
hardware platform.
Consider a system with 80% hit ratio, 50 nano-seconds time to
search the associative registers , 750 nano-seconds time to access
memory. Find the time to access a page
a) When the page number is in associative memory.
b) When the time to access a page when not in associative memory.
c) Find the effective memory access time.
a) The time required is 50 nano seconds to get the page number from
associative memory and 750 nano-seconds to read the desired word
from memory. Time = 50+750= 800 nano seconds.
b) Now the time when not in associative memory is Time =
50+750+750= 1550 nano seconds One memory access extra is
required to read the page table from memory.
c) Effective access time = Page number in associative memory +
Page number not in associated memory. Page number in associative
memory = 0.8 * 800. Page number not in associated memory = 0.2 *
1550. Effective access time = 0.8 * 800 + 0.2 * 1550 = 950 nano
seconds
Protection
• Memory protection in a paged environment is accomplished by
protection bits associated with each frame.
• A bit or bits can be added to the page table to classify a page as
read-write, read-only, read-write-execute, or some combination of
these sorts of things. Then each memory reference can be checked
to ensure it is accessing the memory in the appropriate mode.
• One additional bit, Valid / invalid bit is generally attached to each
entry in the page table.
• When this bit is set to valid, the associated page is in the process’s
logical address space and is thus a legal (or valid) page. When the
bit is set to invalid, the page is not in the process’s logical address
space. Illegal addresses are trapped by use of the valid–invalid bit.
The operating system sets this bit for each page to allow or
disallow access to the page.
In a system with a 14-bit address space (0 to 16383), we have a program that
should use only addresses 0 to 10468. Given a page size of 2 KB, addresses in
pages 0, 1, 2, 3, 4, and 5 are mapped normally through the page table. Any
attempt to generate an address in pages 6 or 7, however, will find that the
valid–invalid bit is set to invalid, and the computer will trap to the operating
system (invalid page reference).
Because address translation works from the outer page table inward,
this scheme is also known as a forward-mapped page table.
• The VAX minicomputer from Digital Equipment Corporation (DEC) was
the most popular minicomputer of its time and was sold from 1977 through
2000. The VAX architecture supported a variation of two-level paging. The
VAX is a 32-bit machine with a page size of 512 bytes.
• The logical address space of a process is divided into four equal sections,
each of which consists of 230 bytes. Each section represents a different part
of the logical address space of a process. The first 2 high-order bits of the
logical address designate the appropriate section. The next 21 bits represent
the logical page number of that section, and the final 9 bits represent an
offset in the desired page.
• By partitioning the page table in this manner, the operating system can
leave partitions unused until a process needs them. Entire sections of virtual
address space are frequently unused, and multilevel page tables have no
entries for these spaces, greatly decreasing the amount of memory needed to
store virtual memory data structures.
With a 64-bit logical address space and 4K pages, there are 52 bits worth of page
numbers, which is still too many even for two-level paging. One could increase the
paging level, but with 10-bit page tables it would take 7 levels of indirection,
which would be prohibitively slow memory access. So some other approach must
be used.
• 64-bits Two-tiered leaves 42 bits in outer table
we can page the outer page table, giving us a three-level paging scheme. Going
to a fourth level still leaves 32 bits in the outer table.
• The outer page table is still 234 bytes (16 GB) in size. The next step would be
a four-level paging scheme, where the second-level outer page table itself is
also paged, and so forth
• The 64-bit UltraSPARC would require seven levels of paging—a prohibitive
number of memory accesses— to translate each logical address.
Hashed Page Tables
⁂A common approach for handling address spaces larger
than 32 bits is to use a hashed page table, with the hash
value being the virtual page number.
⁂Each entry in the hash table contains a linked list of
elements that hash to the same location (to handle
collisions).
⁂Each element consists of three fields:
(1) the virtual page number,
(2) the value of the mapped page frame,
(3) a pointer to the next element in the linked list.
The virtual page number in the virtual address is hashed into the hash table.
The virtual page number is compared with field 1 in the first element in the
linked list. If there is a match, the corresponding page frame (field 2) is used to
form the desired physical address. If there is no match, subsequent entries in
the linked list are searched for a matching virtual page number.
In this example, Best-fit turns out to be the best because there is no wait processes.
Consider a paging system with the page table stored in memory.
a) If a memory reference takes 200 nanoseconds, how long does a
paged memory reference take?
b) If we add associative registers, and 75 percent of all page-table
references are found in the associative registers, what is the
effective memory reference time? (Assume that finding a page-
table entry in the associative registers takes zero time, if the entry
is there.)
Using the page table shown below, translate the physical address 25
to virtual address. The address length is 16 bits and page size is
2048 words while the size of the physical memory is four frames.
Page Present (1-ln, 0-0ut) Frame
0 1 3
1 1 2
2 1 0
3 0 –
A. 25 B. 6125 C. 2073 D. 4121