15.1.1 Single-Level Page Tables
15.1.1 Single-Level Page Tables
Lecture 15
Lecturer: Mark Corner Scribes: Bruno Silva,Jim Partan
At the end of the last lecture, we introduced page tables, which are lookup tables mapping a process’ virtual
pages to physical pages in RAM. How would one implement these page tables?
The most straightforward approach would simply have a single linear array of page-table entries (PTEs).
Each PTE contains information about the page, such as its physical page number (“frame” number) as well
as status bits, such as whether or not the page is valid, and other bits to be discussed later.
If we have a 32-bit architecture with 4k pages, then we have 220 pages, as discussed in the last lecture. If
each PTE is 4 bytes, then each page table requires 4 Mbytes of memory. And remember that each process
needs its own page table, and there may be on the order of 100 processes running on a typical personal
computer. This would require on the order of 400 Mbytes of RAM just to hold the page tables on a typical
desktop!
Furthermore, many programs have a very sparse virtual address space. The vast majority of their PTEs
would simply be marked invalid.
Clearly, we need a better solution than single-level page tables.
Multi-level page tables are tree-like structures to hold page tables. As an example, consider a two-level page
table, again on a 32-bit architecture with 212 = 4 kbyte pages. Now, we can divide the virtual address into
three parts: say 10 bits for the level-0 index, 10 bits for the level-1 index, and again 12 bits for the offset
within a page.
The entries of the level-0 page table are pointers to a level-1 page table, and the entries of the level-1 page
table are PTEs as described above in the single-level page table section. Note that on a 32-bit architecture,
pointers are 4 bytes (32 bits), and PTEs are typically 4 bytes.
So, if we have one valid page in our process, now our two-level page table only consumes
(210 level-0 entries) · (22 bytes/entry) + 1 · (210 level-1 entries) · (22 bytes/entry) = 2 · 212 bytes = 8 kbytes.
For processes with sparse virtual memory maps, this is clearly a huge savings, made possible by the additional
layer of indirection.
Note that for a process which uses its full memory map, that this two-level page table would use slightly
more memory than the single-level page table (4k+4M versus 4M). The worst-case memory usage, in terms
of efficiency, is when all 210 level-1 page tables are required, but each one only has a single valid entry.
15-1
15-2 Lecture 15: March 31
In practice, most page tables are 3-level or 4-level tables. The size of the indices for the different levels are
optimized empirically by the hardware designers, then these sizes are permanently set in hardware for a
given architecture.
We will briefly introduce paging to finish off this lecture. When a process is loaded, not all of the pages are
immediately loaded, since it’s possible that they will not all be needed, and memory is a scarce resource.
The process’ virtual memory address space is broken up into pages, and pages which are valid addresses are
either loaded or not loaded. When the MMU encounters a virtual address which is valid but not loaded (or
“resident”) in memory, then the MMU issues a pagefault to the operating system. The OS will then load the
page from disk into RAM, and then let the MMU continue with its address translation. Since access times
for RAM are measured in nanoseconds, and access times for disks are measured in milliseconds, excessive
pagefaults will clearly hurt performance a lot.
1 TLB stands for “translation lookaside buffer”, which is not a very enlightening name for this subsystem.