Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

OS Chapt 3 (2015)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Operating System

Chapter 3
Memory Management
3.1 Introduction
Memory is central to the operation of a modern computer system. Memory is a large array of
words or bytes, each with its own address. The CPU fetches instructions from memory
according to the value of the program counter. These instructions may cause additional
loading from and storing to specific memory addresses.

A typical instruction execution cycle, for example, will first fetch an instruction from
memory. The instruction is then decoded and may cause operands to be fetched from
memory. After the instruction has been executed on the operands, results may be stored back
in memory.

The purpose of memory management is to ensure fair, secure, orderly, and efficient use of
memory. The task of memory management includes keeping track of used and free memory
space, as well as when, where, and how much memory to allocate and deallocate. It is also
responsible for swapping processes in and out of main memory.

The memory hierarchy includes:


 Very small, extremely fast, extremely expensive, and volatile CPU registers
 Small, very fast, expensive, and volatile cache
 Hundreds of megabytes of medium-speed, medium-price, volatile main memory
 Hundreds of gigabytes of slow, cheap, and non-volatile secondary storage

3.1.1 Address Binding


Usually, a program resides on a disk as a binary executable file. The program must be
brought into memory and placed within a process for it to be executed. Depending on the
memory management in use, the process may be moved between disk and memory during its
execution. The collection of processes on the disk that are waiting to be brought into memory
for execution forms the input queue.

The normal procedure is to select one of the processes in the input queue and to load that
process into memory. As the process is executed, it accesses instructions and data from
memory. Eventually, the process terminates, and its memory space is declared available.

In most cases, a user program will go through several steps (some of which may be optional)
before being executed. Addresses may be represented in different ways during these steps.
Addresses in the source program are generally symbolic (such as COUNT). A compiler will
typically bind these symbolic addresses to relocatable addresses (such as "14 bytes from the
beginning of this module"). The linkage editor or loader will in turn bind these relocatable
addresses to absolute addresses (such as 74014). Each binding is a mapping from one
address space to another.

Classically, the binding of instructions and data to memory addresses can be done at any step
along the way:
 Compile time: If it is known at compile time where the process will reside in
memory, then absolute code can be generated by the compiler.

1
Operating System

 Load time: If it is not known at compile time where the process will reside in
memory, then the compiler must generate relocatable code. In this case, final binding
is delayed until load time.
 Execution time: If the process can be moved during its execution from one memory
segment to another, then binding must be delayed until run time.

We shall see in this chapter how these various bindings can be implemented effectively in a
computer system.

3.1.2 Dynamic Loading


To obtain better memory-space utilization, we can use dynamic loading. With dynamic
loading, a routine is not loaded until it is called. All routines are kept on disk in a relocatable
load format. The main program is loaded into memory and is executed. When a routine needs
to call another routine, the calling routine first checks to see whether the other routine has
been loaded. If it has not been, the relocatable linking loader is called to load the desired
routine into memory. The advantage of dynamic loading is that an unused routine is never
loaded.

Dynamic loading does not require special support from the operating system. It is the
responsibility of the users to design their programs to take advantage of such a scheme.

3.1.3 Dynamic Linking


Most operating systems support only static linking, in which system language libraries are
treated like any other object module and are combined by the loader into the binary program
image. The concept of dynamic linking is similar to that of dynamic loading. Rather than
loading being postponed until execution time, linking is postponed. This feature is usually
used with system libraries, such as language subroutine libraries. Without this facility, all
programs on a system need to have a copy of their language library included in the executable
image.

3.1.4 Overlays
In our discussion so far, the entire program and data of a process must be in physical memory
for the process to execute. The size of a process is limited to the size of physical memory. So
that a process can be larger than the amount of memory allocated to it, a technique called
overlays is sometimes used. The idea of overlays is to keep in memory only those instructions
and data that are needed at any given time. When other instructions are needed, they are
loaded into space that was occupied previously by instructions that are no longer needed.

The use of overlays is currently limited to microcomputer and other systems that have limited
amounts of physical memory and that lack hardware support for more advanced techniques.

3.2 Logical versus Physical Address Space


An address generated by the program (generated by the CPU) is commonly referred to as a
logical address, whereas an address seen by the memory unit (that is, the one loaded into the
memory address register of the memory) is commonly referred to as a physical address.

The compile-time and load-time address-binding schemes result in an environment where the
logical and physical addresses are the same. However, the execution-time address-binding
scheme results in an environment where the logical and physical addresses differ. In this case,

2
Operating System

we usually refer to the logical address as a virtual address. We use logical address and
virtual address interchangeably. The set of all logical addresses generated by a program is
referred to as a logical address space; the set of all physical addresses corresponding to these
logical addresses is referred to as a physical address space. Thus, in the execution-time
address-binding scheme, the logical and physical address spaces differ.

The run-time mapping from virtual to physical addresses is done by the memory-
management unit (MMU), which is a hardware device. There are a number of different
schemes for accomplishing such a mapping, as will be discussed later. For the time being, we
shall illustrate this mapping with a simple MMU scheme

In this case the base register is now called a relocation register. The value in the relocation
register is added to every address generated by a user process at the time it is sent to memory.
For example, if the base is at 14,000, then an attempt by the user to address location 0 is
dynamically relocated to location 14,000; an access to location 346 is mapped to location
14,346 as shown in the figure 3.1.

Figure 3.1: Dynamic relocation using a relocation register.

Notice that the user program never sees the real physical addresses. The user program deals
with logical addresses. The memory-mapping hardware converts logical addresses into
physical addresses.

3.3 Swapping
A process needs to be in memory to be executed. A process, however, can be swapped
temporarily out of memory to a backing store, and then brought back into memory for
continued execution. For example, assume a multiprogramming environment with a round-
robin CPU-scheduling algorithm. When a quantum expires, the memory manager will start to
swap out the process that just finished, and to swap in another process to the memory space
that has been freed (Figure 3.2). In the meantime, the CPU scheduler will allocate a time slice
to some other process in memory. When each process finishes its quantum, it will be
swapped with another process.

3
Operating System

A variant of this swapping policy is used for priority-based scheduling algorithms. If a


higher-priority process arrives and wants service, the memory manager can swap out the
lower-priority process so that it can load and execute the higher-priority process. When the
higher-priority process finishes, the lower-priority process can be swapped back in and
continued. This variant of swapping is sometimes called roll out, roll in.

Figure 3.2: Swapping of two processes using a disk as a backing store.

Swapping requires a backing store. The backing store is commonly a fast disk. It must be
large enough to accommodate copies of all memory images for all users, and must provide
direct access to these memory images. The system maintains a ready queue consisting of all
processes whose memory images are on the backing store or in memory and are ready to run.
Whenever the CPU scheduler decides to execute a process, it calls the dispatcher. The
dispatcher checks to see whether the next process in the queue is in memory. If the process is
not, and there is no free memory region, the dispatcher swaps out a process currently in
memory and swaps in the desired process. It should be clear that the context-switch time in
such a swapping system is fairly high.

3.4 Contiguous Allocation


The main memory must accommodate both the operating system and the various user
processes. The memory is usually divided into two partitions, one for the resident operating
system, and one for the user processes. It is possible to place the operating system in either
low memory or high memory. In our discussion we shall assume that the operating system
resides in low memory (Figure 3.3).

Figure 3.2: Memory partition.


4
Operating System

It is desirable to have several user processes residing in the memory at the same time. In
contiguous memory allocation, each process is contained in a single contiguous section of
memory. The base (re-location) and limit registers are used to point to the smallest memory
address of a process and its size, respectively.

3.4.1 Single-Partition Allocation


If the operating system is residing in low memory and the user processes are executing in
high memory, we need to protect the operating-system code and data from changes
(accidental or malicious) by the user processes. We also need to protect the user processes
from one another. We can provide this protection by using a relocation register with a limit
register. The relocation register contains the value of the smallest physical address; the limit
register contains the range of logical addresses (for example, relocation = 100,040 and limit =
74,600). With relocation and limit registers, each logical address must be less than the limit
register; the MMU maps the logical address dynamically by adding the value in the relocation
register. This mapped address is sent to memory (Figure 3.3).

Figure 3.3: The program can access memory between the base and the limit.

When the CPU scheduler selects a process for execution, the dispatcher loads the relocation
and limit registers with the correct values as part of the context switch. Because every address
generated by the CPU is checked against these registers, we can protect both the operating
system and the other users’ programs and data from being modified by this running process.
The base register makes it impossible for a program to reference any part of memory below
itself. Furthermore, the limit register makes it impossible to reference any part of memory
above itself.

5
Operating System

Note that the relocation-register scheme provides an effective way to allow the operating-
system size to change dynamically. This flexibility is desirable in many situations. For
example, the operating system contains code and buffer space for device drivers. If a device
driver (or other operating-system service) is not commonly used, it is undesirable to keep the
code and data in memory, as we might be able to use that space for other purposes. Such code
is sometimes called transient operating-system code; it comes and goes as needed. Thus,
using this code changes the size of the operating system during program execution.

3.4.2 Multiple-Partition Allocation


Because it is desirable, in general, that there be several user processes residing in memory at
the same time, we need to consider the problem of how to allocate available memory to the
various processes that are in the input queue waiting to be brought into memory. One of the
simplest schemes for memory allocation is to divide memory into a number of fixed-sized
partitions. Each partition may contain exactly one process. Thus, the degree of
multiprogramming is bound by the number of partitions. When a partition is free, a process is
selected from the input queue and is loaded into the free partition. When the process
terminates, the partition becomes available for another process. This scheme was originally
used by the IBM OS/360 operating system; it is no longer in use. The scheme described next
is a generalization of the fixed partition scheme (called MVT- Multiprogramming with
Variable Task) and is used primarily in a batch environment.

The operating system keeps a table indicating which parts of memory are available and which
are occupied. Initially, all memory is available for user processes, and is considered as one
large block of available memory, a hole. When a process arrives and needs memory, we
search for a hole large enough for this process. If we find one, we allocate only as much
memory as is needed, keeping the rest available to satisfy future requests.

When a process is allocated space, it is loaded into memory and it can then compete for the
CPU. When a process terminates, it releases its memory, which the operating system may
then fill with another process from the input queue.

At any given time, we have a list of available block sizes and the input queue. Memory is
allocated to processes until, finally, the memory requirements of the next process cannot be
satisfied; no available block of memory (hole) is large enough to hold that process. The
operating system can then wait until a large enough block is available, or it can skip down the
input queue to see whether the smaller memory requirements of some other process can be
met.

In general, there is at any time a set of holes, of various sizes, scattered throughout memory.
When a process arrives and needs memory, we search this set for a hole that is large enough
for this process. If the hole is too large, it is split into two: One part is allocated to the arriving
process; the other is returned to the set of holes. When a process terminates, it releases its
block of memory, which is then placed back in the set of holes. If the new hole is adjacent to
other holes, we merge these adjacent holes to form one larger hole. At this point, we may
need to check whether there are processes waiting for memory and whether this newly freed
and recombined memory could satisfy the demands of any of these waiting processes. This
procedure is a particular instance of the general dynamic storage allocation problem, which is
how to satisfy a request of size n from a list of free holes. There are many solutions to this
problem. The set of holes is searched to determine which hole is best to allocate. First-fit,

6
Operating System

best-fit, and worst-fit are the most common strategies used to select a free hole from the set of
available holes.

 First-fit: Allocate the first hole that is big enough. We can stop searching as soon as
we find a free hole that is large enough.
 Best-fit: Allocate the smallest hole that is big enough. We must search the entire list,
unless the list is kept ordered by size. This strategy produces the smallest leftover
hole.
 Worst-fit: Allocate the largest hole. Again, we must search the entire list, unless it is
sorted by size. This strategy produces the largest leftover hole, which may be more
useful than the smaller leftover hole from a best-fit approach.

Simulations have shown that both first-fit and best-fit are better than worst-fit in terms of
decreasing both time and storage utilization. Neither first-fit nor best-fit is clearly better in
terms of storage utilization, but first-fit is generally faster.

3.4.3 External and Internal Fragmentation


The algorithms described above suffer from external fragmentation. As processes are loaded
and removed from memory, the free memory space is broken into little pieces. External
fragmentation exists when enough total memory space exists to satisfy a request, but it is not
contiguous; storage is fragmented into a large number of small holes. One solution to the
problem of external fragmentation is compaction. The goal is to shuffle the memory contents
to place all free memory together in one large block.

Another problem that arises with the multiple partition allocation scheme is internal
fragmentation. Consider the hole of 18,464 bytes. Suppose that the next process requests
18,462 bytes. If we allocate exactly the requested block, we are left with a hole of 2 bytes.
The overhead to keep track of this hole will be substantially larger than the hole itself. The
general approach is to allocate very small holes as part of the larger request. Thus, the
allocated memory may be slightly larger than the requested memory. The difference between
these two numbers is internal fragmentation - memory that is internal to a partition, but is not
being used.

3.5 Paging
Another possible solution to the external fragmentation problem is to permit the logical
address space of a process to be noncontiguous, thus allowing a process to be allocated
physical memory wherever the latter is available. One way of implementing this solution is
through the use of a paging scheme.

Physical memory is broken into fixed-sized blocks called frames. Logical memory is also
broken into blocks of the same size called pages. When a process is to be executed, its pages
are loaded into any available memory frames from the backing store. The backing store is
divided into fixed-sized blocks that are of the same size as the memory frames.

Every address generated by the CPU is divided into two parts: a page number (p) and a page
offset (d). The page number is used as an index into a page table. The page table contains the
base address of each page in physical memory. This base address is combined with the page
offset to define the physical memory address that is sent to the memory unit. The paging
model of memory is shown in Figure 3.4.

7
Operating System

Figure 3.4: Paging model of logical and physical memory.

The page size (like the frame size) is defined by the hardware. The size of a page is typically
a power of 2 varying between 512 bytes and 8192 bytes per page, depending on the computer
architecture. If the size of logical address space is 2m, and a page size is 2n addressing units
(bytes or words), then the high-order m – n bits of a logical address designate the page
number, and the n low-order bits designate the page offset. Thus, the logical address is as
follows:

where p is an index into the page table and d is the displacement within the page.

For example, consider the memory of Figure 3.5. Using a page size of 4 bytes and a physical
memory of 32 bytes (8 pages), we show an example of how the user's view of memory can be
mapped into physical memory. Logical address 0 is page 0, offset 0. Indexing into the page
table, we find that page 0 is in frame 5. Thus, logical address 0 maps to physical address 20
(= (5 x 4) + 0). Logical address 3 (page 0, offset 3) maps to physical address 23 (= (5 x 4) +
3). Logical address 4 is page 1, offset 0; according to the page table, page 1 is mapped to
frame 6. Thus, logical address 4 maps to physical address 24 (= (6 x 4) + 0). Logical address
13 maps to physical address 9.

Notice that paging itself is a form of dynamic relocation. When we use a paging scheme, we
have no external fragmentation: Any free frame can be allocated to a process that needs it.
However, we may have some internal fragmentation.

8
Operating System

Figure 3.5: Paging example for a 32-byte memory with 4-byte pages.

3.6 Segmentation
The user's view of memory is not the same as the actual physical memory. The user’s view is
mapped onto physical memory. What is the user’s view of memory? Does the user think of
memory as a linear array of bytes, some containing instructions and others containing data, or
is there some other preferred memory view? There is general agreement that the user or
programmer of a system does not think of memory as a linear array of bytes. Rather, the user
prefers to view memory as a collection of variable-sized segments, with no necessary
ordering among segments.

Segmentation is a memory-management scheme that supports this user view of memory. A


logical address space is a collection of segments. A user program can be subdivided using
segmentation, in which the program and its associated data are divided into a number of
segments. It is not required that all segments of all programs be of the same length. As with
paging, a logical address using segmentation consists of two parts: a segment number, s, and
an offset into that segment, d.

9
Operating System

The segment number is used as an index into the segment table. The offset d of the logical
address must be between 0 and the segment limit. If this offset is legal, it is added to the
segment base to produce the address in physical memory of the desired byte. The segment
table is thus essentially an array of base-limit register pairs.

Figure 3.6: Example of segmentation.

As an example, consider the situation shown in Figure 3.6. We have five segments numbered
from 0 through 4. The segments are stored in physical memory as shown. The segment table
has a separate entry for each segment, giving the beginning address of the segment in
physical memory (the base) and the length of that segment (the limit). For example, segment
2 is 400 bytes long, and begins at location 4300. Thus, a reference to byte 53 of segment 2 is
mapped onto location 4300 + 53 = 4353. A reference to segment 3, byte 852, is mapped to
3200 (the base of segment 3) + 852 = 4052. A reference to byte 1222 of segment 0 would
result in a trap to the operating system, as this segment is only 1000 bytes long.

3.7 Virtual Memory


Virtual memory is a technique that allows the execution of process that may not be
completely in memory. The main visible advantage of this scheme is that programs can be
larger than physical memory. Virtual memory is the separation of user logical memory from
physical memory this separation allows an extremely large virtual memory to be provided for
programmers when only a smaller physical memory is available (Figure 3.7).

10
Operating System

Figure 3.7: Diagram showing virtual memory that is larger than physical memory.

Following are the situations, when entire program is not required to load fully:
 User written error handling routines are used only when an error occurs in the data or
computation.
 Certain options and features of a program may be used rarely.
 Many tables are assigned a fixed amount of address space even though only a small
amount of the table is actually used.

The ability to execute a program that is only partially in memory would counter many
benefits:
 Less number of I/O would be needed to load or swap each user program into memory.
 A program would no longer be constrained by the amount of physical memory that is
available.
 Each user program could take less physical memory, more programs could be run the
same time, with a corresponding increase in CPU utilization and throughput.

Virtual memory is commonly implemented by demand paging.

3.8 Demand Paging


Consider how an executable program might be loaded from disk into memory. One option is
to load the entire program in physical memory at program execution time. However, a
problem with this approach is that we may not initially need the entire program in memory.
Consider a program that starts with a list of available options from which the user is to select.
Loading the entire program into memory results in loading the executable code for all
options, regardless of whether an option is ultimately selected by the user or not. An
alternative strategy is to initially load pages only as they are needed. This technique is known
as demand paging and is commonly used in virtual memory systems.

With demand-paged virtual memory, pages are only loaded when they are demanded during
program execution; pages that are never accessed are thus never loaded into physical
memory. A demand-paging system is similar to a paging system with swapping (Figure 3.8)
where processes reside in secondary memory (usually a disk).

11
Operating System

Figure 3.8: Transfer of a paged memory to contiguous disk space.

When we want to execute a process, we swap it into memory. Rather than swapping the
entire process into memory, however, we use a lazy swapper. A lazy swapper never swaps a
page into memory unless that page will be needed. Since we are now viewing a process as a
sequence of pages, rather than as one large contiguous address space, use of the term swapper
is technically incorrect. A swapper manipulates entire processes, whereas a pager is
concerned with the individual pages of a process. We thus use pager, rather than swapper, in
connection with demand paging.

When a process is to be swapped in, the pager guesses which pages will be used before the
process is swapped out again. Instead of swapping in a whole process, the pager brings only
those necessary pages into memory. Thus, it avoids reading into memory pages that will not
be used anyway, decreasing the swap time and the amount of physical memory needed.

If a process tries to access a page that was not brought into memory, access to such page is
called page-fault. If the program calls or jumps to an instruction that is not in memory, a
page fault occurs and the operating system will go and get the missing instruction from disk.
This is called a page fault. The process is blocked while the necessary instruction is being
located and read in.

Advantages of Demand Paging:


 Large virtual memory.
 More efficient use of memory.
 Unconstrained multiprogramming. There is no limit on degree of
multiprogramming.

12
Operating System

3.9 Page Replacement Algorithm


 There are many different page replacement algorithms. Every operating system
probably has its own replacement scheme. How do we select a particular replacement
algorithm? In general, we want the one with the lowest page-fault rate. We evaluate
an algorithm by running it on a particular string of memory reference and computing
the number of page faults. The string of memory references is called reference
string. Reference strings are generated artificially or by tracing a given system and
recording the address of each memory reference. The latter choice produces a large
number of data.
 For a given page size we need to consider only the page number, not the entire
address. If we have a reference to a page p, then any immediately following
references to page p will never cause a page fault. Page p will be in memory after the
first reference; the immediately following references will not fault.
 To determine the number of page faults for a particular reference string and page
replacement algorithm, we also need to know the number of page frames available.
As the number of frames available increase, the number of page faults will decrease.

3.9.1 FIFO Page Replacement


The simplest page-replacement algorithm is a first-in, first-out (FIFO) algorithm. A FIFO
replacement algorithm associates with each page the time when that page was brought into
memory. When a page must be replaced, the oldest page is chosen.

For example consider the following reference string


7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0,1
for a memory with three frames (Buffer size = 3).

For our example reference string, our three frames are initially empty. The first three
references (7, 0, 1) cause page faults and are brought into these empty frames. The next
reference (2) replaces page 7, because page 7 was brought in first. Since 0 is the next
reference and 0 is already in memory, we have no fault for this reference. The first reference
to 3 results in replacement of page 0, since it is now first in line. Because of this replacement,
the next reference, to 0, will fault. Page 1 is then replaced by page 0. This process continues
as shown in Figure 3.9. Every time a fault occurs, we show which pages are in our three
frames. There are 15 page faults altogether.

Figure 3.9: FIFO page-replacement algorithm.


Total page faults occurs: 15.

The FIFO page-replacement algorithm is easy to understand and program. However, its
performance is not always good.

13
Operating System

3.9.2 Optimal Page Replacement


An optimal page-replacement algorithm has the lowest page-fault rate of all algorithms. It is
also called OPT. It is simply replace the page that will not be used for the longest period of
time. The optimal page-replacement algorithm is difficult to implement, because it requires
future knowledge of the reference string. Use of this page-replacement algorithm guarantees
the lowest possible page fault rate for a fixed number of frames.

For example, on our sample reference string,


7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0,1
for a memory with three frames (Buffer size = 3)

the optimal page-replacement algorithm would yield nine page faults, as shown in Figure
3.10.

Figure 3.10: Optimal page-replacement algorithm.

The first three references cause faults that fill the three empty frames. The reference to page
2 replaces page 7, because 7 will not be used until reference 18, whereas page 0 will be used
at 5, and page 1 at 14. The reference to page 3 replaces page 1, as page 1 will be the last of
the three pages in memory to be referenced again. With only nine page faults, optimal
replacement is much better than a FIFO algorithm, which resulted in fifteen faults. (If we
ignore the first three, which all algorithms must suffer, then optimal replacement is twice as
good as FIFO replacement.) In fact, no replacement algorithm can process this reference
string in three frames with fewer than nine faults.

Unfortunately, the optimal page-replacement algorithm is difficult to implement, because it


requires future knowledge of the reference string.

3.9.3 Least Recently Used (LRU) Algorithm


The FIFO algorithm uses the time when a page was brought into memory; the OPT algorithm
uses the time when a page is to be used. In LRU replace the page that has not been used for
the longest period of time. LRU replacement associates with each page the time of that page’s
last use. When a page must be replaced, LRU chooses that page that has not been used for the
longest period of time.

The result of applying LRU replacement to our example reference string is shown in Figure
3.11. The LRU algorithm produces 12 faults. Notice that the first five faults are the same as
the optimal replacement. When the reference to page 4 occurs, however, LRU replacement
sees that, of the three frames in memory, page 2 was used least recently. The most recently
used page is page 0, and just before that page 3 was used. Thus, the LRU algorithm replaces
page 2, not knowing that page 2 is about to be used. When it then faults for page 2, the LRU
algorithm replaces page 3 since, of the three pages in memory {0, 3, 4}, page 3 is the least

14
Operating System

recently used. LRU replacement with 12 faults is still much better than FIFO replacement
with 15. The LRU policy is often used as a page-replacement algorithm and is considered to
be quite good.

Figure 3.11: LRU page-replacement algorithm.

15

You might also like