Notes-4up-Forprinting-2012 OS PDF
Notes-4up-Forprinting-2012 OS PDF
Notes-4up-Forprinting-2012 OS PDF
2 / 184 4 / 184
I Memory (1) – physical memory, early paging and segmentation Textbooks
techniques
There are many very good operating systems textbooks, most of which
I Memory (2) – modern virtual memory concepts and techniques
cover the material of the course (and much more).
I Memory (3) – paging policies I shall be (very loosely) following
I I/O (1) – low level I/O functions W. Stallings Operating Systems: Internals and Design Principles, Prentice-
I I/O (2) – high level I/O functions and filesystems Hall/Pearson.
I Case studies: one or both of: the Windows NT family; IBM’s Another book that can as well be used is
System/390 family – N.B. you will be expected to study Linux A. Silberschatz and P. Galvin Operating Systems Concepts (5th or later
during the practical exercise and in self-study. edition), Addison-Wesley.
I Other topics to be determined, e.g. security. Most of the other major OS texts are also suitable.
You are expected to read around the subject in some textbook, but there
is no specific requirement to buy Stallings 7th edition.
References to Stallings change from edition to edition, so are mainly by
keyword.
5 / 184 7 / 184
Assessment Acknowledgement
The course is assessed by a written examination (75%), one practical I should like to thank Dr Steven Hand of the University of Cambridge,
exercise (15%) and an essay (10%). who has provided me with many useful figures for use in my slides, and
The practical exercise will run through weeks 3–8, and will involve allowed me to use some of his slides as a basis for some of mine.
understanding and modifying the Linux kernel. The final assessed
outcome is a relatively small part of the work, and will not be too hard;
most of the work will be in understanding C, Makefiles, the structure of a
real OS kernel, etc. This is essential for real systems work!
The essay will be due at the end of week 10, and will be from a list of
topics, either a more extensive investigation of something covered briefly
in lectures, or a study of something not covered. (Ideas welcome.)
6 / 184 8 / 184
A brief and selective history of computing . . . The Difference Engine, [The Analytical Engine] – 1812, 1832 Babbage /
Lovelace.
Computing machines have been increasing in complexity for many
centuries, but only recently have they become complex enough to require
something recognizable as an operating system. Here, mostly for fun, is a
quick review of the development of computers.
The abacus – some millennia BP.
[ Science Museum ?? ]
[Association pour le musée international du calcul de l’informatique et de Analytical Engine (never built) anticipated many modern aspects of
l’automatique de Valbonne Sophia Antipolis (AMISA)] computers. See http://www.fourmilab.ch/babbage/.
9 / 184 11 / 184
Logarithms (Napier): the slide rule – 1622 Bissaker Electro-mechanical punched card – 1890 Hollerith (→ IBM)
First mechanical digital calculator – 1642 Pascal Vacuum tube – 1905 De Forest
Relay-based IBM 610 hits 1 MultiplicationPS – 1935
ABC, 1st electronic digital computer – 1939 Atanasoff / Berry
Z3, 1st programmable computer – 1941 Zuse
Colossus, Bletchley Park – 1943
Memory
Input
Arithmetic
Control Logical Unit
Unit Output
Accumulator
In 1945, John von Neumann drafted the EDVAC report, which set out the
[University of Pennsylvania] architecture now taken as standard.
13 / 184 15 / 184
I 30 tons, 1000 sq feet, 140 kW the transistor – 1947 (Shockley, Bardeen, Brattain)
I 18k vacuum tubes, 20 10-digit accumulators EDSAC, 1st stored program computer – 1949 (Wilkes)
I 100 kHz, around 300 M(ult)PS I 3k vacuum tubes, 300 sq ft, 12 kW
I in 1946 added blinking lights for the Press! I 500kHz, ca 650 IPS
Programmed by a plugboard, so very slow to change program. I 1K 17-bit words of memory (Hg ultrasonic delay lines)
I operating system of 31 words
I see http://www.dcs.warwick.ac.uk/~edsac/ for a simulator
TRADIC, 1st valve-free computer – 1954 (Bell Labs)
first IC – 1959 (Kilby & Noyce, TI)
IBM System/360 – 1964. Direct ancestor of today’s zSeries, with
continually evolved operating system.
Intel 4004, 1st µ-processor – 1971 (Ted Hoff)
Intel 8086, IBM PC – 1978
VLSI (> 100k transistors) – 1980
14 / 184 16 / 184
Levels of (Programming) Languages Quick Review of Computer Architecture
Level 5 ML/Java Processor
Bytecode Bus
Address Data Control
Register File
interpret
Level 4 C/C++ Source (including PC)
Memory
compile
Control Execution e.g. 64 MByte
Level 3 ASM Source Unit Unit 2^26 x 8 =
assemble Other Object 536,870,912bits
Level 2 Object File Files ("Libraries")
link Reset
Hard Disk
Executable File execute
Level 1 Framebuffer
("Machine Code")
Super I/O
(Modern) Computers can be programmed at several levels.
Sound Card
Level relates to lower via either translation/compilation or interpretation. Mouse Keyboard Serial
Think of a virtual machine in each layer built on the lower VM; machine
in one level understands language of that level.
This course considers mainly levels 1 and 2.
Exercise: Operating Systems are often written in assembly language or C
or higher. What does it mean to say level 2 is below levels 3 and 4?
18 / 184 20 / 184
Intel Pentium has: The cache is fast, expensive memory sitting between CPU and main
I eight 32-bit general purpose registers memory – cache ↔ CPU via special bus.
May have several levels of cache – current IBM mainframes have four.
I six 16-bit segment registers (for address space management)
The OS has to be aware of the cache and control it, e.g. when switching
I two 32-bit control registers, including Program Counter (called EIP
address spaces.
by Intel)
IBM z/Architecture has:
I sixteen 64-bit general registers
I sixteen 64-bit floating point registers
I one 32-bit floating point control register
I sixteen 64-bit control registers
I sixteen 32-bit access registers (for address space management)
I one Program Status Word (PC)
21 / 184 23 / 184
+
Execution Data
Register File
Register File
Unit Cache
64MB PC
DRAM
Decode IB
Control Instruction
Unit Cache
32K ROM
Address
Data
Control
PC initialized to fixed value on CPU reset. Then repeat (until halt):
Bus 1. instruction is fetched from memory address in PC into instruction
buffer
2. Control Unit decodes instruction
3. Execution Unit executes it
4. PC is updated: explicitly by jumps, implicitly otherwise
22 / 184 24 / 184
Input/Output Devices Bus Hierarchy
We’ll consider these later in the course. For now, note that: Processor Memory Bus (100Mhz)
Bus
I/O devices typically connected to CPU via a bus (or via a chain of
Caches
I
Processor
buses and bridges)
64MByte
I wide range of devices, e.g.: hard disk, CD, graphics card, sound DIMM
64MByte
card, ethernet card, modem DIMM
Bridge
SCSI
Controller
Sound
Card
Most computers have many different buses, with different functions and
characteristics.
25 / 184 27 / 184
Buses Interrupts
ADDRESS
Devices much slower than CPU; can’t have CPU wait for device. Also,
Processor
external events may occur.
DATA Memory
Interrupts provide suitable mechanism. Interrupt is (logically) a signal line
CONTROL
into CPU. When asserted, CPU jumps to particular location (e.g. on x86,
Other Devices on interrupt (IRQ) n, CPU jumps to address stored in nth entry of table
pointed to by IDTR control register).
A bus is a group of ‘wires’ shared by several devices (e.g. CPU, memory, The jump saves state; when the interrupt handler finishes, it uses a special
I/O). Buses are cheap and versatile, but can be a severe performance return instruction to restore control to original program.
bottleneck (e.g. PC-card hard disks). Thus, I/O operation is: instruct device and continue with other tasks;
A bus typically has address lines, data lines and control lines. when device finishes, it raises interrupt; handler gets info from device etc.
and schedules requesting task.
Operated in master–slave protocol: e.g. to read data from memory, CPU
(master) puts address on bus and asserts ‘read’; memory (slave) retrieves In practice (e.g. x86), may be one or two interrupt pins on chip, with
data, puts data on bus; CPU reads from bus. interrupt controller to encode external interrupts onto bus for CPU.
In some cases, may need initialization protocol to decide which device is
the bus master; in others, it’s pre-determined.
26 / 184 28 / 184
Direct Memory Access (DMA) In the beginning. . .
DMA means allowing devices to write directly (i.e. via bus) into main Earliest ‘OS’ simply transferred programs from punched card reader to
memory. memory.
E.g., CPU tells device ‘write next block of data into address x’; gets Everything else done by lights and switches on front panel.
interrupt when done. Job scheduling done by sign-up sheets.
PCs have basic DMA; IBM mainframes’ ‘I/O channels’ are a sophisticated User ( = programmer = operator) had to set up entire job (e.g.: load
extension of DMA (CPU can construct complex programs for device to compiler, load source code, invoke compiler, etc) programmatically.
execute). I/O directly programmed.
29 / 184 31 / 184
30 / 184 32 / 184
Early batch systems Making good use of resource – multiprogramming
Late 1950s–early 1960s saw introduction of batch systems (General Even in the 60s, I/O was very slow compared to CPU. So jobs would
Motors, IBM; standard on IBM 7090/7094). waste most (typically > 75%) of the CPU cycles waiting for I/O.
Multiprogramming introduced: monitor loads several user programs; when
I monitor is simple resident OS: reads
Interrupt
one is waiting for I/O, run another.
Processing jobs, transfers control to program,
Multiprogramming means the monitor must:
Device
Drivers
receives control back from program at
Monitor
Job
end of task. I manage memory among the various tasks
Sequencing I batches of jobs can be put onto one I schedule execution of the tasks
Control Language
Interpreter
tape and read in turn by monitor – Multiprogramming OSes introduced early 60s – Burroughs MCP (1963)
Boundary reduces human intervention. was early (and advanced) example.
I monitor permanently resident: user In 1964, IBM introduced System/360 hardware architecture. Family
programs must be loaded into of architectures, still going strong (S/360 → S/370 → S/370-XA →
User different area of memory ESA/370 → ESA/390 → z/Architecture). Simulated/emulated previous
Program
Area IBM computers.
Early S/360 OSes not very advanced: DOS single batch; MFT ran fixed
number of tasks. In 1967 MVT ran up to 15 tasks.
33 / 184 35 / 184
Protecting the monitor from the users Using batch systems was (and is) pretty painful. E.g. on MVS, to
assemble, link and run a program:
Having monitor co-resident with user programs is asking for trouble.
Desirable features, needing hardware support, include: //USUAL JOB A2317P,’MAE BIRDSALL’
I memory protection: user programs should not be able to . . . write to //ASM EXEC PGM=IEV90,REGION=256K, EXECUTES ASSEMBLER
// PARM=(OBJECT,NODECK,’LINECOUNT=50’)
monitor memory,
//SYSPRINT DD SYSOUT=*,DCB=BLKSIZE=3509 PRINT THE ASSEMBLY LISTING
I timer control: . . . or run for ever, //SYSPUNCH DD SYSOUT=B PUNCH THE ASSEMBLY LISTING
I privileged instructions: . . . or directly access I/O (e.g. might read //SYSLIB DD DSNAME=SYS1.MACLIB,DISP=SHR THE MACRO LIBRARY
next job by mistake) or certain other machine functions, //SYSUT1 DD DSNAME=&&SYSUT1,UNIT=SYSDA, A WORK DATA SET
// SPACE=(CYL,(10,1))
I interrupts: . . . or delay the monitor’s response to external events //SYSLIN DD DSNAME=&&OBJECT,UNIT=SYSDA, THE OUTPUT OBJECT MODULE
// SPACE=(TRK,(10,2)),DCB=BLKSIZE=3120,DISP=(,PASS)
//SYSIN DD * IN-STREAM SOURCE CODE
.
code
.
/*
34 / 184 36 / 184
//LKED EXEC PGM=HEWL, EXECUTES LINKAGE EDITOR Virtual Memory
// PARM=’XREF,LIST,LET’,COND=(8,LE,ASM)
Multitasking, and time-sharing in particular, much easier if all tasks are
//SYSPRINT DD SYSOUT=* LINKEDIT MAP PRINTOUT
//SYSLIN DD DSNAME=&&OBJECT,DISP=(OLD,DELETE) INPUT OBJECT MODULE resident, rather than being swapped in and out of memory.
//SYSUT1 DD DSNAME=&&SYSUT1,UNIT=SYSDA, A WORK DATA SET But not enough memory! Virtual memory decouples memory as seen by
// SPACE=(CYL,(10,1)) the user task from physical memory. Task sees virtual memory, which may
//SYSLMOD DD DSNAME=&&LOADMOD,UNIT=SYSDA, THE OUTPUT LOAD MODULE be anywhere in real memory, and can be paged out to disk.
// DISP=(MOD,PASS),SPACE=(1024,(50,20,1))
Hardware support required: all memory references by user tasks must be
//GO EXEC PGM=*.LKED.SYSLMOD,TIME=(,30), EXECUTES THE PROGRAM
// COND=((8,LE,ASM),(8,LE,LKED)) translated to real addresses – and if the virtual page is on disk, monitor
//SYSUDUMP DD SYSOUT=* IF FAILS, DUMP LISTING called to load it back in real memory.
//SYSPRINT DD SYSOUT=*, OUTPUT LISTING In 1963, Burroughs had virtual memory. IBM only introduced it to
// DCB=(RECFM=FBA,LRECL=121) mainframe line with S/370 in 1972.
//OUTPUT DD SYSOUT=A, PROGRAM DATA OUTPUT
// DCB=(LRECL=100,BLKSIZE=3000,RECFM=FBA)
//INPUT DD * PROGRAM DATA INPUT
.
data
.
/*
//
37 / 184 39 / 184
Time-sharing Real
Address
Memory
Allow interactive terminal access to computer, with many users sharing. Processor
Virtual
Management
Unit Main
Early system (CTSS, Cambridge, Mass.) gave each user 0.2s of CPU time; Address
Memory
monitor then saved user program state, loaded state of next scheduled
user.
IBM’s TSS for S/360 was similar – and a software engineering disaster. Disk
Major motivation for development of SE! Address
Secondary
Memory
38 / 184 40 / 184
The Process Concept Memory Protection
With virtual memory, becomes natural to give different tasks their own Virtual memory itself allows user’s memory to be isolated from kernel
independent address space or view of memory. Monitor then schedules memory and other users’ memory. Both for historical reasons and to allow
processes appropriately, and does all context-switching (loading of virtual user/kernel memory to be appropriately shared, many architectures have
memory control info, etc.) transparently to user process. separate protection mechanisms as well:
Note on terminology. It’s common to use ‘process’ for task with independent I A frame or page may be read or write accessible only to a processor
address space, espec. in Unix setting, but this is not a universal definition. Tasks
in a high privilege level;
sharing the same address space are called ‘tasks’ (IBM) or ‘threads’ (Unix). But
some older OSes without virtual memory called their tasks ‘processes’. I In S/370, each frame of memory has a 4-bit storage key, and each
Communication between processes becomes a major issue (studied later); task runs with a particular key.
as does control of resources. I the virtual memory mechanism may be extended with permission
bits; frames can then be shared.
I combination of all the above may be used.
41 / 184 43 / 184
All OS function sits in the kernel. Some modern kernels are very large –
tens of MLoC. Bug in any function can crash system. . .
42 / 184 44 / 184
OS structure – microkernels Processes – what are they?
App. App. App. App.
Recall that a process is ‘a program in execution’; may have own view of
memory; sees one processor, although it’s sharing it with other processes
– running on virtual processor.
To switch between processes, we need to track:
Server Server
I its memory, including stack and heap
I the contents of registers
Unpriv I program counter
Priv Server Device Device I its state
Driver Driver
Kernel Scheduler
S/W
H/W
46 / 184 48 / 184
admit release
Context Switching
New Exit PCB allows OS to switch process contexts:
dispatch
Process A Operating System Process B
Ready Running executing
idle
Save State into PCB A
timeout
or yield idle
Restore State from PCB B
event event-wait
executing
Blocked
50 / 184 52 / 184
Scheduling Creating Processes(2)
When do processes move from Ready to Running? This is the job of the When a process is created, the OS must
scheduler. We will look at this in detail later. I assign unique identifier
I allocate memory space: both kernel memory for control structures,
and user memory
I initialize PCB and (maybe) memory management tables
I link PCB into OS data structures
I initialize remaining control structures
I for WinNT, OS/390: load program
I for Unix: make child process a copy of parent
Modern Unices don’t actually copy; they share and do copy-on-write.
53 / 184 55 / 184
54 / 184 56 / 184
Processes and Threads Real Threads vs Thread Libraries
Processes Threads can be implemented as part of the OS; e.g. Linux, OS/390,
I own resources such as address space, i/o devices, files Windows.
If the OS does not do this (or in any case), threads can be implemented
I are units of scheduling and execution
by user-space libraries:
These are logically distinct. Some old OSes (MVS) and most modern
I thread library implements mini-process scheduler (entirely in user
OSes (Unix, Windows) allow many threads (or lightweight processes [some
space), e.g.
Unices] or tasks [IBM]) to execute concurrently in one process (or address
space [IBM]). I context of thread is PC, registers, stacks etc., saved in
Everything previously said about scheduling applies to threads; but I thread control block (stored in user process’s memory)
process-level context is shared by the thread contexts. All threads in one I switching between threads can happen voluntarily, or on timeout
process share system resources. Hence (user level timer, rather than kernel timer)
I creating threads is quick (ca. 10 times quicker than processes)
I ending threads is quick
I switching threads within one process is quick
I inter-thread communication is quick and easy (have shared memory)
57 / 184 59 / 184
58 / 184 60 / 184
MultiProcessing SMP OS design considerations
There is always a desire for faster computers. One solution is to use I cache coherence: several CPUs, one shared memory. Each CPU has
several processors connected together. Following taxonomy is widely used: its own cache. What happens when CPU 1 writes to memory that
CPU 2 has cached? This problem is usually solved by hardware
I Single Instruction Single Data stream (SISD): normal setup, one designers, not OS designers.
processor, one instruction stream, one memory. I re-entrancy: several CPUs may call kernel simultaneously. Kernel
I Single Instruction Multiple Data stream (SIMD): a single program code must be written to allow this.
executes in lockstep on several processors. E.g. vector processors I scheduling: genuine concurrency between threads. Also between
(used for large scientific applications). kernel threads.
I Multiple Instruction Single Data stream (MISD): not used. I memory: must maintain virtual memory consistency between
I Multiple Instruction Multiple Data stream (MIMD): many processors processors (since each CPU has VM hardware support).
each executing different programs on different data. I fault tolerance: single CPU failure should not be catastrophic.
Within MIMD systems, processors may be loosely coupled, for example,
a network of separate computers with communication links; or tightly
coupled, for example processors connected via single bus to shared
memory.
61 / 184 63 / 184
62 / 184 64 / 184
Scheduling Criteria Preemptive Policies
To schedule effectively, need to decide criteria for success! For example, Here we interrupt processes after some time (the quantum).
I good utilization: minimize the amount of CPU idle time I round-robin: when the quantum expires, running process is sent to
I good utilization: job throughput back of ready queue. Good for general purposes. Tends to favour
CPU-bound processes – can be refined to avoid this. How big
I fairness: jobs should all get a ‘fair’ share of CPU . . .
should the quantum be? ‘Slightly greater than the typical
I priority: . . . unless they’re high priority interaction time.’ (How fast do you type?) Recent Linux kernels
I response time: fast (in human terms) response to interactive input have base quantum of around 50ms.
I real-time: hard deadlines, e.g. chemical plant control I shortest remaining time: (SRT) – preemptive version of SPN. On
I predictability: avoid wild variations in user-visible performance quantum expiry, dispatch process with shortest expected running
time. Tends to starve long CPU-bound processes. Estimation
Balance very system-dependent: on PCs, response time is important,
problem as for SPN.
utilization irrelevant; in large financial data centre, throughput is vital.
65 / 184 67 / 184
66 / 184 68 / 184
Scheduling evaluation: Suggested Reading SMP scheduling: Dispatching
In your favourite OS textbook, read the chapter on basic scheduling. For process scheduling, performance analysis and simulation indicate that
Study the section(s) on evaluation of scheduling algorithms. Aim to the differences between scheduling algorithms are much reduced in a
understand the principles of queueing analysis and simulation modelling multi-processor system. There may be no need to use complex systems:
for evaluating scheduler algorithms. FCFS, or slight variant, may suffice.
(E.g. Stallings 7/e chap 9 and online chap 20.) For thread scheduling, situation is more complex. SMP allows many
threads within a process to run concurrently; but because these threads
are typically interacting frequently (unlike different user processes), it turns
out that performance is sensitive to scheduling. Four main approaches:
I load sharing: idle processor selects ready thread from whole pool
I gang scheduling: a gang of related threads are simultaneous
dispatched to a set of CPUs
I dedicated CPUs: static assignment of threads (within program) to
CPUs
I dynamic scheduling: involve the application in changing number of
threads; OS shares CPUs among applications ‘fairly’.
69 / 184 71 / 184
Multiprocessor Scheduling Load sharing is simplest and most like uniprocessing environment. As for
process scheduling, FCFS works well. But it has disadvantages:
Scheduling for SMP systems involves:
I the single pool of TCBs must be accessed with mutual exclusion –
I assigning processes to processors may be bottleneck, esp. on large systems
I deciding on multiprogramming on each processor I preempted threads are unlikely to be rescheduled to same CPU;
I actually dispatching processes loses benefits of CPU cache (hence Linux, e.g., refines algorithm to
processes to CPUs: Do we assign processes to processors statically (on try to keep threads on same CPU)
creation), or dynamically? If statically, may have idle CPUs; if dynamically, I program wanting all its threads running together is unlikely to get it
complexity of scheduling is increased – esp. in SMP, where kernel may be – if threads are tightly coupled, could severely impact performance.
executing concurrently on several CPUs. Most systems use load sharing, but with refinements or user-specifiable
multiprogramming: Do we need to multiprogram on each CPU? ‘Obviously, parameters to address some of the disadvantages. Gang scheduling
yes.’ But if there are many CPUs, and the application is parallel at the or dedicated assignment may be used in special purpose (e.g. parallel
thread level, may be better (for response time) not to. numerical and scientific computation) systems.
70 / 184 72 / 184
Real-Time Scheduling Concurrency
Real-time systems have deadlines. These may be hard: necessary for When multiprogramming on a uniprocessor, processes are interleaved in
success of task, or soft: if not met, it’s still worth running the task. execution, but concurrent in the abstract. On multiprocessor systems,
Deadlines give RT systems particular requirements in: processes really concurrent. This gives rise to many problems:
I determinism: need to acknowledge events (e.g. interrupt) within I resource control: if one resource, e.g. global variable, is accessed by
predetermined time two processes, what happens? Depends on order of executions.
I responsiveness: and take appropriate action quickly enough I resource allocation: processes can acquire resources and block,
I user control: hardness of deadlines and relative priorities is (almost stopping other processes.
always) a matter for the user, not the system I debugging: execution becomes non-deterministic (for all practical
I reliability: systems must ‘fail soft’. panic() is not an option! purposes).
Better still, they shouldn’t fail.
73 / 184 75 / 184
RTOSes typically do not handle deadlines as such. Instead, they try to Concurrency – example problem
respond quickly to tasks’ demands. This may mean allowing preemption
almost everywhere, even in small kernel routines. Suppose a server, which spawns a thread for each request, keeps count of
the number of bytes written in some global variable bytecount.
Suggested reading: read the section on real-time scheduling in Stallings
(section 10.2). If two requests are served in parallel, they look like
serve request1 serve request2
Exercise: how does Linux handle real-time scheduling?
tmp1 = bytecount + thiscount1 ; tmp2 = bytecount + thiscount2 ;
bytecount = tmp1 ; bytecount = tmp2 ;
Depending on the way in which threads are scheduled, bytecount may
be increased by thiscount1 , thiscount2 , or (correct) thiscount1 +
thiscount2 .
Solution: control access to shared variable: protect each read–write
sequence by a lock which ensures mutual exclusion. (Remember Java
synchronized.)
74 / 184 76 / 184
Mutual Exclusion Mutex – first attempt
Allow processes to identify critical sections where they have exclusive Suppose we have a global variable turn. We could say that when
access to a resource. The following are requirements: Pi wishes to enter critical section, it loops checking turn, and can
I mutual exclusion must be enforced! proceed iff turn = i. When done, flips turn. In pseudocode:
I processes blocking in noncritical section must not interfere with while ( turn != i ) { }
others /* critical section */
turn = ı̂;
I processes wishing to enter critical section must eventually be allowed
to do so This has obvious problems:
I entry to critical section should not be delayed without cause I processes busy-wait
I there can be no assumptions about speed or number of processors I the processes must take strict turns
A requirement on clients, which may or may not be enforced, is: although it does enforce mutex.
I processes remain in their critical section for finite time
77 / 184 79 / 184
78 / 184 80 / 184
Mutex – third attempt Mutex – Dekker’s algorithm
Maybe set one’s own flag before checking the other’s? Ensure that one process has priority, so will not defer; and give other
flag[i] = true; process priority after performing own critical section.
while ( flag[ı̂] ) { } flag[i] = true;
/* critical section */ while ( flag[ı̂] ) {
flag[i] = false; if ( turn == ı̂ ) {
flag[i] = false;
This does enforce mutex. (Exercise: prove it.) while ( turn == ı̂ ) { }
But now both processes can set flag to true, then loop for ever waiting flag[i] = true;
for the other! This is deadlock. }
}
/* critical section */
turn = ı̂;
flag[i] = false;
Optional Exercise: show this works. (If you have lots of time.)
81 / 184 83 / 184
OK, but now it is possible for the processes to run in exact synchrony and
keep deferring to each other – livelock.
82 / 184 84 / 184
Mutual Exclusion: Using Hardware Support Types of semaphore
On a uniprocessor, mutual exclusion can be achieved by preventing A semaphore is called strong if waiting processes are released FIFO;
processes from being interrupted. So just disable interrupts! Technique it is weak if no guarantee is made about the order of release. Strong
used extensively inside many OSes. Forbidden to user programs for semaphores are more useful and generally provided; henceforth, all
obvious reasons. Can’t be used in long critical sections, or may lose semaphores are strong.
interrupts. A binary or boolean semaphore takes only the values 0 and 1: wait
This doesn’t work in SMP systems. A number of SMP architectures decrements from 1 to 0, or blocks if already 0; signal unblocks, or
provide special instructions. E.g. S/390 provides TEST AND SET, increments from 0 to 1 if no blocked processes.
which reads a bit in memory and then sets it to 1, atomically as seen by Recommended Exercise: Show how to use a private integer variable
other processors. This allows easy mutual exclusion: have shared variable and two binary semaphores in order to implement a general semaphore.
token, then process grabs token using test-and-set. (Please think about this before looking up the answer!)
while ( test-and-set(token) == 1 ) { }
/* critical section */
token = 0;
86 / 184 88 / 184
Using Semaphores Monitors
A semaphore gives an easy solution to user level mutual exclusion, for any Because solutions using semaphores have wait and signal separated in
number of processes. Let s be a semaphore initialized to 1. Then each the code, they are hard to understand and check.
process just does: A monitor is an ‘object’ which provides some methods, all protected by
wait(s); a blocking mutex lock, so only one process can be ‘in the monitor’ at a
/* critical section */ time. Monitor local variables are only accessible from monitor methods.
signal(s); Monitor methods may call:
I cwait(c) where c is a condition variable confined to the monitor:
Exercise: what happens if s is initialized to m rather than 1? the process is suspended, and the monitor released for another
process.
I csignal(c): some process suspended on c is released and takes the
monitor.
Unlike semaphores, csignal does nothing if no process is waiting.
What’s the point? The monitor enforces mutex; and all the synchroniza-
tion is inside the monitor methods, where it’s easier to find and check.
This version of monitors has some drawbacks; there are refinements which
work better.
89 / 184 91 / 184
93 / 184 95 / 184
97 / 184 99 / 184
Memory
within the Linux kernel. (I.e. memory for use by the kernel.)
no
+
CPU logical physical
address yes address
address fault
101 / 184 103 / 184
Memory
logical address
Page Table CPU
p o TLB
Memory
p1 f1
p p2 f2
p o p3 f3
p4 f4 f o
logical address
physical address
Page Table
CPU 1 f f o
physical p
address
1 f
Combined Paging and Segmentation: S/390 Completely independently, the linear address goes through a two-level
paging system.
The concepts of paging and segmentation can be combined.
I Segment related info (e.g. segment tables) can be paged out; so can
In S/390, they are intertwined, and can be seen as a 2-level paging system.
second-level page tables.
I There is no link between pages and segments: segments need not lie
I Logical address is 31 bits: on page boundaries.
I first 11 bits index into current segment table I Pages can be 4KB, or 4MB.
I next 8 bits index into page table; I Page table register is part of task context, stored in task segment (!).
I remaining bits are offset.
Page tables can be paged out, by marking their entries invalid in the
segment table.
For normal programming, there is only one segment table per process.
Other segment tables (up to 16) are used by special purpose instructions
for moving data between address spaces.
So far as possible, confine device-specific code to small, low layer, and Consequently, to access data in a given sector, need to:
write higher-level code in terms of abstract device classes. I move head assembly to right cylinder (around 4 ms on modern disks)
I wait for right sector to rotate beneath head (around 5 ms in modern
disks)
Disk scheduling is the art of minimizing these delays.
I Windows 2000 (NT 5.0); adds features for distributed processing; User Mode
I Windows Vista: still NT, but many components extensively File System Cache Security LPC
re-worked. Interesting techniques include machine-learning based Drivers Manager Manager Facility
I System Queue Area: page tables etc. operating system for interactive work;
I Pageable Link Pack Area: shared libraries permanently resident in I or they may load a S/390 operating system: MVS, Linux/390, or
Attacks Techniques
Many attacks are obvious: cut power lines, intercept phone lines, etc. I User identification
Some are less obvious: traffic analysis may reveal critical information. Data I Passwords
aggregation may extract sensitive information from several apparently
I One-time passwords
I Biometrics
innocent sources.
I Confidentiality
Social engineering attacks often work.
I OS facilities
I Encryption
I Authenticity and non-repudiation
I Cryptographic signing
In autumn 2003, somebody attempted to insert a trapdoor into the master The Internet Worm of 3 November 1988
copy of the Linux kernel. They added what looked like an obscure test
with a typo to a system call, which would actually allow anybody to Robert Morris, Jr., released the first worm that seriously disrupted the
become root by passing appropriate arguments. Fortunately, Internet. It used many techniques. For attack:
I It was caught by clash detection in version control.
I buffer overflow problem in the Unix finger service
I The source tree they modified was not actually the master (although
I exploiting intentional ‘trapdoor’ in Unix mail servers compiled in
is a source used by many people). debugging mode – very many production servers were!
I password guessing
I Unix remote execution allowed easy spread
It also had defensive capabilities:
I changes its name to sh
I forks() to change process id frequently
I avoids leaving files around
I obfuscates data in memory to hinder analysis
Intrusion detection, concealment and rootkits The Bell–LaPadula Security Policy Model
Modern systems do a lot of monitoring to try to detect suspicious activity: A security policy model describes concisely and accurately what constraints
changed files, unusual processes. Therefore, after successful crack, need security places on information flow.
to avoid detection. Often install modified copies of ls, ps etc. The BLP model is two basic principles:
Modern Linux rootkits try even harder: install kernel modules which
I No Read Up (simple security property): a process running at one
modify kernel code of system calls so that certain processes and files
level may not read data at a higher level;
are ignored - even clean ls or ps will not show them. Also modify load
average to ignore your password cracker, etc. etc. I No Write Down (*-property): a process running at one level may
not write data at a lower level.
The second policy prevents viruses etc. from copying sensitive information
down to a lower security level, as well as stopping humans accidentally
doing so.
(How is declassification achieved?)
Single Level Security – the Easy Way Stegfs: a Steganographic File System for Linux
Many workers with security clearances want to use ‘standard’ computers Theory by Anderson, Needham, Shamir; implementation by McDonald
(i.e. Windows) in a simple (non-MLS) way. How is data protected? and Kuhn.
Usually by a software add-on that maintains the hard drive in an Stegfs provides a crypto-based secure filesystem with multiple levels of
encrypted state. security. If you have only the level 3 password (say), you can’t tell that
The encryption key is held in a USB dongle. Several commercial devices there is a level 4, let alone that there is any level 4 data.
are approved to reduce the classification level of a laptop by two levels If the machine is running at level 3, the OS doesn’t know about level 4
when it is switched off (e.g. Top Secret to Confidential). data, so it may write over it. . .
(Question: Top Secret material may not be removed from a government . . . so StegFS maintains several copies of data in dispersed blocks, in the
site save in very unusual circumstances. Confidential material can be hope that one of them will survive until you next enter the relevant level
taken for home working subject to certain precautions. What should be - every so often, you should enter the highest security level and run a
the rule for such a laptop?) maintenance procedure to regenerate the multiple copies.