Lab2 - Scheduling Simulation
Lab2 - Scheduling Simulation
Joseph Latina 92
My Blackboard Courses Content Collection
Lab 2 - Processes
In this lab, you will work with a process scheduler simulator and understand scheduling decisions in a
multiprogramming system. You will also be introduced to the notion of a system's policy, and how it is
distinguished from its mechanism.
Create a new directory called L2 in your CMPT 360 Git repository. In that directory, you will add a file
called answers.txt, which will contain the solutions to questions in the lab.
Your answers are to be wri en in short-answer form, with relevant program output as appropriate. Write
your answers in a manner befi ing an upper-level university course: submit complete sentences with
proper punctuation, please. Poor writing style will be penalized. If you are looking for an appropriate tonal
voice for your technical writing, I suggest imitating that of the Linux manual: precise and clear, but not
overly stuffy. If you need assistance with your writing skills, I would encourage you to get in touch with
MacEwan Writing and Learning Services.
Before coming to lab, make sure you have read the OStep chapter entitled The Abstraction: The Process.
As you will have seen in lecture, programs' behaviour can broadly be grouped into two kinds of
operations: compute operations, such as computing a numerical algorithm, or IO operations, such as reading
or writing a file on disk. Of course, a compute operation needs to be actively running on the CPU for it to
complete. However, once an IO operation like a network operation has begun, the process cannot proceed
until the operation is completed by the relevant piece of hardware, so it can be descheduled and replaced
with another process while its IO operation completes in the background.
The authors of the OStep textbook have provided a simple program called process-run.py. It simulates
scheduling one or more processes in a multiprocessing system. While it makes many simplifying
assumptions, it is a useful tool to help build your intuition about what an operating system scheduler has
to do when faced with multiple processes to run concurrently.
In the prelab, we will learn how to run the script and have it simulate a particular scheduling workload.
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 1/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
The simulator can be downloaded to your current working directory by the following command:
Please do not check this script into your repository, as you will simply run it and never make any changes
to it for us to grade.
Once you have downloaded process-run.py, run it with the -h flag, to see a description of how to run
the program.
macdonellc4@students:~> cd /tmp/
macdonellc4@students:/tmp/> ls
process-run.py
macdonellc4@students:/tmp/> ./process-run.py -h
Usage: process-run.py [options]
Options:
-h, --help show this help message and exit
-s SEED, --seed=SEED the random seed
-l PROCESS_LIST, --processlist=PROCESS_LIST
a comma-separated list of processes to run, in
the
form X1:Y1,X2:Y2,... where X is the number of
instructions that process should run, and Y th
e
chances (from 0 to 100) that an instruction wi
ll use
the CPU or issue an IO
-L IO_LENGTH, --iolength=IO_LENGTH
how long an IO takes
-S PROCESS_SWITCH_BEHAVIOR, --switch=PROCESS_SWITCH_BEHAVIOR
when to switch between processes: SWITCH_ON_I
O, SWITCH_ON_END
-I IO_DONE_BEHAVIOR, --iodone=IO_DONE_BEHAVIOR
type of behavior when IO ends: IO_RUN_LATER, I
O_RUN_IMMEDIATE
-c compute answers for me
-p, --printstats print statistics at end; only useful with -c f
lag
(otherwise stats are not printed)
macdonellc4@students:/tmp/>
This help output describes how to specify the workload that the simulator will simulate. We will use most of
these flags in one form or another in this lab, even if not all of their meanings are clear to you just yet.
On the command-line, one or more process pair values are specified with the -l flag. A process pair is two
numbers, separated by a colon, that represent a particular process that the simulator will run. The first
instructions
number represents how many operations (CPU or IO) the process will perform, and the second represents
the proportion that will be CPU operations. The exact operation sequence is determined pseudorandomly.
Here is a process with five operations, all of which are CPU operations:
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 2/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
Important behaviors:
System will switch when the current process is FINISHED or ISSUES AN
IO
After IOs, the process issuing the IO will run LATER (when it is its
turn)
Here is a process with five operations, approximately 50% of which are IO operations.
Important behaviors:
System will switch when the current process is FINISHED or ISSUES AN
IO
After IOs, the process issuing the IO will run LATER (when it is its
turn)
io refers to the operation that kicks off an IO operation, and io_done refers to the operation that handles
the competion of the IO operation. Recall IO operations that you, yourself, have done, such as calling
fread() in L1, and convince yourself that an io operation can't do any other operation until the io_done
comes next.
The final section of the output, under "Important behaviours", describes aspects of the scheduler policy. We
say that a policy is a set of design choices that is enacted by some mechanism. In this case, our mechanism is
the Python simulator, but, we could hypothetically take the same policies we are experimenting with in this
lab and have the Linux kernel scheduler act as the mechanism, too. Discussing policy as separate from the
mechanism allows us to deliniate "what should the behaviour of a system be" from "how does the system
go about doing it".
Here, there is a policy for when to stop executing a particular process and switch to another, and a policy
for when a process that has completed an IO operation should begin executing again.
Executing a simulation
Part of the goal of this lab is for you to build intuition for how a particular workload should behave. But,
with the -c flag, we can have the program tell us which processes get scheduled and when.
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 3/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
A CPU operation, in this very simplified scheduler simulator, takes one abstract "unit of time". Therefore,
this simulation ran for five "time units", each of which was spent executing PID 0 on the CPU.
We see that a process performing an IO operation requires essentially seven time steps: one time step to
begin the io operation, five to mimic waiting for the IO operation to complete, and another one to process
the completed IO operation. However, unlike a compute operation, those five timesteps do not require that
the process is scheduled to run on the CPU, since it is in the WAITING state, simulating waiting for some
hypothetical piece of hardware to complete an operation. A key goal of the operating system scheduler is to
take advantage of this "downtime" in order to reschedule another process while the other is waiting for IO
to complete. In this lab, you will perform experiments that involve multiple concurrent processes running.
We can also add the -p flag to print out the number of time units where the CPU was spent doing useful
work, and the number of units where the CPU had no work to do as all processes were tied up waiting for
IO to complete. Ideally, we would be able to have our CPU doing useful work 100% of the time, but life is
rarely so straightforward..!
Lab questions
S f th f ll i ti d t df th h k ti f th OSt h t titl d
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 4/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
Some of the following questions are adapted from the homework questions from the OStep chapter entitled
The Abstraction: The Process.
Question 1 (3 points)
Consider the following command-line invocation of process-run.py (but don't run it just yet):
Part a): Describe in your own words what this will simulate.
Part b): Explain what the CPU utilization (ie. the percentage of time the CPU is running a process) should
be, and justify it. (You can run the simulation, and add the -p flag, to check your work.)
Part c): Explain how the number of timesteps that the whole simulation takes to complete in this particular
run depends on the number of timesteps of each of the processes. Will this hold for every run, irrespective
of CPU/IO ratios?
Question 2 (3 points)
In the prelab, we discussed the idea of scheduler policies. The -S flag allows us to change what the scheduler
will do when it is executing some process and an IO operation occurs. The possible policies for -S are:
- -S SWITCH_ON_IO: The system WILL switch to another ready process, if one exists, when an IO
operation begins
- -S SWITCH_ON_END: The system will NOT switch to another process when an IO operation begins
Part a): Create a workload with two processes, where both process' five operations are equally-likely to be
CPU operations and IO operations.
Part b): Run your simulation from part a. From the output of the simulation, determine what the default
switch policy is, with evidentive justification for your answer.
Part c): Re-run this simulation with the SWITCH_ON_END scheduler policy. Explain the difference in
behaviour.
Question 3 (2 points)
Another important scheduler policy concerns what the scheduler should do when a process is running on
the CPU, and another process' IO operation finishes. Should we switch back to the other process or
continue executing the current process? We can change this policy with the -I flag:
./process-run.py -l 3:0,5:100,5:100,5:100
Make sure you have a rough understanding about how this workload represents before continuing.
Part a): Execute this workload. Which of the two policies described above is the default? Explain.
Part b): Repeat the same workload, with the policy that switches to the process that just completed its IO
operation. Explain the difference in behaviour.
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 5/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
Question 4 (3 points)
In answering this question, you should make use of your discoveries about the scheduler policies from
questions 2 and 3.
Part a): Write a workload that can be passed to process-run.py with the -l flag that simulates two
processes: one that runs a CPU-only workload for eight timesteps in PID 0, and one that runs two IO
operations, one right after the other in PID 1. Predict how many timesteps it will take to execute this
workload with the default scheduler policies, explaining each scheduler decision as necessary.
Part b): What happens if you flip the order of the processes in your -l parameter? Explain the scheduler's
behaviour here.
Part c): In the workload you considered in part a), the two IO operations were serialized; that is, the second one
could only begin executing after the first one completed. Modify your simulation such that there are now
three processes, with the two IO operations handled concurrently by PID 1 as well as PID 2, with PID 0
responsible for the CPU operations. What is the workload that simulates these processes? As before,
explain the execution trace.
Question 5 (2 points)
Create an example workload with at least 3 jobs (each job should have at least 10 instructions) that has a
significant difference in total execution time depending on the IO_RUN_LATER vs IO_RUN_IMMEDIATE
policy. The relative difference must be greater than 10%. For example, if running with IO_RUN_LATER
results in an execution of 73 time units and IO_RUN_IMMEDIATE results in 59 time units, the relative
difference is (73 - 59)/73 = 19%. Provide an explanation of why the performance difference of your
workload is so noticable.
(Hint: You might find it helpful to begin by coming up with a sample workload where IO_RUN_LATER vs
IO_RUN_IMMEDIATE makes no difference.)
Question 6 (1 point)
Leaving all other parameters the same from the previous question, modify only the CPU/IO probabilities of
the jobs and see if you can increase or decrease the difference seen in the previous question. In your
answers file, include the complete modified execution commands and clearly explain what change you
made to the CPU/IO probability and what its impact on difference was between the executions with
IO_RUN_LATER and IO_RUN_IMMEDIATE.
Question 7 (1 point)
The title of this chapter is The Abstraction: The Process. Explain how a process is an abstraction.
OSes maintain a data structure for each process that is used to track and manage all the information for a
process. The linux kernel is open source and so its code is openly available on the internet. The macOS
kernel (only the kernel) is also open source, it's called Darwin XNU. For one of these kernels, find the
process control block (PCB). Include a URL for the PCB you found. Make a non-trivial observation
compared to the xv6 proc struct in the textbook.
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 6/7
9/23/21, 11:10 AM Lab 2 - Processes – CMPT-360-AS01(1) - 2021 Fall - Intro...
https://learn.macewan.ca/webapps/blackboard/execute/content/file?cmd=view&content_id=_3414584_1&course_id=_94552_1&framesetWrapped=true 7/7