Programming Assignment: EE382C, Spring 2020
Programming Assignment: EE382C, Spring 2020
Submit: Generate a report that contains all necessary information to answer the stated questions.
Questions are given in a section labelled as such in each of the two parts of this assignment description.
Briefly describe your steps along with your results and conclusions. You may encounter concepts you
have not yet seen in class in detail but you are not required to have an understanding of those yet. Email
both the instructor and the TA with your report and files with the new code you wrote for the second
part of this assignment. This assignment allows teams of two. Clearly state the names and submit one
report per team. Your partner cannot be the same as the research paper assignment. This assignment is
worth 12% of your final grade.
Summary
The focus of this assignment is to familiarize yourself with Booksim, which is a cycle-accurate network
simulator we will use in this class. Booksim was created to accompany the book we use in this class,
hence the name.
This assignment has two parts. For the first part, this assignment has you conducting simulations on a
Dragonfly topology, modify various parameters, collect results, and reach conclusions. At the end you
will be asked to improve on a metric, such as performance, by choosing another topology from the ones
supported. The goal is to familiarize yourself with using the simulator and efficiently extracting results,
in order to prepare you for further assignments for which you will modify code.
In the second part, you are asked to re-write the fat tree (folded Clos) model inside Booksim. The
current model Booksim has is restrictive and has limited support for some parameters. For this new fat
tree, you will write an oblivious load-balancing routing function.
Other Simulators
If you are familiar with another simulator and would rather use that instead, you are welcome to. In that
case, make sure you answer the questions this assignment states, but you can ignore all the step-by-step
instructions. Make sure that simulator has the capability to satisfy this assignment. However, for the
sake of being able to choose the most suitable simulator for the rest of the class, it is worth giving
Booksim a try. The instructor is intimately familiar with Booksim but few others, so you may be on your
own if you try another simulator.
Booksim URL
https://github.com/booksim/booksim2
Contains source code and brief documentation.
Part 1: Dragonfly Topology
In Booksim, parameter “n” is the dimension of the intra-group network (restricted to 1) and “k” is the
radix of each switch. From those parameters the rest of the configuration is derived. Please refer to
“dragonfly.cpp” in directory “networks” for more details. You’ll have to get into the habit of reading
source code since documentation in academic simulators is lacking.
Installing Booksim
Luckily, Booksim has very few dependencies and in most environments simply compiles by running the
included makefile in the “src” directory. The “src” directory contains the source code for all of Booksim.
That directory also has subdirectories for classes of a specific type, such as network topologies. Refer to
the class slides for an overview of the internal hierarchy. Note that many classes have child classes. For
instance, the class “trafficmanager” generates, injects, and ejects traffic. Some simulation types use
synthetic traffic such as uniform random, while others use trace files. Each of these is a different child
“trafficmanager” class. Here is a broad overview of each important source file:
A list of directories:
Running Booksim
For this we will use “dragonflyconfig” in directory “examples”. Before you run booksim, add “stats_out =
<filename of your choice>.m. This will generate a matlab file with helpful statistics after each simulation
such as latency historgrams, packet latency histograms (plat), and others. To run booksim simply:
➢ ./booksim dragonflyconfig
The simulator will then generate an output to stdout (it is a good idea to redirect stdout to a file). That
will contain a printout of the configuration, statistics report at regular intervals (parameter
“sample_period”), and a final statistics report. Booksim can report statistics separately for traffic classes,
but by default there is only one class which is why you see a “class 0:” printout. These statistics are not
detailed, which is why “stats_out” is important. In that file, “sent_packets” is the rate at which each
source generates packets while “accepted_packets” is the rate at which those packets are admitted into
the network. “plat” is a histogram of packet latencies where if bin I equals A it means that A many
packets had a latency of I. Similarly there are two more histograms: flat for flit latencies and nlat for
packet network latencies (latencies without the time spent waiting in injection queues).
At the beginning of a simulation rate is the warmup phase where the network is filled with non-recorded
packets (will not change statistics) for the purpose of creating a realistic state for the recorded packets
that will follow in the main phase. Also, when a simulation is about to terminate because the pre-
defined number of cycles was reached, booksim continues to generate non-recorded packets. The
reason is that If booksim simply stopped generating packets, the last recorded packets would experience
an unrealistically empty network. This only occurs for simulations that use injection rate. If traffic is read
from a trace file booksim does not guess what packets could come before or after the ones in the trace
file.
It is important to understand when a simulation is considered stable. If the average latency keeps
increasing and does not stabilize after the warmup period (the duration of which is configurable),
booksim declares the simulation unstable and exists. This is meant to detect when the network is
saturated because it cannot satisfy the load it is receiving. If booksim does not report the simulation as
unstable, the network can handle the load and simulation begins.
In your debugging you may wish to track individual packets of flits. The easiest way to do that is to add
“watch_file = <filename>” in the configuration file and then create a file. That file has flit IDs, one per
line, that will be watched. You can also specify a packet ID by prefacing the ID with a p, e.g., “p34” is
packet ID 34. When a flit or packet is watched, booksim will report any action that is relevant to that flit
or packet.
Some configuration options that may be interest are “num_vcs” which specifies the number of VCs.
“vc_buf_size” specifies the buffer depth in flits per VC and per input.
The goal of part 1 is to have you use booksim, read parts of the source code to get acquainted, analyze
results, and change network configurations.
1. Read the Dragonfly source code and report what is the topology connectivity within each group
and across groups. This means figure out and describe how routers are connected to each other,
not just their radices. Also, how many routing functions does the Dragonfly have in the source
code and how do they work?
2. Now it’s time to run your first simulation! You can use the example configuration file but modify
the injection rate to 2% packet injection rate. You’ll have to be careful how to properly define
this so your injection rate is actually what this question asks for. For the requested injection
rate, is the network saturated? What is the average packet and flit latency? What is the median,
and standard deviation? What is the 99th percentile flit and packet latency? How many
measured packets were sent?
3. Now lets sweep the injection rate starting from 1% flit injection rate (not packet), and increase it
by 5% at a time (1%, 5%, 10%, etc) until you find the point where the network saturates. What is
that injection rate? What is the average and maximum latency at an injection rate right before
the network saturates? Plot the injection rate – average latency curve. Compare the offered
traffic versus ejected traffic at a point before network saturation and after. Are they equal? This
question asks you to remember the injection rate that saturated the network and the one
before it, and compare "sent_packets" and "accepted_packets" for each of those injection rates
from the matlab output file.
4. Now we will start modifying the network to figure out how its performance changes. For the
injection rate that you identified above, lets use adaptive routing. Does the network still
saturate? What is the average hop count with and without adaptive routing? If the network
saturates, reduce the injection rate until you find the new point of saturation. If it does not
saturate, increment until you find the same point. Does the new saturation point make sense in
relation to the old one?
5. Finally, repeat question 3 but now for a 2D mesh of the same size (same number of terminals). Is
the mesh better than the Dragonfly for this configuration? Why do you think? Since the mesh
has to be square it may not have the exact same number of terminals. In that case use a mesh
with the closest possible number of terminals.
Part 2: Fat Tree Topology (Folded Clos)
A picture of an example fat tree is shown above. There are two parameters of interest here: the number
of levels (only three are shown), and the connectivity radix in each level (lets call it “k”). As shown, k = 2
because each router connects to two other routers going up and two more going down. Note that this
topology has a hierarchy. Also, sources and destinations of traffic are only connected at the leaf routers
(i.e., this is an indirect topology). The number of sources and destinations that are connected to each
leaf router is k.
You are asked to implement a fat tree topology of any number of levels and any value of “k”. Some
combinations of parameters will be invalid and you should check for that. For instance, for convenience
you can check and return an error if a value of k and number of levels would create a network where not
all routers have the same number of inputs and outputs. Hint: calculate the number of inputs and
outputs for each router as a function of k and the number of levels.
For the topology you create, you will also create a routing function that is oblivious and load balancing.
That is, for packets traversing up, the routing function will choose at random and with equal probability
among all channels going up. Thankfully, you don’t need to check if the channels you choose among
provide a final path to your destination because if you construct your topology right, packets can go to
any destination once they start traversing in the down direction. Your routing function should not take
unnecessary hops. That is, if your source and destination share a router that is not at the top level,
packets should only go as high as that common router level (i.e., not go any higher than necessary). Note
that once packets start moving in the down direction, there is no path diversity anymore and packets
have only one choice. Once packets start moving in the down direction, they cannot switch to the up
direction.
Following how booksim is internally organized, almost all your code edits will be constrained into a
topology file. It is ok if you want to refer to or overwrite booksim’s existing fat tree model. As you will
see, there are really two functions that you need to edit. One is “BuildNet” which instantiates routers,
channels, and connects them as well as to sources and destinations appropriately. Also, the routing
function you will write will be its own standalone function in the same .cpp file. If you write a new
routing function, you will need to register it so that booksim recognizes its name if it’s given in the
configuration file. We strongly advise you to read and understand existing an existing topology file and
ask the instructor or TA questions to help you understand what is going on before you start writing
code.
(25 points) Part 2: Questions and Deliverables
Submit a report answering the questions below. Also submit the code that you wrote for this
assignment (the topology file .h and .cpp). The primary metric for this part 2 is correctness.
1. (10 points) Your first task is to make sure that your code works correctly. Show simulation
results for three, five, and seven levels with a k = 4 and k = 8. Use different injection rates. Does
your simulation return any errors? Do your results make sense based on your knowledge from
class?
2. (10 points) Sadly, bugs do not always trigger assertions. Therefore, prove that your load-
balancing routing algorithm works correctly. Remember that one expected result is that
channels in the up and down directions have comparable loads. You may have to insert statistics
and printouts.
3. (5 points) As we mentioned in class, a fully-sized fat tree should be able to provide 100% (full)
throughput for any traffic pattern. Following this expectation, run uniform random, transpose,
bitcomp and find the saturation rate of the topology like you did in part 1. Is it 100%? If not, why
do you think it is not? Hint: it may be a bug, but also consider differences between theory (what
we talked about in class) and reality (imperfections of an actual network).