Lab 2
Lab 2
Lab 2
Part 1 due: Sept. 21st at 11:59pm Part 2 due: Sept. 25th at 11:59pm
Overview
The goal of this lab is to familiarize yourself with nework packet headers by parsing some yourself. This
assignment must be written in C (not C++), and will require a fairly deep understanding of pointer ma-
nipulation, memory layouts of basic and structured data types, and formatted printing. To ensure you can
complete these tasks, you will first complete a sequence of smaller programs (Part 1). Following, you will
use the skills you developed in Part 1 to complete the packet parser in Part 2.
Deliverables
Your submission should minimally include the following programs for each part of the lab.
• Part 1
• Part 2
– data sizes.c
– parse headers.c
– bit wise.c
– Makefile
– Makefile
– README
– README
Your README file should contain a short header containing your name, username, and the assignment
title. The README should additionally contain a short description of your code, tasks accomplished, and
how to compile, execute, and interpret the output of your programs. Additionally, if there are any short
answer questions in this lab write-up, you should provide well marked answers in the README, as well as
indicate that you’ve completed any of the extra credit (so that I don’t forget to grade it).
Submission Instructions
Submission will occur via handin43. You can retrieve relevant files via update43.
1
1 C Programming Primer
In Part 1 of this lab, you will complete a sequence of short programs to review some basic C programming
skills. Note that C is a subset of the C++ language; however, there are subtle difference that can trip you up.
In this lab, I expect you to only include C’s stdlib.h, stdio.h, and string.h and no more unless
otherwise specified in sample code provided. A quick note about strings in C, there are none. Strings are
represented as char arrays, where the last character is NULL (’\0’).
char first_name[20];
char last_name[20];
char middle_initial;
long account_num;
short account_type;
double account_balance;
};
Have your code print the size of this data type when data sizes.c is executed. Additionally,
declare a data type of this structure on the stack and propagate it with your name and (made-up)
numbers for the account (you’ll find strcpy() like functions useful for this task). Print out the
record information as well when the program executes. (See below for an example of a formatted
print).
Your task is to extract that data and print out the information below:
Name: Adam J. Aviv
Acc#: 424242
Acc Type: 10
Balance: 2048.256000
4. Extra Credit (4 pts): Submit your own data embeded string and a formatted print in your code.
2
Questions: Answer the following questions in your README.
1. What is the bit-width of your CPU? 32bit or 64bit? How can you tell from the output of this program?
2. What is the largest signed short, int, and long? What is the largest unsigned values?
3. Explain casting in C? Does it affect the underlying data or just the interpretation of that data?
4. Hexidecimal notation (or hex) is incredibly useful tool for programmers, and you’ve seen an example
of that above where I embedded data into a string using hex. Why do you think we use hex so readily?
How much information can two hex digits store? How much information can one byte store? What is
1024 in hex? What is 1025 in hex?
• a << b, left shift: Shift the bits of a to the left by b bits and pad extra bits to the right with 0. For
example, 1 << 3 is 8, or 1 = 1b in binary and shifted to the left by 3 gives 8 = 1000b in binary.
• a >> b, right shift: Shift the bits of a to the right by b bits and pad extra bits to the left with 0. For
example, 16 << 2 is 4, or 16 = 10000b in binary and when shifted to the right by two 4 = 100b .
Program solutions to the following problems and put them in a new file named bit wise.c. In the
main() function of your program, you should provide code to access and test these function, verifying
your solutions. Feel free to include all test code you used when programming.
1. A bit-mask is used to extract information embedded at the bit level. For example, consider the em-
bedding of bit-wise flags within a byte. There are 8 flags that can be set, one for each bit, and suppose
we are only interested in the flag in the least significant bit. We can describe a mask of 0x01 in hex
(or 00000001 in binary) to extract that bit in the following way.
Now consider the TCP flags embedded within a TCP header (we’ll discuss the meanings of the flags
later in the semester). There are 9 TCP header flags, and you can consider them stored within the
data-width of a short. Write a function tcp flags() that given a short representing the flags,
will print out which flags are set. The TCP flag layout is as follows:
3
bit-offset 0 < --- > 6 7 8 9 10 11 12 13 14 15
+---------------+---+---+---+---+---+---+---+---+---+
| | N | C | E | U | A | P | R | S | F |
values | unused | S | W | C | R | C | S | S | Y | I |
| | | R | E | G | k | H | T | N | N |
+---------------+---+---+---+---+---+---+---+---+---+
Do not worry about what each flag means. Your program can just print the short-hands for each
flag set. For example, a call to your function like this tcp flags(0x01A1) should provide the
following formatted output:
Flags: NS, CWR, URG, FIN
2. Write a function print bits() that takes a char and prints a sequence of 1’s and 0’s representing
the char in terms of its bits. Below is a function outline to work from:
void print_bits(char c){
int i;
for(i = 0; i < sizeof(char)*8; i++){ // 8 bits per byte
3. Extra Credit (3 pts): Change your function above such that it can take in arbitrary data and print the
bits of that data. This function should have the following definition:
void print_bits_EC(unsigned char * ptr, size_t len);
4. Consider a short which contains two different values in the first (most significant) byte and the
second (least significant) byte. Write a program that will print the (unsigned) values of each part of
the short. Below is a function outline to work from:
void split_short(unsigned short s){
unsigned short first,second;
5. Most modern computers encode data in Little Endian format, where the most significant bits appear
on the left in the data encoding, such as the encoding of 13 in binary 13 = 1101b . An equivalent
encoding is where the most siginficant bits apear on the right. This is called Big Endian notation; for
example, 13 is encoded as 13 = 1011b in Big Endian.
Due to legacy issues, all network packet are encoded in byte-wise Big Endian (also called Network
Order). Within each byte, the ordering is Little Endian, but sequences of bytes are encoded such that
the must significant byte occurs on the right.
4
This causes many headaches for network programmers, but the byte order can easily be reorder using
bit-wise operators. Add a function to your program that given a short will swap the first and second
byte of the short, printing it in both formats. Here is a function outline to work from:
void swap_bytes(unsigned short s){
printf("Lil’Endian: %u", s);
6. Extra Credit (3pts): Write a function that will reverse the byte order of an arbitrary data type. Here
is a function definition to start from:
unsigned char * reverse_bytes(unsigned char *, size_t len);
5
2 Packet Parsing
In this part of the lab, you will be provided with a network packet capture, and you must parse the Ethernet,
Internet, and TCP header information from the packets and print out requisite information as described
below. The packet capture was done using the tcpdump program and is accessed by libpcap. The point
of this lab is not to learn libpcap, but rather parse the packet header information. So: I have provided you
with some skeleton code to get started.
This code makes use of libpcap, the packet capturing libraray, and you will need to tell the compiler
to dynamically link the library during compilation. If you are using gcc (which I highly recommend), then
this is easily accomplished using the -l flag. For example, here is a standard compilation command for
your program:
bash> gcc -g -lpcap parse_headers.c -o parse_headers
If libpcap is install, then it should be automatically linked against your compiled binary at execution time
by the dynamic linker.
In addition to the skeleton code, I have also provide two traces for you to use to test your program:
capture1.pcap and capture2.pcap. As an example of the expected output, here is some sample
output from parsing capture2.pcap:
(...)
-------- Packet 16 ------------
Size: 66 bytes
MAC src: 00:25:90:26:be:da
MAC dest: 7c:c3:a1:89:31:e8
IP src: 130.58.68.137
IP dest: 130.58.68.200
Src port: 80
Dst port: 63770
-------- Packet 17 ------------
Size: 1514 bytes
MAC src: 00:25:90:26:be:da
MAC dest: 7c:c3:a1:89:31:e8
IP src: 130.58.68.137
IP dest: 130.58.68.200
Src port: 80
Dst port: 63770
-------- Packet 18 ------------
Size: 1514 bytes
MAC src: 00:25:90:26:be:da
MAC dest: 7c:c3:a1:89:31:e8
IP src: 130.58.68.137
IP dest: 130.58.68.200
Src port: 80
Dst port: 63770
(...)
You are not required to match this format precisely, but all this information must be present.
6
Ethernet Header
Note that the link-layer addresses are MAC addresses, not IP addresses, which occur in the Internet packet
(i.e., in the network layer).
octets 0 1 2 3 4 5 6 7 8 9 10 11 12 13
+----------------+-----------------+------+
| Dest. Address | Src. Address | Type |
+----------------------------------+------+
Internet Header
Below is the IPv4 header. Note that the length of the header is defined by IHL, which describes how many
data words are in the header. Each word is 4 bytes longs, and if IHL is 5, then we expect the header to be
160 bits (or 20 bytes) Thus, the “Options” and “Paddding” field may not be set and are optional.
octets 0 1 2 3
bits 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 |Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
32 | Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
64 | Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
96 | Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
128 | Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
160 | Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TCP Header
Similar to the Intenet header, the TCP header can be variable length. The length of the header is defined in
the data offset field, and again is described in data word units (or 4 byte units). Normally, the data payload
offset is set to 5, or the data begins at the 160th bit. Again, anything following the 160th bit is optional.
octets 0 1 2 3
bits 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 | Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
32 | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
64 | Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |N|U|A|P|R|S|F| |
96 | Offset| Resv. |s|R|C|S|S|Y|I| Window |
| | | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
128 | Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
160 | Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| .... data .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
7
2.2 Extra Credit (6 pt): Verify the Internet and TCP checksum
Review the Internet checksum in Kurose and Ross, and implement a routine that will verify both the TCP
and Internet header check-sums.