Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

lOMoARcPSD|3336559

FIT1047-Exam Revision

Introduction to Computers, Networks and Security (Monash University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by fizz 2win (fizzz2win@gmail.com)
lOMoARcPSD|3336559

FIT1047 – Exam Revision Notes

Computer Architecture
The Von Neumann architecture consists of a Central Processing Unit (CPU), the memory, and
the input/output devices. Furthermore, the CPU can be subdivided into the Arithmetic/Logic
Unit (ALU), a number of registers, and the Control Unit (CU).

Central Processing
Unit (CPU)
A Central Processing Unit, or CPU, is the component of a computer that does most of the actual
“computing”. Almost all modern computers are based on the same architecture, called the Von
Neumann architecture, where the instructions that make up the programs we want to run, as well as
the data for those programs, are stored in the memory, and the CPU is connected to the memory by
a set of wires called the bus. The bus also connects the CPU to external devices (such as the screen, a
network interface, a printer, or input devices like touch screens or keyboards).

Compilers
A compiler takes a program in a language like Java or C++ and translates it into a lower level
language.

Interpreters
An interpreter executes, or interprets, the instructions written in an interpreted programming
language such as Python, without first translating them into machine code. The interpreter is itself a
program, usually written in a compiled programming language (so it can run directly on the CPU).

Machine Code
Machine code is a very low-level programming language. A program in machine code is a sequence
of individual instructions, each of which is just a sequence of bits. Usually, an instruction consists of
one or more words.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Registers
A register is a very fast memory location inside the CPU. Each register can typically only store a
single word, but it is many orders of magnitude faster to read from or change the value in a register
compared to accessing the main memory.

Arithmetic Logic Unit


The arithmetic logic unit (ALU) is responsible for performing basic computations such as addition
and multiplication, as well as Boolean logic operations, for example comparisons or AND and OR
operations.

Control Unit
The control unit (CU) performs the fetch, decode and execute cycle / is the actual “machine” inside
the CPU. It coordinates the other components. For example, it can switch the memory into “read” or
“write” mode.

Fetch
The PC register contains the memory address where the next instruction to be executed is stored. In
the fetch cycle, the CU transfers the instruction from memory into the IR (instruction register). It
then increments the PC by one, so that it points to the next instruction again. This concludes the
fetch cycle.

Decode
In the decode cycle, the CU looks at the instruction in the IR and decodes what it “means”. For
example, it will get any data ready that is required for executing the instruction.

Execute
In the execute cycle, the actual operation encoded by the instruction needs to be performed. For
example, the CU may load a word of data from memory into a register, switch the ALU into “addition
mode”, and store the result of the addition back into a register.

When the execute cycle concludes, the control unit starts again with the next fetch.

MARIE CODING
AC (accumulator): This is the only general-purpose register.

MAR (Memory Address Register): Holds a memory address of a word that needs to be read from or
written to memory.

MBR (Memory Buffer Register): Holds the data read from or written to memory.

IR (Instruction Register): Contains the instruction that is currently being executed.

PC (Program Counter): Contains the address of the next instruction.

OPCODE MNEMONIC EXPLANATION

0001  Load X Load value from location X into AC

0010 Store X Store value from AC into location X

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

OPCODE MNEMONIC EXPLANATION

0011 Add X Add value stored at location X to current value in AC

0100 Subt X Subtract value stored at location X from current value in AC

0101 Input Read user input into AC

0110 Output Output current value of AC

0111 Halt Stop execution

1010 Clear Set AC to 0

1000 SkipCond X  Skip next instruction under certain condition (depends on X)

1001 Jump X  Continue execution at location X

1011 AddI X  Add value pointed to by X to AC

1100 JumpI X  Continue execution at location pointed to by X

1101 LoadI X Load from address pointed to by X into AC

1110 StoreI X Store AC into address pointed to by X

Sequential Circuits
Feedback
The main mechanism that allows a circuit to remember the past is to pass its output back into its
input. That way, we can establish a feedback loop.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

The diagram shows a sequence of three steps. In the first step, both input and output are 0. If we
now set the input to 1 (the second step), the output of course also becomes 1. But if we now reset
the input to 0, the 1 from the output is still feeding into the other input of the OR gate, which means
that the output stays 1. This circuit therefore remembers whether the input has ever been set to 1.

D – Flip – Flop

The input is the data bit that is supposed to be stored, and the output Q is the data bit that is
currently stored in the flip-flop. The output Q’ is not really required, it simply outputs the negation
of the stored bit. The interesting input is the clock: The state of the flip-flop can only change on the
“positive edge” of the clock, i.e., when the clock input changes from 0 to 1.

Booting The System

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

We already know the CPU and clock. It should be noted that modern CPUs contain a part of memory
known as Cache. This memory takes part of the main memory and copies it as some kind of working
copy into this cache memory. The advantage is, that cache is much faster and therefore, data can be
delivered to the CPU in a faster way.

The actual memory of the system is the RAM (random access memory). It is usually writable and
volatile (i.e. stored data disappears if power is switched off). RAM is much faster than non-volatile
storage on hard-disks, USB drives etc. Furthermore, access to data happens at almost the same
speed for all locations.

RAM is connected to the system via the Memory Bus and a set of electrical elements / chips, called
the Northbridge (or memory controller hub).

Also connected to the northbridge we find other components that require higher speed. E.g.
graphics cards are connected via high speed busses such as PCI Express or AGP.

I/O and slower components are connected via the southbridge. North- and southbridge together
are what is commonly called the chipset.

Southbridge is also called I/O controller hub. All wires leading off the board are connected to the
southbridge.

Examples include busses to connect harddisks to the board (IDE, SATA), networking (Ethernet),
audio, universal serial bus (USB), Firewire, etc.

The PCI bus can be used to connect other extensions that do not need the high speed of the PCI
Express bus.

Finally, the rather outdated and slow low-pincount bus (LPC) is used to connect the security chip
rusted Platform Module TPM and the ROM (read-only memory) containing the first code to be
executed after switching on the computer. In PCs, this code was called the BIOS (Basic Input/Output
system), that has now been largely replaced by UEFI, the Unified Extensible Firmware Interface.

Boot Process
Step 1: Turn on power
Power supply starts and provides energy to the motherboard and other components in the
computer. Components should only really start to work after a stable power level is established.

A power good signal can be sent to the motherboard which triggers the timer chip to reset the
processor and start clock ticks.

Step 2: Initial software


BIOS (Basic Input Output System or first steps of UEFI Unified Extensible Firmware Interface in
modern PCs) is stored in non-volatile memory (ROM – read only memory) on the motherboard.

It controls the start-up steps, provides initial system configuration (power saving, security, etc.), and
initially configures accessible hardware.

Boot process: POST


BIOS starts with a power-on-self-test POST:

-System memory is OK

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

-System clock / timer is running

-Processor is OK

-Keyboard is present

-Screen display memory is working

-BIOS is not corrupted

Boot process: Video card


The first thing after a successful POST is to initialise the video card and show some initial message on
the screen.

Note that the BIOS can only do a rudimentary initialisation. Use of 3D, fancy graphics, etc. needs
additional software, the so-called driver.

Boot process: Other hardware


Then, the BIOS goes through all available hardware and initialises as far as possible without more
complex driver software (UEFI has more options).

Examples are type and size of hard-disk, DVD drive, timing of RAM (random access memory) chips,
networking, sound, etc.

Boot process: Find Operating System


Only devices with very restricted resources have all the software needed to run (the Operating
System) stored in the firmware. All other computers need to load the Operating System (e.g.
Windows, Linux, Android, iOS) from some non-volatile storage. This storage must be configured to
support booting from it and it must be enabled for booting in the BIOS configuration.

Thus, BIOS needs to look for a bootable drive. This can be on a hard-disk, SD-Card, USB Stick, DVD,
floppy disk, etc.

The order that BIOS checks for anything bootable is defined in the BIOS configuration (usually
accessible by holding a particular key while start-up screen is shown). In UEFI systems that enable
faster booting, this option is not always available, it needs first to be activated in the system
configuration on a running system or becomes automatically available of the Operating System fails
to load.

Boot process: Boot sector


On a bootable drive, there needs to be a boot sector with code that can be executed (called the boot
loader). On a hard disk, this information is in the Master Boot Record (MBR). The boot loader first
loads the core part of the operating system, the kernel. Then it loads various modules, device
drivers, etc.

Once all drivers are loaded, the Graphical User interface GUI is started and personal settings are
loaded.

The computer is now ready to use.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Operating Systems
An operating system provides a level of abstraction between hardware and software. This is an
important concept in IT: We want to hide complicated, diverse, low-level concepts behind a simple
interface. The following figure shows how an OS fits into our overall view of a computer:

Operating Systems have the following core tasks:

-Managing multiple processes running in parallel. A process is a program that is currently being
executed.

-Managing the memory that processes use.

-Provide access to file systems, the network and other I/O resources.

Abstraction
The main goal of an OS is to make computers easier to use, both for end users and for programmers.

For end users, an OS typically provides a consistent user interface, and it manages multiple
applications running simultaneously. Most OSs also provide some level of protection against
malicious or buggy code.

For programmers, the OS provides a programming interface that enables easy access to the
hardware and input/output devices. The OS also manages system resources such as memory,
storage and network.

We can summarise these functions in one word: abstraction. The OS hides some of the complexity
between consistent, well-documented interfaces – both for the end user, and for the programmer.

Processes and Programs


Process: A process is a running instance of a program.

Program: A sequence of instructions.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

So a program is the code that you write, or the result of compiling the code into machine code. In
most cases, a program is a file that’s stored on disk (or maybe printed in a textbook). A process, on
the other hand, is created by loading the program code into the computer and then executing it.

OS abstractions through virtualisation


Operating Systems achieve abstraction through virtualisation. This means that they provide a virtual
form of each physical resource to each processes. Physical resources include the CPU, the memory,
the external storage (hard disks etc) and the network and other I/O devices. Instead of using these
physical resources directly, the process uses functionality in the OS to get access, and the OS can
provide the “illusion” that each process

 has the CPU completely to itself (i.e., there are no other processes running)
 has a large, contiguous memory just for itself
 has exclusive access to system resources (i.e., it doesn’t have to worry about sharing these
resources with other processes)

Virtualising the CPU


The goal of virtualising the CPU is to run multiple processes “simultaneously”, but in a way that
individual processes don’t need to know that they are running in parallel with any other process. The
Operating System creates this illusion of many processes running in parallel by switching between
the processes several times per second.

Virtualisation Mechanisms
The mechanisms for virtualising a CPU classify each process as being in one of three states: ready,
running, or blocked.

When the process is created by loading program code into memory, it is put into the ready state.
Ready means ready for execution, but not currently being executed by the CPU.

When the OS decides that the process should now be executed, it puts it into the running state. We
say that the OS schedules the process. In the running state, the code for the process is actually
executed on the CPU.

Now one of two things can happen. Either the OS decides that the time for this process is up. In this
case, the process is de-scheduled, putting it back into the ready state, from which it will be
scheduled again in the future.

Or, the process requests some I/O to happen, e.g. opening a file from an external storage. Since I/O
can take a long time, and the process will have to wait until the file has been opened, the OS takes
advantage of this opportunity and puts the process into the blocked state. As soon as the I/O is
finished, the process will be put back into ready, from where it will be scheduled again.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Limited Direct Execution (LDE)


In order to achieve good performance, each process is allowed to run directly on the CPU.

In user mode, only a subset of all instructions are allowed to be executed. Typically, instructions that
perform I/O would not be allowed in user mode, and we will see in the module on Virtual Memory
that certain memory operations are also blocked. Normal applications are executed in user mode.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

In kernel mode, code is run without any restrictions, i.e., it has full access to I/O devices and
memory. The OS runs in kernel mode. Whenever an interrupt happens (as discussed in Input/Output
devices), the CPU switches into kernel mode to execute the interrupt handler (which is part of the
OS, since it deals with I/O). Kernel mode is sometimes also called supervisor mode.

System Calls
A system call is, at its core, a special CPU instruction that switches the CPU from user mode into
kernel mode and then jumps to a special subroutine of the OS.

To enable this, the OS sets up a table of system call handlers. This table is just a contiguous block of
memory, and each location contains an address of a subroutine that performs one particular
function. A user mode application can then put the number of the OS subroutine it wants to call into
a register before triggering an interrupt or calling the special system call instruction.

Cooperative and Preemptive Timesharing


Cooperative timesharing is that all processes must cooperate with the OS and make system calls in
regular intervals, otherwise some other processes can “starve” (i.e., not get scheduled at all).

The advantage of cooperative timesharing is that it is relatively easy to implement, but the downside
is that buggy or malicious processes can make a system unusable.

timer interrupts. These are hardware circuits that generate an interrupt in regular intervals, usually
after a programmable number of clock ticks. This gives the OS full control: it sets up a timer interrupt
before executing the context switch to a process, so it can be guaranteed that the process will be
preempted at the latest when the timer “fires”, if it doesn’t make any system calls before that.
Consequently, we call this preemptive timesharing.

In preemptive timesharing systems, the OS (or the user) can always kill buggy or malicious processes
(e.g. through the task manager in Windows or the kill command in Linux and Mac OS), since the OS
will regain control of the system several times per second.

Process Scheduling
First-come first-served
This is a simple policy that you are familiar with from the supermarket checkout. Let’s assume a
single checkout is open, and five people are queuing. The first person buys two items, the next one
three, the third has a single item, the fourth buys six items, and you are the final customer with just
a single item in your trolley. Let’s make a simple assumption that the time each customer takes at
the checkout just depends on the number of items they buy. So customer 1 required two time units,
customer 4 requires six time units, and you take a single time unit.

Shortest job first

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

As you can see, the average turnaround time has been reduced to 5.4, and in fact it’s possible to
show that this policy always results in an optimal schedule with respect to turnaround time. We call
this the shortest job first policy. Of course it wouldn’t be that easy to implement this strategy in a
supermarket with a single checkout, because customers would get angry if we start allowing people
with few items in their trolleys to jump the queue. But adding “express checkouts” for customers
with few items has a similar effect.

Round-robin scheduling
Compared to the previous two policies, the next one will split up each process into short time
slices. The OS can then cycle through all the processes one by one.

In the figure, you can see how P1 (which takes two time units in total) has been split into four short
time slices, and P2 has been split into six. The schedule first cycles through all five processes twice.
After that, P3 and P5 have already finished, which are repeated twice. Then P1 finishes, and we get
P2, P4, P2, and then the rest of P4.

This type of scheduling produces a fair schedule, which means that during a certain time interval, all
processes get roughly equal access to the CPU.

Virtualising the Memory


Virtualising the memory has three main goals:

1. To enable protection of a process’s memory against access from other (malicious or buggy)
processes.
2. To make programming easier because a programmer does not need to know exactly how
the memory on the target computer is organised (e.g. how much RAM is installed, and at
which address the program will be loaded into RAM).
3. To enable processes to use more memory than is physically installed as RAM in the
computer, by using external storage (e.g. hard disks) as temporary memory.

Virtualising the CPU meant that for each process, it looks as if it had exclusive access to the CPU. The
same holds for virtual memory: a process does not need to “know” that there are other processes
with which it shares the memory.

Address space
We call the addresses that can be used by a process its address space.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

This is the same situation as in early computers, which only ran a single program at a time. The OS
was typically loaded as a library, starting at address 0, and the user program and data used the
addresses behind the OS, as illustrated in the figure below.

Multiprogramming
The first multiprogramming systems divided up the memory, and each process was allocated a fixed
region that it was allowed to use. This figure shows three processes sharing the memory, and a few
regions of memory still being free for new processes.

Virtual memory
In a virtual memory system, the instructions operate on virtual addresses, which the OS, together
with the hardware, translates into physical addresses (i.e., the actually addresses of the RAM).

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Memory Protection
Staying with our simple model, we can extend the system by one more register, the bounds register,
which contains the highest address that the current process is allowed to access. The CPU will then
check for each memory access whether it is an address between the base register and the bounds
register. If the process tries to access memory outside of its address space, this generates an
interrupt, which causes a context switch and gives the operating system the chance to kill the
process.

Networks
A client is a device (e.g. a computer or a smart phone) that enables users to access the network,
such as the laptops in the picture above.

A server is a device (usually a dedicated computer) that provides services to clients. For example,
when you open the Monash homepage in a web browser, your client establishes a connection to the
Monash web server computer, which sends the requested information (text, images etc.) back to
you. In addition to sending information back to you, a server can also provide other types of services.

A switch connects multiple devices to form a Local Area Network (LAN). All devices in the same LAN
can directly communicate with each other.

A router connects different networks. If a device wants to communicate with another device that is
outside of its own network, it has to send the messages via a router.

Addresses
Addresses are used for clients, the routers and the server are all annotated with their IP addresses.
An address, in general, is a unique identifier.

Types of Networks
A Local Area Network (LAN) is a group of clients and/or servers that share a local circuit, i.e., they
are directly connected to each other using just switches and cables or radio waves. All devices in a
LAN can communicate directly with each other (without going through a router).

A Local Area Network (LAN) is a group of clients and/or servers that share a local circuit, i.e., they
are directly connected to each other using just switches and cables or radio waves. All devices in a
LAN can communicate directly with each other (without going through a router).

A Metropolitan Area Network (MAN) is the next larger scale network. It can span several kilometres,
and connects LANs and BNs across different locations. For instance, the Monash Caulfield and
Clayton campuses are connected via a MAN. A MAN is usually not built and owned by the
organisation that uses it, but leased from a telecommunications company.

A Wide Area Network (WAN) is very similar to a MAN except that it would connect networks over
large distances. For example, if Monash had a direct connection between its Australian and
Malaysian campuses, that would be considered a WAN. Just as with MANs, the actual circuits used
for WANs are usually owned and operated by third-party companies who sell access to their
networks.

Network application architectures


The presentation logic is the part of the application that provides the user interface, i.e., the
elements such as menus, windows, buttons etc. that the end user interacts with.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

The application logic or business logic defines how the application behaves, i.e., it implements what
should happen when the user clicks a button or receives a new message or types a word.

The data access logic defines how the application manages its data, it is responsible e.g. for
updating text documents when the user makes changes, or retrieving a piece of information when
the user performs a search.

The data storage is where the data is kept, e.g. in the form of files on a disk.

Protocols
A protocol is a formal language that defines how two applications talk to each other. In the case of
the “middle layers” of the Internet Model (i.e., data link, network and transport), the protocols take
the form of well-defined headers of data that are added to the actual message data by the sender,
containing information such as sender and receiver addresses, the type of content of the message,
and error detection codes like a CRC.

Compared to the regular postal service, each layer adds its own envelope, and puts the envelope for
the layers above inside. So, the actual packet that is being transmitted by the hardware would be a
data link layer envelope that contains a network layer envelope that contains a transport layer
envelope that contains (part of) the application layer message! This is called message encapsulation.

These “envelopes” are called protocol data units (PDU), and the PDU for each layer has its own
name. At the hardware layer, the PDU is simply a bit. At the data link layer, the PDU is called
a frame. The network layer PDU is called a packet, and the transport layer PDU is a segment or
a datagram (depending on the concrete protocol used). At the application layer, we generally talk
about messages.

The Internet Model


Hardware Layer
The hardware layer (layer 1, also known as the physical layer) is concerned with the actual
hardware, such as cables, plugs and sockets, antennas, etc. It also specifies the signals that are
transmitted over cables or radio waves, i.e., how the sequence of bits that make up a packet is
converted into electrical or optical wave forms.

Network Hardware
Network Interface Card
The NICs are the hardware components that connect devices to the network. In the case of wired
networks, the physical connection is provided in the form of a socket into which you can plug the
network cable. For wireless networks, a NIC is connected to an antenna in order to send and receive
radio signals.

Network Cables
Networks can be built using a variety of different cables. The first distinction can be made by the
main material of the cable. If the network uses electrical signals, the cable will contain copper wires.
Examples for this type of cable are

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

 UTP (Unshielded Twisted Pair), which contain several pairs of copper cables, and each pair
is twisted together in order to reduce interference. Each individual copper cable is
surrounded by a plastic insulation layer, and there’s another layer of insulation surrounding
all the pairs. This is the most common type of LAN cable. UTP cables come in
different categories. The higher the number, the better the quality of the cable, and the
higher the transmission rate they can be used for.

 STP (Shielded Twisted Pair) is similar to UTP, but adds metal shielding to provide
better protection from electromagnetic interference. STP cables are required for very high
speed Ethernet (beyond 10 Gb/s), or in environments with strong electromagnetic
interference.

 Coaxial cables. These look like TV cables, with an inner wire surrounded by an insulation
layer and a wire mesh. The original Ethernet used coaxial cables, but they are not common
in network installations any more.

Physical media
We transmit information using physical signals

A signal travels through a medium:

• electrical signals through e.g. copper wires

• radio waves through “air” (or, really, space)

• light signals through space or optical fibres

Digital vs Analog
DATA
Digital data:
• Discrete values (e.g. 0 and 1, or characters in the alphabet)

• Discrete step from one symbol to the next

Analog data:
• Range of possible values (e.g. temperature, air pressure)

• Continuous variation over time

SIGNAL
Digital signal:
• Waveform with limited number of discrete states

Analog signal:
• Continuous, often sinusoidal wave

• E.g. sound (pressure wave in air), light and radio (electromagnetic waves)

Transmission types
Analog signals for analog data:
• e.g. analog FM radio

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Digital signals for digital data:


• e.g. old Ethernet, USB, the bus in a computer

Analog signals for digital data


• e.g. modems, ADSL, Ethernet, WiFi, 4G

Digital transmission
• Digital signals are typically transmitted through copper cables

• A digital signal encodes 0s and 1s into different voltage levels on the cable

• This results in a square wave

• Simplest encoding: unipolar

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Analog Transmission

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Data Link Layer


The data link layer (layer 2) defines the interface between hardware and software. It specifies how
devices within one local area network, e.g. those connected directly via cables or radio waves to a
switch, can exchange packets.

The data link layer:

• controls access to the physical layer (MAC = Media Access Control)

• encodes/decodes between frames and signals

• implements error detection

• interfaces to the network layer

There are two approaches to MAC.

The first one is controlled access, where only one device has permission to send at any point in time,
and we either have a central authority assigning permission to send, or the permission gets passed
from device to device.

The second approach to MAC is called contention-based access. Here, access is provided on a first-
come first-served basis, i.e., any device can start transmitting at any time.

Any device can transmit at any time

• “first come first served”

Collisions: two devices transmitting at the same time

• packets in a collision are damaged

• avoid collisions by carrier sensing (listening on the network for transmission)

• detect collisions and re-transmit

Used in Ethernet

Ethernet MAC
Media Access Control: CSMA/CD

• Carrier Sense (CS):

listen on bus, only transmit if no other signal is "sensed"

• Multiple Access (MA):

several devices access the same medium

• Collision Detection (CD):

when signal other than own is detected:

• transmit jam signal (so all other devices detect collision)

• both wait random time before re-transmitting

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Network Layer
The network layer (layer 3) is responsible for routing, i.e., deciding which path a packet takes
through the network. There are often multiple possible paths, and each individual packet may take a
different path from source to destination. The network layer is of course most important for routers,
which are the network devices whose primary task it is to perform routing.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Transport Layer
The transport layer (layer 4) establishes a logical connection between an application sending a
message and the receiving application. It takes care of breaking up a large message into individual
packets and reassembles them at the receiving side. It also makes sure that messages are received
correctly, re-sending packets if they were received with errors (or not received at all).

Transmission Control Protocol (TCP)


Connection-oriented

• A virtual circuit is established between two devices

• To the application it always looks like a point-to-point Full duplex connection

• Messages split into segments for transmission

Reliable

• Errors are detected and corrected

• Segments are re-assembled in the correct order

Used by e.g. HTTP, SMTP, IMAP, SSH

Addresses
Since many applications (processes) are running on the same machine, the TCP/IP system needs a
way of sending each packet to the correct process. IP addresses are not good enough: they only
identify an interface (and therefore one device), but not a process on that device.

The port number (or port address) is used to address an individual application (or process). Server
processes use standard port numbers so that clients know which port to connect to (e.g. port 80
stands for http, port 25 for SMTP).

Clients create random port numbers for each connection (which can be reused when the connection
is closed). A connection is therefore uniquely identified by four numbers: source IP, source port,
destination IP, destination.

Addresses per Layer


Application Layer

• URL (e.g. http://www.csse.monash.edu)

Transport Layer (TCP)

• Port number (e.g. 80 for HTTP)

• identifies the application that handles a message

Network Layer (IP)

• IP address (e.g. 130.194.66.43)

• used for identifying devices across networks

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Data Link Layer (Ethernet)

• MAC address (e.g. 00:23:ae:e7:52:85)

• used for sending frames in a LAN

TCP ARQ
Error control

• Data Link Layer discards frames that have errors

• Frames may not arrive at all

• But TCP should be a reliable channel!

Solution: Automatic Repeat ReQuest

• Exchange acknowledgements (ACK), letting sender know that packets were received correctly

• Sender re-transmits if no ACK within certain time

When we send a TCP packet, it includes two numbers.

Sequence number:

• how many bytes we've already transmitted (before this one)

Acknowledgement number:

• how many bytes we've received from the other side

Sender can therefore check how many bytes have been received correctly!

Establishing a connection
Three-way handshake:

• Client sends a SYN packet with random sequence number A

• Server replies with SYN, ACK, acknowledgement number A+1, and random sequence number B

• Client sends ACK with sequence number A+1 and acknowledgement number B+1

After the three-way handshake, the server and client both know each other's random sequence
number. That way they now have shared knowledge and can start transmitting the actual data.

Closing a connection
Four-way handshake:

• Computer A (client or server!) sends a FIN packet

• Computer B acknowledges with an ACK

• Computer B sends a FIN packet

• Computer A acknowledges with an ACK

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

• Can be simplified to three-way (combining a FIN/ACK)

Necessary because TCP is full duplex!

We don't know in advance whether e.g. the client or the server is finished sending data first, so both
need to be able to close the connection individually. If computer A sends FIN first, it needs to wait
until computer B also sends a FIN.

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Application Layer
The application layer (layer 5), finally, is the actual application software that a user interacts with.
For example, your web browser or your instant messaging app are implemented at the application
layer.

URLs, HTML and HTTP


Uniform Resource Locators
A URL is a textual address that uniquely identifies where to find a particular document on the
Internet, and how to retrieve it. For example, the following URL identifies this document:

https://www.alexandriarepository.org/module/application-layer/

The URL above has three components:

 The scheme describes which protocol must be used to retrieve the document. In this case,
the scheme is https, which is a secure version of HTTP (discussed in more detail in Security).
 The host identifies the server. In the URL above, the host is www.alexandriarepository.org.
 The path identifies a particular document on the server. The name of the document here is /
module/application-layer/

HTML documents
The earliest web pages contained just plain text and links to other pages. Modern web pages are of
course often results of very careful design, using many graphical elements, different fonts, and
media such as audio and video. But at its core, a web page is still represented by a document written
in the Hypertext Markup Language (HTML).

The HTTP request-response cycle


The final piece of the puzzle is the Hypertext Transfer Protocol. If defines how a web browser can
request a document from a web server, and how the web server responds (delivering the document,
or perhaps sending an error message if something went wrong).

A request consists of the request line, which includes the method, a path, and the protocol version
that is to be used (HTTP/1.1 in our example). The request line is followed by the request header,
which must always include the Host: ... line, indicating the name of the host, but it can include other
lines specifying additional information. For example, the web browser can request that the browser
only return certain types of files, or only files newer than a certain date, or only documents written
in a particular language. The request header needs to be followed by a blank line. If the request
requires sending a document to the server (as in the case of a POST request), then the document is
sent as the so-called request body after the header.

A response has a similar structure. The first line is always the response status. In the example above,
the server replied with HTTP/1.0 200 OK. This identifies the protocol version that the server uses,
and two status codes, 200 and OK. They both mean the same thing: the numerical version is easy to
interpret for the browser, the text version is easy to read for humans. Other potential status codes

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

include 404 Not found if the server can’t find the requested document, or 403 Forbidden if the
browser doesn’t have the right permissions to access the document (e.g. you need to log in first).

Email Protocols
Simple Mail Transfer Protocol (SMTP)
• Handles transfer of text messages between email client and mail server,
and between mail servers
Post Office Protocol (POP)

• Messages are downloaded onto client and deleted from server

Internet Message Access Protocol (IMAP)

• Messages remain on server

• Multiple clients can be connected simultaneously to same mailbox

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Security
Security Protocols

Downloaded by fizz 2win (fizzz2win@gmail.com)


lOMoARcPSD|3336559

Cryptography
The Merriam Webster Dictionary defines cryptography as

1: secret writing
2: the enciphering and deciphering of messages in secret code or cipher; also : the computerized
encoding and decoding of information

a more restricted view at cryptography would define it as “the coding and decoding of secret
messages” (Merriam Webster Student Dictionary).

Symmetric Encryption

The main idea of a symmetric algorithm is to use the same key for encryption and decryption.

Public key cryptography

The main difference to symmetric cryptography is that instead of a single symmetric key 
there is now a pair of keys (the private/secret key and the pubic) used for encryption or for 
digital signatures.

Downloaded by fizz 2win (fizzz2win@gmail.com)

You might also like