3 Transport Layer
3 Transport Layer
3 Transport Layer
Best-effort
connectionless packet IP (ICMP, ARP)
transfer
Ethernet Header
Ethernet IP TCP HTTP Request FCS
contains source & header header header
destination MAC
addresses
Chapter 3 outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and segment structure
demultiplexing reliable data transfer
3.3 connectionless flow control
transport: UDP connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control
P1 P2 P4 application
applicationP3 P1 application
Host 2 Host 3
Host 1
Demultiplexing at the Transport Layer
Host receives IP datagrams
Each datagram has source IP
address, destination IP 32 bits
address Source Port # Dest Port #
Each datagram carries one
transport-layer segment
Each segment has source, other header fields
destination port number
Host uses IP addresses & port
numbers to direct segment to
appropriate socket Application
Data
(Message)
P2 P1
P1
P3
P1 P4 P5 P6 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
• This is a Web Server example as the segments are being sent to Port 80 of the
server which corresponds to the HTTP Service
• Note that in this case, the server is creating a separate process for each of the
sockets. This would be inefficient (see next slide for a more efficient example with
“threading”)
Connection-oriented Demultiplexing (TCP)
Threaded Web Server
P1 P4 P2 P1P3
SP: 5775
DP: 80
S-IP: B
D-IP:C
• This is also a Web Server example as the segments are being sent to Port 80 of the
server which corresponds to the HTTP Service
• Note that in this case, the server is creating one process for all the sockets. A new
thread (kind of like a sub-process) is created for each socket
Chapter 3 outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and segment structure
demultiplexing reliable data transfer
3.3 connectionless flow control
transport: UDP connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control
Source IP Address
Destination IP Address
UDP Pseudoheader
(used in checksum calculation but never actually transmitted, nor is it
included in the “Length”)
Note that IP Address information will come from another layer (Network Layer).
Strictly speaking, this goes against the philosophy of keeping the layers separate
from each other.
Internet Checksum
Several Internet protocols (e.g. IP, TCP, UDP) use check
bits to detect errors in the IP header (or in the header and
data for TCP/UDP)
A checksum is calculated for header contents and included
in a special field.
Checksum recalculated at every router, so algorithm
selected for ease of implementation in software
Let header consist of L, 16-bit words,
b0, b1, b2, ..., bL-1
The algorithm appends a 16-bit checksum bL
Don’t take this example too seriously as it uses only 4-bit words for simplicity! Real
Internet checksum calculations (shown next) uses 16-bit words.
Kurose & Ross 3-22
Internet Checksum Example
(A more complex one, using 16-bit words and mod 216-1 arithmetic)
Datagram demultiplexed
to its appropriate port
UDP Demultiplexing
(based on destination port #)
send receive
side side
Transmit next frame only after ACK is received for the earlier frame
31
Pipelined protocols
pipelining: sender allows multiple, “in-flight”, yet-to-be-
acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver
U 3L / R .0024
sender = = = 0.00081
RTT + L / R 30.008
Sequence Number
Acknowledgment Number
Header
Header U A P R S F
Reserved R C S S Y I Window Size
Length G K H T N N
Options Padding
Data
Each TCP segment has header of 20 or more bytes + 0 or more bytes of data
TCP Header
Reserved Control
6 bits 6 bits
URG: urgent pointer flag
Urgent message end = SN + urgent pointer
ACK: ACK packet flag
PSH: override TCP buffering
RST: reset connection
Upon receipt of RST, connection is
terminated and application layer notified
SYN: establish connection
FIN: close connection
TCP Header
0 8 16 31
Source IP address
Destination IP address
Options Options
Variable length Maximum Segment Size
NOP (No Operation) option is (MSS) option specifices
used to pad TCP header to largest segment a receiver
multiple of 32 bits wants to receive
Time stamp option is used for Window Scale option
round trip measurements increases TCP window from
16 to 32 bits
TCP Services
User
types
‘C’ Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed
‘C’ Seq=43, ACK=80
350
300
RTT (milliseconds)
250
RTT (milliseconds)
200
150
sampleRTT
EstimatedRTT
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
timeout
ACK=100
X
ACK=100
ACK=120
SendBase=120
X
ACK=120
cumulative ACK
Kurose & Ross 3-63
TCP ACK generation [RFC 1122, RFC 2581]
ACK=100
timeout
ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data
IP
flow control code
receiver controls sender, so sender
won’t overflow receiver’s buffer by
transmitting too much, too fast from sender
t0
1024 bytes to
transmit t1
1024 bytes to
transmit
t2
128 bytes to
transmit
1024 bytes to
transmit t3
1024 bytes to
Only 512 bytes
transmit
t4 sent as that is the
advertised value
of Win
Nagle Algorithm
Solution:
TCP sends data & waits for ACK
New characters buffered
Send new characters when ACK arrives
Algorithm adjusts to RTT as follows -
• Short RTT send frequently at low efficiency
• Long RTT send less frequently at greater efficiency
Silly Window Syndrome
Situation:
Transmitter sends large amount of data
Receiver’s buffer is depleted slowly, so buffer fills up
Every time a few bytes read from buffer, a new advertisement to
transmitter is generated
Sender immediately sends data & fills buffer
This leads to many small, inefficient segments being transmitted
Solution:
Receiver does not advertize window until window is at least ½ of
receiver buffer or is equal to the maximum segment size (MSS)
Transmitter refrains from sending small segments
Sequence Number Wraparound
(Potential problem at high data rates)
232 = 4.29x109 bytes = 34.3x109 bits (TCP has 32-bit seq. no.)
Therefore, at 1 Gbps, sequence numbers will wraparound in just 34.3
seconds transmitter can only transmit for very brief periods
application application
network network
2-way handshake:
Q: will 2-way handshake
always work in
Let’s talk
network?
ESTAB variable delays
OK
ESTAB retransmitted messages (e.g.
req_conn(x)) due to
message loss
message reordering
choose x
req_conn(x)
can’t “see” other side
ESTAB
acc_conn(x)
ESTAB
Three-Way Handshake
A sends a SYN segment specifying the port number of
the other party B , the initial sequence number (ISN)
that A will use and other info (eg. max. segment size)
B responds with its own SYN segment containing its ISN.
B also acknowledges A’s SYN by ACKing A’s ISN plus one
A acknowledges B’s SYN by ACKing B’s ISN plus one
Protects the ISN against responding falsely to old segments from prior connections
TCP: closing a connection
client, server each close their side of connection
send TCP segment with FIN bit = 1
respond to received FIN with ACK
on receiving FIN, ACK can be combined with own FIN
simultaneous FIN exchanges can be handled
LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime
CLOSED
Host B still
delivers 150
bytes
Host B now sends
its own FIN
congestion:
informally: “too many sources sending too much
data too fast for network to handle”
different from flow control!
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
Low delay
Can accommodate more
2. Knee (congestion onset)
Arrival rate approaches R
Arrival Delay increases rapidly
Rate
Throughput begins to
saturate
3. Congestion collapse
Arrival rate > R
Delay (sec)
R/2
delay
out
Host A
out
sender sends only when
router buffers available
in R/2
A no buffer space!
Host B
Kurose & Ross 3-93
Causes/costs of congestion: scenario 2
Idealization: known loss R/2
packets can be lost,
dropped at router due to when sending at R/2,
full buffers some packets are
out
retransmissions but
sender only resends if asymptotic goodput
packet known to be lost is still R/2
in R/2
Host B
Kurose & Ross 3-94
Causes/costs of congestion: scenario 2
Realistic: duplicates
packets can be lost, dropped at R/2
router due to full buffers
when sending at R/2,
sender times out prematurely,
some packets are
out
sending two copies, both of which retransmissions
are delivered including duplicated
that are delivered!
in R/2
in
timeout
copy 'in out
Host B
Kurose & Ross 3-95
Causes/costs of congestion: scenario 2
Realistic: duplicates R/2
packets can be lost, dropped
at router due to full buffers when sending at R/2,
some packets are
out
sender times out prematurely, retransmissions
including duplicated
sending two copies, both of that are delivered!
which are delivered in R/2
“costs” of congestion:
more work (retrans) for given “goodput”
unneeded retransmissions: link carries multiple copies of pkt
decreasing goodput
Host A
in : original data out
Host B
'in: original data, plus
retransmitted data
finite shared output
link buffers
Host D
Host C
C/2
out
in’ C/2
time
Kurose & Ross 3-101
TCP Congestion Control: details
sender sequence number space
cwnd
TCP sending rate:
roughly: send cwnd bytes,
wait RTT for ACKS, then
last byte last byte send more bytes
ACKed sent, not- sent
yet ACKed
(“in-flight”)
cwnd
sender limits transmission: rate ~
~ bytes/sec
RTT
LastByteSent- < cwnd
LastByteAcked
cwnd is dynamic, function of perceived
network congestion
RTT
double cwnd every RTT
done by incrementing cwnd for
every ACK received
summary: initial rate is slow
but ramps up exponentially
fast
time
Implementation:
variable ssthresh
on loss event, ssthresh
is set to 1/2 of cwnd just
before loss event
3 W
avg TCP thruput = bytes/sec
4 RTT
W/2
1.22 . MSS
TCP throughput =
RTT L
➜ to achieve 10 Gbps throughput, need a loss rate of L
= 2·10-10 – a very small loss rate!
new versions of TCP for high-speed
TCP connection 1
bottleneck
router
capacity R
TCP connection 2
Connection 1 throughput R
Kurose & Ross 3-109
Fairness (more)