Unit 1
Unit 1
Unit 1
Fig: Client and server on the same Ethernet communicating using TCP
In the above fig, the client and server communicate through the single LAN.
In the following fig, we show the client and server on different LANs, with both LANs
connected to WAN using routers.
Routers are the building blocks of WANs. The largest WAN today is the Internet.
2. Overview of TCP/IP Protocol
TCP/IP protocol suite has 5 layers. They are Application, Transport, Network, Data link and
Physical layer.
Most client server applications use either TCP / UDP.
These two protocols inturn use the network layer protocol IP. It may be IP version4 or IP
version6.
It is possible to use IPv4 or IPv6 directly bypassing the transport layer.
UDP is a simple, unreliable, datagram protocol, while TCP is reliable, byte stream protocol.
In the below diagram, right most 4 applications are using IPv6 and next five applications use
IPv4.
The leftmost application, tcpdump, communicates directly with the datalink using either the
BSD packet filter (BPF) or the datalink provider interface (DLPI).
Mark the dashed line below the nine applications on the right as the API, which is normally
sockets or XTI. The interface to either BPF or DLPI does not use sockets or XTI.
In fig, the traceroute program uses two sockets: one for IP and another for ICMP.
IPv4 Internet Protocol version 4. It denote as IP. It uses 32-bit addresses. It provides packet
delivery service for TCP, UDP, SCTP, ICMP, and IGMP.
IPv6 Internet Protocol version 6. It uses 128 bits address. It provides packet delivery service
for TCP, UDP, SCTP, and ICMPv6.
TCP Transmission Control Protocol. It is a connection-oriented protocol that provides a
reliable, full-duplex byte stream to its users. TCP sockets are an example of stream sockets. It
provides the facilities like acknowledgments, timeouts, and retransmissions. Most Internet
application programs use TCP. TCP can use either IPv4 or IPv6.
UDP User Datagram Protocol. It is a connectionless protocol, and UDP sockets are an
example of datagram sockets. There is no guarantee for the delivery of UDP datagrams. It
can use either IPv4 or IPv6.
ICMP Internet Control Message Protocol. It handles error and control information between
routers and hosts. These messages are generated and processed by the TCP/IP networking
software. We show the ping and traceroute programs use ICMP.
IGMP Internet Group Management Protocol. It is used with multicasting, which is optional
with IPv4.
ARP Address Resolution Protocol. ARP maps an IPv4 address into a hardware address
(such as an Ethernet address). ARP is normally used on broadcast networks such as Ethernet,
token ring, and FDDI.
RARP Reverse Address Resolution Protocol. It maps a hardware address into an IPv4
address.
ICMPv6 Internet Control Message Protocol version 6. It combines the functionality of
ICMPv4, IGMP, and ARP.
BPF BSD ( Berkeley Software Distribution) packet filter. This interface provides access to
the datalink layer.
DLPI Datalink provider interface. This interface provides access to the datalink layer.
TCP Connection Establishment and Termination
Three-Way handshake
The following scenario occurs when a TCP connection is established:
1. The server must be prepared to accept an incoming connection. This is normally done by
calling socket, bind, and listen and is called a passive open.
2. The client issues an active open by calling connect. This causes the client TCP to send a
"synchronize" (SYN) segment, it tells the server the client's initial sequence number for
the data that the client will send on the connection. Normally, there is no data sent with
the SYN; it just contains an IP header, a TCP header, and possible TCP options.
3. The server must acknowledge (ACK) the client's SYN and the server must also send its
own SYN containing the initial sequence number for the data that the server will send on
the connection. The server sends its SYN and the ACK of the client's SYN in a single
segment.
4. The client must acknowledge the server's SYN.
The minimum number of packets required for this exchange is three; hence, this is called
TCP's three-way handshake.
Two computers on a network establish a connection using some kind of networking tools
is called handshake.
Figure 2.2. TCP three-way handshake.
We show the client's initial sequence number as J and the server's initial sequence
number as K. The acknowledgment number in the ACK of each SYN is the initial sequence
number plus one.
TCP Options
Each SYN can contain TCP options. Commonly used options are
i. MSS option
The Maximum Segment Size (MSS) is set when TCP connection established.
MSS option can only appear in a SYN segment.
If one end does not receive an MSS option from other end a default 536 byte is assumed.
ii. Window scale option.
TCP always tells its peer exactly how many bytes of data it is willing to accept from the
peer. This is called the advertised window.
The maximum window size is 65,535.
TCP can able to change the window size during the connection establishment.
It never changes the window size during communication.
iii. Timestamp option. This option is needed for high-speed connections to prevent possible
data corruption caused by old, delayed, or duplicated segments.
TCP Connection Termination
TCP takes three segments to establish a connection; it takes four to terminate a connection.
1. One application calls close first, and we say that this end performs the active close. This
end's TCP sends a FIN segment, which means it is finished sending data.
2. The other end that receives the FIN performs the passive close. The received FIN is
acknowledged by TCP. The receipt of the FIN is also passed to the application as an end-of-
file, since the receipt of the FIN means the application will not receive any additional data on
the connection.
3. Sometime later, the application that received the end-of-file will close its socket. This causes
its TCP to send a FIN.
4. The TCP on the system that receives this final FIN acknowledges the FIN.
Since a FIN and an ACK are required in each direction, four segments are normally required. We
use the qualifier "normally" because in some scenarios, the FIN in step 1 is sent with data.
Figure 2.3. Packets exchanged when a TCP connection is closed.
Between Steps 2 and 3 it is possible for data to flow from the end doing the passive close
to the end doing the active close. This is called a half-close.
TCP State Transition Diagram
The operation of TCP with regard to connection establishment and connection
termination can be specified with a state transition diagram.
There are 11 different states defined for a connection and the rules of TCP say the transitions
from one state to another, based on the current state and the segment received in that state.
For example, if an application performs an active open in the CLOSED state, TCP sends a
SYN and the new state is SYN_SENT.
.
3. Introduction to Socket
Socket is an abstract identifier used by TCP/IP based protocols. It is an end point for
communication.
When there is a need for any resource from the network, the application programs request the
operating system to create a socket.
The system returns a small integer (socket descriptor) that the application program uses to
refer newly created socket.
Socket allows us to exchange information between processes on the same machine or across
a network.
The port number concatenated with the IP address form a socket.
Most application use TCP and UDP for performing a series of operation on that socket.
The operations that can be performed on a socket include control operations (associating a
port number with socket, accepting a connection, destroying the socket) data transfer
operation (read, write) and status operation (finding IP address associated with socket).
Some basic socket system calls are
socket ( ) – To create a new socket and return its descriptor.
bind ( ) – To associate a socket with a port and address.
listen ( ) – To establish queue for connection request.
accept ( ) – To accept a connection request.
connect ( ) – To initiate a connection to remote host.
read ( ) – To read data from a socket descriptor.
write( ) – To write data to a socket descriptor.
close( ) – To close a socket descriptor.
A socket type is uniquely determined by a triple <domain, type, protocol>.
Possible domains are AF_INET (Internet domain version 4), AF_INET6 (Internet domain
version 6), UNIX domain AF_UNIX and AF_LOCAL.
AF_LOCAL – set up socket connection between two process on same host.
The possible socket types are Stream socket (SOCK_STREAM), Datagram socket
(SOCK_DGRAM) and Raw socket (SOCK_RAW).
Stream Socket
These sockets are error free. Sending items are reached at the destination in the same order as
they send.
They use TCP and hence the connection is established between sockets and then the data
transfer occurs.
Datagram Socket
These sockets use UDP. There is no need to open connection in case of datagram sockets.
Raw Socket
They skip the transport layer completely.
If protocol is 0, all packets go to the socket. If protocol is specified, then only the packets
with that protocol are received.
In TCP/IP model interface 1 refers to the interface of Stream socket, interface 2 refers to the
interface of Datagram socket, and interface 3 refers to the interface of Raw socket.
! "
Both the IPv4 address and the TCP or UDP port number are always stored in the structure in
network byte order.
Socket address structures are used only on a given host: the structure itself is not
communicated between different hosts although certain fields (IP address and port) are used
for communication.
Generic Socket Address Structure
A socket address structures is always passed by reference when passed as an argument to
any socket functions. void * is the generic pointer type.
The generic socket address structure: sockaddr.
struct sockaddr
{
uint8_t sa_len;
sa_family_t sa_family; /* address family: AF_xxx value */
char sa_data[14]; /* protocol-specific address */
};
The socket functions are then defined as taking a pointer to the generic socket address structure,
as shown here in the function prototype for the bind function:
int bind(int, struct sockaddr *, socklen_t);
This requires that any calls to these functions must cast the pointer to the protocol-specific socket
address structure to be a pointer to a generic socket address structure. For example,
struct sockaddr_in serv; /* IPv4 socket address structure */
/* fill in serv{} */
bind(sockfd, (struct sockaddr *) &serv, sizeof(serv));
From an application programmer's point of view, the only use of these generic socket address
structures is to cast pointers to protocol-specific structures.
IPv6 Socket address Structure
The IPv6 socket address is defined by including the <netinet/in.h> header
IPv6 socket address structure: sockaddr_in6.
struct in6_addr
{
uint8_t s6_addr[16]; /* 128-bit IPv6 address */
/* network byte ordered */
};
#define SIN6_LEN /* required for compile-time tests */
struct sockaddr_in6
{
uint8_t sin6_len; /* length of this struct (28) */
sa_family_t sin6_family; /* AF_INET6 */
in_port_t sin6_port; /* transport layer port# */
/* network byte ordered */
uint32_t sin6_flowinfo; /* priority, flow label */
/* Network byte order*/
struct in6_addr sin6_addr; /* IPv6 address */
/* network byte ordered */
};
The SIN6_LEN constant must be defined if the system supports the length member for
socket address structures.
The IPv6 family is AF_INET6, whereas the IPv4 family is AF_INET.
The members in this structure are ordered so that if the sockaddr_in6 structure is 64-bit
aligned, so is the 128-bit sin6_addr member.
The sin6_flowinfo member is divided into three fields:
The low-order 24 bits are the flow label
the next 4 bits are the priority,
The next 4 bits are reserved
Comparison of Socket Address Structure
The socket address structures all contain a one-byte length field, that the family field also
occupies one byte.
Two of the socket address structures are fixed-length, while the Unix domain structure and
the datalink structure are variable-length.
To handle variable-length structures, whenever we pass a pointer to a socket address
structure as an argument to one of the socket functions, we pass its length as another
argument.
The sockaddr_un structure itself is not variable-length, but the amount of information—the
pathname within the structure—is variable-length.
5. Byte ordering functions
A16-bit integer that is made up of 2 bytes. There are two ways to store the two bytes in
memory: with the low-order byte at the starting address, known as little-endian byte order, or
with the high-order byte at the starting address, known as big-endian byte order.
Fig: Little-endian byte order and big-endian byte order for a 16-bit integer.
In this fig. we show increasing memory addresses going from right to left in the top, and
from left to right in the bottom. There is no standard between these two byte orderings and we
encounter systems that use both formats. We refer to the byte ordering used by a given system as
the host byte order.
Program to determine host byte order.
#include "unp.h"
int main(int argc, char **argv)
{
union
{
short s;
char c[sizeof(short)];
} un;
un.s = 0x0102;
printf("%s: ", CPU_VENDOR_OS);
if (sizeof(short) == 2)
{
if (un.c[0] == 1 && un.c[1] == 2)
printf("big-endian\n");
else if (un.c[0] == 2 && un.c[1] == 1)
printf("little-endian\n");
else
printf("unknown\n");
}
else
printf("sizeof(short) = %d\n", sizeof(short));
exit(0);
}
We store the 2 byte value 0x0102 into the short integer and then look at the two
consecutive bytes c[0] and c[1] to determine the byte order. The string CPU_VENDOR_OS
identifies the CPU type, vendor, and operating system release.
We must deal with these byte ordering differences as network programmers because
networking protocols must specify a network byte order. For example, in a TCP segment, there
is a 16-bit port number and a 32-bit IPv4 address. The sending protocol stack and the receiving
protocol stack must agree on the order in which the bytes of these multibyte fields will be
transmitted. The Internet protocols use big-endian byte ordering for these multibyte integers.
In theory, an implementation could store the fields in a socket address structure in host
byte order and then convert to and from the network byte order when moving the fields to and
from the protocol headers. We use the following four functions to convert between these two
byte orders.
#include <netinet/in.h>
Both return: value in network byte order
uint16_t htons(uint16_t host16bitvalue) ;
uint32_t htonl(uint32_t host32bitvalue) ;
Both return: value in host byte order
uint16_t ntohs(uint16_t net16bitvalue) ;
uint32_t ntohl(uint32_t net32bitvalue) ;
In the names of these functions, h stands for host, n stands for network, s stands for short, and l
stands for long.
6. Address Conversion functions
inet_aton, inet_addr, and int_ntoa Functions
inet_aton, inet_ntoa, and inet_addr convert an IPv4 address from a dotted-decimal string
(e.g., "206.168.112.96") to its 32-bit network byte ordered binary value.
#include <arpa/inet.h>
int inet_aton(const char *strptr, struct in_addr *addrptr);
Returns: 1 if string was valid, 0 on error
in_addr_t inet_addr(const char *strptr);
Returns: 32-bit binary network byte ordered IPv4 address; INADDR_NONE if error
char *inet_ntoa(struct in_addr inaddr);
Returns: pointer to dotted-decimal string
inet_aton, converts the C character string pointed to by strptr into its 32-bit binary network
byte ordered value, which is stored through the pointer addrptr. If successful, 1 is returned;
otherwise, 0 is returned.
The inet_addr does the same conversion, returning the 32-bit binary network byte ordered
value as the return value.
The all 232 possible binary values are valid IP addresses (0.0.0.0 through 255.255.255.255),
but the function returns the constant INADDR_NONE (typically 32 one-bits) on an error.
This means the dotted-decimal string 255.255.255.255 (the IPv4 limited broadcast address)
cannot be handled by this function since its binary value appears to indicate failure of the
function.
The inet_ntoa function converts a 32-bit binary network byte ordered IPv4 address into its
corresponding dotted-decimal string.
Functions that take actual structures as arguments are rare. It is more common to pass a
pointer to the structure.
inet_pton and inet_ntop Functions
These two functions are new with IPv6 and work with both IPv4 and IPv6 addresses.
The letters "p" and "n" stand for presentation and numeric.
The presentation format for an address is often an ASCII string and the numeric format is
the binary value that goes into a socket address structure.
#include <arpa/inet.h>
int inet_pton(int family, const char *strptr, void *addrptr);
Returns: 1 if OK, 0 if input not a valid presentation format, -1 on error
const char *inet_ntop(int family, const void *addrptr, char *strptr, size_t len);
Returns: pointer to result if OK, NULL on error
The family argument for both functions is either AF_INET or AF_INET6.
If family is not supported, both functions return an error with errno set to EAFNOSUPPORT.
The first function tries to convert the string pointed to by strptr, storing the binary result
through the pointer addrptr.
If successful, the return value is 1. If the input string is not a valid presentation format for the
specified family, 0 is returned.
The inet_ntop does the reverse conversion, from numeric (addrptr) to presentation (strptr).
The len argument is the size of the destination, to prevent the function from overflowing the
caller's buffer.
To specify this size, the following two definitions are defined by including the <netinet/in.h>
#define INET_ADDRSTRLEN 16 /* for IPv4 dotted-decimal */
#define INET6_ADDRSTRLEN 46 /* for IPv6 hex string */
7. Elementary TCP Sockets
The following fig. shows a timeline of the typical scenario that takes place between a
TCP client and server. First, the server is started, and then sometime later, a client is started that
connects to the server. We assume that the client sends a request to the server, the server
processes the request, and the server sends a reply back to the client. This continues until the
client closes its end of the connection, which sends an end-of-file notification to the server. The
server then closes its end of the connection and either terminates or waits for a new client
connection.
i. socket ( ) Function
To perform network I/O, the first thing a process must do is call the socket( ) function,
specifying the type of communication protocol desired.
#include<sys/socket.h>
int socket(int family, int type, int protocol);
On success the socket ( ) function returns a small non negative integer value, we call this
a socket descriptor or a sockfd.
family: specifies the protocol family {AF_INET for TCP/IP}
type: indicates communications semantics
SOCK_STREAM stream socket TCP
SOCK_DGRAM datagram socket UDP
SOCK_RAW raw socket
protocol: set to 0 except for raw sockets
Example: sd = socket(AF_INET, SOCK_STREAM,0)
Figure. Socket functions for elementary TCP client/server.
The above fig depicts these two queues for a given listening socket.
The above fig. depicts the packet exchanged during the connection establishment with
these two queues.
When a SYN arrives from a client, TCP creates a new entry on the incomplete queue and
then responds with the second segment of the three-way handshake: the server's SYN with
an ACK of the client's SYN.
This entry will remain on the incomplete queue until the third segment of the three-way
handshake arrives.
If the three-way handshake completes normally, the entry moves from the incomplete
queue to the end of the completed queue.
When the process calls accept, first entry on the completed queue is returned to the
process.
v. accept ( ) Function
The accept( ) is called by a TCP server to return the next completed connection from the
front of the completed connection queue.
If the completed connection queue is empty, the process is put to sleep.
#include<sys/socket.h>
int accept( int sockfd, struct sockaddr *cliaddr, socklen_t addrlen);
Return nonnegative descriptor if Ok, -1 on error
sockfd: This is the same descriptor as in listen call.
The cliaddr and addrlen arguments are used to return protocol address of the connected
peer process.
If accept is successful, its return value is a brand-new descriptor automatically created by
the kernel.
This new descriptor refers to the TCP connection with the client.
We call the first argument to accept the listening socket (the descriptor created by socket
and then used as the first argument to both bind and listen), and we call the return value
from accept the connected socket.
It is important to differentiate between these two sockets. A given server normally creates
only one listening socket, which then exists for the lifetime of the server.
The kernel creates one connected socket for each client connection that is accepted.
When the server is finished serving a given client, the connected socket is closed.
The connected socket is closed each time, but the listening socket remains open for the
life of the server.
vi. write ( )Function
#include<unistd.h>
#include<sys/types.h>
int write ( int fd, char *Buff, int NumBytes);
int write ( int file_descriptor, const void *buf, size_t message_length);
Example: int fd; char Buff[ ] = “hello”; write(fd,Buff,strlen(Buff)+1);
The return value is the number of bytes written.
The number of bytes written may be less than the message_length.
This function transfer the data from your application to a buffer in the kernel on your
machine, it does not directly transmit the data over the network.
TCP is in complete control of sending the data and this is implemented inside the kernel.
vii. read ( ) Function
#include<unistd.h>
#include<sys/types.h>
int read ( int fd, char *Buff, int NumBytes);
int read ( int file_descriptor, const void *buf, size_t buffer_length);
Example: int fd; char Buff[50]; read(fd,Buff,sizeof(Buff));
The value returned is the number of bytes read which may not be buffer_length.
This function transfers data from a buffer in the kernel to your application.
viii. close( ) Function
The close ( ) function is used to close a socket and terminate a TCP connection.
#include <unistd.h>
int close (int sockfd);
Returns 0 if Ok, -1 on error.
The default action of close with a TCP socket is to mark the socket as closed and return
to the process immediately.
The socket descriptor is no longer usable by the process:
It cannot be used as an argument to read or write.
But, TCP will try to send any data that is already queued to be sent to the other end, and
after this occurs, the normal TCP connection termination sequence takes place.
A Simple Daytime Client & Server Program
Client
This client establishes a TCP connection with a server and the server simply sends back
the current time and date in a human-readable format.
#include "unp.h"
int main(int argc, char **argv)
{
int sockfd, n;
char recvline[MAXLINE + 1];
struct sockaddr_in servaddr;
if (argc != 2)
err_quit("usage: a.out <IPaddress>");
if ( (sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
err_sys("socket error");
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(13); /* daytime server */
if (inet_pton(AF_INET, argv[1], &servaddr.sin_addr) <= 0)
err_quit("inet_pton error for %s", argv[1]);
if (connect(sockfd, (SA *) &servaddr, sizeof(servaddr)) < 0)
err_sys("connect error");
The next step in the concurrent server is to call fork. Fig. shows the status after fork
returns.
Fig: Status of client/server after fork returns.
Both descriptors, listenfd and connfd, are shared (duplicated) between the parent and child.
The next step is for the parent to close the connected socket and the child to close the
listening socket.
Fig: Status of client/server after parent and child close appropriate sockets.
This is the desired final state of the sockets. The child is handling the connection with the
client and the parent can call accept again on the listening socket, to handle the next client
connection.
PART A
1. What is a socket?
A socket is a logical endpoint for communication between two hosts on a TCP/IP
network. A socket type is uniquely determined by a triple <domain, type, protocol>.
2. State the differences between TCP and UDP.
Transmission Control Protocol User Datagram Protocol
Connection Oriented Connection less
Sophisticated Simple
Reliable Unreliable
Byte stream protocol Datagram protocol
*****************************