Sockets Introducton: Inet - Addr Inet - Ntoa Inet - Pton Inet - Ntop
Sockets Introducton: Inet - Addr Inet - Ntoa Inet - Pton Inet - Ntop
Sockets Introducton: Inet - Addr Inet - Ntoa Inet - Pton Inet - Ntop
INTRODUCTION:
Socket address structures can be passed in two directions: from the process to the kernel,
and from the kernel to the process. The latter case is an example of a value-result
argument, and we will encounter other examples of these arguments throughout the text.
The address conversion functions convert between a text representation of an address and
the binary value that goes into a socket address structure. Most existing IPv4 code uses
inet_addr and inet_ntoa, but two new functions, inet_pton and inet_ntop, handle
both IPv4 and IPv6
Most socket functions require a pointer to a socket address structure as an argument. Each
supported protocol suite defines its own socket address structure. The names of these
structures begin with sockaddr_ and end with a unique suffix for each protocol suite.
The socket functions are then defined as taking a pointer to the generic socket address
structure, as shown here in the ANSI C function prototype for the bind function:
VALUE-RESULT ARGUMENTS:
When a socket address structure is passed to any socket function, it is always passed by
reference. That is, a pointer to the structure is passed. The length of the structure is also
passed as an argument. But the way in which the length is passed depends on which
direction the structure is being passed: from the process to the kernel, or vice versa.
1. Three functions, bind, connect, and sendto, pass a socket address structure from
the process to the kernel. One argument to these three functions is the pointer to
the socket address structure and another argument is the integer size of the
structure, as in
/* fill in serv{} */
connect (sockfd, (SA *) &serv, sizeof(serv));
Since the kernel is passed both the pointer and the size of what the pointer points
to, it knows exactly how much data to copy from the process into the kernel.
Each client connection causes the server to fork a new process just for that client.. First,
the server is started, a client is started that connects to the server. Client sends a request to
the server, the server processes the request, and the server sends a reply back to the client.
This continues until the client closes its end of the connection, which sends an end-of-file
notification to the server. The server then closes its end connection and either terminates
or waits for a new client connection.
socket Function
To perform network I/O, the first thing a process must do is call the socket function,
specifying the type of communication protocol desired (TCP using IPv4, UDP using
IPv6, Unix domain stream protocol, etc.).
#include <sys/socket.h>
connect Function
The connect function is used by a TCP client to establish a connection with a TCP
server.
#include <sys/socket.h>
bind Function
The bind function assigns a local protocol address to a socket. With the Internet
protocols, the protocol address is the combination of either a 32-bit IPv4 address or a
128-bit IPv6 address, along with a 16-bit TCP or UDP port number.
#include <sys/socket.h>
int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen);
The listen function is called only by a TCP server and it performs two actions:
#include <sys/socket.h>
accept Function
accept is called by a TCP server to return the next completed connection from the front
of the completed connection queue (If the completed connection queue is empty, the
process is put to sleep (assuming the default of a blocking socket).
#include <sys/socket.h>
close Function
The normal Unix close function is also used to close a socket and terminate a TCP
connection.
#include <unistd.h>
INTRODUCTION:
1. The client reads a line of text from its standard input and writes the line to the
server.
2. The server reads the line from its network input and echoes the line back to the
client.
3. The client reads the echoed line and prints it on its standard output.
A TCP socket is created. An Internet socket address structure is filled in with the
wildcard address (INADDR_ANY) and the server's well-known port (SERV_PORT, which is
defined as 9877 in our unp.h header). Binding the wildcard address tells the system that
we will accept a connection destined for any local interface, in case the system is
multihomed.. The socket is converted into a listening socket by listen.
The server blocks in the call to accept, waiting for a client connection to complete.
Concurrent server
For each client, fork spawns a child, and the child handles the new client. As we
discussed in the child closes the listening socket and the parent closes the connected
socket. The child then calls str_echo handle the client.
The function str_echo performs the server processing for each client: It reads data from
the client and echoes it back to the client.
read reads data from the socket and the line is echoed back to the client by writen. If the
client closes the connection the receipt of the client's FIN causes the child's read to return
0. This causes the str_echo function to return, which terminates the child .
A TCP socket is created and an Internet socket address structure is filled in with the
server's IP address and port number. We take the server's IP address from the command-
line argument and the server's well-known port (SERV_PORT) is from our unp.h header.
Connect to server
connect establishes the connection with the server. The function str_cli) handles the
rest of the client processing.
This function handles the client processing loop: It reads a line of text from standard
input, writes it to the server, reads back the server's echo of the line, and outputs the
echoed line to standard output.
Read a line, write to server
fgets reads a line of text and writen sends the line to the server.
readline reads the line echoed back from the server and fputs writes it to standard
output.
Return to main
The loop terminates when fgets returns a null pointer, which occurs when it encounters
either an end-of-file (EOF) or an error. Our Fgets wrapper function checks for an error
and aborts if one occurs, so Fgets returns a null pointer only when an end-of-file is
encountered.
I/O FUNCTIONS
sock_aread :
Description :
Read exactly len bytes from the socket or, if that amount of data is not yet available, do
not read anything. Unlike sock_fastread(), this function will never return less than the
requested amount of data. This can be useful when the application knows that it will be
receiving a fixed amount of data, but does not wish to handle the arrival of only part of
the data, as it would have to do if sock_fastread() was used.
len must be less than or equal to the socket receive buffer size, otherwise
sock_fastread() must be used.
sock_awrite :
Description
Write exactly len bytes to the socket or, if that amount of data can not be written, do not
write anything. Unlike sock_fastwrite(), this function will never return less than the
requested amount of data. This can be useful when the application needs to write a fixed
amount of data, but does not wish to handle the transmission of only part of the data, as it
would have to do if sock_fastwrite() was used.
len must be less than or equal to the socket transmit buffer size, otherwise
sock_fastwrite() must be used.
.
sock_read :
Description
Reads up to len bytes from dp on socket s. This function will busy wait until either len
bytes are read or there is an error condition. If sock_yield() has been called, the user-
defined function that is passed to it will be called in a tight loop while sock_read() is
busy waiting.
sock_write :
Description
Writes up to len bytes from dp to socket s. This function busy waits until either the
buffer is completely written or a socket error occurs. If sock_yield() has been called,
the user-defined function that is passed to it will be called in a tight loop while
sock_write() is busywaiting.
sock_recv_init :
The basic socket reading functions (sock_read(), sock_fastread(), etc.) are not
adequate for all your UDP needs. The most basic limitation is their inability to treat UDP
as a record service.
A record service must receive distinct datagrams and pass them to the user program as
such. You must know the length of the received datagram and the sender (if you opened
in broadcast mode). You may also receive the datagrams very quickly, so you must have
a mechanism to buffer them.
Once a socket is opened with udp_open(), you can use sock_recv_init() to initialize
that socket for sock_recv() and sock_recv_from().sock_recv_init() installs a large
buffer area which gets segmented into smaller buffers. Whenever a UDP datagram
arrives, DCRTCP.LIB stuffs that datagram into one of these new buffers. The new
functions scan those buffers. You must select the size of the buffer you submit to
sock_recv_init(); make it as large as possible, say 4K, 8K or 16K.
sock_recv_from :
Description
This function allows the process to instruct the kernel to wait for any one of multiple
events to occur and to wake up the process only when one or more of these events occurs
or when a specified amount of time has passed.
As an example, we can call select and tell the kernel to return only when:
• Any of the descriptors in the set {1, 4, 5} are ready for reading
• Any of the descriptors in the set {2, 7} are ready for writing
• Any of the descriptors in the set {1, 4} have an exception condition pending
• 10.2 seconds have elapsed
#include <sys/select.h>
#include <sys/time.h>
We start our description of this function with its final argument, which tells the kernel
how long to wait for one of the specified descriptors to become ready. A timeval
structure specifies the number of seconds and microseconds.
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
1. Wait forever— Return only when one of the specified descriptors is ready for I/O.
For this, we specify the timeout argument as a null pointer.
2. Wait up to a fixed amount of time— Return when one of the specified descriptors
is ready for I/O, but do not wait beyond the number of seconds and microseconds
specified in the timeval structure pointed to by the timeout argument.
3. Do not wait at all— Return immediately after checking the descriptors. This is
called polling. To specify this, the timeout argument must point to a timeval
structure and the timer value (the number of seconds and microseconds specified
by the structure) must be 0.
The three middle arguments, readset, writeset, and exceptset, specify the descriptors that
we want the kernel to test for reading, writing, and exception conditions
The maxfdp1 argument specifies the number of descriptors to be tested. Its value is the
maximum descriptor to be tested plus one (hence our name of maxfdp1). The descriptors
0, 1, 2, up through and including maxfdp1–1 are tested.
select modifies the descriptor sets pointed to by the readset, writeset, and exceptset
pointers. These three arguments are value-result arguments. When we call the function,
we specify the values of the descriptors that we are interested in, and on return, the result
indicates which descriptors are ready. We use the FD_ISSET macro on return to test a
specific descriptor in an fd_set structure. Any descriptor that is not ready on return will
have its corresponding bit cleared in the descriptor set. To handle this, we turn on all the
bits in which we are interested in all the descriptor sets each time we call select.
The two most common programming errors when using select are to forget to add one
to the largest descriptor number and to forget that the descriptor sets are value-result
arguments. The second error results in select being called with a bit set to 0 in the
descriptor set, when we think that bit is 1.
The return value from this function indicates the total number of bits that are ready across
all the descriptor sets. If the timer value expires before any of the descriptors are ready, a
value of 0 is returned. A return value of –1 indicates an error (which can happen, for
example, if the function is interrupted by a caught signal).
poll Function
The poll function originated with SVR3 and was originally limited to STREAMS
devices SVR4 removed this limitation, allowing poll to work with any descriptor. poll
provides functionality that is similar to select, but poll provides additional information
when dealing with STREAMS devices.
#include <poll.h>
int poll (struct pollfd *fdarray, unsigned long nfds, int timeout);
The first argument is a pointer to the first element of an array of structures. Each element
of the array is a pollfd structure that specifies the conditions to be tested for a given
descriptor, fd.
struct pollfd {
int fd; /* descriptor to check */
short events; /* events of interest on fd */
short revents; /* events that occurred on fd */
};
The conditions to be tested are specified by the events member, and the function returns
the status for that descriptor in the corresponding revents member. (Having two
variables per descriptor, one a value and one a result, avoids value-result arguments.
Recall that the middle three arguments for select are value-result.) Each of these two
members is composed of one or more bits that specify a certain condition
There are three classes of data identified by poll: normal, priority band, and high-
priority. These terms come from the STREAMS-based implementations.POLLIN can be
defined as the logical OR of POLLRDNORM and POLLRDBAND. The POLLIN constant exists
from SVR3 implementations that predated the priority bands in SVR4, so the constant
remains for backward compatibility. Similarly, POLLOUT is equivalent to POLLWRNORM,
with the former predating the latter.
The return value from poll is –1 if an error occurred, 0 if no descriptors are ready before
the timer expires, otherwise it is the number of descriptors that have a nonzero revents
member.
SOCKET OPTIONS
#include <sys/socket.h>
int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t
*optlen);
int setsockopt(int sockfd, int level, int optname, const void *optval socklen_t
optlen);
Both return: 0 if OK,–1 on error
sockfd must refer to an open socket descriptor. level specifies the code in the system that
interprets the option: the general socket code or some protocol-specific code (e.g., IPv4,
IPv6, TCP, or SCTP).
optval is a pointer to a variable from which the new value of the option is fetched by
setsockopt, or into which the current value of the option is stored by getsockopt. The
size of this variable is specified by the final argument, as a value for setsockopt and as a
value-result for getsockopt.
Setting this option allows us to set IP options in the IPv4 header. This requires intimate
knowledge of the format of the IP options in the IP header.
This socket option causes the destination IP address of a received UDP datagram to be
returned as ancillary data by recvmsg.
This socket option causes the index of the interface on which a UDP datagram is received
to be returned as ancillary data by recvmsg.
With this option, we can set and fetch the default TTL (that the system will use for
unicast packets sent on a given socket. (The multicast TTL is set using the
IP_MULTICAST_TTL socket option, 4.4BSD, for example, uses the default of 64 for both
TCP and UDP sockets (specified in the IANA's "IP Option Numbers" registry [IANA])
and 255 for raw sockets. As with the TOS field, calling getsockopt returns the default
value of the field that the system will use in outgoing datagrams—there is no way to
obtain the value from a received datagram.
There are two socket options for TCP. We specify the level as IPPROTO_TCP.
This socket option allows us to fetch or set the MSS for a TCP connection. The value
returned is the maximum amount of data that our TCP will send to the other end; often, it
is the MSS announced by the other end with its SYN, unless our TCP chooses to use a
smaller value than the peer's announced MSS. If this value is fetched before the socket is
connected, the value returned is the default value that will be used if an MSS option is not
received from the other end. Also be aware that a value smaller than the returned value
can actually be used for the connection if the timestamp option, for example, is in use,
because this option occupies 12 bytes of TCP options in each segment.
The maximum amount of data that our TCP will send per segment can also change during
the life of a connection if TCP supports path MTU discovery. If the route to the peer
changes, this value can go up or down.
If set, this option disables TCP's Nagle algorithm (Section 19.4 of TCPv1 and pp. 858–
859 of TCPv2). By default, this algorithm is enabled.
The purpose of the Nagle algorithm is to reduce the number of small packets on a WAN.
The algorithm states that if a given connection has outstanding data (i.e., data that our
TCP has sent, and for which it is currently awaiting an acknowledgment), then no small
packets will be sent on the connection in response to a user write operation until the
existing data is acknowledged. The definition of a "small" packet is any packet smaller
than the MSS. TCP will always send a full-sized packet if possible; the purpose of the
Nagle algorithm is to prevent a connection from having multiple small packets
outstanding at any time.
Elementary UDP Sockets
Introduction:
There are some fundamental differences between applications written using TCP versus
those that use UDP. These are because of the differences in the two transport layers: UDP
is a connectionless, unreliable, datagram protocol, quite unlike the connection-oriented,
reliable byte stream provided by TCP. Nevertheless, there are instances when it makes
sense to use UDP instead of TCP,. Some popular applications are built using UDP: DNS,
NFS, and SNMP, for example.
. The client does not establish a connection with the server. Instead, the client just sends a
datagram to the server using the sendto function which requires the address of the
destination (the server) as a parameter. Similarly, the server does not accept a connection
from a client. Instead, the server just calls the recvfrom function, which waits until data
arrives from some client. recvfrom returns the protocol address of the client, along with
the datagram, so the server can send a response to the correct client.
recvfrom and sendto Functions:
These two functions are similar to the standard read and write functions, but three
additional arguments are required.
#include <sys/socket.h>
ssize_t recvfrom(int sockfd, void *buff, size_t nbytes, int flags, struct
sockaddr *from, socklen_t *addrlen);
ssize_t sendto(int sockfd, const void *buff, size_t nbytes, int flags, const
struct sockaddr *to, socklen_t addrlen);
The first three arguments, sockfd, buff, and nbytes, are identical to the first three
arguments for read and write: descriptor, pointer to buffer to read into or write from,
and number of bytes to read or write.
to argument for sendto is a socket address structure containing the protocol address (e.g.,
IP address and port number) of where the data is to be sent. The size of this socket
address structure is specified by addrlen. The recvfrom function fills in the socket
address structure pointed to by from with the protocol address of who sent the datagram.
The number of bytes stored in this socket address structure is also returned to the caller in
the integer pointed to by addrlen. The final two arguments to recvfrom are similar to the
final two arguments to accept: The contents of the socket address structure upon return
tell us who sent the datagram (in the case of UDP) or who initiated the connection (in the
case of TCP). The final two arguments to sendto are similar to the final two arguments
to connect: We fill in the socket address structure with the protocol address of where to
send the datagram (in the case of UDP) or with whom to establish a connection (in the
case of TCP).
Both functions return the length of the data that was read or written as the value of the
function. In the typical use of recvfrom, with a datagram protocol, the return value is the
amount of user data in the datagram received.
If the from argument to recvfrom is a null pointer, then the corresponding length
argument (addrlen) must also be a null pointer, and this indicates that we are not
interested in knowing the protocol address of who sent us data.
The DNS is used primarily to map between hostnames and IP addresses. A hostname can
be either a simple name, such as solaris or freebsd, or a fully qualified domain name
'(FQDN), such as solaris.unpbook.com
Resource Records
Entries in the DNS are known as resource records (RRs). There are only a few types of
RRs that we are interested in.
A An A record maps a hostname into a 32-bit IPv4 address. For example, here
are the four DNS records for the host freebsd in the unpbook.com domain,
the first of which is an A record:
AAAA A AAAA record, called a "quad A" record, maps a hostname into a 128-bit
IPv6 address. The term "quad A" was chosen because a 128-bit address is four
times larger than a 32-bit address.
PTR PTR records (called "pointer records") map IP addresses into hostnames. For
an IPv4 address, then 4 bytes of the 32-bit address are reversed, each byte is
converted to its decimal ASCII value (0–255), and in-addr.arpa is the
appended. The resulting string is used in the PTR query.
.
CNAME CNAME stands for "canonical name." A common use is to assign CNAME
records for common services, such as ftp and www.
Organizations run one or more name servers, often the program known as BIND
(Berkeley Internet Name Domain). Applications such as the clients and servers that we
are writing in this text contact a DNS server by calling functions in a library known as the
resolver. The common resolver functions are gethostbyname and gethostbyaddr, both
of which are described in this chapter. The former maps a hostname into its IPv4
addresses, and the latter does the reverse mapping.
The resolver code reads its system-dependent configuration files to determine the location
of the organization's name servers. (We use the plural "name servers" because most
organizations run multiple name servers, even though we show only one local server in
the figure. Multiple name servers are absolutely required for reliability and redundancy.)
The file /etc/resolv.conf normally contains the IP addresses of the local name servers.
gethostbyname Function
Host computers are normally known by human-readable names. All the examples that we
have shown so far in this book have intentionally used IP addresses instead of names, so
we know exactly what goes into the socket address structures for functions such as
connect and sendto, and what is returned by functions such as accept and recvfrom.
But, most applications should deal with names, not addresses. This is especially true as
we move to IPv6, since IPv6 addresses (hex strings) are much longer than IPv4 dotted-
decimal numbers. (The example AAAA record and ip6.arpa PTR record in the previous
section should make this obvious.)
#include <netdb.h>
struct hostent {
char *h_name; /* official (canonical) name of host */
char **h_aliases; /* pointer to array of pointers to alias names
*/
int h_addrtype; /* host address type: AF_INET */
int h_length; /* length of address: 4 */
char **h_addr_list; /* ptr to array of ptrs with IPv4 addrs */
};
gethostbyaddr Function:
The function gethostbyaddr takes a binary IPv4 address and tries to find the hostname
corresponding to that address. This is the reverse of gethostbyname.
#include <netdb.h>
struct hostent *gethostbyaddr (const char *addr, socklen_t len, int family);
This function returns a pointer to the same hostent structure that we described with
gethostbyname. The field of interest in this structure is normally h_name, the canonical
hostname.
The addr argument is not a char*, but is really a pointer to an in_addr structure
containing the IPv4 address. len is the size of this structure: 4 for an IPv4 address. The
family argument is AF_INET.
INDEX
• Socket introduction,
• Elementary TCP sockets
• TCP client sever
• I/O functions
• Select& poll functions
• Socket options
• Elementary UDP sockets
• Elementary node and address conversions
• Echo Service