Notes On Socket Programming
Notes On Socket Programming
Table of Contents
If you're viewing this document online, you can click any of the topics below to link directly to that section.
1. Before you start......................................................... 2. Understanding IP networks and network layers .................. 3. Writing a client application in C ...................................... 4. Writing a server application in C ..................................... 5. Writing socket applications in Python ............................... 6. Summary and resources ..............................................
2 3 7 10 14 17
Page 1 of 18
ibm.com/developerWorks
Page 2 of 18
ibm.com/developerWorks
Page 3 of 18
ibm.com/developerWorks
Page 4 of 18
ibm.com/developerWorks
#!/usr/bin/env python "USAGE: nslookup.py <inet_address>" import socket, sys print socket.gethostbyname(sys.argv[1])
The trick is using a wrapped version of the same gethostbyname()) function we also find in C. Usage is as simple as:
/* Bare nslookup utility (w/ minimal error checking) */ #include <stdio.h> /* stderr, stdout */ #include <netdb.h> /* hostent struct, gethostbyname() */ #include <arpa/inet.h> /* inet_ntoa() to format IP address */ #include <netinet/in.h> /* in_addr structure */ int main(int argc, char **argv) { struct hostent *host; /* host information */ struct in_addr h_addr; /* Internet address */ if (argc != 2) { fprintf(stderr, "USAGE: nslookup <inet_address>\n"); exit(1); } if ((host = gethostbyname(argv[1])) == NULL) { fprintf(stderr, "(mini) nslookup failed on '%s'\n", argv[1]); exit(1); } h_addr.s_addr = *((unsigned long *) host->h_addr_list[0]); fprintf(stdout, "%s\n", inet_ntoa(h_addr)); exit(0);
Page 5 of 18
ibm.com/developerWorks
Notice that the returned value from gethostbyname() is a hostent structure that describes the name's host. The member host->h_addr_list contains a list of addresses, each of which is a 32-bit value in "network byte order"; in other words, the endianness may or may not be machine-native order. In order to convert to dotted-quad form, use the function inet_ntoa().
Page 6 of 18
ibm.com/developerWorks
There is not too much to the setup. A particular buffer size is allocated, which limits the amount of data echo'd at each pass (but we loop through multiple passes, if needed). A small error function is also defined.
Page 7 of 18
ibm.com/developerWorks
int main(int argc, char *argv[]) { int sock; struct sockaddr_in echoserver; char buffer[BUFFSIZE]; unsigned int echolen; int received = 0; if (argc != 4) { fprintf(stderr, "USAGE: TCPecho <server_ip> <word> <port>\n"); exit(1); } /* Create the TCP socket */ if ((sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) { Die("Failed to create socket"); }
The value returned is a socket handle, which is similar to a file handle; specifically, if the socket creation fails, it will return -1 rather than a positive-numbered handle.
/* Construct the server sockaddr_in structure */ memset(&echoserver, 0, sizeof(echoserver)); echoserver.sin_family = AF_INET; echoserver.sin_addr.s_addr = inet_addr(argv[1]); echoserver.sin_port = htons(atoi(argv[3])); /* Establish connection */ if (connect(sock, (struct sockaddr *) &echoserver, sizeof(echoserver)) < 0) { Die("Failed to connect with server"); }
/* /* /* /*
As with creating the socket, the attempt to establish a connection will return -1 if the attempt fails. Otherwise, the socket is now ready to accept sending and receiving data. See the Resources on page 17 for a reference on port numbers.
Page 8 of 18
ibm.com/developerWorks
string (for verification), and a flag argument. Normally the flag is the default value 0. The return value of the send() call is the number of bytes successfully sent.
/* Send the word to the server */ echolen = strlen(argv[2]); if (send(sock, argv[2], echolen, 0) != echolen) { Die("Mismatch in number of sent bytes"); } /* Receive the word back from the server */ fprintf(stdout, "Received: "); while (received < echolen) { int bytes = 0; if ((bytes = recv(sock, buffer, BUFFSIZE-1, 0)) < 1) { Die("Failed to receive bytes from server"); } received += bytes; buffer[bytes] = '\0'; /* Assure null terminated string */ fprintf(stdout, buffer); }
The rcv() call is not guaranteed to get everything in-transit on a particular call; it simply blocks until it gets something. Therefore, we loop until we have gotten back as many bytes as were sent, writing each partial string as we get it. Obviously, a different protocol might decide when to terminate receiving bytes in a different manner (perhaps a delimiter within the bytestream).
Page 9 of 18
ibm.com/developerWorks
#define MAXPENDING 5 /* Max connection requests */ #define BUFFSIZE 32 void Die(char *mess) { perror(mess); exit(1); }
The BUFFSIZE constant limits the data sent per loop. The MAXPENDING constant limits the number of connections that will be queued at a time (only one will be serviced at a time in our simple server). The Die() function is the same as in our client.
Page 10 of 18
ibm.com/developerWorks
while loop will occur. But the underlying sockets interface (and TCP/IP) does not make any guarantees about how the bytestream will be split between calls to recv().
void HandleClient(int sock) { char buffer[BUFFSIZE]; int received = -1; /* Receive message */ if ((received = recv(sock, buffer, BUFFSIZE, 0)) < 0) { Die("Failed to receive initial bytes from client"); } /* Send bytes and check for more incoming data in loop */ while (received > 0) { /* Send back received data */ if (send(sock, buffer, received, 0) != received) { Die("Failed to send bytes to client"); } /* Check for more data */ if ((received = recv(sock, buffer, BUFFSIZE, 0)) < 0) { Die("Failed to receive additional bytes from client"); } } close(sock); }
The socket that is passed in to the handler function is one that already connected to the requesting client. Once we are done with echoing all the data, we should close this socket; the parent server socket stays around to spawn new children, like the one just closed.
int main(int argc, char *argv[]) { int serversock, clientsock; struct sockaddr_in echoserver, echoclient; if (argc != 2) { fprintf(stderr, "USAGE: echoserver <port>\n"); exit(1); } /* Create the TCP socket */ if ((serversock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) { Die("Failed to create socket"); } /* Construct the server sockaddr_in structure */
Page 11 of 18
ibm.com/developerWorks
/* /* /* /*
Notice that both IP address and port are converted to network byte order for the sockaddr_in structure. The reverse functions to return to native byte order are ntohs() and ntohl(). These functions are no-ops on some platforms, but it is still wise to use them for cross-platform compatibility.
/* Bind the server socket */ if (bind(serversock, (struct sockaddr *) &echoserver, sizeof(echoserver)) < 0) { Die("Failed to bind the server socket"); } /* Listen on the server socket */ if (listen(serversock, MAXPENDING) < 0) { Die("Failed to listen on server socket"); }
Once the server socket is bound, it is ready to listen(). As with most socket functions, both bind() and listen() return -1 if they have a problem. Once a server socket is listening, it is ready to accept() client connections, acting as a factory for sockets on each connection.
/* Run until cancelled */ while (1) { unsigned int clientlen = sizeof(echoclient); /* Wait for client connection */ if ((clientsock = accept(serversock, (struct sockaddr *) &echoclient, &clientlen)) < 0) { Die("Failed to accept client connection"); } fprintf(stdout, "Client connected: %s\n", inet_ntoa(echoclient.sin_addr)); HandleClient(clientsock);
Page 12 of 18
ibm.com/developerWorks
} }
We can see the populated structure in echoclient with the fprintf() call that accesses the client IP address. The client socket pointer is passed to HandleClient(), which we saw at the start of this section.
Page 13 of 18
ibm.com/developerWorks
Section 5. Writing socket applications in Python The socket and SocketServer module
Python's standard module socket provides almost exactly the same range of capabilities you would find in C sockets. However, the interface is generally more flexible, largely because of the benefits of dynamic typing. Moreover, an object-oriented style is also used. For example, once you create a socket object, methods like .bind(), .connect(), and .send() are methods of that object, rather than global functions operating on a socket pointer. At a higher level than socket, the module SocketServer provides a framework for writing servers. This is still relatively low level, and higher-level interfaces are available for serving higher-level protocols, such as SimpleHTTPServer, DocXMLRPCServer, and CGIHTTPServer.
#!/usr/bin/env python "USAGE: echoclient.py <server> <word> <port>" from socket import * # import *, but we'll avoid name conflict import sys if len(sys.argv) != 4: print __doc__ sys.exit(0) sock = socket(AF_INET, SOCK_STREAM) sock.connect((sys.argv[1], int(sys.argv[3]))) message = sys.argv[2] messlen, received = sock.send(message), 0 if messlen != len(message) print "Failed to send complete message" print "Received: ", while received < messlen: data = sock.recv(32) sys.stdout.write(data) received += len(data) print sock.close()
At first brush, we seem to have left out some of the error-catching code from the C version. But since Python raises descriptive errors for every situation that we checked for in the C echo client, we can let the built-in exceptions do our work for us. Of course, if we wanted the precise wording of errors that we had before, we would have to add a few try/except clauses around the calls to methods of the socket object.
ibm.com/developerWorks
While shorter, the Python client is somewhat more powerful. Specifically, the address we feed to a .connect() call can be either a dotted-quad IP address or a symbolic name, without need for extra lookup work; for example:
$ ./echoclient 192.168.2.103 foobar 7 Received: foobar $ ./echoclient.py fury.gnosis.lan foobar 7 Received: foobar
We also have a choice between the methods .send() and .sendall(). The former sends as many bytes as it can at once, the latter sends the whole message (or raises an exception if it cannot). For this client, we indicate if the whole message was not sent, but proceed with getting back as much as actually was sent.
#!/usr/bin/env python "USAGE: echoserver.py <port>" from SocketServer import BaseRequestHandler, TCPServer import sys, socket class EchoHandler(BaseRequestHandler): def handle(self): print "Client connected:", self.client_address self.request.sendall(self.request.recv(2**16)) self.request.close() if len(sys.argv) != 2: print __doc__ else: TCPServer(('',int(sys.argv[1])), EchoHandler).serve_forever()
The only thing we need to provide is a child of SocketServer.BaseRequestHandler that has a .handle() method. The self instance has some useful attributes, such as .client_address, and .request, which is itself a connected socket object.
Page 15 of 18
ibm.com/developerWorks
#!/usr/bin/env python "USAGE: echoclient.py <server> <word> <port>" from socket import * # import *, but we'll avoid name conflict import sys def handleClient(sock): data = sock.recv(32) while data: sock.sendall(data) data = sock.recv(32) sock.close() if len(sys.argv) != 2: print __doc__ else: sock = socket(AF_INET, SOCK_STREAM) sock.bind(('',int(sys.argv[1]))) sock.listen(5) while 1: # Run until cancelled newsock, client_addr = sock.accept() print "Client connected:", client_addr handleClient(newsock)
In truth, this "hard way" still isn't very hard. But as in the C implementation, we manufacture new connected sockets using .listen(), and call our handler for each such connection.
Page 16 of 18
ibm.com/developerWorks
Resources
A good introduction to sockets programming in C is TCP/IP Sockets in C, by Michael J. Donahoo and Kenneth L. Calvert (Morgan-Kaufmann, 2001). Examples and more information are available on the book's Author pages. The UNIX Systems Support Group document Network Layers explains the functions of the lower network layers. The Transmission Control Protocol (TCP) is covered in RFC 793. The User Datagram Protocol (UDP) is the subject of RFC 768. You can find a list of widely used port assignments at the IANA (Internet Assigned Numbers Authority) Web site. "Understanding Sockets in Unix, NT, and Java" (developerWorks) illustrates fundamental sockets principles with sample source code in C and in Java. "RunTime: Programming sockets" (developerWorks) compares the performance of sockets on Windows and Linux. The Sockets section from the AIX C Programming book Communications Programming Concepts goes into depth on a number of related issues. Volume 2 of the AIX 5L Version 5.2 Technical Reference focuses on Communications, including, of course, a great deal on sockets programming. The Robocode project (alphaWorks) has an article on "Using Serialization with Sockets," which includes Java source code and examples.
Page 17 of 18
ibm.com/developerWorks
Sockets, network layers, UDP, and much more are also discussed in the conversational Beej's Guide to Network Programming. You may find Gordon McMillan's Socket Programming HOWTO and Jim Frost's BSD Sockets: A Quick and Dirty Primer useful as well. Find more resources for Linux developers in the developerWorks Linux zone.
Feedback
Please let us know whether this tutorial was helpful to you and how we could make it better. We'd also like to hear about other tutorial topics you'd like to see covered. For questions about the content of this tutorial, contact the author, David Mertz, at mertz@gnosis.cx.
Colophon
This tutorial was written entirely in XML, using the developerWorks Toot-O-Matic tutorial generator. The open source Toot-O-Matic tool is an XSLT stylesheet and several XSLT extension functions that convert an XML file into a number of HTML pages, a zip file, JPEG heading graphics, and two PDF files. Our ability to generate multiple text and binary formats from a single source file illustrates the power and flexibility of XML. (It also saves our production team a great deal of time and effort.) You can get the source code for the Toot-O-Matic at www6.software.ibm.com/dl/devworks/dw-tootomatic-p. The tutorial Building tutorials with the Toot-O-Matic demonstrates how to use the Toot-O-Matic to create your own tutorials. developerWorks also hosts a forum devoted to the Toot-O-Matic; it's available at www-105.ibm.com/developerworks/xml_df.nsf/AllViewTemplate?OpenForm&RestrictToCategory=11. We'd love to know what you think about the tool.
Page 18 of 18