Networking Basics
Networking Basics
Networking Basics
Computers running on the Internet communicate to each other using either the Transmission Control
Protocol (TCP) or the User Datagram Protocol (UDP), as this diagram illustrates:
When you write Java programs that communicate over the network, you are programming at the
application layer. Typically, you don't need to concern yourself with the TCP and UDP layers. Instead, you
can use the classes in the java.net package. These classes provide system-independent network
communication. However, to decide which Java classes your programs should use, you do need to
understand how TCP and UDP differ.
TCP
When two applications want to communicate to each other reliably, they establish a connection and send
data back and forth over that connection. This is analogous to making a telephone call. If you want to
speak to Aunt Beatrice in Kentucky, a connection is established when you dial her phone number and she
answers. You send data back and forth over the connection by speaking to one another over the phone
lines. Like the phone company, TCP guarantees that data sent from one end of the connection actually
gets to the other end and in the same order it was sent. Otherwise, an error is reported.
TCP provides a point-to-point channel for applications that require reliable communications. The
Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), and Telnet are all examples of
applications that require a reliable communication channel. The order in which the data is sent and
received over the network is critical to the success of these applications. When HTTP is used to read from
a URL, the data must be received in the order in which it was sent. Otherwise, you end up with a jumbled
HTML file, a corrupt zip file, or some other invalid information.
Definition:
TCP (Transmission Control Protocol) is a connection-based protocol that provides a reliable flow of data
between two computers.
UDP
The UDP protocol provides for communication that is not guaranteed between two applications on the
network. UDP is not connection-based like TCP. Rather, it sends independent packets of data,
called datagrams, from one application to another. Sending datagrams is much like sending a letter
through the postal service: The order of delivery is not important and is not guaranteed, and each
message is independent of any other.
Definition:
UDP (User Datagram Protocol) is a protocol that sends independent packets of data, called datagrams,
from one computer to another with no guarantees about arrival. UDP is not connection-based like TCP.
For many applications, the guarantee of reliability is critical to the success of the transfer of information
from one end of the connection to the other. However, other forms of communication don't require such
strict standards. In fact, they may be slowed down by the extra overhead or the reliable connection may
invalidate the service altogether.
Consider, for example, a clock server that sends the current time to its client when requested to do so. If
the client misses a packet, it doesn't really make sense to resend it because the time will be incorrect
when the client receives it on the second try. If the client makes two requests and receives packets from
the server out of order, it doesn't really matter because the client can figure out that the packets are out of
order and make another request. The reliability of TCP is unnecessary in this instance because it causes
performance degradation and may hinder the usefulness of the service.
Another example of a service that doesn't need the guarantee of a reliable channel is the ping command.
The purpose of the ping command is to test the communication between two programs over the network.
In fact, ping needs to know about dropped or out-of-order packets to determine how good or bad the
connection is. A reliable channel would invalidate this service altogether.
The UDP protocol provides for communication that is not guaranteed between two applications on the
network. UDP is not connection-based like TCP. Rather, it sends independent packets of data from one
application to another. Sending datagrams is much like sending a letter through the mail service: The
order of delivery is not important and is not guaranteed, and each message is independent of any others.
Note:
Many firewalls and routers have been configured not to allow UDP packets. If you're having trouble
connecting to a service outside your firewall, or if clients are having trouble connecting to your service,
ask your system administrator if UDP is permitted.
Understanding Ports
Generally speaking, a computer has a single physical connection to the network. All data destined for a
particular computer arrives through that connection. However, the data may be intended for different
applications running on the computer. So how does the computer know to which application to forward
the data? Through the use of ports.
Data transmitted over the Internet is accompanied by addressing information that identifies the computer
and the port for which it is destined. The computer is identified by its 32-bit IP address, which IP uses to
deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and
UDP use to deliver the data to the right application.
In connection-based communication such as TCP, a server application binds a socket to a specific port
number. This has the effect of registering the server with the system to receive all data destined for that
port. A client can then rendezvous with the server at the server's port, as illustrated here:
Definition:
The TCP and UDP protocols use ports to map incoming data to a particular process running on a
computer.
In datagram-based communication such as UDP, the datagram packet contains the port number of its
destination and UDP routes the packet to the appropriate application, as illustrated in this figure:
Port numbers range from 0 to 65,535 because ports are represented by 16-bit numbers. The port
numbers ranging from 0 - 1023 are restricted; they are reserved for use by well-known services such as
HTTP and FTP and other system services. These ports are calledwell-known ports. Your applications
should not attempt to bind to them.
The term network programming refers to writing programs that execute across multiple devices (computers), in which
the devices are all connected to each other using a network.
The java.net package of the J2SE APIs contains a collection of classes and interfaces that provide the low-level
communication details, allowing you to write programs that focus on solving the problem at hand.
The java.net package provides support for the two common network protocols:
TCP: TCP stands for Transmission Control Protocol, which allows for reliable communication between two
applications. TCP is typically used over the Internet Protocol, which is referred to as TCP/IP.
UDP: UDP stands for User Datagram Protocol, a connection-less protocol that allows for packets of data to be
transmitted between applications.
Socket Programming: This is most widely used concept in Networking and it has been explained in very detail.
URL Processing: This would be covered separately. Click here to learn about URL Processing in Java language.
Socket Programming:
Sockets provide the communication mechanism between two computers using TCP. A client program creates a
socket on its end of the communication and attempts to connect that socket to a server.
When the connection is made, the server creates a socket object on its end of the communication. The client and
server can now communicate by writing to and reading from the socket.
The java.net.Socket class represents a socket, and the java.net.ServerSocket class provides a mechanism for the
server program to listen for clients and establish connections with them.
The following steps occur when establishing a TCP connection between two computers using sockets:
The server instantiates a ServerSocket object, denoting which port number communication is to occur on.
The server invokes the accept() method of the ServerSocket class. This method waits until a client connects to the
server on the given port.
After the server is waiting, a client instantiates a Socket object, specifying the server name and port number to
connect to.
The constructor of the Socket class attempts to connect the client to the specified server and port number. If
communication is established, the client now has a Socket object capable of communicating with the server.
On the server side, the accept() method returns a reference to a new socket on the server that is connected to the
client's socket.
After the connections are established, communication can occur using I/O streams. Each socket has both an
OutputStream and an InputStream. The client's OutputStream is connected to the server's InputStream, and the
client's InputStream is connected to the server's OutputStream.
TCP is a twoway communication protocol, so data can be sent across both streams at the same time. There are
following usefull classes providing complete set of methods to implement sockets.
If the ServerSocket constructor does not throw an exception, it means that your application has successfully bound to
the specified port and is ready for client requests.
When the ServerSocket invokes accept(), the method does not return until a client connects. After a client does
connect, the ServerSocket creates a new Socket on an unspecified port and returns a reference to this new Socket. A
TCP connection now exists between the client and server, and communication can begin.
public Socket(String host, int port, InetAddress localAddress, int localPort) throws
IOException.
3
Connects to the specified host and port, creating a socket on the local host at the specified
address and port.
public Socket(InetAddress host, int port, InetAddress localAddress, int localPort) throws
IOException.
4
This method is identical to the previous constructor, except that the host is denoted by an
InetAddress object instead of a String
public Socket()
5
Creates an unconnected socket. Use the connect() method to connect this socket to a server.
When the Socket constructor returns, it does not simply instantiate a Socket object but it actually attempts to connect
to the specified server and port.
Some methods of interest in the Socket class are listed here. Notice that both the client and server have a Socket
object, so these methods can be invoked by both the client and server.
String getHostAddress()
4
Returns the IP address string in textual presentation.
String getHostName()
5
Gets the host name for this IP address.
String toString()
7
Converts this IP address to a String.
import java.net.*;
import java.io.*;
import java.net.*;
import java.io.*;
TCP and UDP are two transport layer protocols, which are extensively used in internet
for transmitting data between one host to another. Good knowledge of how TCP and
UDP works is essential for any programmer. That's why difference between TCP and
UDP is one of the most popular programming interview question. I have seen this
question many times on various Java interviews , especially for server side Java
developer positions. Since FIX (Financial Information Exchange) protocol is also a TCP
based protocol, several investment banks, hedge funds, and exchange solution
provider looks for Java developer with good knowledge of TCP and UDP. Writing fix
engines and server side components for high speed electronic trading platforms needs
capable developers with solid understanding of fundamentals including data
structure, algorithms and networking. By the way, use of TCP and UDP is not limited
to one area, its at the heart of internet. The protocol which is core of internet, HTTP
is based on TCP. One more reason, why Java developer should understand these two
protocol in detail is that Java is extensively used to write multi-threaded, concurrent
and scalable servers. Java also provides rich Socket programming API for both TCP and
UDP based communication. In this article, we will learn key differences between TCP
and UDP protocol, which is useful to every Java programmers. To start with, TCP
stands for Transmission Control Protocol and UDP stands for User Datagram Protocol,
and both are used extensively to build Internet applications.
Difference between TCP vs UDP Protocol
I love to compare two things on different points, this not only makes them easy to
compare but also makes it easy to remember differences. When we compare TCP to
UDP, we learn difference in how both TCP and UDP works, we learn which provides
reliable and guaranteed delivery and which doesn't. Which protocol is fast and why,
and most importantly when to choose TCP over UDP while building your own
distributed application. In this article we will see difference between UDP and TCP in
9 points, e.g. connection set-up, reliability, ordering, speed, overhead, header size,
congestion control, application, different protocols based upon TCP and UDP and how
they transfer data. By learning these differences, you not only able to answer this
interview question better but also understand some important details about two of
the most important protocols of internet.
First and foremost difference between them is TCP is a connection oriented protocol,
and UDP is connection less protocol. This means a connection is established between
client and server, before they can send data. Connection establishment process is also
known as TCP hand shaking where control messages are interchanged between client
and server. Attached image describe the process of TCP handshake, for example
which control messages are exchanged between client and server. Client, which is
initiator of TCP connection, sends SYN message to server, which is listening on a TCP
port. Server receives and sends a SYN-ACK message, which is received by client again
and responded using ACK. Once server receive this ACK message, TCP connection is
established and ready for data transmission. On the other hand, UDP is a connection
less protocol, and point to point connection is not established before sending
messages. That's the reason, UDP is more suitable for multicast distribution of
message, one to many distribution of data in single transmission.
2) Reliability
TCP provides delivery guarantee, which means a message sent using TCP protocol is
guaranteed to be delivered to client. If message is lost in transits then its recovered
using resending, which is handled by TCP protocol itself. On the other hand, UDP is
unreliable, it doesn't provide any delivery guarantee. A datagram package may be lost
in transits. That's why UDP is not suitable for programs which requires guaranteed
delivery.
3) Ordering
Apart from delivery guarantee, TCP also guarantees order of message. Message will be
delivered to client in the same order, server has sent, though its possible they may
reach out of order to the other end of the network. TCP protocol will do all
sequencing and ordering for you. UDP doesn't provide any ordering or sequencing
guarantee. Datagram packets may arrive in any order. That's why TCP is suitable for
application which need delivery in sequenced manner, though there are UDP based
protocol as well which provides ordering and reliability by using sequence number and
redelivery e.g. TIBCO Rendezvous, which is actually a UDP based application.
4) Data Boundary
TCP does not preserve data boundary, UDP does. In Transmission control protocol,
data is sent as a byte stream, and no distinguishing indications are transmitted to
signal message (segment) boundaries. On UDP, Packets are sent individually and are
checked for integrity only if they arrived. Packets have definite boundaries which are
honoured upon receipt, meaning a read operation at the receiver socket will yield an
entire message as it was originally sent. Though TCP will also deliver complete
message after assembling all bytes. Messages are stored on TCP buffers before
sending to make optimum use of network bandwidth.
5) Speed
In one word, TCP is slow and UDP is fast. Since TCP does has to create connection,
ensure guaranteed and ordered delivery, it does lot more than UDP. This cost TCP in
terms of speed, that's why UDP is more suitable where speed is a concern, for
example online video streaming, telecast or online multi player games.
7) Header size
TCP has bigger header than UDP. Usual header size of a TCP packet is 20 bytes which
is more than double of 8 bytes, header size of UDP datagram packet. TCP header
contains Sequence Number, Ack number, Data offset, Reserved, Control bit, Window,
Urgent Pointer, Options, Padding, Check Sum, Source port, and Destination port.
While UDP header only contains Length, Source port, Destination port, and Check
Sum. Here is how TCP and UDP header looks like :