Network Protocols

M.Sc.
Information Technology
(DISTANCE MODE)
DIT 116 Network Protocols
I SEMESTER COURSE MATERIAL
Centre for Distance Education

Anna University Chennai Chennai 600 025
Author
Dr. P. Yogesh
Senior Lecturer Department of Computer Science and Engineering Anna University Chennai Chennai 600 025
Reviewer
Dr. Ranjani Parthasarathy

Professor Department of Computer Science and Engineering Anna University Chennai Chennai 600 025
Editorial Board
Dr. C. Chellappan
Dr. T.V. Geetha

Dr. H. Peeru Mohamed

Professor Department of Management Studies Anna University Chennai Chennai 600 025
Copyrights Reserved (For Private Circulation only)
ACKNOWLEDGEMENT
I would like to convey my sincere thanks to Dr. C. Chellappan, Professor and Deputy Director, Centre for Distance Education, Anna University, Chennai for providing me the opportunity to prepare this course material. I thank my wife Mrs. A. Thiruchelvi and for master Y. Mukil kumar for their constant encouragement in preparing this course material. I have drawn inputs from several sources for the preparation of this course material, to meet the requirements of the syllabus. The author gratefully acknowledges the following sources Dougles E Corner internet working with TCP / IP Principles, Protocols and Architectures, Fourth Edition, Prentice Hall of India, 2002. Behrouz A Forouzan, TCP /IP protocol suite Third Edition Tata McGraw Hill Edition, 2006 Uyless Black, Computer Networks Protocols, Standards and Interfaces Second Edition, Prentice Hall of India, 2002.
Dr. P.Yogesh Senior Lecturer Department of Computer Science and Engineering Anna University, Chennai 25.
DIT 116 NETWORK PROTOCOLS UNIT I Internet Protocol : Routing IP Datagrams Error and Control Messages (ICMP), Reliable Stream Transport Service (TCP) : TCP State Machine, Response to congestion congestion, Tail Drop and TCP Random Early Discard, Routing : Exterior Gateway Protocols and Autonomous Systems (BGP) UNIT II Internet Multicasting Mobile IP Bootstrap And Auto configuration (BOOTP, DHCP). UNIT III The Domain Name System (DNS) Applications : Remote Login (TELNET, Rlogin) File Transfer and Access (FTP, TFTP, NFS). UNIT IV Applications : Electronic Mail (SMTP, POP, IMAP, MIME) World Wide Web (HTTP) Voice and Video over IP (RTP). UNIT V Applications : Internet Management (SNMP) Internet Security and Firewall Design (Ipsec) The Future of TCP / IP (IPV6). TEXT BOOK 1. Douglas E.Comer, Internetworking with TCP / IP Principles, Protocols and Architectures, Fourth Edition, Prentice Hall of India Private Limited, 2002. REFERENCES 1. Uyless Black, Computer Networks Protocols, Standards and Interfaces, Second Edition, Prentice Hall of India, Delhi, 2002. 2. Udupa, Network Management System essentials, McGraw Hill, 1999.
DIT 116
NETWORK PROTOCOLS
UNIT - 1
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 Introduction Learning Objectives TCP/IP Reference Model Internet Architecture and Design Philosophy Routing in an internet Internet Protocol (IP) Internet Control Message Protocol Necessity of Transport Layer TCP State Machine Timer management of TCP Congestion Control Behavior of TCP Congestion-control Mechanisms in Network Layer User Datagram Protocol Autonomous Systems
NOTES
UNIT - 2
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 Introduction Learning Objectives Obtaining IP Addresses IP Multicast Internet Group Management Protocol Multicast Routing Issues and Protocols Multicasting over the Internet Mobile IP
UNIT - 3
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 Introduction Learning Objectives Domain Name System Shared File Access File Transfer Protocol Trivial File Transfer Protocol Network File System TELNET Protocol Rlogin (BSD UNIX)
1 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
UNIT - 4
4.1 Introduction 4.2 Learning Objectives 4.3 World Wide Web 4.4 Hyper Text Transfer Protocol 4.5 Electronic Mail 4.6 Protocols of E-Mail System 4.7 Message Formats 4.8 Multimedia Applications 4.9 Real-Time Transport Protocol (RTP) 4.10 IP Telephony and Signaling
UNIT - 5
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 Introduction Learning Objectives Network Management Simple Network Management Protocol Network Security IP Security (IPSec) Firewalls and Internet Access The Future of TCP/IP
Anna University Chennai
DIT 116
NETWORK PROTOCOLS
UNIT - 1
NOTES
Network Protocols
1.1 INTRODUCTION
Internet is a packet switched network that provides the world wide connectivity to its users. Internet enables its users to exchange the information among them even if they are geographically separated across continents. The communication software that provides this ability is the TCP (Transmission Control Protocol)/IP (Internet Protocol) protocol stack. In order to understand how data is sent from one machine situated at one corner of the world to another machine situated at another corner of the world, it is necessary to understand the various protocols of this protocol suite. This unit introduces you to the TCP/IP reference model and the various operations that are required to transmit the data across multiple networks. The unit then discusses IP, the core protocol of the entire stack and its adjunct protocol ICMP (Internet Control Message Protocol). To fully understand the functioning of the Internet, it is necessary to understand the events that are likely to happen once the datagram reaches the destination. The discussion about TCP gives you the details about these events. Finally this unit makes you understand the concept of autonomous systems. Overall, this unit gives you the foundation to understand the rest of the units of this subject. 1.2 LEARNING OBJECTIVES To understand the basics of network reference models To introduce TCP/IP reference model To discuss about the issues of routing To study the various fields of IP To understand the necessity of ICMP To understand the end-to-end issues of the data communication To discuss the TCP protocol in detail To understand the various congestion control policies To discuss the idea of Autonomous Systems
DIT 116
NETWORK PROTOCOLS
NOTES
1.3 1.3.1
TCP/IP REFERENCE MODEL Network Software
Basically a computer network is an interconnection of computers. This may give you an idea that, it is sufficient to provide the mere physical interconnection between these computers. However, the fact is just a physical interconnection of a set of computers can not provide any useful service. In a computer network, the basic requirement to provide useful services to the users, interconnected computers should be able to exchange information among them. To exchange the information and to provide some useful services, you should understand that software also plays a major role in networking. Initially, computer networks were designed with the hardware as the main concern and software was not given due importance. This strategy failed as the expectations of the users increased. Now a days, network software is highly structured. To reduce the design complexity, network software is organized as a stack of layers or levels, each one built upon the one below it. The purpose of each layer is to offer certain services to the higher layers, shielding those layers from the implementation details of the services. Layer n on one machine communicates with layer n on another machine. The rules and conventions followed in this communication are collectively known as layer n protocol. A set of layers and protocols is called network architecture. The set of protocols followed by a communication system, one per layer, is called a protocol stack. ARPANET (Advanced Research Projects Agency Network), the forerunner of the Internet was using a set of protocols during its initial deployment and demonstration. During these demonstrations, the researchers of ARPA realized the limitations of the existing protocols and initiated more research on protocols with the aim of developing a protocol suite that works better in a highly internetworked environment. The outcome of their research is TCP/IP protocol suite and you are going to learn about this particular protocol stack in detail in this unit. 1.3.2 Internetworking and the Internet As you know, a computer network N1 is formed by interconnecting a set of computers. A computer in N1 can communicate with any other computer in N1. The fact is, apart from network N1 many other networks also exists in the world. Assume that host Hx of network N1 wants to communicate with host Hy of N2. Such a communication is not possible as long as N1 and N2 function like islands. Interconnection of networks N1 and N2 is necessary to achieve the above communication. The process of interconnecting different networks is called internetworking and the resultant network is called an internetwork. The short form of internetwork is internet. However the term internet is different from the global Internet. (Please note the capitalization of the letter i in the global Internet). Internetworking is difficult since internetworking
Anna University Chennai 4
DIT 116
NETWORK PROTOCOLS
involves incompatible networks. To have a viable internet, you need special computers that are willing to transfer packets from one network to another. Computers that interconnect two networks and pass packets from one to the other are called internet gateways or internet routers. The global Internet that provides you many useful services like browsing, file transfer, electronic mail etc is a packet switched network to which billions of users, millions of computers and thousands of networks across various continents are connected. Host Hs of network Ns is able to send and receive datagrams to and from host Hd of network Nd possibly through many intermediate networks Ni, Nj, Nk, etc. Since the underlying network is a packet switched network different datagrams from source machine S may take different routes before reaching destination machine D. The TCP/ IP protocol stack is the glue or the lynchpin that holds the entire Internet together. Routers or gateways make it possible to transport the datagrams from source S to destination D across various incompatible networks. 1.3.3 TCP/IP Protocol Stack TCP/IP protocol stack was designed and implemented by the researchers of ARPA from the beginning itself with the sole purpose of internetworking different and incompatible networks. The TCP/IP reference model and its correspondence with OSI (Open System Interconnection) reference model are shown in Figure 1.1. The TCP/IP reference model may be considered to have either four or five layers. The bottom most layer can be considered either as a single layer (host to network layer) or a collection of two layers (data link layer and physical layer).
OSI Application Presentation Session Transport Network Data Link Physical TCP/IP Application Transport Internet Host to Network
NOTES
Figure 1.1 TCP/IP Reference Model
DIT 116
NETWORK PROTOCOLS
NOTES
The Internet Layer The job of this layer is to permit the hosts to inject the packets into any network and have them traveled independently to the destination (You should understand that the destination may be on a different network). Datagrams may arrive in a different order than they were sent. This layer defines an official packet format and protocol called IP (Internet Protocol). The job of the internet layer is to deliver IP packets where they are supposed to go. The Transport Layer The function of this layer allows peer to peer entities on the source and destination to carry on a conversation. Transport layer defines two protocols namely TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). TCP provides a reliable connection oriented service that allows a byte stream originating on one machine to be delivered without error on any other machine in the internet. It fragments the incoming byte stream into discrete messages and passes each one on to the Internet layer. At the destination, the receiving TCP process resembles the received messages into the output stream. Flow control and error control are also done by TCP. UDP provides an unreliable, connectionless protocol for applications that do not want flow control and error control operations. From this, you should understand that irrespective of reliability transport layer is required to achieve the end to end functionality. The Application Layer This layer contains all the higher-level protocols. Some popular application layer protocols are TELNET, SMTP, HTTP etc. Applications are to be developed using client/server paradigm with respect to these protocols. Please observe the difference between the functionality of the application layer in OSI and TCP/IP irrespective of the similarity in the name. Application layer of TCP/IP protocol stack has to carry out the functions of session layer and presentation layer also. The Host-to-Network Layer This reference model does not deal much with the happenings below the Internet layer. It simply points out that the host has to connect to the network using some protocol so it can send IP packets to it. This layer is also called network access layer. Have you understood? 1. What is an internetwork? 2. What is the difference between an internet and the Internet? 3. What are the functions of the internet layer of TCP/IP protocol stack? 4. Whether host to network layer is a layer or an interface? Justify your answer. 5. Mention the protocols of the transport layer.
DIT 116
NETWORK PROTOCOLS
1.4
INTERNET ARCHITECTURE AND DESIGN PHILOSOPHY
NOTES
Conceptually, TCP/IP provides three sets of services as shown in Figure 1.2. At the lowest level, a connectionless delivery service provides a foundation on which everything rests. At the next level, a reliable transport service provides a higher platform on which applications depend.
APPLICATION SERVICES RELIABLE TRANSPORT SERVICE CONNECTIONLESS PACKET DELIVERY SERVICE
Figure 1.2 Three conceptual layers of internet services
1.4.1
The Conceptual Service Organization
You observe the fact that each of these services is implemented in the form of protocol software. Irrespective of this fact it is also necessary to identify them as conceptual parts of the internet to understand the design philosophy of the TCP/IP reference model. Internet software is designed around three conceptual networking services arranged in a hierarchy; much of its success has resulted because this architecture is surprisingly robust and adaptable. One of the most significant advantages of this conceptual separation is that it becomes possible to replace one service without disturbing others. Thus, research and development can proceed concurrently on all three. 1.4.2 Connectionless Delivery System
The most fundamental internet service consists of a packet delivery system. Technically, the service is defined as an unreliable, best-effort, connectionless packet delivery system, analogous to the service provided by the network hardware that operates on a best-effort delivery paradigm. The service is called unreliable because delivery is not guaranteed. The packet may be lost, duplicated, delayed, or delivered out of order, but the service will not detect such conditions, nor will it inform the sender or receiver. The service is called connectionless because each packet is treated independently from all others. A sequence of packets sent from one computer to another may travel over different paths, or some may be lost while others are delivered. Hence, the transport layer has to provide the reliability if users expect reliable services from the network. Finally, the service is said to use best-effort delivery because the internet software makes an earnest attempt to deliver packets. You should understand that the internet does not drop the datagrams intentionally and unreliability arises only when resources are exhausted or underlying networks fail.
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. 2. 1.5 What is the advantage of conceptual service organization of the internet architecture? What type of service is provided by the network in TCP/IP reference model? ROUTING IN AN INTERNET
To understand routing, you have to understand the basic fact that, in a switched network, the source and destination are not directly connected by a single medium. They are connected through a number of intermediate devices like routers and switches. So it becomes necessary to send the voice or data through a number of intermediate devices. In a packet switched network, routing refers to the process of choosing a path over which to send packets, and router refers to a computer making the choice. Routing is a major task in packet switched network since routing is a decision to be taken for individual packets of the data flow between the source and destination. In general, routing has two different but related tasks. First step is to build the routing table that contains the information about the next hop through which the datagrams are forwarded for a particular destination. Separate protocols like Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) are available to construct the routing tables at the routers. Second step is to actually forward the datagrams to the next hop. Separate protocols like IP (Internet Protocol) and IPX (Internet Packet Exchange) exist to forward the datagrams through any one of the outgoing interfaces. In the internet, IP is the protocol that is responsible for forwarding the datagram towards the destination. This action of IP is called IP routing. The information used to make routing decision is known as IP routing information. IP routing information is available in the routing tables of the routers and hosts. Routing in an internet is more difficult than a single network, since an internet is composed of multiple physical networks interconnected by routers and the source and the destination may be in different networks. Ideally, the routing table construction software (routing protocols such as RIP or OSPF) would examine network load, datagram length, or the type of service specified in the datagram header when selecting the best path. IP, the forwarding protocol makes use of this information built by routing protocols. Hosts can also take part in the routing activity if they have been provided multiple network interface cards and such hosts are called multi-home hosts. However, now-a-days in the Internet, by and large routing is confined to routers alone. Both multi-homed hosts and routers participate in routing an IP datagram to its destination. When an application program on a host attempts to communicate, the TCP/IP protocols eventually generate one or more IP datagrams. The host must make an initial routing decision when it chooses where to send the datagrams. As figure 1.3
DIT 116
NETWORK PROTOCOLS
shows, hosts must make routing decisions even if they have only one network connection. Path to some destinations Path to other destinations
NOTES
R1
R2
HOST
Figure 1.3 An example of a single homed host that must route datagrams
The primary purpose of routers is to make IP routing decisions. Any computer with multiple network connections can act as a router and hence multi-homed hosts running TCP/IP have all the software needed for routing. However, you please note the fact that the TCP/IP standards draw a sharp distinction between the functions of a host and those of a router, and sites that try to mix host and router functions on a single machine sometimes find that their multi homed hosts engage in unexpected interactions. So it becomes clear that even if a host is multi-homed it can not be considered as a router according to the TCP/IP standards. The process of routing deliver the packets to the destination either using direct delivery or indirect delivery. Two machines can engage in direct delivery only if they both attach directly to the same underlying physical transmission system (e.g., a single Ethernet or a single Token Ring). The router delivers the datagrams to the destination using indirect delivery occurs when the destination is not a directly attached network, forcing the sender to pass the datagram to a router for delivery. 1.5.1 Datagram Delivery in a Single Network
If the source and destination are present in the same network the datagrams can be delivered using direct delivery itself. In such a case, to transfer an IP datagram, the sender encapsulates the datagram in a physical frame, maps the destination IP address into a physical address, and uses the network hardware to deliver it. However the issue is how the sender knows whether the destination lies on a directly connected network. The issue is easily resolved since IP version 4 follows a hierarchical addressing scheme. An IP address has two parts namely net id and host id. To see if a destination lies on one of the directly connected networks, the sender extracts the network portion of the destination IP address and compares it to the network
DIT 116
NETWORK PROTOCOLS
NOTES
portion of its own IP address. If both of them match it implies that datagrams can be handed over through direct delivery itself. From an internet perspective, you can think of direct delivery as the final step in any datagram transmission, even if the datagram traverses many networks and intermediate routers. The final router along the path between the source and its destination will connect directly to the same physical network as the destination. Thus, the final router will deliver the datagram using direct delivery. You please understand that, direct delivery between the source and destination is a special case of general purpose routing. 1.5.2 Indirect Delivery
Indirect delivery is more difficult than direct delivery because the sender must identify a router to which the datagram can be sent. The router must then forward the datagram on toward its destination network. To visualize how indirect routing works, imagine a large internet with many networks interconnected by routers but with only two hosts at the far ends. When one host wants to send to the other, it encapsulates the datagram and sends it to the nearest router. We know that the host can reach a router because the host or the network in which the host is present has a physical connection to the router. Once the frame reaches the router, software extracts the encapsulated datagram, and the IP software selects the next router along the path towards the destination. The datagram is again placed in a frame and sent over the next physical network to a second router, and so on, until it can be delivered directly. Indirect delivery is similar to the following scenario in the real life. Assume that you are a native of Tirunelveli and you want to go to Chennai. Your brother drops you at Tirunelveli bus station (Host to the nearest router with datagram encapsulation) in his vehicle (direct delivery). You get into a bus (Extracting the datagram) and reach Madurai bus station (next router along the path with datagram encapsulation). Bus stations of Tiruchirappalli and Villuppuram are the other intermediate routers. Finally you reach Chennai (Last router) and your cousin takes you to his house in his vehicle (direct delivery last step in the whole indirect delivery process). Now the issue is how the intermediate routers know where to forward the datagrams further. Hence it becomes necessary for the routers to maintain the routing tables. 1.5.3 Table - Driven IP Routing
The IP routing algorithm employs a routing table (sometimes called an IP routing table) on each machine that stores information about possible destinations and how to
10
DIT 116
NETWORK PROTOCOLS
reach them. Because both hosts and routers route datagrams, both have IP routing tables. Whenever the IP routing software in a host or router needs to transmit a datagram, it consults the routing table to decide where to send the datagram. Please note the fact that achieving the scalability is the major challenge in the maintenance of routing tables. If a router has to maintain the details about all the hosts of the connected networks, the size of the routing table increases in proportion to the number of hosts in the networks. Hence IP follows a hierarchical addressing scheme. Hierarchical addressing followed in IP version 4 follows a two level hierarchy: network id and host id. As a result it is enough for the routers to store only the details about the local hosts and other networks. 1.5.4 Next Hop Routing
NOTES
Using the network portion of a destination address instead of the complete host address makes routing efficient and keeps routing tables small. More important, it helps hide information, keeping the details of specific hosts confined to the local environment in which those hosts operate. Typically, a routing table consists of pairs (N,R), where N is the IP address of a destination network, and R is the IP address of the next router along the path to network N. Router R is called the next hop, and the idea of using a routing table to store a next hop for each destination is called next-hop routing. Thus, the routing table in a router R only specifies one step along the path from R to a destination network the router does not know the complete path to a destination. It is important to understand that each entry in a routing table points to a router that can be reached across a single network. That is, all routers listed in machine Ms routing table must lie on networks to which M connects directly. When a datagram is ready to leave M, IP software locates the destination IP address and extracts the network portion. M then uses the network portion to make a routing decision, selecting a router that can be reached directly. Figure 1.4 shows a concrete example that helps explain routing tables. The example internet consists of four networks connected by three routers. In the figure, the routing table maintained at the routers facilitate the routers to select the next hop. Because R connects directly to networks 20.0.0.0 and 30.0.0.0, it uses direct delivery to send to a host on either of those networks. Given a datagram destined for a host of network 40.0.0.0, R routes it to the address of router S, 30.0.0.7. S will then deliver the datagram directly. R can reach address 30.0.0.7 because both R and S attach directly to network 30.0.0.0.
11
DIT 116
NETWORK PROTOCOLS
NOTES
Network 10.0.0.0
20.0.0.5
30.0.0.6
40.0.0.7
Network 20.0.0.0
Network 30.0.0.0
Network 40.0.0.0
10.0.0.5
20.0.0.6
30.0.0.7
To Reach Hosts on Network 20.0.0.0 30.0.0.0 10.0.0.0 40.0.0.0
Route to this Address Deliver Directly Deliver Directly 20.0.0.5 30.0.0.7
Figure 1.4. An example internet with 4 networks and 3 routers and the routing table in R
As Figure 1.4 demonstrates, the size of the routing table depends on the number of networks in the internet; it only grows when new networks are added. However, the table size and contents are independent of the number of individual hosts connected to the networks. The whole idea behind scalability is to keep information only about destination network and not about individual hosts. 1.5.5 Reducing the size of the Routing Tables It is possible to reduce the size of the routing tables considerably in the cases where the networks have a small set of local hosts and only one connection to the rest of the internet. Assume that the network has k local hosts. The number of entries in the routing table is k+1 (k entries for k local hosts and 1 entry for the rest of the internet). To put it in other words, to forward a datagram the routing decision has only two tests: one for the local network and another from the default internet connection. You also note that most of the routers in the edge level and the distribution level of the internets will have default routes to reduce the size of the routing tables. Since internetworks follow a hierarchical addressing scheme, it is enough if a router maintains the details about other networks alone and it is not necessary for a router to maintain the details about each and every host in other networks. Entries in the routing table that have the routing information about other networks are network specific and the entries that convey the information about the hosts in the local network are host specific. Although most of the entries in the routing table are network specific rather than host specific, it is better to have host-specific routes also for the hosts to which frequently datagrams are forwarded. Having host specific entries speeds up the process of taking the routing decisions and thereby the datagram forwarding.
DIT 116
NETWORK PROTOCOLS
Have you understood? 1. 2. 3. 4. 5. 6. 1.6 What is meant by indirect delivery in a network? List down the steps required in next-hop routing. What is the advantage of having host-specific entries in the routing table? Define a logical subnet. What is the purpose of a subnet mask? List down the steps involved in forwarding a datagram with subnetting. INTERNET PROTOCOL (IP)
NOTES
The protocol that is responsible for the selection of the routes and forwarding of the datagram is the Internet Protocol (IP). IP is the glue that holds the entire internet together. Its importance can be understood from the fact that the whole protocol is referred as TCP/IP, where TCP is another dominant transport layer protocol of the suite. Irrespective of the underlying network (may be an Ethernet LAN or a wireless LAN or a X.25 WAN or an ATM network) the IP is able to forward the datagram in the correct direction towards the destination. One important point you should understand is that the information required for the route selection and data forwarding is not gathered by IP. Construction of the routing table and the maintenance of the routing table in the routers is done by another set of protocols like Routing Information Protocol (RIP) and Open Shortest Path First (OSPF). IP provides three important definitions. First, the IP protocol defines the basic unit of data transfer used throughout a TCP/IP internet. Thus, it specifies the exact format of all data as it passes across the internet. Second, IP software performs the routing function, choosing a path over which data will be sent. Third, in addition to the precise, formal specification of data formats and routing, IP includes a set of rules that embody the idea of unreliable packet delivery. 1.6.1 The Internet Datagram
The internet calls its basic transfer unit an Internet datagram, sometimes referred to as an IP datagram or merely a datagram or sometimes a packet. Like any other protocol data units, a datagram is divided into header and data areas. When a datagram is in transit, the intermediate routers do not examine the payload and what they check is the header part. Figure 1.5 shows the general form of a datagram.
DATAGRAM HEADER
DATAGRAM DATA AREA
Figure 1.5. General form on an IP datagram
13
DIT 116
NETWORK PROTOCOLS
NOTES
1.6.2
IP Header
Figure 1.6 shows the various fields of the IP datagram. 0

VERS
4
HLEN
8
SERVICE TYPE
16
19
24
31
TOTAL LENGTH FLAGS FRAGMENT OFFSET HEADER CHECKSUM
IDENTIFICATION TIME TO LIVE PROTOCOL SOURCE IP ADDRESS DESTINATION IP ADDRESS IP OPTIONS ( IF ANY ) DATA .
PADDING
Figure 1.6 IP header
VERS - The first 4-bit contains the version of the IP protocol that was used to create the datagram. It is used to verify that the sender, receiver, and any routers in between them agree on the format of the diagram. The current popular version is 4 and the next generation IP aims at version 6. HLEN This field gives the datagram header length measured in 32-bit words. It is necessary to include this field as the size of the header may vary due to the options that may be present in the header. TOTAL LENGTH - field gives the length of the IP diagram measured in octets, including the length of the header (HLEN). Because the TOTAL LENGTH field is 16 bits long, the maximum possible size of an IP diagram is 216 or 65, 535 octets. In most applications this is not a severe limitation. It may become more important in the future if higher speed network can carry data packets larger than 65,535 octets. SERVICE TYPE Also called Type Of Services (TOS), it specifies how the datagram should be handled. This field is the forerunner of Quality of Service (QoS) in IP networks. It is optional for the routers to interpret this field. If a router is not capable of handling TOS field, it ignores the specifications of this field. Hence it becomes necessary for you to understand that the mere inclusion of the TOS field can not provide differentiated service to the connections. The field was originally divided into five subfields as shown in Figure 1.7. 0 1 2 3 4 5 6 7
PRECEDENCE
T R UNUSED
Figure 1.7 The original five subfields that comprise the 8 bit SERVICE TYPE field
14
DIT 116
NETWORK PROTOCOLS
Three PRECEDENCE bits specify datagram precedence, with values ranging from 0 (normal precedence) through 7 (network control), allowing senders to indicate the importance of each diagram. Bits D, T and R specify the type of transport desired for the datagram. When set, the D bit requests low delay, the T bit requests high throughput, and the R bit requests high reliability. In the 1990s the IETF redefined the meaning of the 8-bit SERVICE TYPE field to accommodate a set of differentiated services (DS) and renamed it as Differentiated Services Code Point (DSCP). Figure 1.8 illustrates the resulting definition. 0 1 2 CODE POINT 3 4 5 6 7
NOTES
UNUSED
Figure 1.8 Differentiated Services Code Point
Under the differentiated services interpretation, the first six bits comprise a codepoint, which is sometimes abbreviated DSCP, and the last two bits are left unused. A codepoint value maps to an underlying service definition. Identification, Flags and Fragment Offset As you know the size of the frame (Protocol Data Unit of Layer 2) is limited since the underlying network hardware or technology has limitations regarding this. However, it may appear to you that there is not any limitation on the size of the datagrams since they are handled by software. However IP allots only 16 bits to the total length field, limiting the datagram to at most 65,535 octets. However in the real world the size of the datagrams can be considerably less due to the fact that as datagrams move from one machine to another, they must always be transported by the underlying physical network. Hence it is desirable to map a datagram directly onto a real packet if possible. Each network technology places a fixed upper bound on the amount of data that can be transferred in one physical frame (e.g., Ethernet - 1500 octets, FDDI Approximately 4470 octets). This limitation of the network is referred as Maximum Transfer Unit (MTU). However, since internet layer or network layer provides a different abstraction, without bothering about the underlying network technology, TCP/IP reference model chooses a convenient initial datagram size and arranges a way to divide large datagrams into smaller pieces when the datagrams needs to traverse a network that has a small MTU. The small pieces when the datagram is divided are called fragments, and process of dividing a datagram is known as fragmentation. Fragments must be reassembled to produce a complete copy of the original datagram before it can be processed at the destination. Three fields in the datagram header, IDENTIFICATION, FLAGS, and FRAGMENT OFFSET, control fragmentation and reassembly of datagrams. Field IDENTI15 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
FICATION contains a unique integer the datagram. When a router fragments a datagram, it copies most of the fields in the datagram header into each fragment. Thus, the IDENTIFICATION field must be copied. Its primary purpose is to allow the destination to know which arriving fragments belong to which datagrams. As a fragment arrives, the destination uses the IDENTIFICATION field along with the datagram source address to identify the datagram. Computers sending IP datagrams must generate a unique value for the IDENTIFICATION field for each datagram. One technique used by IP software keeps a global counter in memory, increments it each time a new datagram is created, and assigns the result as the datagrams IDENTIFICATION field. Each fragment has exactly the same format as a complete datagram. For a fragment, the field FRAGMENT OFFSET specifies the offset in the original datagram of the data being carried in the fragment, measured in units of 8 octets, starting at offset zero. To reassemble the datagram, the destination must obtain all fragments starting with the fragment that has offset 0 through the fragment with the highest offset. Fragments do not necessarily arrive in order, and there is no communication between the router that fragmented the datagram and the destination trying to reassemble it. The low-order two bits of the 3-bit FLAGS field control fragmentation. Usually, application software using TCP/IP does not care about fragmentation because both fragmentation and reassembly are automatic procedures that occurs at the low level in the operating system, invisible to end users. However, to test internet software or debug operational problems, it may be important to test sizes of datagrams for which fragmentation occurs. The first control bit aids in such testing by specifying whether the datagram may be fragmented. It is called the do not fragment bit because setting it to 1 specifies that the datagram should not be fragmented. An application may choose to disallow fragmentation when only the entire datagram is useful. For example, consider a bootstrap sequence in which a small embedded system executes a program in ROM that sends a request over the internet to which another machine responds by sending back a memory image. If the embedded system has been designed so it needs the entire image or none of the datagram should have the do not fragment bit set. Whenever a router needs to fragment a datagram that has the do not fragment bit set, the router discards the datagram and sends an error message back to the source. The low order bit in the FLAGS field specifies whether the fragment contains data from the middle of the original datagram or from the end. It is called the more fragments bit. To see why such a bit is needed, consider the IP software at the ultimate destination attempting to reassemble a datagram. It will receive fragments (possibly out of order) and needs to know when it has received all fragments for a datagram. When a fragment arrives, the TOTAL LENGTH field in the header refers to the size of the fragment and not to the size of the original datagram, so the destination cannot use the TOTAL
16
DIT 116
NETWORK PROTOCOLS
LENGTH field to tell whether it has collected all fragments. The more fragments bit solves the problem easily: once the destination receives a fragment with the more fragments bit turned off, it knows this fragment is the last for the current datagram. From the FRAGMENT OFFSET and TOTAL LENGTH fields, it can compute the length of the original datagram. By examining the FRAGMENT OFFSET and TOTAL LENGTH of all fragments that have arrived, a receiver can tell whether the fragments on hand contain all pieces needed to reassemble the original datagram. TIME TO LIVE This field theoretically specifies how long, in seconds, the datagram is allowed to remain in the internet system. Ideally the routers of the internet should decrement the TIME TO LIVE field as time passes and remove the datagrams from the internet when its time expires. However, practically it is very difficult to estimate the exact time at which the datagram has to expire since routers do not usually compute the transit time for the datagrams. One simple way of implementing this field is to take the decision in terms of the number of hops taken by the network. The TTL field of a datagram is initialized to some value and is decremented by the router for every hop taken by the datagram and once the value reaches zero the network discards the datagram and sends an error message back to the source. Hence you understand that in practice, the TIME TO LIVE acts as a hop limit rather than an estimate of delay. PROTOCOL This field of the IP header specifies the high-level protocol that has created the DATA area of the datagram. If you observe the protocol layering of the TCP/IP reference model it may look like that PROTOCOL field has only two possible options namely TCP and UDP since these are the two protocols of the transport layer that is immediately above the internet layer. But the fact is, ICMP messages (of Layer 3 itself) or some times messages of applications themselves (e.g., ping) may also get encapsulated within the IP datagram. HEADER CHECKSUM The purpose of this field is to ensure that the IP header is in tact as the datagram is in transit. The IP checksum is formed by treating the header as a sequence of 16-bit integers (in network byte order), adding them together using ones complement arithmetic, and then taking the ones complement to the result. For the purpose of computing the checksum, field HEADER CHECKSUM is assumed to contain zero. You please observe the fact that checksum only applies to values in the IP header and not the data. Separating the checksum for headers and data has advantages and disadvantages. The advantage is, since the header usually occupies fewer octets than the data, having a header checksum reduces processing time at routers which only need to compute header checksums. The separation also allows higher level protocols to choose their own checksum scheme for the data. The chief disadvantage is that higher level protocols are forced to add their own checksum or risk having corrupted data go undetected.
17
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
SOURCE IP ADDRESS and DESTINATION IP ADDRESS These fields refer to the 32-bit IP addresses of the datagrams sender and intended recipient. One important and interesting point you should note is, although the datagram may be routed through many intermediate routers, the source and destination fields never change. The next hop IP addresses in the intermediate routers are just used and forgotten. This is one of the stateless features of the IP. DATA This field indicates the payload of the IP datagram. This part of the datagram is not examined or interpreted by the routers of the internet. In the best effort model of the Internet, the network does not differentiate between the datagrams with respect to the content of the datagram. The length of the datagram depends on the nature of programs running in the application layer of the end systems. IP OPTIONS As we have already discussed, the options field of the header is variable in size and depends on the various restrictions the network service provider or the application wants to impose in the process of routing. It is important to understand that many of the routers of the internet are not able to interpret them and take the appropriate measures to satisfy the restrictions. If the routers are not able to interpret the options, they simply ignore them. PADDING This field depends on the options selected. It represents bits containing zero that may be needed to ensure the datagram header extends to an exact multiple of 32 bits. 1.6.3 IP Addresses
Every host and router on the internet has an IP address, which encodes its network number and host number. The combination is unique: in principle, no two machines on the internet have the same IP address. All IP addresses are 32 bits long and are used in the source address and destination address fields of IP datagrams. It is important to note that an IP address does not actually refer to a host. It really refers to a network interface, so if a host is on two networks, it must have two IP addresses. However, in practice, most hosts are on one network and thus have one IP address.
Figure: 1.9 IP address format Anna University Chennai 18
DIT 116
NETWORK PROTOCOLS
For several decades, IP addressees were divided into five categories listed in figure 1.9. This allocation has come to be called classful addressing. It is no longer used, but references to it in the literature are still common. The class A,B,C and D formats allow for up to 128 networks with 16 million hosts each, 16384 networks with up to 64K hosts, and 2 million networks (e.g., LANs) with up to 256 hosts each (although a few of these are special). Also supported is multicast, in which a datagram is directed to multiple hosts. Addresses beginning with 1111 are reserved for future use. Over 500,000 networks are now connected to the Internet and the number grows every year. Network numbers are managed by a non-profit corporation called ICANN (Internet Corporation for Assigned Names and Numbers) to avoid conflicts. In turn, ICANN has delegated parts of the address space to various regional authorities which then dole out IP addresses to ISPs and other companies. Network addresses which are 32 bit numbers, are usually written in dotted decimal notation. In this format, each of the four bytes is written in decimal, from 0 to 255. For example, the 32 bit hexadecimal address C0290614 is written as 192.41.6.20. The lowest IP addresses 0.0.0.0 and the highest is 255.255.255.255. The values 0 and -1 (all 1s) has special meanings as shown in figure 1.10.
NOTES
Figure: 1.10
Special IP addresses
The value 0 means this network or this host. The value of -1 is used as a broadcast address to mean all hosts on the indicated network. The IP address 0.0.0.0 is used by hosts when they are being booted. IP addresses with 0 as network number refer to the current network. These addresses allow machines to refer to their own network without knowing its number (but they have to know its class to know how many 0s to include). The address consisting of all 1s allows broadcasting on the local network, typically a LAN. The addresses with a proper network number and all 1s in the host field allow machines to send broadcast packets to distant lands anywhere in the internet. Finally, all addresses of the form 127.x.y.z are reserved for loopback testing. Packets
DIT 116
NETWORK PROTOCOLS
NOTES
send to that addresses are not put out on to the wire: they are processed locally and treated as incoming packets. These allow packets to be send to the local network without the sender knowing its number. Even though class based and hierarchical addressing scheme of IP version 4 (IPv4) improves the scalability of the design, it imposes serious limitations in the way the IP addresses can be allotted to the networks and hosts. It makes many addresses combinations among the 232 addresses unusable. As the size of the Internet grows exponentially IPv4 suffers from the problem of address space exhaustion. Many temporary solutions have been proposed and the most important one is subnetting. Subnetting is the process of splitting a large network into many small logical subnets. (Please note the difference between a logical subnet and the communication subnet. Logical subnet is a subnetwork of a large network which is transparent to the outside world whereas communication subnet is the collection of lines (trunks or media or channel) and the networking devices like routers, switches etc). Even after the split the network appears as a single network to the outside world. Subnets are created by borrowing certain bits from the host portion to the network portion. Now the problem is, how to identify the subnet in which the destination host is present. The solution is to have a subnet mask which is logically ANDed with the destination address to extract the net id, the subnet id and the host id. For example, the 32 bit subnet mask 11111111 11111111 00000000 00000000 specifies that the first two octets identify the network and the last two octets identify a host on that network and this is the default subnet mask for class B networks. However, if subnetting is supported then it is possible to have a subnet mask like 11111111 11111111 11111100 00000000 which is represented as 255.255.252.0 or /22. A packet addressed to 130.50.15.6 and arriving at the main router is ANDed with the subnet mask 255.255.252.0/22 to give the address 130.50.12.0. This address is looked up in the routing tables to find out which output line to use to get to the router for subnet 3. 1.6.4 IP Data forwarding Protocol IP is able to forward the datagrams form the source all the way up to the destination across the communication subnet. However, to do this IP is in need of the routing table or information that should be present in the intermediate routers. Gathering and maintaining the routing information in the routers is not the task of IP. Instead, these two functions are done by routing protocols like Routing Information Protocol (RIP), Open Shortest Path First (OSPF) etc. Moreover, within the network IP does not examine the contents of the payload field. Instead, it checks the fields of the header and forwards the datagram according to the entry present in the routing table. That is the reason for having the checksum field that takes care of header alone and not the data. However, if the IP header format is not proper in a datagram, the datagram may be discarded or dropped from the network. The inability to examine the contents of the datagram is
20
DIT 116
NETWORK PROTOCOLS
allowed to reduce the complexity of the routers. However, in the context of multimedia applications and other time sensitive applications IP suffers in terms of quality of service requirements. Internet engineering Task Force (IETF) has taken efforts in this direction and it has proposed two models namely Integrated Services Architecture (ISA) and Differentiated Services Architecture (DSA) to overcome the limitations of the Internet. The IP Routing Algorithm Once a datagram arrives at a router, it has to take a series of decisions before forwarding it. Conventionally, this forwarding algorithm is called IP routing algorithm. These decisions are summarized in the form of a routing algorithm as shown in figure 1.11.
NOTES
Algorithm: Route Datagram (Datagram , Routing Table) Extract destination IP address, D, from the datagram and compute the network prefix N; if N matches any directly connected network address deliver datagram to destination D over that network (This involves resolving D to a physical address, encapsulating the datagram, and sending the frame.) else if the table contains a host specific route for D send datagram to next-hop specified in table else if the table contains a route for network N send datagram to next-hop specified in table else if the table contains a default route send datagram to the default router specified in table else declare a routing error;
Figure 1.11. The algorithm IP uses to forward a datagram.
The unified routing algorithm that is capable of handling subnetting is shown in figure 1.12.
21
DIT 116
NETWORK PROTOCOLS
NOTES
Algorithm: Route_IP_Datagram(datagram, routing_table) Extract destination IP address, ID, from datagram; If prefix of ID matches address of any directly connected network send datagram to destination over that network (This involves resolving ID to a physical address, encapsulating the datagram, and sending the frame.) else for each entry in routing table do Let N be the bitwise-and of ID and the subset mask If N equals the network address field of the entry then route the datagram to the specified next hop address endforloop If no matches were found, declare a routing error;
Figure 1.12. The unified IP routing algorithm
Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 1.7 IP is a forwarding protocol. Justify this statement. What are the two parts of a datagram? What is the purpose of TTL field? Do the address fields of IP header change or not? Justify your answer. How many octets indicate the host in a class C address? What is the purpose of class D addresses? What are the limitations of IP? What is meant by fragmentation and reassembly? INTERNET CONTROL MESSAGE PROTOCOL
An internetwork based on packet switching provides unreliable, connectionless service to the users. In other words, we can say that such a network is based on besteffort service model. The system works correctly if every thing goes fine. However, the internet is subjected to certain conditions that may result in the failure of delivery of datagrams. Communication lines and the networking devices along the path may fail. Another possible reason for the failure of delivery of datagrams is that the destination may be temporarily or permanently disconnected from the network. Even congestion can be a reason for the failure since congestion leads to buffer overflow and exhaustion
DIT 116
NETWORK PROTOCOLS
of bandwidth. Due to congestion, intermediate routers are not able to handle the incoming traffic and the subnet starts dropping the datagrams. The IP protocol itself does not provide any mechanism to give feedback to the sender about what happened to the transmitted datagram. To solve this problem the TCP/IP protocol stack provides an adjunct protocol to IP by name Internet Control Message Protocol (ICMP). As the name implies it is basically a messaging system, i.e., it only sends suitable messages to the sender and does not try to fix the problem. In other words, we can say that ICMP is an error reporting mechanism rather than an error correcting mechanism. You have to understand another important fact that even ICMP messages are encapsulated inside IP datagrams and they are also subjected to all the risk faced by IP datagrams. Another important fact is that the ultimate destination of ICMP messages is the IP software running in another machine of the network. In other words, we can say that ICMP provides the communication between the IP software running in one machine and the IP software running in another machine. However, ICMP can be used by both hosts and routers to communicate with other routers and hosts. We can say in other words that both routers and hosts can use ICMP messages to communicate among themselves about the errors encountered by the datagrams in the network. 1.7.1 Error Reporting Mechanism
NOTES
The phrase control protocol may give you a wrong idea that errors encountered in the network is kept under control by ICMP. But the fact is ICMP is only an error reporting mechanism. Problems faced by the datagrams in the subnet can not be corrected by ICMP itself. The purpose of ICMP is to inform what went wrong in the subnet to the sender of the datagram. It is left to the upper layers to decide what to do with the errors. The outline of protocol specification says ICMP may suggest possible actions to the sender. However, practically ICMP does not fully specify the action to be taken for possible error. In most of the cases, it is left to the sender of the datagram to relate the ICMP message to an individual application program or take other action to correct the problem. When a datagram is transmitted from the source (sender) to the destination (receiver), datagram may encounter problem anywhere in the path between the source and destination. To put it in other words, there is a chance that the problem would have been caused by any one of the intermediate routers also. However, ICMP is capable of informing about the problem only to the sender. This is due to the stateless nature of IP. As we have discussed in the previous sections, throughout the travel from the source to the destination, a datagram retains the source address and destination address only. Datagrams do not record the complete route of its journey from the source to the destination. Similarly, the source is also not able to determine which router caused
DIT 116
NETWORK PROTOCOLS
NOTES
problem. Because of the stateless nature of IP, if the router detects a problem, it cannot know the set of intermediate machines that processed the datagram. Hence the router uses ICMP to inform the original source that a problem has occurred, and trusts that host administrators will cooperate with network administrators to locate and repair the problem. Moreover, the processes of route discovery and route maintenance takes place in a distributed manner, routers can establish and change their routing table irrespective of other routers. As there is no global knowledge of the routes, ICMP restricts the communication to the original source. 1.7.2 Message Delivery
Once ICMP messages are generated, they should be delivered to the appropriate destination (either a sender host or sender router). Since ICMP is an adjunct protocol of IP, control messages are created at layer 3. Just like ordinary datagrams it may be necessary for the ICMP messages to take multiple hops before reaching the destination. ICMP also makes use of the connectionless service of the IP to deliver the messages. Hence ICMP messages require two levels of encapsulation as shown in figure for the ICMP messages require two levels of encapsulation as shown in figure 1.13. Each ICMP message travels across the internet in the data portion of an IP datagram, which itself travels across each physical network in the data portion of a frame. Datagrams carrying ICMP messages are routed exactly like datagrams carrying information for users; there is no additional reliability or priority. Thus, error messages themselves may be lost or discarded. Furthermore, in an already congested network, the error message may cause additional congestion. An exception is made to the error handling procedures if an IP datagram carrying an ICMP message cause an error. The exception, established to avoid the problem of having surplus of error messages, specifies that ICMP messages are not generated for errors that result from datagrams carrying ICMP error messages.
Figure 1.13. Two levels of ICMP encapsulation. Anna University Chennai 24
DIT 116
NETWORK PROTOCOLS
Encapsulation of ICMP messages is a classical example of encapsulating the protocol data units of one layer in the same layer itself (Layer 3). Later you will study about mobile IP where the encapsulation of IP within IP is followed. 1.7.3 Message Format
NOTES
Dozens of reasons exist for the failure in delivering the datagrams to the destination. Hence, ICMP should support different types of messages and each one of them may have their own format. Irrespective of this, they all begin with the same three fields; an 8-bit integer message TYPE field that identifies the message, an 8-bit CODE field that provides further information about the message type, and a 16-bit CHECKSUM field. ICMP also uses the same additive checksum algorithm as IP. However, ICMP checksum only covers the ICMP message. In addition, ICMP messages that report errors always include the header and first 64 data bits of the datagram causing the problem. The reason for returning more than the datagram header alone is to allow the receiver to determine more precisely which protocol and which application program were responsible for the datagram. Such a design works properly since higher level protocols in the TCP/IP suite are designed so that crucial information is encoded in the first 54 bits. The ICMP TYPE field defines the meaning of the message as well as its format. The types include: Type Field _________ 0 3 4 5 8 9 10 11 12 13 14 15 16 17 18 ICMP Message Type ________________ Echo Reply Destination Unreachable Source Quench Redirect (change a route) Echo Request Router Advertisement Router Solicitation Time Exceeded for a Datagram Parameter Problem on a Datagram Timestamp Request Timestamp Reply Information Request(obsolete) Information Reply (obsolete) Address Mask Request Address Mark Reply
25
DIT 116
NETWORK PROTOCOLS
NOTES
The following sections describe the purpose of each of these types and explain the corresponding message format. 1.7.4 Testing Destination Reachability and Status (Ping) In network administration, if the services at application layer level are not working properly, before checking the applications, it is necessary to check the connectivity between the source and the destination at the network layer level. TCP/IP protocol suite provides facilities to help network administrators and managers or even to users identify network problems. One of the most frequently used debugging tools is ping. Ping tool is based on ICMP echo request and echo reply messages. A host or router sends an ICMP echo request message to a specified destination. Any machine that receives an echo request formulates an echo reply and returns it to the original sender. The request contains an optional data area and the reply contains a copy of the data sent in the request. The echo request and associated reply can be used to test whether a destination is reachable and responding. Because both the request and reply travel in IP datagrams, successful receipt of a reply verifies that major pieces of the transport system work. If the ICMP echo reply reaches in time, it implies that IP software on the source computer routes the datagram properly, intermediate routers between the source and destination machine are running properly and both ICMP and IP software are working properly and finally, all routers along the return path have correct routes. Thus the correct arrival of echo reply ensures the proper functioning of the major elements of the transport system. The format of echo request and reply messages is shown in figure 1.14. You observe that even this request/reply message also has TYPE, CODE and CHECKSUM.
Figure 1.14. ICMP echo request or reply
The value of the TYPE field specifies whether the message is a request (8) or a reply (0). CODE field is used to give more information about the type of error that has occurred. Fields IDENTIFIER and SEQUENCE NUMBER are used by the sender to match replies to requests. The field listed as OPTIONAL DATA is a variable length field that contains data to be returned to the sender. An echo reply always returns exactly the same data as was received in the request.
26
DIT 116
NETWORK PROTOCOLS
When a router cannot forward or deliver an IP datagram, it sends a destination unreachable message back to the original source, using the format shown in Figure 1.15.
NOTES
Figure 1.15. ICMP destination unreachable message
The CODE field in a destination unreachable message contains an integer that further describes the problem. Possible values are listed as follows. Code Value __________ 0 0 1 2 3 4 5 6 7 8 9 10 11 Meaning _______________________________ Network Unreachable Host Unreachable Protocol Unreachable Port Unreachable Fragmentation needed and DF set Source route failed Destination network unknown Destination host unknown Source host isolated Communication with destination network administratively prohibited Communication with destination host administratively prohibited Network unreachable for type of service Host unreachable for type of service
Although IP is a best-effort delivery mechanism, discarding datagrams should not be taken lightly. IP follows best-effort delivery mechanism on the assumption that the underlying network is by and large able to deliver the datagrams properly to the destination. Another motivation behind best-effort model is to make the network as simple as possible. However, if datagrams are dropped or discarded frequently it may degrade the performance of the network considerably. ICMP through its error and control messages with a short prefix of the datagram that caused the problem helps the source to identify what would have gone wrong and thereby helps the sender to take the corrective actions. Destinations may be unreachable because hardware is temporarily out of service, because the sender specified a nonexistent destination address, or because the router
DIT 116
NETWORK PROTOCOLS
NOTES
does not have a route to the destination network. At this point you should understand that, although routers report failures they encounter, they may not know of all delivery failures. For example, if the destination machine connects to an Ethernet network, the network hardware does not provide acknowledgements. Therefore, a router can continue to send packets to a destination after the destination is powered down without receiving any indication that the packets are not being delivered. 1.7.5 Role of ICMP in Congestion Control
The design philosophy of IP networks is to make the network as simple as possible and to leave the issue of reliability to the end systems (host). This results in simple routers whose sole responsibility is forwarding the datagrams across multiple hops up to the destination. An IP router does not have the facility to reserve memory or communication resources in advance of receiving datagrams. Moreover, routers have only limited buffer capacity. Hence if too many datagrams are injected into the network by various hosts, routers are overrun with traffic and start dropping the datagrams, a condition known as congestion. Congestion can arise due to several factors. 1. If all of a sudden, streams of datagrams begin arriving on three or four input lines and if all of them want the same output line, queue builds up. If the buffer associated with that particular output line becomes full, datagrams are lost. If the routers CPU are slow at operations like queuing buffers, updating tables, etc. queues can build up, even though there is excess line capacity. A high speed computer may be able to generate traffic faster than a network can transfer it. For example, imagine a super computer generating internet traffic. The datagrams may eventually need to cross a slower speed Wide Area Network(WAN) even though the super computer itself attaches to a high speed Local Area Network (LAN). Congestion will occur in the router that attaches the LAN to the WAN because datagrams arrive faster than they can be sent.
2. 3.
If the datagrams are part of a small burst, buffering may solve the problem up to certain extent. If the traffic continues, the host or router eventually exhausts memory and must discard additional datagrams that arrive. Although efficient congestion control algorithms running at the end systems are available, it is also necessary to identify the congestion at the network itself to deal with the problem of congestion more effectively. Routers use ICMP source quench messages to report congestion to the original source. A source quench message is a request for the source to reduce its current rate of datagram transmission. It is important for you to know that there is no ICMP message to reverse the effect of a source quench. Instead, a host that receives source quenches messages for a destination, D, lowers the rate at which it sends datagrams to D until it stops receiving source quench messages. Then the source gradually increases the rate as long as no further source quench requests are received. The format of the source quench message is shown in figure 1.13.
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 1.16. ICMP source quench message format
In addition to the usual ICMP TYPE, CODE, CHECK SUM fields, and an unused 32-bit field, source quench messages have a field that contains a datagram prefix. Figure 4 illustrates the format. As with most ICMP messages that report an error, the datagram prefix field contains a prefix of the datagram that triggered the source quench request. 1.7.6 Role of ICMP in the maintenance of optimal routes
Forwarding of datagrams takes place properly only if correct information is maintained in the routing tables of the routers. Routing tables in hosts and routers usually remain static over long periods of time. Hosts initialize them from a configuration file at system startup, and system administrators seldom make routing changes during normal operations. However, network topology is subjected to changes like router failure, communication line failure etc. Even new hosts and routers may be added to the existing network. A line that remained down might have come up. In all these cases routing tables in a router or host may become incorrect. If the routing protocol used is a static one then these changes can not be reflected in the topology. However, many of the internets follow dynamic routing protocol like Routing Information Protocol (RIP) or Open Shortest Path First (OSPF) that enables the routers to exchange the routing information periodically to accommodate network changes and keep their routes upto-date. In internets, the initial host route configuration specifies the minimum possible routing information needed to communicate. The host begins with minimal information and relies on routers to update its routing table. There is a possibility that a host may use a non-optimal route in the process of data forwarding. When a router detects this it sends the host an ICMP message, called a redirect, requesting the host to change its entry in the routing table. The router also forwards the original datagram on its destination. The advantage of the ICMP redirect scheme is that it enables the host to maintain a small routing table but still the routing table contains optimal routes for all destinations in use. 1.7.6.1 Redirect Messages Redirect messages do not solve the problem of propagating routes in a general way, however, because they are limited to interactions between a router and a host on
DIT 116
NETWORK PROTOCOLS
NOTES
a directly connected network. Figure 1.17 illustrates the limitation. In the figure, assume source S sends a datagram to destination D. Assume that router R1 incorrectly routes the datagram through router R2 instead of through router R4 (i.e., R1 incorrectly chooses a longer path than necessary). When router R1 receives the datagram, it cannot send an ICMP redirect message to R1, because it does not know R1s address.
R2 R3 R1 R5 D
S
R4
Figure 1.17. A routing scenario
In addition to the requisite TYPE, CODE, and CHECKSUM fields, each redirect message contains a 32-bit ROUTER INTERNET ADDRESS field and an INTERNET HEADER field, as figure 1.18 shows.
Figure 1.18. ICMP redirect message format
The ROUTER INTERNET ADDRESS field contains the address of a router that the host is to use to reach the destination mentioned in the datagram header. The INTERNET HEADER field contains the IP header plus the next 64 bits of the datagram that triggered the message. Thus, a host receiving an ICMP redirect examines the datagram prefix to determine the datagrams destination address. The CODE field of an ICMP redirect message further specifies how to interpret the destination address, based on values assigned as follows. Code Value _____________ 0 1 2 3 Meaning _______________________________ Redirect datagrams for the Net (now obsolete) Redirect datagrams for the Host Redirect datagrams for the Type of Service and Net Redirect datagrams for the Type of Service and Host
30
DIT 116
NETWORK PROTOCOLS
As a general rule, routers only send ICMP request to hosts and not to other routers. 1.7.6.2 Time Exceeded Messages Since next hop routing is the popular routing in internet, errors in routing tables can produce a routing cycle for some destination, D. A routing cycle can consist of two routers that each route a datagram for destination D to the other, or it can consist of several routers. Even in the case of several routers, ultimately one router may have an entry that forms a cyclic loop. When several routers form a cycle, they each route a datagram for destination D to the next router in the cycle. If a datagram enters a routing cycle, it will pass endlessly. We have already seen that IP header has a field by name TTL or hop count which is initialized to some value and then decremented on every hop. At a router, if this TTL becomes zero, the datagram is dropped. However, instead of dropping the datagram abruptly, the source is given an indication with the help of ICMP. Whenever a router discards a datagram because its hop count has reached zero or because a time out occurred while waiting for fragments of a datagram, it sends an ICMP time exceeded message back to the datagrams source, using the format shown in figure 1.19.
NOTES
Figure 1.19. ICMP time exceeded message format
ICMP uses the CODE field in each time exceeded message (value zero or one) to explain the nature of the time out being reported: Code Value _____________ 0 1 Meaning _______________________________ Time-to-live count exceeded Fragment reassembly time exceeded
Fragment reassembly refers to the task of collecting all the fragments from a datagram. When the first fragment of a datagram arrives, the receiving host starts a timer and considers it an error if the timer expires before all the pieces of the datagram arrive. Code value 1 is used to report such errors to the sender: one message is sent for each such error.
31
DIT 116
NETWORK PROTOCOLS
NOTES
1.7.6.3 Parameter Problem Messages When a router or host finds problems with a datagram not covered by previous ICMP error messages, it sends a parameter problem message to the original source. One possible cause of such problems occurs when arguments to an option are incorrect. The message formatted as shown in figure 1.20, is only sent when the problem is so severe that the datagram must be discarded.
Figure 1.20. ICMP parameter problem message format
To make the message unambiguous, the sender uses the POINTER field in the message header to identify the octet in the datagram that caused the problem. Code 1 is used to report that are required option is missing (e.g., a security option in the military community); the POINTER field is not used for code 1. 1.7.7 Clock Synchronization The hosts on an internet operate independently, with each host maintaining its own notion of the current time. However, they may communicate among themselves since they are on the internet. Since different hosts use different clocks, the applications for which time is a sensitive parameter, users may not get proper services. As a result, users of distributed systems software may be confused. This problem can be solved to certain extent with the help of ICMP messages. One of the simplest techniques uses an ICMP message to obtain the time from another machine. A requesting machine sends an ICMP timestamp request message to another machine, asking that the second machine return its current value for the time of day. The receiving machine returns a timestamp reply back to the machine making the request. Figure 1.21 shows the format of timestamp request and reply messages.
Figure 1.21. ICMP timestamp request or reply message format
The TYPE field identifies the message as a request (13) or a reply (14); the IDENTIFIER and SEQUENCE NUMBER fields are used by the source to associate replies
DIT 116
NETWORK PROTOCOLS
with requests. Remaining fields specify times, given in milliseconds since midnight, Universal Time. The ORIGINATE TIMESTAMP field is filled in by the original sender just before the packet is transmitted, the RECEIVE TIMESTAMP field is filled immediately upon receipt of a request, and the TRANSMIT TIMESTAMP field is filled immediately before the reply is transmitted. Hosts use the three timestamp fields to compute estimates of the delay time between them and to synchronize their clocks. Because the reply includes the ORIGINATE TIMESTAMP field, a host can compute the total time required for a request to travel to a destination, be transformed into a reply, and return. Because the reply carries both the time at which the request entered the remote machine, as well as the time at which the reply left, the host can compute the network transit time, and from that, estimate the differences in remote and local clocks. In practice, accurate estimation of round-trip delay can be difficult and substantially restricts the utility of ICMP timestamp messages. Of course, to obtain an accurate estimate of round trip delay, one must take many measurements and average them. However, the round-trip delay between a pair of machines that connect to a large internet can vary dramatically, even over short periods of time. Furthermore, recall that because IP is a best-effort technology, datagrams can be dropped, delayed, or delivered out of order. Thus, merely taking many measurements may not guarantee consistency; sophisticated statistical analysis is needed to produce precise estimates. 1.7.8 Obtaining Subnet Mask
NOTES
It is important to understand that when hosts use subnet addressing, some bits in the hosted portion of their IP address identify a physical network. To participate in subnet addressing, a host needs to know which bits of the 32-bit internet address correspond to the physical network and which correspond to host identifiers. The information needed to interpret the address is represented in a 32-bit quantity called the subnet mask. To learn the subnet mask used for the local network, a machine can send an address mask request message to a router and receive an address mask reply. The machine making the request can either send the message directly, if it knows the routers address, or broadcast the message if it does not. Figure 1.22 shows the format of address mask messages.
Figure 1.22. ICMP address mask request or reply message format 33 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
The TYPE field in an address mask message specifies whether the message is a request (17) or a reply (18). A reply contains the networks subnet address mask in the ADDRESS MASK field. As usual, the IDENTIFIER and SEQUENCE NUMBER fields allow a machine to associate replies with requests. 1.7.9 Route Discovery After a host boots, it must learn the address of at least one router on the local network before it can send datagrams to destinations on other networks. ICMP supports a router discovery scheme that allows a host to discover a router address. ICMP router discovery is not the only mechanism a host can use to find a router address. The BOOTP and DHCP protocols provide the main alternative - each of the protocols provides a way for a host to obtain the address of a default router along with other bootstrap information. However, BOOTP and DHCP have a serious deficiency: the information they return comes from a database that network administrators configure manually. Thus, the information cannot change quickly. Of course, static router configuration does work well in some situations. For example, consider a network that has only a single router connecting it to the rest of the internet. There is no need for a host on such a network to dynamically discover routers or change routes. However, if a network has multiple routers connecting it to the rest of the internet, a host that obtains a default route at startup can lose connectivity if a single router crashes. More important, the host cannot detect the crash. The ICMP router discovery scheme helps in two ways. First, instead of providing a statically configured router address via a bootstrap protocol, the mechanism permits a host to obtain information directly from the router itself. Second, the mechanism uses a soft state technique with timers to prevent hosts from returning a route after a router crashes. In this method, routers advertise their information periodically and a host discards a route if the timer for a route expires. Since the state information expires automatically it is called soft state. Figure 1.23 illustrates the format of the advertisement message a router sends.
Figure 1.23. ICMP router advertisement message format used with IPv4. Anna University Chennai 34
DIT 116
NETWORK PROTOCOLS
Besides the TYPE, CODE, and CHECKSUM fields, the message contains a field labeled NUM ADDRS that specifies the number of address entries which follow (often 1), an ADDR SIZE field that specifies the size of an address in 32-bit units ( 1 for IPv4 addresses ), and a LIFETIME field that specifies the time in seconds a host may use the advertised address(es). The default value for LIFETIME is 30 minutes, and the default for periodic retransmission is 10 minutes, which means that a host will not discard a route if the host misses a single advertisement message. The remainder of the message consists of NUM ADDRS pairs of fields, where each pair contains a ROUTER ADDRESS and an integer PRECEDENCE LEVEL for the route. The precedence value is a twos complement integer: a host chooses the route with highest precedence. If the router and the network support multicast, a router multicasts ICMP router advertisement messages to the all systems multicast address (i.e., 224.0.0.1). If not, the router sends the message to the limited broadcast address (i.e., the all 1s address). Of course, a host must never send a router advertisement message. 1.7.10 Router Solicitation Although the designers provided a range of values to be used as the delay between successive router advertisements, they chose the default of 10 minutes. The value was selected as a compromise between rapid failure detection and overhead. A smaller value would allow more rapid detection of router failure, but would increase network traffic; a larger value would decrease traffic but would delay failure detection. One of the issues the designers considered was how to accommodate a large number of routers on the same network. From the point of view of a host, the default delay has a severe disadvantage: a host cannot afford to wait many minutes for an advertisement when it first boots. To avoid such delays, the designers included an ICMP router solicitation message that allows a host to request an immediate advertisement. Figure 1.24 illustrates the message format.
NOTES
Figure 1.24. ICMP router solicitation message
If a host supports multicasting, the host sends the solicitation to the all-routers multicast address (i.e., 224.0.0.2); otherwise the host sends the solicitation to the limited broadcast address (i.e., the all 1s address). The arrival of a solicitation message causes a router to send a normal router advertisement. As the figure shows, the solicitation does not need to carry information beyond the TYPE, CODE, and CHECKSUM fields.
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. ICMP is an adjunct protocol of IP. Justify this statement. 2. What is the encapsulation procedure followed for ICMP messages? 3. What are the common fields of all ICMP messages? 4. Whether ICMP is an error reporting mechanism or an error correcting mechanism? Justify your statement. 5. What are the ICMP messages used by ping utility? 6. What is the relationship between TYPE and CODE fields of ICMP messages? 7. What is the purpose of ICMP source quench message? 8. When does a router send ICMP redirect message to the source? 9. How does ICMP achieve clock synchronization? 10. What is the role of ICMP router advertisement message? 1.8 NECESSITY OF TRANSPORT LAYER
A generic internet or the specific Internet is basically a packet switched network. TCP/IP reference model suggests that it is enough for the network layer to provide simple, connectionless and unreliable service. Hence in an internet, packets can be lost or destroyed when transmission errors interfere with data, when network hardware fails, or when networks become too heavily loaded to accommodate the load presented. Moreover, an internet based on packet switching, can deliver datagrams out of order, deliver them after a substantial delay, or deliver duplicates. Furthermore, underlying network technologies may dictate an optimal packet size or pose other constraints needed to achieve efficient transfer rates. In short we can say an internet is an unreliable one. However, users expect reliable services from the network. Since the network is not capable of providing reliability, the responsibility is left to the end systems (hosts). Applications running at the end systems can not be built directly over the internet or network layer due to the inherent limitations of the network. So in between application layer and network layer, another layer or abstraction by name transport layer is required. This transport layer has to compensate all the limitations of the network and shield the applications or users from the implementation details of reliability. One question that may arise in your mind is why the application layer itself cant take care of the reliability issues. It is not reasonable to expect that since most of the application programmers do not have the necessary technical background. 1.8.1 Requirements of a Reliable service
It becomes necessary for the transport layer to provide the following features to implement reliable and effective services.
36
DIT 116
NETWORK PROTOCOLS
1. Stream Orientation The underlying network is able to deal in terms of datagrams and not in terms of bit and bytes. However if the network has to provide a reliable service, it is necessary to ensure that the receiver receives the octets (bytes) as sent by the sender. When we say that the sequence is to be preserved at the octet level, it implies that the transport layer entities need to preserve the message boundaries. 2. Virtual Circuit at Transport level ISO/OSI reference model suggests that network layer can follow either datagram subnet or virtual circuit subnet. However, according to TCP/IP recommendations the network layer need not to support virtual circuit subnet. Hence internet layer provides only datagram subnet. Hence it becomes necessary for the transport layer entities of the TCP/IP reference model to emulate a telephone call and such a emulated call is called a virtual circuit. Protocol software modules in the two operating systems communicate by sending messages across an internet, verifying that the transfer is authorized, and that both sides are ready. Once all details have been settled, the protocol modules inform the application programs that a connection has been established and that transfer can begin. During transfer, protocol software on the two machines continues to communicate to verify that data is received correctly. If the communication fails for any reason both machines detect the failure and report it to the appropriate programs. 3. Buffered Transfer It is not advisable to send and receive the datagrams at the same rate as generated by the application at the sender side since it may result in ineffective utilization of network resources. To make transfer more efficient and to minimize network traffic, transport layer entities usually collect enough data from a stream to fill a reasonably large datagram before transmitting it across an internet. Thus, even if the application program generates the stream one octet at a time, transfer across an internet may be quite efficient. Similarly, if the application program chooses to generate extremely large blocks of data, the protocol software can choose to divide each block into smaller pieces of transmission. 4. Unstructured Stream It is important to understand that the TCP/IP stream service does not honor structured data streams. For example, there is no way for a payroll application to have the stream service mark boundaries between employee records, or to identify the contents of the stream as being payroll data. Application programs using the stream service must understand stream content and agree on stream format before they initiate a connection.
37
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
5. Full Duplex Connection Connections provided by the TCP/IP stream service allow concurrent transfer in both directions. Such connections are called full duplex. From the point of view of an application process, a full duplex connection consists of two independent streams flowing in opposite directions, with no apparent interaction. The stream service allows an application process to terminate flow in one direction while data continues to flow in the other direction, making the connection half duplex. The advantage of a full duplex connection is that the underlying protocol software can send control information for one stream back to the source in datagrams carrying data in the opposite direction. This ability is called piggybacking and piggybacking reduces network traffic. 1.8.2 Providing Reliability
The major issue in the network design is that the transport layer (mostly functioning at end systems) has to provide reliability over the unreliable service provided by the network layer. The basic mechanism followed by transport layer entities to provide reliability is positive acknowledgement with retransmission. The technique requires a recipient to communicate with the source, sending back an acknowledgement (ACK) message as it receives data. The sender keeps a record of each packet it sends and waits for an acknowledgement before sending the next packet. The sender also starts a timer when it sends a packet and retransmits a packet if the timer expires before an acknowledgement arrives. Figure 1.25 shows how the simplest positive acknowledgement protocol transfers data.
Figure1.25. Positive acknowledgement with retransmission
38
DIT 116
NETWORK PROTOCOLS
In the figure 1.25, events at the sender and receiver are shown on the left and right. Each diagonal line crossing the middle shows the transfer of one message across the network. Figure 1.26 uses the same format diagram as figure 1.25 to show what happens when a packet is lost or corrupted. The sender starts a timer after transmitting a packet. When the timer expires, the sender assumes the packet was lost and retransmits it.
NOTES
Figure 1.26. Timeout and retransmission that occurs when a packet is lost
The final reliability problem arises when an underlying packet delivery system duplicates packets. Solving duplication requires careful thought because both packets and acknowledgements can be duplicated. Usually, reliable protocols detect duplicate packets by assigning each packet a sequence number and requiring the receiver to remember which sequence numbers it has received. To avoid confusion caused by delayed or duplicated acknowledgements, positive acknowledgement protocols send sequence numbers back in acknowledgements, so the receiver can correctly associate acknowledgements with packets. 1.8.3 Sliding Window Algorithms The major functions required to provide reliability are flow control and error control. Error control is performed along with flow control as an adjunct process. Even though we did not mention explicitly, mechanisms explained in figures 1.25 and 1.26 are stop and wait algorithms. The major limitations of the stop and wait algorithm are
39
DIT 116
NETWORK PROTOCOLS
NOTES
the inefficient usage of bandwidth and the wastage of time. Both of these limitations are effectively overcome in sliding window algorithms. The easiest way to understand the sliding window algorithm is to consider the stop and wait protocol as a sliding window protocol with a window size of 1. In the sliding window algorithms, sender can send more than one segment (maximum number is restricted by the size of the window). Sliding window protocols use network bandwidth better because they allow the sender to transmit multiple packets before waiting for an acknowledgement. The functioning of sliding window protocol is explained in figure 1.27.
Figure 1.27. An example of three packets transmitted using a sliding window protocol
Figure 1.27 shows an example of the operation of a sliding window protocol when sending three packets. You please note that the sender transmits all the three packets before receiving any acknowledgement. We say that a packet is unacknowledged if it has been transmitted but no acknowledgement has been received. Technically, the number of packets that can be unacknowledged at any given time is constrained by the window size and is limited, to a small, fixed number. For example, in a sliding window protocol with window size 8, the sender is permitted to transmit 8 packets before it receives an acknowledgement. As figure shows, once the sender receives an acknowledgement for the first packet inside the window, it slides the window along and it sends the next packet. The window continues to slide as long as acknowledgements are received.
DIT 116
NETWORK PROTOCOLS
Conceptually, a sliding window protocol always remembers which packets have been acknowledged and keeps a separate timer for each acknowledged packet. If a packet is lost, the timer expires and the sender transmits that packet. When the sender slides its window, it moves past all acknowledged packets. At the receiving end, the protocol software keeps an analogous window, accepting and acknowledging packets as they arrive. Thus, the window partitions the sequence of packets into three sets: those packers to the left of the window have been successfully transmitted, received and acknowledged; those packets to the right have not yet been transmitted; and those packets that lie in the window are being transmitted. The lowest numbered packet in the window is first packet in the sequence that has not been acknowledged. 1.8.4 The Transmission Control Protocol
NOTES
Most of the applications of the Internet have to provide reliable services to the users. Some popular examples are e-mail, file transfer, World Wide Web etc. The reliability required by these applications is provided by a particular protocol of the transport layer by name Transmission Control Protocol (TCP). Since reliability is the major requirement of many applications, the whole protocol stack is referred as TCP/ IP along with the dominant protocol of the internet layer. TCP is a communication protocol, not a piece of software. It is better to understand the difference between a protocol and the software that implements it. The difference between a protocol and the software that implements it is analogous to the difference between the definition of a programming language and a compiler. The best way to understand a protocol is to learn what the protocol does and what the protocol does not. The protocol specifies the format of the data and acknowledgements that two computers exchange to achieve a reliable transfer, as well as the procedures the computers use to ensure that the data arrives correctly. It specifies how TCP software distinguishes among multiple destinations on a given machine, and how communicating machines recover from errors like lost or duplicated packets. The protocol also specifies how two computers initiate a TCP stream transfer and how they agree when it is complete. It is also important to understand what the protocol does not include. Although the TCP specifications describe how application programs use TCP in general terms, it does not dictate the details of the interface between an application program and TCP. That is, the protocol documentation only discusses the operations TCP supplies and it does not specify the exact procedures application programs invoke to access these operations. The reason for leaving the application program interface unspecified is flexibility. In particular, because programmers usually implement TCP in the computers operating system, they need to employ whatever interfaces the operating system supplies. Allowing the implementor flexibility makes it possible to have a single specifica41 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
tion for TCP that can be used to build software for a variety of machines. Because TCP assumes little about the underlying communication system, TCP can be used with a variety of packet delivery systems, including the IP datagram delivery service. 1.8.5 TCP Segment Format
The unit of transfer between the TCP software on two machines is called a segment.Segments are exchanges to establish connections, transfer data, send acknowledgements, advertise window sizes, and close connections. Because TCP uses piggybacking, an acknowledgement traveling from machine A to machine B may travel in the same segment as data traveling from machine A to machine B, even though the acknowledgement refers to data sent from B to A. Similar to that of IP datagram, a TCP segment also has two parts namely header and payload. Figure 1.28 shows the TCP segment format.
Figure 1.28. The format of a TCP segment
SOURCE PORT and DESTINATION PORT - contain the TCP port numbers that identify the application programs at the ends of the connection SEQUENCE NUMBER This field identifies the position in the senders byte stream of the data in the segment ACKNOWLEDGEMENT NUMBER The purpose of this field is to identify the number of the octet that the source expects to receive next. You please observe that the sequence number refers to the stream flowing in the same direction as the segment, while the acknowledgement number refers to the stream flowing in the opposite direction from the segment. HLEN Contains integer that specifies the length of the segment header measured in 32-bit multiples. Similar to that of IP, this field is required since options field can vary in length.
DIT 116
NETWORK PROTOCOLS
OPTIONS Varies in length based on the service to be provided to the user. This field provides a way to add extra facilities not covered by the regular header. The most important option is the one that allows each host to specify the maximum TCP payload it is willing to accept. Using large segments is more efficient than using small ones because the 20-byte header can then be amortized over more data, but small hosts may not be able to handle big segments. The 6-bit field marked RESERVED is reserved for future use CODE BITS These bits are also called flag bits. TCP segments are used not only to carry the data but also to carry the control information that is required in the process of connection establishment and connection release. Some segments carry only an acknowledgement while some carry data. Others carry requests to establish or close a connection. TCP software uses this field to determine the purpose and contents of the segment. The six bits tell how to interpret other fields in the header according to the table in figure 1.29.
NOTES
Bit (left to right) URG ACK PSH RST SYN FIN
Meaning if bit set to 1 Urgent pointer field is valid Acknowledgement field is valid This segment requests a push Reset the connection Synchronize sequence numbers to initiate connection Sender has reached end of its byte stream and sender wants to close the connection
Figure 1.29. Bits of the CODE field in the TCP header
WINDOW - TCP software uses this field to advertise how much data it is willing to accept every time it sends a segment by specifying its buffer size. The field contains a 16-bit unsigned integer in network-standard byte order. Window advertisements provide another example of piggybacking because they accompany all segments, including those carrying data as well as those carrying only an acknowledgment. CHECKSUM It checksums the header, the data and the conceptual pseudo header shown in figure 1.30. When performing this computation, the TCP checksum field is set to zero and the data is padded out with an additional zero byte if its length is an odd number. The checksum algorithm is simply to add up all the 16-bit words in 1s complement and then to take the ones complement of the sum. As a consequence when the receiver performs the calculation on the entire segment, including the checksum field, the result should be zero.
43
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 1.30. Pseudo header included in the checksum
URGENT POINTER This field is used to indicate a byte offset from the current sequence number at which urgent data are to be found. This facility is in lieu of interrupt messages. This facility is a bare-bone way of allowing the sender to signal the receiver without getting TCP itself involved in the reason for the interrupt. There are occasions in which an application program needs to send urgent bytes. This means that the sending application wants a peace of data to be read out of order by the receiving application. The sending application tells the sending TCP that the piece of data is urgent. The sending TCP creates a segment and inserts the urgent data at the beginning of the segment. 1.9 1.9.1 TCP STATE MACHINE TCP Connection Establishment
To establish a connection, TCP uses a three-way handshake. The three-way handshake is shown as in figure 1.31.
Figure 1.31 Anna University Chennai
Three way handshake protocol 44
DIT 116
NETWORK PROTOCOLS
The first segment of a handshake can be identified because it has the SYN (Synchronization) bit set in the code field. The second message has both the SYN and ACK (Acknowledgement) bits set, indicating that it acknowledges the first SYN segment as well as continuing the handshake. The final handshake message is only an acknowledgement and is merely used to inform that both sides agree that a connection has been established. Usually the TCP software on one machine waits passively for the handshake, and the TCP software on another machine initiates it. However, the handshake is carefully designed to work even if both machines attempt to initiate a connection simultaneously. Once a connection has been established, data can flow in both directions equally well. There is no master or slave. The three-way handshake is both necessary and sufficient for correct synchronization between two ends of the connection. To understand why, remember that TCP builds on unreliable packet delivery service, so messages can be lost, delayed, duplicated or delivered out of order. Thus the protocol must use a timeout mechanism and retransmit lost requests. Trouble arises if retransmitted and original requests arrive while the connection is being established, or if retransmitted requests are delayed until after a connection has been established, used and terminated. A three way handshake plus the condition that the TCP ignores additional requests for connection after a connection has been established) solves these problems. 1.9.2 TCP Connection Release
NOTES
Two programs that use TCP to communicate can terminate the conversation gracefully using the close operation. Internally, TCP uses a modified three-way handshake to close connections. TCP connections are full duplex and can be viewed as containing two independent stream transfers, one going in each direction. When an application program tells TCP that it has no more data to send, TCP will close the connection in one direction. To close its half of a connection, the sending TCP finishes transmitting the remaining data, waits for the receiver to acknowledge it, and then sends a segment with the FIN (Finish) bit set. The receiving TCP acknowledges the FIN segment and informs the application program on its end that no more data is available. Once a connection is closed in one direction, TCP refuses to accept more data for that direction. Meanwhile, data can continue to flow in the opposite direction until the sender closes it. Of course, acknowledgements continue to flow back to the sender even after a connection has been closed. When both the directions have been closed, the TCP software at each endpoint deletes its record of the connection. Figure 1.32 illustrates the procedure.
45
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 1.32 TCP Connection Release
The difference between three way handshakes used to establish and break connections occurs after a machine receives the initial FIN segment. Instead of generating a second FIN segment immediately, TCP sends an acknowledgement and then informs the application of the request to shut down. Informing the application program of the request and obtaining a response may take considerable time. The acknowledgement prevents transmission of the initial FIN segment during the wait. Finally, when the application program instructs TCP to shut down the communication completely, TCP sends the second FIN segment and the original site replies with the third message, an ACK. 1.9.3 TCP Connection reset
Normally, an application program uses the close operation to shutdown a connection when it finishes using it. Thus, closing connections is considered a normal part of use, analogous to closing files. Sometimes abnormal conditions arise that force an application program or the network software to break a connection. TCP provides a reset facility for such abnormal disconnections. To reset a connection, one side initiates termination by sending a segment with the RST bit in the CODE field set. The other side responds to a reset segment immediately by aborting the connection. TCP also informs the application program that a reset occurred. A reset is an instantaneous abort that means that transfer in both directions ceases immediately, and resources such as buffers are released. 1.9.4 Finite State machine
Like most protocol, the operation of TCP can best be explained with a theoretical model called a finite state mechaine. The figure 1.33 shows the TCP finite state machine, with circles representing states and arrows representing transitions between them. The label on each tranisition shows what TCP receives to cost the transition and what it sends in response. For example, the TCP software at each end point begins in the
DIT 116
NETWORK PROTOCOLS
CLOSED state. Application programs must issue either a passive open command (to wait for a connection from another machine), or an active open command (to initiate a connection). An active open command forces a transition from the CLOSED state to the SYN SENT state. When TCP follows the transition, it emits a SYN segment. When the other end returns a segment that contains a SYN plus ACK, TCP moves to the ESTABLISH state and begins data transfer. The TIMED WAIT state reveals how TCP handles some of the problem incurred with unreliable delivery. TCP keeps a notion of maximum segment lifetime (MSL), the maximum time an old segment can remain alive in an internet. To avoid having segments from a previous connections interfere with the current one, TCP moves to the TIMED WAIT state after closing a connection. It remains in the state for twice the maximum segment lifetime before deleting its record for the connection. If any duplicate segments happen to arrive for the connection during the lifetime interval, TCP will reject them. However, to handle cases where the last acknowledge was lost; TCP acknowledges valid segments and restarts the timer. Because the timer allows TCP to distinguish old connections from new ones, it prevents TCP from responding with the RST (reset) if the other end retransmits FIN request.
NOTES
Anything/reset
Closed
begin
Passive open
close LISTEN active open / syn
syn/syn+ack
send/syn reset
close
SYN RCVD
syn/syn+ack ack close/fin ESTBD fin/ack close/fin
SYN SENT
timeout/reset Syn+ack/ack
CLOSE WAIT
CLOSING
FIN WAIT-1 ack FIN WAIT-2
Close/fin Fin-ack/ack ack LAST ACK ack/
fin/ack
TIMED WAIT
Figure 1.33 TCP Finite State Machine
47
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. Why is a three way handshake necessary in TCP connection establishment? 2. What is the various control segments involved in the TCP connection establishment? 3. Closing a connection is more subtle than establishment of a connection. Justify this statement. 4. Identify the various states in the process of connection establishment and connection release. 5. Whether reliability is provided by IP or not? 6. What are the requirements of a reliable service? 7. What is meant by positive acknowledgement with retransmission? 8. What is meant by expectational acknowledgement? 10. What are the advantages of simple stop and wait flow control scheme? 11. What is meant by sliding window in TCP? 12. What are the five bits of the CODE field in TCP? 13. What does a port number refer in TCP? 14. What is the purpose of window field in TCP header format? 15. What is meant by conceptual pseudo header in checksum computation of TCP? 1.10 TIMER MANAGEMENT OF TCP
As we have already seen, the basic mechanism followed by TCP to ensure reliability is the acknowledgement. Every time the TCP entity of the sender sends a segment, it starts a timer and waits for an acknowledgement from the TCP entity of the receiver. Like other reliable protocols, TCP expects the destination to send acknowledgements whenever it successfully receives new octets from the data stream. If the timer expires before data in the segment has been acknowledged, TCP assumes that the segment was lost or corrupted and retransmits it. Even though retransmissions consume considerable bandwidth, they become a must to ensure that the segments have really reached the other side. 1.10.1 Basic timer management algorithm The basic idea in retransmission is the sender has to wait for certain amount of time before choosing the option of retransmission. Here the issue is how long the sender has to wait before going for retransmission. The following scenarios help you to understand the difficulty in deciding the timeout period. In an internet, a segment traveling between a pair of machines may traverse a single, low delay network (e.g., a high speed LAN), or it may travel across multiple intermediate networks through multiple routers. Thus, it is impossible to know a priori how quickly acknowledgements will return to the source. Furthermore, the delay at each router depends on traffic, so the total time required for a segment to travel to the destination and an acknowledgement
DIT 116
NETWORK PROTOCOLS
to return to the source varies dramatically from one instant to the other. If we measure the round trip time of various segments transmitted over a period of time, we can observe the considerable variation in the RTT. Hence, TCP must be able to accommodate these wide variations in RTT and time out at the appropriate time. TCP accommodates varying internet delays by using an adaptive retransmission algorithm. In essence, TCP monitors the performance of each connection and deduces reasonable value for timeouts. As the performance of a connection changes, TCP revises its timeout value (i.e., it adapts to the change). To collect the data needed for an adaptive algorithm, TCP records the time at which each segment is sent and the time at which an acknowledgement arrives for the data in the segment. From the two times, TCP computes an elapsed time known as a sample route trip time or round trip sample. Whenever it obtains a new round trip sample, TCP adjusts its notion of the average round trip time for the connection. Usually, TCP software stores the estimated round trip time RTT, as a weighted average and uses new round trip samples to change the average slowly. For example, when computing a new weighted average, one early averaging technique used a constant weighting factor, , where 0 d < 1, to weight the old average against the latest round trip sample as shown in equation in 1.1. RTT = ( * Old_RTT)+ (( 1 ) * New_Round_Trip_Sample) (1.1)
NOTES
Choosing a value for close to 1 makes the weighted average immune to changes that last a short time (e.g., a single segment that encounters long delay). Choosing a value for close to 0 makes the weighted average respond to changes in delay very quickly. When it sends a packet, TCP computes a timeout value as a function of the current round trip estimate. Early implementations of TCP used a constant weighting factor, ( > 1), and made the timeout greater than the current round trip estimate as shown in equation 1.2. Timeout = * RTT (1.2)
Choosing a value for can be difficult. On one hand, to detect packet loss quickly, the timeout value should be close to the current round trip time (i.e., should be close to 1). Detecting packet loss quickly improves throughput because TCP will not wait an unnecessarily long time before retransmitting. On the other hand, if = 1, TCP is overly eager any small delay will cause an unnecessary retransmission, which wastes network bandwidth. The original specification setting = 2; more recent work described below has produced better techniques for adjusting timeout. To accommodate the varying delays encountered in an internet environments, TCP uses an adaptive retransmission algorithm that monitors delays on each connection and adjusts its timeout parameter accordingly.
49
DIT 116
NETWORK PROTOCOLS
NOTES
1.10.2 Refinements in the basic timer mechanism One problem that occurs with the dynamic estimation of RTT is what to do when a segment times out and sends again. When the acknowledgement comes in, it is unclear whether the acknowledgement refers to the first transmission or a later one. Guessing wrong can seriously contaminate the estimate of RTT. Figures 1.34 and 1.35 explain the consequences of wrong estimation.
Figure 1.34. Associating the ACK with original transmission
Figure 1.35. Associating the ACK with retransmission
The above two figures reveal the fact that associating the acknowledgement with the original transmission can make the estimated round trip grow without bound in cases where an internet loses datagrams. If an acknowledgement arrives after one or more retransmissions, TCP will measure the round trip sample from the original transmission, and compute a new RTT using the excessively long sample. Thus, RTT will grow slightly. The next time TCP sends a segment, the larger RTT will result in slightly
50
DIT 116
NETWORK PROTOCOLS
longer timeouts, and so if an acknowledgement arrives after one or more retransmission, the next sample round trip time will be even larger, and so on. Associating the acknowledgement with the most recent retransmission can also fail. Consider what happens when the end-to-end delay suddenly increases. When TCP sends a segment, it uses the old round trip estimate to compute timeout, which is now too small. The segment arrives and an acknowledgement starts back, but the increase in delay means the timer expires before the acknowledgement arrives, and TCP retransmits the segment. Shortly after TCP retransmits, the first acknowledgement arrives and is associated with the retransmission. The round trip sample will be much too small and will result in a slight decrease of the estimated round trip time, RTT. Unfortunately, lowering the estimated round trip time guarantees that TCP will set the timeout too small for the next segment. Ultimately, the estimated round trip time can stabilize at a value T, such that the correct round trip is slightly longer than some multiple of T. Implementations of TCP that associate acknowledgements with the most recent retransmission have been observed that TCP sends each segment exactly twice even though no loss occurs. 1.10.3 Karns Algorithm and Timer Backoff If the original transmission and the most recent transmission both fail to provide accurate round trip times, what should TCP do? The accepted answer is simple: TCP should not update the round trip estimate for retransmitted segments. The idea, known as Karns Algorithm, avoids the problem of ambiguous acknowledgements altogether by only adjusting the estimated round trip for unambiguous acknowledgements (acknowledgements that arrive for segments that have only been transmitted once). Of course, a simplistic implementation of Karns algorithm, one that merely ignores times from retransmitted segments, can lead to failures, as well. Consider what happens when TCP sends a segment after a sharp increase in delay. TCP computes a timeout using the existing round trip estimate. The timeout will be too small for the new delay and will force retransmission. If TCP ignores acknowledgements from retransmitted segments, it will never update the estimate and the cycle will continue. To accommodate such failures, Karns algorithm requires the sender to combine retransmission timeouts with a timer backoff strategy. The backoff technique computes an initial timeout using a formula like the one shown. However, if the timer expires and causes a retransmission, TCP increases the timeout (to keep timeouts from becoming ridiculously long, most implementations limit increases to an upper bound that is larger than the delay along any path in the internet).Implementations use a variety of techniques to compute backoff. Most choose a multiplicative factor, , and set the new value as shown in equation 1.3.
51
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
New_timeout = * timeout
(1.3)
Typically, is 2. (It has been argued that values of less than 2 lead to instabilities). Other implementations use a table of multiplicative factors, allowing arbitrary backoff at each step. Karns algorithm combines the backoff technique with round trip estimation to solve the problem of never increasing round trip estimates. When computing the round trip estimate, ignore samples that correspond to retransmitted segments, but use a backoff strategy, and retain the timeout value from a retransmitted packet for subsequent packets until a valid sample is obtained. Generally speaking, when an internet misbehaves, Karns algorithm separates computation of the timeout value from the current round trip estimate. It uses the round trip estimate to compute an initial timeout value, but then backs off the timeout on each retransmission until it can successfully transfer a segment. When it sends subsequent segments, it retains the timeout value that results from backoff. Finally, when an acknowledgement arrives corresponding to a segment that did not require retransmission, TCP recomputes, the round trip estimate and resets the timeout accordingly. Experience shows that Karns algorithm works well even in networks with high packet loss. 1.10.4 Jacobson/Karels Algorithm Jacobson and Karels suggested that the TCP implementations should estimate both average round trip time and the variance, and to use the estimated variance in place of the constant . Estimation of timeout period involves the following the set of equations. DIFF = SAMPLE Old_RTT Smoothed_RTT = Old_RTT + *DIFF DEV = Old_DEV + (|DIFF| - Old_DEV) Timeout = Smoothed_RTT + * DEV (1.4) (1.5) (1.6) (1.7)
where DEV is the estimated mean deviation, is a fraction between 0 and 1 that controls how quickly the new sample affects the weighted average, is a fraction between 0 and 1 that controls how quickly the new sample affects the round trip timeout. To make the computation efficient, TCP chooses and to each be an inverse of a power of 2, scales the computation by 2n for an appropriate n, and uses integer arithmetic. Have you understood? 1. 2. 3. 4. 5. What is the basis for providing reliability in TCP? When does TCP decide to retransmit a segment? What are the difficulties in measuring RTT? How RTT and Timeout are are estimated in the original implementation of TCP? What is the suggestion proposed by Karn to improve the measurement of RTT?
52
DIT 116
NETWORK PROTOCOLS
1.11
CONGESTION CONTROL BEHAVIOR OF TCP
NOTES
If we look at the operation of networks superfluously, it may appear that only flow control is an end to end issue and congestion control is an issue of the network. However the fact is, a network can be relieved of congestion only if the end systems react to the indication of congestion given by the subnet. Hence TCP has the provision to react to the problems caused by congestion in the network. Congestion is a condition of severe delay caused by an overload of datagrams at one or more switching points (e.g., at routers). When congestion occurs, delays increase and the router begins to enqueue datagrams until it can route them. We must remember that each router has finite storage capacity and that datagrams compete for that storage (i.e., in a datagram based internet, there is no preallocation of resources to individual TCP connections). In the worst case, the total number of datagrams arriving at the congested router grows until the router reaches capacity and starts to drop datagrams. If steps are not taken to control congestion, the network may reach a state in which the network will not be able to deliver any datagrams to the destination. Such a condition is referred as congestion collapse. Hosts do not usually know the details of where congestion has occurred or why. To them, congestion simply means increased delay. Unfortunately, most transport protocols use timeout and retransmission, so they respond to increased delay by retransmitting datagrams. Retransmission aggravates congestion instead of alleviating it. If unchecked, the increased traffic will produce increased delay, leading to increased traffic, and so on, until the network becomes useless. To avoid congestion collapse, TCP must reduce transmission when congestion occurs. Routers watch queue lengths and use techniques like ICMP source quench to inform hosts that congestion has occurred, but transport protocols like TCP can help to avoid congestion by reducing transmission rates automatically whenever delays occur. Of course, algorithms to avoid congestion must be constructed carefully because even under normal operating conditions an internet will exhibit wide variation in round trip delays. To avoid congestion, the TCP standard now recommends using two techniques: slow-start and multiplicative decrease. They are related and can be implemented easily. We said that for each connection, TCP must remember the size of the receivers window (i.e., the buffer size advertised in acknowledgements). To control congestion TCP maintains a second limit, called the congestion window limit or congestion window that it uses to restrict data flow to less than the receivers buffer size when congestion occurs. TCP entity at the sender has to decide the sending rate as shown in equation 1.4. Allowed_window = min (receiver_advertisement, congestion_window) (1.4)
53
DIT 116
NETWORK PROTOCOLS
NOTES
In the steady state on a non-congested network, the congestion window is the same size as the receivers window. Reducing the congestion window reduces the traffic TCP will inject into the connection. To estimate congestion window size, TCP assumes that most datagram loss come from congestion and uses the multiplicative decrease algorithm. Multiplicative decrease algorithm reduces the congestion window by half (down to a minimum of atleast one segment). For those segments that remain in the allowed window, the retransmission timer is backed of exponentially. Because TCP reduces the congestion window by half for every loss, it decreases the window exponentially if loss continues. In other words, if congestion is likely, TCP reduces the volume of traffic exponentially and the rate of retransmission exponentially. If the loss continues, TCP eventually limits transmission to a single datagram and continues to double timeout values before retransmitting. The idea is to provide quick and significant traffic reduction to allow routers enough time to clear the datagrams already in their queues. To recover from the congestion, TCP follows a technique called slow-start to scale up transmission. Whenever starting traffic on a new connection or increasing traffic after a period of congestion, start the congestion window at the size of a single segment and increase the congestion window by one segment each time an acknowledgement arrives. Slow-start avoids swamping the internet with additional traffic immediately after congestion clears or when new connection suddenly starts. The term slow-start may be a misnomer because under ideal conditions, the start is not very slow. TCP initializes the congestion window to 1, sends an initial segment, and waits. When the acknowledgement arrives, it increases the congestion window to 2, sends two segments, and waits. When the two acknowledgements arrive they each increase the congestion window by 1, so TCP can send 4 segments. Acknowledgements for those will increase the congestion window to 8. Within four round trip times, TCP can send 16 segments, often enough to reach the receivers window limit. Even for extremely large windows, it takes only log2N round trips before TCP can send N segments. Taken together, slow start, additive increase, multiplicative decrease, congestion avoidance, measurement of variation, and exponential timer backoff improve the performance of TCP dramatically without adding any significant computational overhead to the protocol software. Versions of a TCP that uses these techniques have improved the performance of previous versions by factors of 2 to 10. Have you understood? 1. 2. List down the factors that affect the data transmission rate of a host. What is the size of effective window in congestion control?
54
DIT 116
NETWORK PROTOCOLS
3. 4. 1.12
What is meant by slow start? What is meant by additive increase and multiplicative decrease in TCP congestion control? CONGESTION-CONTROL MECHANISMS IN NETWORK LAYER
NOTES
Congestion can occur in the network due to many reasons. We will present the problem of congestion in two different angles. First, hosts are generating data at normal rate; however the network is not able to deliver the datagrams properly. There are many possible reasons for congestion in this scenario. Some routers might have failed or datagrams arriving through different input lines may demand the same outgoing line. Second, network functions properly and tries its best to deliver the datagrams. However, certain hosts generate data at excessive rates and the network is not able to accommodate them. The conclusion is congestion can be caused either by end systems or by the network. Network layer is responsible for taking steps to control congestion in the network and transport layer is responsible at the end systems. Congestion control policies employed in each of these layers have impact over the other layer and hence the overall performance of the network. In the previous section, we discussed about the TCP congestion control algorithm through which the hosts are made to adjust their sending rate. In this section, we are going to discuss about few congestion-control mechanisms employed in the network layer. 1.12.1 Tail Drop Policy Even though layers function in isolation of each other except in terms of the interface, it becomes necessary to know the details about other layers to implement a particular layer efficiently. Especially, the congestion control or congestion avoidance algorithms followed in the network layer has tremendous impact over the performance of TCP. The most important interaction between IP implementation policies and TCP occurs when a router becomes overrun and drops datagrams. Because a router places each incoming datagram in a queue in memory until it can be processed, the policy focuses on queue management. When datagrams arrive faster than they can be forwarded, the queue increases. However, because memory is finite, the queue cannot grow without bound. Early router software used a tail-drop policy to manage queue overflow. Tail drop refers to a policy in which if the input queue is filled when a datagram arrives, discard the datagram. Tail-drop has an interesting effect on TCP. In the simple case where datagrams traveling through a router carry segments from a single TCP connection, the loss causes TCP to enter slow-start, which reduces throughput until TCP begins receiving ACKs and increases the congestion window. A more severe problem can occur, however,
DIT 116
NETWORK PROTOCOLS
NOTES
when the datagrams traveling through a router carry segments from many TCP connections because tail-drop can cause global synchronization. To see why, observe that datagrams are typically multiplexed, with successive datagrams each coming from a different source. Thus, a tail-drop policy makes it likely that the router will discard one segment from N connections rather than N segments from one connection. The simultaneous loss causes all N instances of TCP to enter slow-start at the same time. 1.12.3 Random Early Discard (RED) It is better to avoid congestion occurrence rather than permitting the congestion to occur and then to control. A simple tail drop policy permits congestion to occur and once the queues become full, datagrams are lost. Random Early Discard tries to sense congestion and avoid it through early warnings. Other abbreviations for RED are Random Early Drop, or Random Early Detection. A router that implements RED uses two threshold values to mark positions in queue: Tmin and Tmax. The general operation of RED can be described by three rules that determine the disposition of each arriving datagram: If the queue currently contains fewer than Tmin datagrams, add the new datagram to the queue. If the queue contains more than Tmax datagrams, discard the new datagram. In the queue contains between Tmin and Tmax datagrams, randomly discard the datagram according to a probability, p. The randomness of RED means that instead of waiting until the queue overflows and then driving many TCP connections into slow-start, a router slowly and randomly drops datagrams as congestion increases. The key to making RED work well lies in the choice of the thresholds Tmin and Tmax, and the discard probability p. Tmin must be large enough that the output link has high utilization. Furthermore, because RED operates like tail-drop when the queue exceeds Tmax, the value must be greater than Tmin by more than the typical increase in queue size during one TCP round trip time (e.g., set Tmax at least twice as large as Tmin). Otherwise, RED can cause the same global oscillations as tail-drop. Computation of the discard probability, p, is the most complex aspect of RED. Instead of using a constant, a new value of p is computed for each datagram; the value depends on the relationship between the current queue size and the thresholds. To understand the scheme, observe that all RED processing can be viewed probabilistically. When the queue size is less than Tmin, RED does not discard any datagrams, making the discard probability 0. Similarly, when the queue size is greater than Tmax , RED discards all datagrams, making the discard probability 1. for intermediate values of queue size, (i.e., those between Tmin and Tmax ), the probability can vary from 0 to 1 linearly.
56
DIT 116
NETWORK PROTOCOLS
Although the linear scheme forms the basis of REDs probability computation, a change must be made to avoid overreacting. The need for the change arises because network traffic is bursty, which results in rapid fluctuations of a routers queue. If RED used a simplistic linear scheme, later datagrams in each burst would be assigned high probability of being dropped (because they arrive when the queue has more entries). However, a router should not drop datagrams unnecessarily because doing so has a negative impact on TCP throughput. Thus, if a burst is short, it is unwise to drop datagrams because the queue will never overflow. Of course, RED cannot postpone discard indefinitely because a long-term burst will overflow the queue, resulting in a taildrop policy which has the potential to cause global synchronization problems. How can RED assign a higher discard probability as the queue fills without discarding datagrams from each burst? The answer lies in a technique borrowed from TCP; instead of using the actual queue size at any instant, RED computes a weighted average queue size, avg, and uses the average size to determine the probability. The value of avg is an exponential weighted average, updated each time a datagram arrives according to the equation 1.5. avg = (1- ) * Old_avg + * Current_queue_size (1.5)
NOTES
where denotes a value between 0 and 1. If is small enough, the average will track long term trends, but will remain immune to short bursts. In addition to equations that determine , RED contains other details that we have glossed over. For example, RED computations can be made extremely efficient by choosing constants as powers of two and using integer arithmetic. Another important detail concerns the measurement of queue size, which affects both the RED computation and its overall effect on TCP. In particular, because the time required to forward a datagram is proportional to its size, it makes sense to measure the queue in octets rather than in datagrams; doing so requires only minor changes to the equations for p and . Measuring queue size in octets affects the type of traffic dropped because it makes the discard probability proportional to the amount of data a sender puts in the stream rather than the number of segments. Small data grams (e.g. those that carry remote login request to servers) have lower probability of being dropped than large data grams (e.g., those that carry file transfer traffic). One positive consequence of using size is that when acknowledgments travel over a congested path, they have lower probability of being dropped. As a result, if a (large) data segment does arrive, the sending TCP will receive the ACK and will avoid unnecessary transmission. Both analysis and simulations show that RED works well. It handles congestion, avoids the synchronization that results from tail drop, and allows short bursts without dropping datagrams unnecessarily. The IETF now recommends that routers implement RED.
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. 2. 3. 4. Whether cooperation is required between the end systems and the network in congestion control? What is meant by FCFS scheduling in routers? What is meant by drop tail policy? What are the advantages of RED scheme?
1.13 USER DATAGRAM PROTOCOL The discussion about the transport layer is not complete if another protocol of the transport layer by name User Datagram Protocol (UDP) is not discussed. If the applications decide that reliability is not required or they themselves can implement it to the desired level, then a sophisticated protocol like TCP is not required. Applications in which late data is worse than bad data use UDP as the transport layer protocol. UDP takes care of the basic end to end issues like fixing the end points for communication, multiplexing and demultiplexing. UDP does not take care of the issues like out of order delivery of datagrams, duplicate packets, flow control, error control and congestion control. Compared with TCP, UDP is considered as a light weight protocol since the functions performed by UDP are limited. UDP datagram format is shown in figure 1.36.
Figure 1.36 UDP datagram format
UDP was used rarely in the early days of the Internet. The reason for these conventional applications of the Internet like file transfer, electronic mail, web are errorsensitive applications and they expect reliability from the transport layer which can be provided by TCP only. Recently UDP has gained significance due to the emergence of multimedia applications in the Internet. Multimedia applications are delay-sensitive applications rather than error-sensitive applications. In other word, we can say that in multimedia applications, late data is worse than bad data. If multimedia applications follow TCP in the transport layer, they may not be able to satisfy the timing requirements of the users. Hence they choose light weight UDP rather than heavy weight TCP. Have you understood? 1. What type of applications prefers UDP over TCP? 2. UDP is a light weight protocol compared to TCP. Justify this statement.
58
DIT 116
NETWORK PROTOCOLS
1.14
AUTONOMOUS SYSTEMS
NOTES
The Internet is a network of networks. To handle the traffic effectively, it is necessary to organize these networks properly. Initially, the Internet followed a single core architecture in which the ARPANET was the core and all other networks were considered as secondary networks. Due to the inclusion of NSFNET into the Internet, a dual core architecture was introduced in which the ARPANET and the NSFNET were the core networks and other networks were organized as the secondary networks. However, as more and more networks were connected to the Internet, even the dual core architecture was not able to manage the Internet effectively. Hence the Internet follows an architecture based on Autonomous Systems. An Autonomous System (AS) is a collection of IP networks and routers under the control of one entity (or sometimes more) that represents a common routing policy in the Internet. Each AS is under the control a single administration authority. Because the networks and routers fall under a single administrative authority, that authority has to guarantee that internal routes remain consistent and viable. Furthermore, the administrative authority can choose one of its routes to serve as the machine that will appraise the outside world of networks within the organization. In figure 1.37, because routers R2, R3, and R4 fall under control of one administrative authority, that authority can arrange to have R3 advertise networks 2, 3 and 4 (R1 already knows about network 1 because it has a direct connection to it). For Purposes of routing, a group of networks and routers control by a single administrative authority is called an autonomous system (AS). Routers within an autonomous system are free to choose their own mechanism for discovering, propagating, validating and checking the consistency of routes. Note that, under the definition, the original internet core routers formed an autonomous system. Each change in routing protocol within the core autonomous system was made without affecting the routers in other autonomous system. 1.14.1 An Exterior Gateway Protocol Computer scientists use the Exterior Gateway Protocol (EGP) to pass routing information between two autonomous systems. Currently a single exterior protocol is used in most TCP/IP internets. Known as the Border Gateway Protocol (BGP), it has evolved through four (quite different) versions. Each version is numbered, which gives raised to the formal name of the current version: BGP-4. In this material, the term BPG refers to BGP-4 by default. If a pair of autonomous system agrees to exchange routing information, each must designate a router that will speak BGP on its behalf; the two routers are said to become BGP peers of one another. Because a router speaking BGP must communicate with a
DIT 116
NETWORK PROTOCOLS
NOTES
peer in another autonomous system, it makes sense to select a machine that is near the edge of autonomous system. Hence, BGP terminology calls the machine a border gateway or border router. Conceptual illustration of two routers, R1, and R2 using BGP to advertise networks in their autonomous system after collecting the information from other routers internally. An organization using BGP usually chooses a router that is close to outer edge of the autonomous system. In figure 1.37, router R1 gathers information about networks in autonomous system 1 and reports that information to R2 using BGP, while router R2 reports information from autonomous system 2.
Backbone Network
R1
R2
Rn
AS1
AS2
ASn
Figure 1.37 Internet with Autonomous Systems
1.14.2 BGP Characteristics BGP is unusual in several ways. Most important, BPG is neither a pure distance vector protocol nor a pure link state protocol. It can be characterized by the following features. Inter-Autonomous System Communication Because BGP is designed as an exterior gateway protocol, its primary role is to allow one autonomous system to communicate with another. Coordination Among Multiple BGP Speakers If an autonomous system has multiple routers each communicating with a peer in an outside autonomous system, BGP can be used to coordinate among routers in the set to guarantee that they all propagate consistent information. Propagation of Reachability Information BGP allows an autonomous system to advertise destinations that are reachable either in or through it, and to learn such information from another autonomous system.
DIT 116
NETWORK PROTOCOLS
Next-Hop Paradigm Like distance-vector routing protocol, BGP supplies next-hop information for each destination. Policy Support Unlike most distance-vector protocol that advice exactly the routes in the local routing table, BGP can implement policy that the local administrator chooses. In particular, a router running BGP can be configured to distinguish between the set of destination reachable by computers inside its autonomous system and the set of destination advertised to other autonomous system. Reliable Transport BGP is unusual among protocols that pass routing information because it assumes reliable transport. Thus, BGP uses TCP for all communication. Path Information In addition to specifying destinations that can be reached and a next-hop for each, BGP advertisements include path information that allows the receiver to learn a series of autonomous system along a path to the destination. Incremental Updates To conserve network bandwidth, BGP does not pass full information in each update message. Instead, full information is exchanged once, and then successive messages carry incremental changes called deltas. Support For Classless Addressing BGP supports an addressing scheme in which rather than expecting address to be self-identifying, the protocol provides a way to send a mask along with each addresses. Route Aggregation BGP conserves network bandwidth by allowing a sender to aggregate route information and send a single entry to represent multiple, related destinations. Authentication BGP allows a receiver to authenticate messages (i.e., verify the identity of a sender).
NOTES
61
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. 2. 3. 4. What is an Autonomous System? Differentiate between interior gateway protocols and exterior gateway protocols. Give an example for exterior gateway protocol. Is RIP an exterior gateway protocol or interior gateway protocol? Justify your answer.
Summary 1. A network is an interconnection of computers in which computers are able to exchange information among themselves and the master/slave relationship is excluded. When a set of networks are connected together, the resultant network is called an internetwork. The short form of internetwork is internet. i in lower case refers any generic internet and I in uppercase refers the specific world wide Internet. TCP/IP reference model was designed and developed from the beginning itself by keeping internetworking in mind. The underlying network is a packet switched network in which different datagrams may be forwarded through different routes. Packet switched networks are more suitable for data communication than the circuit switched networks. The internet layer of TCP/IP reference model provides an unreliable, connectionless datagram delivery system in which datagrams may arrive out of order, some datgarams may be duplicated and some datgrams may be lost. Routing is the process of forwarding the datagram from the source to the destination through intermediate routers with the help of the routing tables maintained at the routers. IP is a forwarding protocol which has the required headers to hide the implementation details of the underlying network technologies from the higher layers. When a datagram is in transit, only the headers are examined and the payload or the data is not examined. Hence IP in its original form is not able to provide differentiated services to the users. RIP and OSPF are the routing protocols that are responsible for the creation and maintenance of routing tables at the routers by exchanging the topological information among the routers. The routing tables built by these tables are used by IP to forward the datagrams. Various types of routing are static routing, dynamic routing, next hop routing, table driven routing, multicast routing etc. The transport layer of the TCP/IP reference model provides the reliability by taking measures to overcome the limitations of the internet layer.
2.
3. 4.
5.
6.
7. 8.
9.
10. 11.
62
DIT 116
NETWORK PROTOCOLS
12.
13.
14.
15.
16.
17.
18.
19. 20.
21.
22.
Transport layer has two protocols by name TCP and UDP. TCP provides a reliable stream oriented service and UDP provides an unreliable message oriented service. TCP employs suitable error control, flow control and congestion control algorithms to provide the reliability. These algorithms require considerable time and resources and as a result TCP is called as a heavy weight protocol. UDP is preferred by the delay sensitive application since it is a light weight protocol in comparison with light weight protocol. UDP is chosen as the transport layer protocol where late data is worse than bad data. The basic mechanism in TCP to provide reliability is retransmission. After sending a segment, the TCP entity on the source waits for acknowledgement for some amount of time, then times out and retransmits the segment. TCP follows a weighted average scheme to estimate the RTT and timeout period in a better way. TCP estimates RTT over a period of time and does not take the values at a particular instant. TCP follows the techniques like slow start, additive increase and multiplicative decrease to control the sending rate of the source in reaction to the congestion indication from the network. TCP employs a three way handshake protocol to establish the connection. A three way handshake is required and sufficient to overcome the limitations of internet layer in establishing a TCP connection. TCP provides a full duplex connection and connection in one direction can be closed and still connection in another direction can continue to send segments. An Autonomous System is a single network or a collection of networks maintained by a single authority. Initially the Internet followed single core architecture and then moved to the dual core architecture. Because of the limitations of these architectures, the Internet right now follows the Autonomous System architecture. An interior gateway protocol is one which is used to exchange the routing information within an autonomous system. It can not exchange the routing information across the autonomous systems. An exterior gateway protocol can exchange the routing information between the autonomous systems.
NOTES
Exercises 1. 2. 3. Whether data link layer and physical layer are present in TCP/IP reference model? Justify your answer. What are the steps followed in a router in routing a datagram? How are the fields IDENTIFICATION, FLAGS and FRAGMENT OFFSET of IP header are used in the process of fragmentation and reassembly?
63
DIT 116
NETWORK PROTOCOLS
NOTES
4.
5. 6.
7. 8.
Find the class of each of the following IP addresses. a. 227.12.14.87 b. 193.14.56.22 c. 14.23.120.8 d. 252.5.15.111 e. 134.11.78.56 Must the loopback address be 127.0.0.1? An organization is granted a block of addresses with the beginning address 14.24.74.0/24. There are 232-24=256 addresses in this block. The organization needs to have 11 subnets as shown below: a. Two subnets, each with 64 addresses. b. Two subnets, each with 32 addresses. c. Three subnets, each with 16 addresses. d. Four subnets, each with 4 addresses. Design the subnets. For what type of datagrams ICMP error messages will not be generated? The values of the parameters in timestamp-request and timestamp-reply messages are as follows. Original Timestamp: 46 ms Receive Timestamp: 59 ms Transmit Timestamp:60 ms Return Time:67ms Find out the values of sending time, receiving time, round-trip time and time difference. If the TCP round-trip time, RTT, is currently 30msec and the following acknowledgments come in after 26, 32, and 24 msec, respectively, what is the new RTT estimate using Jacobson algorithm? Use =0.9. Is slow start of AIMD congestion control algorithm really slow? Justify your answer.
9.
10.
Answers 1. Data link layer and physical layer are present in the TCP/IP reference model. However, unlike the specifications of ISO/OSI reference model TCP/IP does not explain about these two layers in detail. It just points out that the host has to connect to the network using some protocol so it can send IP packets to it. Hence the name network access layer or host-to-network layer is used to refer both data link layer and physical layer in TCP/IP reference model.
64
DIT 116
NETWORK PROTOCOLS
2.
Actions taken by a router as soon as it receives a datagram in the process of routing are as follows. i. Extract the destination IP address of the datagram. ii. If prefix portion the destination IP address matches address of any directly connected network, send the datagram to the destination over that network. iii. If a match does not occur perform the following steps fro every entry in the routing table. a. Perform the bitwise-and of the destination IP address and the subnet mask. b. If the result of the above operation equals the network address field of the entry, route the datagram to the specified next hop address. iv. If no matches were found, declare a routing error. The IDENTIFICATION field is neede to allow the destination host to determine which datagram a newly arrived fragment belongs to. All the fragments of a datagram contain the same identification value. DF bit of the flags is an order to the routers not to fragment the datagram because the destination is incapable of putting the pieces back together again. MF stands for more fragments. All fragments except the last one have this bit set. It is needed to know when all fragments of a datagram have arrived. The FRAGMENT OFFSET tells where in the current datagram this fragment belongs. All fragments except the last one in a datagram must be a multiple of 8 bytes, the elementary fragment unit. Since 13 bits are provided, there is a maximum of 8192 fragments per datagram, giving a maximum datagram length of 65,536 bytes, one more than the TOTAL LENGTH field.
NOTES
3.
4.
a. b. c. d. e.
227.12.14.87 Class D (Since first byte is 227 (between 224 and 139)) 193.14.56.22 Class C (Since first byte is 193 (between 192 and 223)) 14.23.120.8 Class A (Since first byte is 14 (between 0 and 127)) 252.5.15.111 Class D (Since first byte is 252 (between 240 and 255)) 134.11.78.56 Class B (Since first byte is 134 (between 128 and 191))
5.
Need not be. According to IPv4 addressing scheme any IP address of the form 127.x.y.z can be used as loopback address. Hence addresses of the form 127.5.6.9, 127.10.4.7 etc can be used as loopback addresses. However, conventionally in many of the organizations, loopback address is always specified as 127.0.0.1. 14.24.74.0/26 14.24.74.64/26 14.24.7.128/27 14.24.74.160/27 14.24.74.192/28 64 addresses 64 addresses 32 addresses 32 addresses 16 addresses
6.
DIT 116
NETWORK PROTOCOLS
NOTES
7.
8.
14.24.74.208/28 16 addresses 14.24.74.224/28 16 addresses 14.24.74.240/30 4 addresses 14.24.74.244/30 4 addresses 14.24.74.248/30 4 addresses 14.24.74.252/30 4 addresses We use the first 128 addresses for the first two subnets. (each with 64 addresses) We use the next 64 addresses for the next two subnets. (each with 32 addresses) We use the next 48 addresses for the next three subnets.(each with 16 addresses) We use the next 16 addresses for the next four subnets. (each with 4 addresses) i. No ICMP error messages will be generated in response to a datagram carrying an ICMP error message. i. No ICMP error message will be generated for a fragmented datagram that is not the first fragment. ii. No ICMP error message will be generated for a datagram having a multicast address. iii. No ICMP error message will be generated for a datagram having a special address such as 127.0.0.0 or 0.0.0.0. Given parameters are (all in milliseconds) Original timestamp:46 Receive timestamp:59 Transmit timestamp:60 Return time:67 Sending time = 59-46=13 ms Receiving time = 67-60=7 ms Round-trip time=13+7=20ms Time difference = receive timestamp (original timestamp field + one way time duration) =59-(46+10)=3 ms
9.
RTT= RTT + (1- ) * M RTT = 30 + (1-0.9) * 26 = 32.6 RTT = 32.6 + (1-0.9) * 32 = 35.8 RTT = 35.8 + (1-.9) * 24 = 38.2
10.
Slow start of Additive Increase and Multiplicative Decrease of TCP congestion control is not slow at all. It is exponential because the congestion window keeps growing exponentially until either a timeout occurs or the receivers window is reached. It has been given the name slow start in comparison with the original congestion control algorithm in which the size of the congestion window was set to the maximum possible window directly.
66
DIT 116
NETWORK PROTOCOLS
NOTES UNIT - 2
2.1
INTRODUCTION
IP is the glue that holds the entire internet together and accomplishes the Himalayan task of forwarding the datagram from a source which may be in one continent to a destination that may be in other continent and is supported by ICMP to indicate the source about the possible errors that would have arisen in the delivery of datagrams. TCP of the transport layer hides all the limitations of the network to the users and provides a reliable service. However, to provide useful services to the customers, these protocols alone are not sufficient. Especially the task of IP is not as simple as that of mere forwarding as we have discussed in Unit I. The basic mode of communication provided by IP is one to one. However, many applications involve one sender and multiple receivers and for such applications IP has to provide the multicasting capability. Moreover, IP in its original form assumes that the hosts are stationary. But in fact, the end systems can be mobile also. Hence it becomes necessary for IP to accommodate roaming hosts also. Whether unicast or multicast or stationary hosts or mobile hosts, the basic requirement for the communication to take place is addresses. Every machine on the internet should have an address (IP address). The issue is how to assign the addresses to the hosts dynamically and effectively. This unit gives an idea about all these issues and the corresponding solutions. 2.2 LEARNING OBJECTIVES To understand the ideas and features of multicasting To learn the functional components of multicasting To study about IP multicast addresses To discuss about the scope and delivery of multicasting To learn the multicast group management To introduce multicast routing paradigms and routing protocols To learn about mobile hosts To learn about the agents required to support mobile hosts To study about tunneling in mobile hosts To understand the importance of uniqueness of IP addresses To study about the protocols involved in the assignment of IP addresses
DIT 116
NETWORK PROTOCOLS
NOTES
2.3
OBTAINING IP ADDRESSES
A network host needs to obtain a globally unique address in order to function on the Internet. The physical or MAC address that a host has is only locally significant, identifying the host within the local area network. Since this is a Layer 2 address, the router does not use it to forward outside the LAN. IP addresses are the most commonly used addresses for Internet communications. This protocol is a hierarchical addressing scheme that allows individual addresses to be associated together and treated as groups. These groups of addresses allow efficient transfer of data across the Internet. Network administrators use two methods to assign IP addresses. These methods are static and dynamic. Regardless of which addressing scheme is chosen, no two interfaces can have the same IP address. Two hosts that have the same IP address could create a conflict that might cause both of the hosts involved not to operate properly. Static assignment works best on small, infrequently changing networks. The system administrator manually assigns and tracks IP addresses for each computer, printer, or server on the intranet. Good recordkeeping is critical to prevent problems which occur with duplicate IP addresses. This is possible only when there are a small number of devices to track. Servers should be assigned a static IP address so workstations and other devices will always know how to access needed services. Consider how difficult it would be to phone a business that changed its phone number every day. Other devices that should be assigned static IP addresses are network printers, application servers, and routers. 2.3.1 Address Mapping Protocols
One of the major problems in networking is how to communicate with other network devices. In TCP/IP communications, a datagram on a local-area network must contain both a destination MAC address and a destination IP address. These addresses must be correct and match the destination MAC and IP addresses of the host device. If it does not match, the datagram will be discarded by the destination host. Communications within a LAN segment require two addresses. There needs to be a way to automatically map IP to MAC addresses. It would be too time consuming for the user to create the maps manually. 2.3.1.1 Address Resolution Protocol The TCP/IP suite has a protocol, called Address Resolution Protocol (ARP), which can automatically obtain MAC addresses for local transmission. Different issues are raised when data is sent outside of the local area network. Communications between two LAN segments have an additional task. Both the IP and MAC addresses are needed for both the destination host and the intermediate routing device. TCP/IP has a variation on ARP called Proxy ARP that will provide the MAC address of an intermediate device for transmission outside the LAN to another
DIT 116
NETWORK PROTOCOLS
network segment. With TCP/IP networking, a data packet must contain both a destination MAC address and a destination IP address. If the packet is missing either one, the data will not pass from Layer 3 to the upper layers. In this way, MAC addresses and IP addresses act as checks and balances for each other. After devices determine the IP addresses of the destination devices, they can add the destination MAC addresses to the data packets. Some devices will keep tables that contain MAC addresses and IP addresses of other devices that are connected to the same LAN. These are called Address Resolution Protocol (ARP) tables. ARP tables are stored in RAM memory, where the cached information is maintained automatically on each of the devices. It is very unusual for a user to have to make an ARP table entry manually. Each device on a network maintains its own ARP table. When a network device wants to send data across the network, it uses information provided by the ARP table. When a source determines the IP address for a destination, it then consults the ARP table in order to locate the MAC address for the destination. If the source locates an entry in its table, destination IP address to destination MAC address, it will associate the IP address to the MAC address and then uses it to encapsulate the data. The data packet is then sent out over the networking media to be picked up by the destination device. There are two ways that devices can gather MAC addresses that they need to add to the encapsulated data. One way is to monitor the traffic that occurs on the local network segment. All stations on an Ethernet network will analyze all traffic to determine if the data is for them. Part of this process is to record the source IP and MAC address of the datagram to an ARP table. So as data is transmitted on the network, the address pairs populate the ARP table. Another way to get an address pair for data transmission is to broadcast an ARP request. The computer that requires an IP and MAC address pair broadcasts an ARP request. All the other devices on the local area network analyze this request. If one of the local devices matches the IP address of the request, it sends back an ARP reply that contains its IP-MAC pair. If the IP address is for the local area network and the computer does not exist or is turned off, there is no response to the ARP request. In this situation, the source device reports an error. If the request is for a different IP network, there is another process that can be used. 2.3.1.2 Proxy ARP A proxy ARP is an ARP that acts on behalf of a set of hosts. Whenever a router running a proxy ARP receives an ARP request looking for the IP address of one of these hosts, the router sends an ARP reply announcing its own hardware (physical) address. After the router receives the actual IP packet, it sends the packet to the appropriate host or router.
NOTES
69
DIT 116
NETWORK PROTOCOLS
NOTES
Routers do not forward broadcast packets. If the feature is turned on, a router performs a proxy ARP. In this version of ARP, a router sends an ARP response with the MAC address of the interface, on which the request was received, to the requesting host. The router responds with the MAC addresses for those requests in which the IP address is not in the range of addresses of the local subnet. Another method to send data to the address of a device that is on another network segment is to set up a default gateway. The default gateway is a host option where the IP address of the router interface is stored in the network configuration of the host. The source host compares the destination IP address and its own IP address to determine if the two IP addresses are located on the same segment. If the receiving host is not on the same segment, the source host sends the data using the actual IP address of the destination and the MAC address of the router. The MAC address for the router was learned from the ARP table by using the IP address of that router. If the default gateway on the host or the proxy ARP feature on the router is not configured, no traffic can leave the local area network. One or the other is required to have a connection outside of the local area network. 2.3.1.3 Reverse Address Resolution Protocol (RARP) Reverse Address Resolution Protocol (RARP) associates a known MAC addresses with an IP addresses. A network device, such as a diskless workstation, might know its MAC address but not its IP address. RARP allows the device to make a request to learn its IP address. Devices using RARP require that a RARP server be present on the network to answer RARP requests. Consider an example where a source device wants to send data to another device. In this example, the source device knows its own MAC address but is unable to locate its own IP address in the ARP table. The source device must include both its MAC address and IP address in order for the destination device to retrieve data, pass it to higher layers of the OSI model, and respond to the originating device. Therefore, the source initiates a process called a RARP request. This request helps the source device to get its own IP address from a RARP server. RARP requests are broadcast onto the LAN and are responded to by the RARP server which is usually a router. RARP uses the same packet format as ARP. However, in a RARP request, the MAC headers and operation code are different from an ARP request. The RARP packet format contains places for MAC addresses of both the destination and source devices. The source IP address field is empty. The broadcast goes to all devices on the network. Therefore, the destination MAC address will be set to all binary 1s. Workstations running RARP have codes in ROM that direct them to start the RARP process. 2.3.2 BOOTP
The RARP protocol has three drawbacks. First, because RARP operates at a low level, using it requires direct access to the network hardware. Thus, it may be
DIT 116
NETWORK PROTOCOLS
difficult or impossible for an application programmer to build a server. Second, although RARP requires a packet exchange between a client machine and a computer that answers its request, the reply contains only one small piece of information: the clients 4 octet address. This drawback is annoying on networks like an Ethernet that enforce a minimum packet size because additional information could be sent in response at no additional cost. Third, because RARP uses a computers hardware address to identify the machine, it cannot be used on networks that dynamically assign hardware addresses to identify the machine. BOOTP eliminates these drawbacks to certain extent and DHCP is more dynamic and effective. 2.3.2.1 Using BOOTP to determine an IP address BOOTP uses UDP to carry messages and that UDP messages are encapsulated in IP datagrams for delivery. Here the issue is how a computer can send BOOTP in an IP datagram before the computer learns its IP address. BOOTP solves this problem with the help of special case IP addresses. IP address consisting of all 1s specifies limited broadcast. IP software can accept the broadcast address even before the software has discovered its local IP address information. Suppose client machine A wants to use BOOTP to find bootstrap information including its IP address and suppose B is the server on the same physical net that will answer the request. Because A does not know Bs IP address or the IP address of the network, it must broadcast its initial BOOTP request using the IP limited broadcast address. Now the issue is whether B has to respond with As IP address (since B knows As IP address) or with the limited broadcast address. The server B uses the limited broadcast address. The reason is if B uses As IP address, Bs network interface software has to use mechanisms like ARP to find out As MAC address. However till the BOOTP reply reaches A, A does not know its IP address. As a result A will not respond to the Bs ARP request. Therefore B has only two alternatives: either broadcast the reply or use the information from the request packet to manually add an entry to its ARP cache. On systems that do not allow programs to modify the ARP cache, broadcasting is the only solution. 2.3.2.2 BOOTP Transmission Policy BOOTP places all responsibility for reliable communication on the client. We know that because UDP uses IP for delivery, messages can be delayed, lost, delivered out of order, or duplicated. Furthermore, because IP does not provide checksum for data, the UDP datagram could arrive with some bits corrupted. To guard against corruption, BOOTP requires that UDP use checksums. It also specifies that requests and replies should be sent with the do not fragment bit set to accommodate clients that have too little memory to reassemble datagrams. BOOTP is also constructed to allow multiple replies; it accepts and processes the first.
71
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
To handle datagram loss, BOOTP uses the conventional technique of timeout and retransmission. When the client transmits a request, it starts a timer. If no reply arrives before the timer expires, the client must retransmit the request. Of course, after a power failure all machines on a network will reboot simultaneously, possibly overrunning the BOOTP server (s) with requests. If all clients exactly follow the same transmission timeout, many or all of them will attempt to retransmit simultaneously. To avoid the resulting collisions, the BOOTP specification recommends using a random delay. In addition, a specification recommends starting with a random type of value between 0 and 4 seconds, and doubling the timer after each retransmission. After the timer reaches a large value, 60 seconds, the client does not increase the timer, but continuous to use randomization. Doubling the timeout of each retransmission keeps BOOTP from adding excessive traffic to a congested netwok; the randomization helps avoids simultaneous transmissions. 2.3.2.3 The BOOTP Message Format To keep an implementation as simple as possible, BOOTP messages have fixed length fields, and replies have the same format as requests. Although we said that the client and servers are programs, the BOOTP protocol uses the terms loosly, referring to the machine that ends a BOOTP request as the client and any machine that sends a reply as a server. The figure 2.1 shows the BOOTP message format. Field OP specifies that the message is a request (1) or a reply (2). As in ARP, fields HTYPE and HLEN specify the network hardware type and length of the hardware address (eg Ethernet has type 1 and address length 6). The client places 0 in the HOPS field. If it receives the request and decide to pass the request on to another machine (e.g. to allow bootstrapping across multiple routers), the BOOTP server increments the HOPS counts. The TRANSACTION ID field contains a integer that diskless machines use to match responses with requests. The SECONDS field reports the number of seconds since the client started to boot. The CLIENT IP ADDRESS field and all fields following it contains the most important information. To allow the greatest flexibility, clients fill in as much information as they know and leave remaining fields set to zero. For example, if a client knows the name or address of a specific server from which it wants information, it can fill in the SERVER IP ADDRESS or SERVER HOST NAME fields. If these fields are nonzero, only the server with matching name/address will answer the request; if they are zero, any server that receives the request will reply.
72
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 2.1. The format of BOOTP message
BOOTP can be used from a client that already knows its IP address (eg to obtain boot file information). A client that knows its IP address places it in the CLIENT IP ADDRESS field; other clients use zero. If the clients IP address is zero in the request, a server returns the clients IP address in the YOUR IP ADDRESS field. 2.3.2.4 The Two-Step Bootstrap procedure BOOTP uses a two step bootstrap procedure. It doesnt provide client with the memory image it only provides the client with the information needed to obtain an image. The client then uses the second protocol to obtain the memory image. While the two-step procedure may seem unnecessary, it allows a clean separation of configuration and storage. A BOOTP server doesnt need to run on the same machine that stores memory images. In fact, the BOOTP server operates from a simple database that only knows the names of memory images. Keeping configuration separate from storage is important because it allows administrator to configure sets of machines so they act identically or independently. The BOOT FILE NAME field of a BOOTP message illustrates the concept. Suppose an
DIT 116
NETWORK PROTOCOLS
NOTES
administrator has several work stations with different architecture, and suppose that when users boot one of the workstation, they either choose to run UNIX or a local operating system. Because the set of outstation includes multiple hardware architectures, no single memory image will operate on all machines. To accommodate such diversity, BOOTP allows the BOOT FILE NAME field in a request to contain a generic name like unix, which means, I want to boot the UNIX operating system for this machine. The BOOTP server consults its configuration database to map the generic name into a specific file name it contains a UNIX memory image appropriate for the client hardware, and returns the specific (i.e. fully qualified) name in its reply. Of course, the configuration database also allows completely automatic bootstrapping in which the client places zeros in the BOOT FILE NAME field, and BOOTP selects a memory image for the machine. The advantage of the automatic approach is that it allows users to specify generic names that work on any machine; they dont need to remember specific file name or hardware architecture. 2.3.3 Dynamic Host Configuration Protocol
BOOTP is not a dynamic configuration protocol. When a client requests its IP address, the BOOTP server consults a table that matches the physical address of the client with its IP address. This implies that the binding between the physical address and the IP address of the client already exists. The binding is predetermined. BOOTP can not handle situations like assigning a temporary IP address or changing an IP address. The Dynamic Host Configuration Protocol (DHCP) has been devised to provide static and dynamic address allocation that can be manual or automatic. The Dynamic Host Configuration Protocol (DHCP) automates the assignment of IP addresses, subnet masks, default gateway, and other IP parameters. The assignment occurs when the DHCP-configured machine boots up or regains connectivity to a network. The DHCP client sends out a query requesting a response from a DHCP server on the locally attached network. The query is typically initiated immediately after booting up and before the client initiates any IP based communication with other hosts. The DHCP server then replies to the client with its assigned IP address, subnet mask, DNS server and default gateway information. The assignment of the IP address generally expires after a predetermined period of time, before which the DHCP client and server renegotiate a new IP address from the servers predefined pool of addresses. Typical intervals range from one hour to several months, and can, if desired, be set to infinite (never expire). The length of time the address is available to the device it was assigned to is called a lease, and is determined by the server. Configuring firewall rules to accommodate access from machines who receive their IP addresses via DHCP is therefore more difficult because the remote IP address will vary from time to time. Administrators must usually allow access to the entire reAnna University Chennai 74
DIT 116
NETWORK PROTOCOLS
mote DHCP subnet for a particular TCP/UDP port. Most home routers and firewalls are configured in the factory to be DHCP servers for a home network. An alternative to a home router is to use a computer as a DHCP server. ISPs generally use DHCP to assign clients individual IP addresses. DHCP is a broadcast-based protocol. As with other types of broadcast traffic, it does not cross a router unless specifically configured to do so. Users who desire this capability must configure their routers to pass DHCP traffic across UDP ports 67 and 68. Home users, however, will practically never need this functionality. 2.3.3.1 IP Address Allocation Depending on implementation, the DHCP server has three methods of allocating IP-addresses: Manual allocation, where the DHCP server performs the allocation based on a table with MAC address - IP address pairs manually filled by the server administrator. Only requesting clients with a MAC address listed in this table get the IP address according to the table. Automatic allocation, where the DHCP server permanently assigns to a requesting client a free IP-address from a range given by the administrator. Dynamic allocation, the only method which provides dynamic re-use of IP addresses. A network administrator assigns a range of IP addresses to DHCP, and each client computer on the LAN has its TCP/IP software configured to request an IP address from the DHCP server when that client computers network interface card starts up. The request-and-grant process uses a lease concept with a controllable time period. This eases the network installation procedure on the client computer side considerably. This decision remains transparent to clients. Some DHCP server implementations can update the DNS name associated with the client hosts to reflect the new IP address. 2.3.3.2 DHCP discovery The client broadcasts on the local physical subnet to find available servers. Network administrators can configure a local router to forward DHCP packets to a DHCP server on a different subnet. This client-implementation creates a UDP packet with the broadcast destination of 255.255.255.255 or subnet broadcast address. A client can also request its last-known IP address. If the client is still in a network where this IP is valid, the server might grant the request. Otherwise, it depends whether the server is set up as authoritative or not. An authoritative server will deny the request, making the client ask for a new IP immediately. A non-authoritative server simply ignores the request, leading to an implementation dependent time out for the client to give up on the request and ask for a new IP.
75
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
2.3.3.3 DHCP Request/Response and Acknowledgement When a DHCP server receives an IP lease request from a client, it extends an IP lease offer. This is done by reserving an IP address for the client and sending a DHCPOFFER message across the network to the client. This message contains the clients MAC address, followed by the IP address that the server is offering, the subnet mask, the lease duration, and the IP address of the DHCP server making the offer. The server determines the configuration, based on the clients hardware address as specified in the CHADDR field. Here the server, 192.168.1.1, specifies the IP address in the YIADDR field. When the client PC receives an IP lease offer, it must tell all the other DHCP servers that it has accepted an offer. To do this, the client broadcasts a DHCPREQUEST message containing the IP address of the server that made the offer. When the other DHCP servers receive this message, they withdraw any offers that they might have made to the client. They then return the address that they had reserved for the client back to the pool of valid addresses that they can offer to another computer. Any number of DHCP servers can respond to an IP lease request, but the client can only accept one offer per network interface card. After 50% of the lease time has passed, the client will attempt to renew the lease with the original DHCP server that it obtained the lease from using a DHCPREQUEST message. Any time the client boots and the lease is 50% or more passed, the client will attempt to renew the lease. At 87.5% of the lease completion, the client will attempt to contact any DHCP server for a new lease. If the lease expires, the client will send a request as in the initial boot when the client had no IP address. If this fails, the client TCP/IP stack will cease functioning. When the DHCP server receives the DHCPREQUEST message from the client, it initiates the final phase of the configuration process. This acknowledgement phase involves sending a DHCPACK packet to the client. This packet includes the lease duration and any other configuration information that the client might have requested. At this point, the TCP/IP configuration process is complete. The server acknowledges the request and sends the acknowledgement to the client. The system as a whole expects the client to configure its network interface with the supplied options. The exchanging of messages in DHCP is shown in figure 2.2.
76
DIT 116
NETWORK PROTOCOLS
NOTES
DHCPDISCOVER DHCPOFFER DHCPREQUEST DHCPACK DHCPREQUEST DHCPNACK DHCPRELEASE
Client
Server
Figure 2.2 DHCP message exchange
2.3.3.4 DHCP Message Format As figure 2.3 illustrates, DHCP uses the BOOTP message format, but modifies the contents and meanings of some fields.
Figure 2.3. The format for DHCP message 77 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
As the figure shows, most of the fields in a DHCP message are identical to fields in a BOOTP message. Infact, the two protocols are compatible; a DHCP server can be programmed to answer BOOTP requests. However, DHCP changes the meaning of two fields. First, DHCP interprets BOOTPs UNUSED field as a 16-bit FLAGS field. Infact, figure 2.4 shows that only the high order bit of the FLAGS field has been assigned a meaning.
Figure 2.4. The format of the 16-bit FLAGS field in a DHCP message.
Because the DHCP request message contains the client hardware address, a DHCP address normally sends its responses to the client using hardware unicast. A client sets the high order bit in the FLAGS field to request that the server respond using hardware broadcast instead of hardware unicast. To understand why a client might choose a broadcast response, recall the while the client communicates with a DHCP server, it does not yet have an IP address. If a datagram arrives via hardware unicast and the destination address does not match the computers address, IP can discard the datagram. However, IP is required to accept and handle any datagram sent to the IP broadcast address. To ensure IP software accepts and delivers DHCP messages that arrive before the machines IP address has been configured, a DHCP client can request that the server send responses using IP broadcast. 2.3.3.5 DHCP Options and Message Type Surprisingly, DHCP does not add new fixed fields to the BOOTP message format, nor does it change the meaning of most fields. For example, the OP field in the DHCP message contains the same values as the OP field in a BOOTP message: the messages either a boot request (1) or a boot reply(2). To encode information such as the lease duration, DHCP uses options. In particular, figure 2.5 illustrates the DHCP message type option use to specify which DHCP message is being sent. The options field has the same format as the VENDOR SPECIFIC AREA, and DHCP honors all the vendor specific information items defined for BOOTP. As in BOOTP, each option consists of a 1- octet code field and a 1-octet length field followed by octets of data that comprise the option. As the figure shows, the option used to specify a DHCP message type consists of exactly three octets. The first octet contains the code 53, the second contains the length 1, and the third contains the value used to identify one of the possible DHCP messages.
78
DIT 116
NETWORK PROTOCOLS
TYPE FIELD Corresponding DHCP Message Type _________________________________________________________ 1 2 3 4 5 6 7

Figure 2.5
NOTES
DHCPDISCOVER DHCPOFFER DHCPREQUEST DHCPDECLINE DHCPACK DHCPNACK DHCPRELEASE

Possible values for TYPE in DHCP Message
Fields SERVER HOST NAME and BOOT FILE NAME in the DHCP message header each occupy many octets. If a given message does not contain information in either of those fields, the space is wasted. To allow a DHCP server to use the two fields for other options, DHCP defines an option overload option. When present, the overload option tells the receiver to ignore the usual meaning of the SERVER HOST NAME and BOOT FILE NAME fields, and look for options in the fields instead. 2.4 2.4.1 IP MULTICAST Motivation and Requirements
The term multicasting is used to describe the distribution of a copy of the packets/ datagrams generated by each host to all other hosts in the group called multicast group. The motivation for developing multicast is that there are applications that want to send a packet to more than one destination host. For example, applications such as audio and video conferencing require a copy of the information generated by each host participating in a conference to be sent to all the other hosts that belong to the same conference. Some other popular examples that involve one sender and multiple receivers are updating replicated, distributed databases, transmitting stock quote to multiple brokers etc. The fact to be observed by you in all these cases is that, the basic forwarding mechanism of IP, i.e., unicasting is not sufficient. In such applications the required mode of communication is one to many, i.e., the datagram generated by the source should reach all the members of the multicast group. Now the issue is who is responsible for the creation of multiple datagrams and delivery to all members of the group. The choices are either the host or the network. The idea followed in multicasting is instead of forcing the source host to send a separate packet to each of the destination
79
DIT 116
NETWORK PROTOCOLS
NOTES
hosts in the multicast group, we want the source to be able to send a single packet to a multicast address, and the network is expected to deliver a copy of that packet to each of a group of hosts. Besides the delivery of the datagrams to multiple hosts, the network has to satisfy the following requirements of the hosts. Hosts can then choose to join or leave this group at will, without synchronizing or negotiating with other members of the group. Also a host may belong to more than one group at a time. Internet multicast can be implemented on top of a collection of networks that support hardware multicast (or broadcast) by extending the routing and forwarding functions implemented by the routers that connect these networks. In the internet, multicasting is implemented by making use of the services provided by the normal IP. However to make the normal IP to support multicasting, a number of additional data structures and tables are to be maintained by the routers (at least by a subset of routers) because the normal IP communication is between one sender and one receiver. In this section, you are going to learn the various concepts and protocols involved in the process of multicasting. 2.4.2 Characteristics and Components of IP Multicasting
Multicasting can be implemented either with the rapport of the network technology (hardware) or as a software abstraction. Implementing multicast through hardware is feasible in the LAN technologies like token ring, Ethernet etc. However, hardware implementation of multicast in a world wide network like the Internet is almost impossible. Hence IP multicasting is the software abstraction of hardware multicasting. The three conceptual pieces required for the general purpose internet multicasting system are as follows. 1. A multicast addressing scheme 2. An effective notification and delivery mechanism 3. An efficient internetwork forwarding facility It is important for you to understand that any transmission scheme should have an addressing scheme in which the destinations can be identified without any conflict in the addresses. The normal addresses that are assigned to the hosts can not be used for multicasting. Hence it becomes necessary for the address field to have special fields or an indication to inform the network that the mode of communication required is not the usual unicast, instead the information is to be delivered to a set of hosts. The major challenge in devising multicast addresses is that addressing scheme has to satisfy two conflicting goals. It has to allow local autonomy in assigning the IP addresses and at the same time the assigned addresses should have global meaning.
80
DIT 116
NETWORK PROTOCOLS
Similarly hosts need a notification mechanism to inform routers about multicast groups in which they are participating, and routers need a delivery mechanism to transfer multicast packets to hosts. Even here also complications arise due to the fact that the multicasting scheme should make effective use of hardware multicast when it is available and at the same time, it should allow IP multicast delivery over networks which do not have hardware support for multicast. Multicasting scheme has to support effective and dynamic forwarding mechanism that can support the requirements of the group without wasting the valuable resources of the network. It should route multicast packets along the shortest paths, should not send a copy of a datagram along a path if the path does not lead to a member of group, and should allow hosts to join and leave groups at any time. IP multicasting includes all three aspects. It defines IP multicast addressing, specifies how hosts send and receive multicast datagrams and describes the protocol routers use to determine multicast group membership on a network. All these issues indicate that multicasting is much more complicated than the simple unicasting scheme. In IP terminology, a given subset is known as multicast group. IP multicasting has the following characteristics. 1. Group Address: each multicast group is a unique class D address. Feed Unidirectional IP (FUIP) multicast addresses are permanently assigned by the internet authority, and correspond to groups that always exist even if they have no current members. Other addresses are temporary, and are available for private use. 2. Number of Groups: IP provides address for unto 228 simultaneous multicast groups. Thus, the number of groups is limited by practical constraints on routing table size rather than addressing. 3. Dynamic Group Membership: A host can join or leave an IP multicast group at any time. Furthermore, a host may be a member of an arbitrary number of multicasting groups. 4. Use of Hardware: If the underline network hardware supports multicast, IP uses hardware multicast to send IP multicast. If the hardware does not support multicast, IP uses broadcast or unicast to deliver IP multicast. 5. Inter-Network Forwarding: Because members of an IP multicast group can attach to multiple physical networks, special multicast routers are required to forward IP multicast; the capability is usually added to conventional routers. 6. Delivery Semantics: IP multicast uses the same best-effort delivery semantics as other IP datagram delivery, meaning that multicast datagrams can be lost, delayed, duplicated or delivered out of order. 7. Membership and Transmission: An arbitrary host may send datagram to any multicast group; group membership is only used to determine whether the host receives datagrams sent to the group.
81
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
2.4.3
Multicast Addresses and Multicast Delivery
You recall the fact that many hardware technologies contain mechanisms to send packets to multiple destinations simultaneously (or at least nearly simultaneously). Especially shared media networks are able to deliver the frames to all the nodes of the network and this type of delivery is called broadcast. In other words, we can say that most of the Local Area Networks (LANs) support broadcast and hence multicasting which is a subset of broadcasting can be accomplished at the hardware level itself. Please note the fact that, in the case of switched networks or internets it is not possible to achieve multicasting at the hardware itself. In such networks either broadcast or multicast is to be implemented at the software level. However, even in technologies where hardware level broadcast is available, special type of addresses is required for multicast. Hence, in technologies where broadcast is supported it is possible to have multicast addresses at layer 2 level. In other networks, it becomes necessary to have multicast addresses at layer 3 level. 2.4.3.1 Layer 2 Multicast Addresses Normally, Network Interface Cards (NICs) on a LAN segment will receive only packets destined for their burned-in Medium Access Control (MAC) address or the broadcast MAC address. Some means had to be devised so that multiple hosts could receive the same packet and still be capable of differentiating among multicast groups. Fortunately, the IEEE LAN specifications made provisions for the transmission of broadcast and/or multicast packets. In the 802.3 standard, bit 0 of the first octet is used to indicate a broadcast and/or multicast frame. Figure 2.6 shows the location of the broadcast/multicast bit in an Ethernet frame.
Figure2.6. IEEE 802.3 MACAddress Format
This bit indicates that the frame is destined for an arbitrary group of hosts or all hosts on the network (in the case of the broadcast address, 0xFFFF.FFFF.FFFF). IP multicast makes use of this capability to transmit IP packets to a group of hosts on a LAN segment. The IANA (Internet Assigned Numbers Authority) owns a block of Ethernet MAC addresses that start with 01:00:5E in hexadecimal. Half of this block is allocated for multicast addresses. This creates the range of available Ethernet MAC addresses to be 0100.5e00.0000 through 0100.5e7f.ffff. This allocation allows for 23 bits in the Ethernet address to correspond to the IP multicast group address. The mapping places the lower 23 bits of the IP multicast group address into these available 23 bits in the Ethernet address as shown in Figure 2.7.
DIT 116
NETWORK PROTOCOLS
NOTES
Figure2.7. Mapping of IP Multicast to Ethernet/FDDI MACAddress
Because the upper 5 bits of the IP multicast address are dropped in this mapping, the resulting address is not unique. In fact, 32 different multicast group IDs all map to the same Ethernet address as shown in 2.8.
Figure2.8. MACAddressAmbiguities
Since Ethernet is the most popular LAN technology, Layer 2 multicast addressing has been discussed in terms of that. It is better for you to know that Ethernet technology has the required device driver software to reconfigure the device to allow it to also recognize one or more multicast addresses. After the reconfiguration, an interface will accept any packet sent to the computers unicast addresses, the broadcast address, or that one multicast address. Ideally, IP multicast has to support all types of layer 2 level multicast addresses. However, the fact is IP multicast supports only popular technologies like Ethernet. 2.4.3.2 Class D Multicast Address The TCP/IP reference model supports multicast addressing in layer 3 and it becomes a must to make use of these addresses when the members of a multicast group are geographically distributed across networks. IP supports multicasting using class D addresses. Each class D address identifies a group of hosts. Twenty-eight bits are available for identifying groups; so over 250 million groups can exist at the same time.
DIT 116
NETWORK PROTOCOLS
NOTES
When a process sends a packet to a class D address, a best-effort attempt is made to deliver it to all the members of the group associated, but no guarantees are given. Some members may not get the packet. Two kinds of group addresses are supported: permanent addresses and temporary ones. Permanent addresses are called well known; they are used for major services on the global Internet as well as for infrastructure maintenance (e.g. multicast routing protocols). Other multicast addresses correspond to transient multicast groups that are created when needed and discarded when the count of group members reaches zero. A temporary group must be created before it can be used. The format of a class D address is shown in figure 2.9.
1 1 1 0 Group Identification
Figure 2.9 The format of class D address
The first 4 bits contain 1110 and identify the addresses as a multicast. The remaining 28 bits specify a particular multicast group. There is no further structure in the group bits. In particular, the group field is not partitioned into bits that identify the origin or owner of the group, nor does it contain administrative information such as whether all members of the group are on one physical network. When expressed in dotted decimal notation, multicast addresses range from 224.0.0.0 through 239.255.255.255. However, many parts of the address space have been special meaning. For example, the lowest address, 224.0.0.0 is reserved it cannot be assigned to any group. Furthermore, the remaining addresses up through 224.0.0.255 are devoted to multicast routing and group maintenance protocol. A router is prohibited from forwarding a datagram sent to any address in that range. Figure 2.10 shows few examples of popular permanently assigned addresses.
84
DIT 116
NETWORK PROTOCOLS
NOTES
Address 224.0.0.0 224.0.0.1 224.0.0.2 224.0.0.3 224.0.0.4 224.0.0.5 224.0.0.6 224.0.0.7 224.0.0.8 224.0.0.9 224.0.0.10 224.0.0.11 224.0.012 224.0.0.13 224.0.0.14 224.0.0.15 224.0.0.16 224.0.0.17 224.0.0.18 224.0.0.19 through 224.0.0.255 224.0.1.21 224.0.1.84 224.0.1.85 239.192.0.0 through 239.251.255.255 Meaning Base address (Reserved) All systems on this subnet All routers on this subnet Unassigned DWMRP routers OSPFIGP All Routers OSPFIGP Designated Routers ST Routers ST Hosts RIP2 Routers IGRP Routers Mobile Agents DHCP Server/ Relay agent All PIM Routers RSVP Encapsulation All CBT Routers Designated-Sbm All-Sbms VRRP Unassigned DVMRP onMOSPF Jini Announcement Jini Request Scope restricted to one organization
Figure 2.10. Examples of few permanent IP addresses
Among the above set of permanent multicast group addresses, two of the addresses are especially important to the multicast delivery mechanism. Address 224.0.0.1 is permanently assigned to the all systems group, and address 224.0.0.2 is permanently assigned to the routers group. The all system group includes all hosts and routers on a network that are participating in IP multicast, whereas the all routers group includes only the routers that are participating. In general, both of these groups are used for control protocols and not for the normal delivery of data. Furthermore, datagrams sent to these addresses only reach machines on the same local network as the sender; there are no IP multicast addresses that refer to all systems in the internet or all routers in the internet.
DIT 116
NETWORK PROTOCOLS
NOTES
IP treats multicast addresses differently than unicast addresses. For example, a multicast address can only be used as a destination addresses. Thus, a multicast addresses can never appear in the source address field of a datagram, nor can it appears in a source route or record route option. Furthermore, no ICMP error messages can be generated about multicast datagrams (e.g. destination unreachable, source quench, echo reply, or time exceeded). Thus, a ping sent to a multicast address will go unanswered. The role prohibiting ICMP errors is somewhat surprising because IP routers do honor the time-to-live field in the header of the multicast datagram. As usual, each router decrements the count, and discards the datagram (without sending an ICMP message) if the count reaches zero. We will see that some protocols use the time-tolive count as a way to limit datagram propagation. 2.4.3.3 Role of Hosts and Routers in Multicast Delivery It is important for you to understand the role played by the host and the network (routers and multicast routers) in delivering datagrams to all the members of the group. In multicasting, the datagrams transmitted by the source (host) has to reach all other members (hosts) of the multicast group. Two possible scenarios exist in which we have to achieve this. One is a single network and another one is an internet. In the former case, a host can send directly to a destination host merely by placing the datagram in a frame and using a hardware multicast address to which the receiver is listening. In the later case, a host sends the datagram to the nearest multicast router and multicast routers forward the datagrams across the network. However, IP multicasting reduces the responsibility of the hosts and it does not expect either a host to install a router as a multicast router, or to make a default route as a multicast router. Instead, the technique a host uses to forward a multicast datagram to a router is unlike the routing lookup used for unicast and broadcast datagrams the host merely uses the local network hardwares multicast capability to transmit the datagram. Multicast routers listen for all IP multicast transmissions. If a multicast router is present on the network, it will receive the datagram and forward it on to the other network if necessary. Thus the primary difference between local and nonlocal multicast lies in multicast routers, not in hosts. 2.4.3.4 Multicast Scope The scope of a multicast group refers to the range of group members. If all the members are on the physical network, we say that the groups scope is restricted to one network. Similarly, if all members of a group lie within a single organization, we say that the group has a scope limited to one organization. In addition to the groups scope, each multicast datagram has a scope, which is defined to be the set of networks over which a given multicast datagram will be propagated. Informally, a datagrams scope is referred to as its range.
86
DIT 116
NETWORK PROTOCOLS
IP uses two techniques to control multicast scope. The first technique relies on the datagrams time-to-live (TTL) field to control its range. By setting the TTL to small value, a host can limit the distance the datagram will be routed. For example, the standards specify the control messages which are used for communication between a host and a router on the same network must have a TTL of 1. As a consequence a router never forwards any datagram carrying control information, because the TTL expires on reading the router to discard the datagram. Similarly, if two applications running on a single host want to use IP multicast for interprocessor communication (e.g. for testing software), they can choose a TTL value of 0 to prevent the datagram from leaving the host. It is possible to use successively larger values of the TTL field to further extend the notion of scope. For example, some router vendors suggest configuring routers at a site to restrict multicast datagrams from leaving the site unless the datagram has a TTL greater than 15. We conclude that it is possible to use the TTL field in a datagram header to provide coarse-grain control over the datagrams scope. The second technique is administrative scoping. Administrative scoping consists of reserving parts of the address field for groups that are local to a given site or local to a given organization. According to the standard, routers in the internet are forbidden from forwarding any datagram that has an address chosen from the restricted space. Thus, to prevent multicast communication among group members from accidentally reaching outsiders, an organization can assign the group an address that has local scope. Figure 2.2 shows examples of address ranges that correspond to administrative scoping. 2.4.3.5 Extending Host Software to Handle Multicasting A host participates in IP multicast at one of the three levels as figure 2.11 shows. Among the above set of permanent multicast group addresses, two of the addresses are especially important to the multicast delivery mechanism. Address 224.0.0.1 is permanently assigned to the all systems group, and address 224.0.0.2 is permanently assigned to the routers group. The all system group includes all hosts and routers on a network that are participating in IP multicast, whereas the all routers group includes only the routers that are participating. In general, both of these groups are used for control protocols and not for the normal delivery of data. Furthermore, datagrams sent to these addresses only reach machines on the same local network as the sender; there are no IP multicast addresses that refer to all systems in the internet or all routers in the internet. IP treats multicast addresses differently than unicast addresses. For example, a multicast address can only be used as a destination addresses. Thus, a multicast addresses can never appear in the source address field of a datagram, nor can it appears in a source route or record route option. Furthermore, no ICMP error messages can be generated about multicast datagrams (e.g. destination unreachable, source quench,
87
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
echo reply, or time exceeded). Thus, a ping sent to a multicast address will go unanswered. The role prohibiting ICMP errors is somewhat surprising because IP routers do honor the time-to-live field in the header of the multicast datagram. As usual, each router decrements the count, and discards the datagram (without sending an ICMP message) if the count reaches zero. We will see that some protocols use the time-tolive count as a way to limit datagram propagation. 2.4.3.3 Role of Hosts and Routers in Multicast Delivery It is important for you to understand the role played by the host and the network (routers and multicast routers) in delivering datagrams to all the members of the group. In multicasting, the datagrams transmitted by the source (host) has to reach all other members (hosts) of the multicast group. Two possible scenarios exist in which we have to achieve this. One is a single network and another one is an internet. In the former case, a host can send directly to a destination host merely by placing the datagram in a frame and using a hardware multicast address to which the receiver is listening. In the later case, a host sends the datagram to the nearest multicast router and multicast routers forward the datagrams across the network. However, IP multicasting reduces the responsibility of the hosts and it does not expect either a host to install a router as a multicast router, or to make a default route as a multicast router. Instead, the technique a host uses to forward a multicast datagram to a router is unlike the routing lookup used for unicast and broadcast datagrams the host merely uses the local network hardwares multicast capability to transmit the datagram. Multicast routers listen for all IP multicast transmissions. If a multicast router is present on the network, it will receive the datagram and forward it on to the other network if necessary. Thus the primary difference between local and nonlocal multicast lies in multicast routers, not in hosts. 2.4.3.4 Multicast Scope The scope of a multicast group refers to the range of group members. If all the members are on the physical network, we say that the groups scope is restricted to one network. Similarly, if all members of a group lie within a single organization, we say that the group has a scope limited to one organization. In addition to the groups scope, each multicast datagram has a scope, which is defined to be the set of networks over which a given multicast datagram will be propagated. Informally, a datagrams scope is referred to as its range. IP uses two techniques to control multicast scope. The first technique relies on the datagrams time-to-live (TTL) field to control its range. By setting the TTL to small value, a host can limit the distance the datagram will be routed. For example, the standards specify the control messages which are used for communication between a host and a router on the same network must have a TTL of 1. As a consequence a
88
DIT 116
NETWORK PROTOCOLS
router never forwards any datagram carrying control information, because the TTL expires on reading the router to discard the datagram. Similarly, if two applications running on a single host want to use IP multicast for interprocessor communication (e.g. for testing software), they can choose a TTL value of 0 to prevent the datagram from leaving the host. It is possible to use successively larger values of the TTL field to further extend the notion of scope. For example, some router vendors suggest configuring routers at a site to restrict multicast datagrams from leaving the site unless the datagram has a TTL greater than 15. We conclude that it is possible to use the TTL field in a datagram header to provide coarse-grain control over the datagrams scope. The second technique is administrative scoping. Administrative scoping consists of reserving parts of the address field for groups that are local to a given site or local to a given organization. According to the standard, routers in the internet are forbidden from forwarding any datagram that has an address chosen from the restricted space. Thus, to prevent multicast communication among group members from accidentally reaching outsiders, an organization can assign the group an address that has local scope. Figure 2.2 shows examples of address ranges that correspond to administrative scoping. 2.4.3.5 Extending Host Software to Handle Multicasting A host participates in IP multicast at one of the three levels as figure 2.11 shows.
NOTES
Level 0 1 2
Meaning Host can neither send nor receive IP multicast Host can send but not receive IP multicast Host can both send and receive IP multicast
Figure 2.11. Three levels of participation in IP multicast
Modifications that allow a host to send IP multicast are not difficult. The IP software must allow an application program to specify a multicast address as a destination IP address, and the network interface software must be able to map an IP multicast address into the corresponding hardware multicast address (or use broadcast if the hardware does not support multicasting). Extending host software to receive IP multicast datagrams are more complex. IP software on the host must have an API that allows an application program to declare that it wants to join or leave a particular multicast group. If multiple application programs join the same group, the IP software must remember to pass each of them a copy of datagrams that arrive destined for the group. A process can ask its host to join a specific group. It can also ask its host to leave the group. When the last process on a host leaves the group, that group is no longer present on the host. Each host keeps track of which groups its processes currently belong to. That is, a host with multiple network connection may join a particular multicast group on one network and not on another. To understand the reason for keeping group
DIT 116
NETWORK PROTOCOLS
NOTES
membership associated with networks, remember that it is possible to use IP multicasting among local sets of machine. The host may want to use a multicast application to interact with machines on one physical net and not with the machines of another. Because group membership is associated with particular networks, the software must keep separate lists of multicast addresses for each network to which the machine attaches. Furthermore, an application program must specify a particular network when it asks to join or leave a multicast group. TCP/IP protocol stack includes a group management protocol (IGMP) exclusively to maintain the multicast groups available in the internet. Have you understood? 1. 2. 3. 4. 5. 2.5 What is the basic difference between unicasting and multicasting? What are the roles of hosts and routers in multicasting? What are the conceptual pieces of IP multicasting? Give some examples for LAN technologies that support multicasting. What is the necessity of layer 3 or IP multicast? INTERNET GROUP MANAGEMENT PROTOCOL
From the discussion we have made in the above sections, you might have understood that the management of multicast groups is the basic necessity to implement IP multicasting. The major goal of multicast IP is to relieve the source from the overhead of sending a separate packet to each of the destination hosts in the multicast group and to put the responsibility in the network. When we say that the network hast to do something, we refer to the multicast routers in the network. The assumption is the multicast routers have stored in its multicast address table the complete address that each multicast router has an interest in. Here the issue is how a multicast router learns the multicast address associated with its own attached networks/subnets. This is achieved by a protocol by name Internet Group Management Protocol (IGMP). The Internet Group Management Protocol (IGMP) is used by IP hosts to report their multicast group memberships to any immediately neighboring multicast routers. To participate in IP multicast on a local network, a host must have IGMP that allows it to send and receive multicast datagrams. To participate in a multicast that spans multiple networks, the host must inform local multicast routers. Routers that are members of multicast groups are expected to behave as hosts as well as routers, and may even respond to their own queries. Like ICMP, IGMP is an integral part of IP. It is required to be implemented by all hosts wishing to receive IP multicasts. IGMP messages are encapsulated in IP datagrams, with an IP protocol number of two. In the following sections, you will learn about the header format and operation of IGMP protocol.
90
DIT 116
NETWORK PROTOCOLS
2.5.1
IGMP Header
NOTES
CHECK SUM
Various fields of the IGMP header is shown in figure 2.12.
TYPE RESP TIME GROUP ADDRESS (ZERO IN QUERY)

Figure 2.12. IGMP header
TYPE - There are three types of IGMP messages of concern to the host- router interaction. The value 0x11 of this field indicates that it is a membership query. There are two sub-types of membership query messages by name general query that is used to learn which groups have members on an attached network and group specific query, which is used to learn if a particular group has any members on an attached network. The value 0x16 indicates that the message is a member report. Leave group messages are indicated by 0x17. RESP TIME - The maximum response time field is meaningful only in membership query messages, and specifies the maximum allowed time before sending a responding report in units of 1/10 second. In all other messages, it is set to zero by the sender and ignored by receivers. CHECKSUM - The checksum is the 16-bit ones complement of the ones complement sum of the whole IGMP message (the entire IP payload). The computation of checksum in IGMP is similar to that of the computation of checksum in IP. GROUP ADDRESS - In a membership query message, the group address field is set to zero when sending a general query, and set to the group address being queried when sending a group-specific query. In a membership report or leave group message, the group address field holds the IP multicast group address of the group being reported or left. Other fields Certain IGMP messages may be longer than 8 octets. As long as the Type is one that is recognized, an IGMPv2 implementation must ignore anything past the first 8 octets while processing the packet. However, the IGMP checksum is always computed over the whole IP payload, not just over the first 8 octets. 2.5.2 IGMP Operation
Multicast routers use IGMP to learn which groups have members on each of their attached physical networks. A multicast router keeps a list of multicast group memberships for each attached network, and a timer for each membership. Multicast group memberships mean the presence of at least one member of a multicast group on a given attached network, not a list of all of the members. With respect to each of its attached
DIT 116
NETWORK PROTOCOLS
NOTES
networks, a multicast router may assume one of two roles: querier or non-querier. Querier is the router that initiates the query message and the non-querier is the one that responds to the messages initiated by the queriers. There is normally only one querier per physical network. All multicast routers start up as a querier on each attached network. If a multicast router hears a query message from a router with a lower IP address, it must become a non-querier on that network. If a router has not heard a query message from another router for certain interval, it resumes the role of querier. Routers periodically send a general query on each attached network for which this router is the querier, to solicit membership information. On startup, a router should send specified number of general queries spaced closely together in order to quickly and reliably determine membership information. A general query is addressed to the all-systems multicast group (224.0.0.1), has a group address field of 0, and has a maximum response time called query response interval. When a host receives a general query, it sets delay timers for each group (excluding the all-systems group) of which it is a member on the interface from which it received the query. Each timer is set to a different random value, using the highest clock granularity available on the host, selected from the range (0, Max Response Time] with maximum response time as specified in the query packet. When a host receives a group-specific query, it sets a delay timer to a random value selected from the range (0, Max Response Time] for the group being queried if it is a member on the interface from which it received the query. If a timer for the group is already running, it is reset to the random value only if the requested maximum response time is less than the remaining value of the running timer. When a groups timer expires, the host multicasts a version 2 membership report to the group, with IP TTL of 1. If the host receives another hosts report (version 1 or 2) while it has a timer running, it stops its timer for the specified group and does not send a report, in order to suppress duplicate reports. When a router receives a Report, it adds the group being reported to the list of multicast group memberships on the network on which it received the report and sets the timer for the membership to the group membership interval. Repeated reports refresh the timer. If no Reports are received for a particular group before this timer has expired, the router assumes that the group has no local members and that it need not forward remotelyoriginated multicasts for that group onto the attached network. When a host joins a multicast group, it should immediately transmit an unsolicited version 2 membership report for that group, in case it is the first member of that group on the network. To cover the possibility of the initial membership report being lost or damaged, it is recommended that it be repeated once or twice after short delays called unsolicited report interval. When a host leaves a multicast group, if it was the last host to reply to a query with a membership report for that group, it should send a leave group message to the all-routers multicast group (224.0.0.2). If it was not the last host
92
DIT 116
NETWORK PROTOCOLS
to reply to a query, it may send nothing as there must be another member on the subnet. This is an optimization to reduce traffic; a host without sufficient storage to remember whether or not it was the last host to reply may always send a leave group message when it leaves a group. Routers should accept a leave group message addressed to the group being left, in order to accommodate implementations of an earlier version of this standard. Leave group messages are addressed to the all-routers group because other group members have no need to know that a host has left the group, but it does no harm to address the message to the group. When a querier receives a leave group message for a group that has group members on the reception interface, it sends last member query count to the group it is leaving from. These group-specific queries have their maximum response time set to last member query interval. If no reports are received after the response time of the last query expires, the routers assume that the group has no local members, as above. Any querier to non-querier transition is ignored during this time; the same router keeps sending the group-specific queries. Non-queriers must ignore leave group messages, and queriers should ignore leave group messages for which there are no group members on the reception interface. When a non-querier receives a group-specific query message, if its existing group membership timer is greater than last member query count times the maximum response time specified in the message, it sets its group membership timer to that value. 2.5.3 Group Membership State Transitions
NOTES
On a host, IGMP must remember the status of each multicast group to which the host belongs (i.e., a group from which the host accepts datagrams). We think of a host as keeping a table in which it records group membership information. Initially, all entries in the table are unused. Whenever an application on the host joins a new group, IGMP software allocates an entry and fills in information about the group. Among the information, IGMP keeps a group reference counter which it initializes to 1. Each time another application program joins the group, IGMP increments the reference counter in the entry. If one of the application program terminates execution (or explicitly drops out of the group), IGMP decrements the groups reference counter. When the reference count reaches zero, the host informs multicast routers that it is leaving the multicast group. In a multicast group, there are three possible states of an entry in a hosts multicast group table and transitions among them take place where each transition is labeled with an event and an action and is depicted in figure 2.13.
93
DIT 116
NETWORK PROTOCOLS
NOTES
join group/start timer NONMEMBER
Timer expires/send
DELAYING MEMBER
MEMBER
group/cancel timer
query arrives/start timer
Figure 2.13 Three possible states of an entry in the hosts multicast group table
A host maintains an independent table entry for each group of which it is currently a member. When a host first joins the group or when a query arrives from a multicast router, the host moves the entry to the DELAYING MEMBER state and chooses a random delay. If another host in the group responds to the routers query before the timer expires, the host cancels its timer and moves to the MEMBER state. If the timer expires, the host sends a response message before moving to the MEMBER state. Because a router only generates a query message every 125 seconds, one expects the host to remain in the MEMBER state most of the time. Have you understood? 1. 2. 3. 4. 5. 2.6 IGMP is an integral part of IP. Justify this statement. What are the complications involved in the maintenance of multicast groups? If the group address field of the IGMP header is zero, what does it indicate? What do the terms querier and non-querier mean in IGMP? What are the three possible states of an entry in the IGMP table? MULTICAST ROUTING ISSUES AND PROTOCOLS
Multicast routing is considerably different from the unicast routing due to many complications involved. You note the fact that multicast groups can be formed in different ways. Members of certain hosts might have spread across multiple networks and members of certain groups may be present in a single network. Hence it becomes necessary for the routers to identify the networks in which the members of a group are present and it is necessary for the routers to forward the datagrams to all the networks in which the members are present and the router has to ensure that the datagrams are not forwarded to the networks in which the members of the group are not present. In the case of unicast routing, routes change only when the topology changes or equipment fails whereas in multicast routing, routes can change simply because an application program joins or leaves a multicast group. Usually in the internet, the forwarding of
DIT 116
NETWORK PROTOCOLS
datagrams is done just by examining the destination address of the datagram. However, in multicasting it is not sufficient if the destination address is alone examined. Even though the members are logically present in the same multicast group, they may be physically in different networks. Hence, the actual way or direction in which datagrams are delivered to the destinations may differ, especially in the cross group communication. One more complication is that the source of information may not be a member of the multicast group and it may be necessary for the internet to forward the datagrams across networks that do not have any group members attached. The major issue in multicast routing is exactly what information should be present in a multicast router to enable it to forward a datagram to a multicast group. Since a multicast destination represents a set of computers, multicast routing is in need of an optimal forwarding system, which can transmit to all members of the multicast groups without sending a datagram across a given network twice. Moreover, it is also necessary to ensure that when a datagram is to be forwarded across multiple routers, if care is not taken, the multicast routing may choose the routers that may form a cycle or routing loop. To avoid such routing loops, multicast routers rely on the datagrams source addresses. 2.6.1 Truncated Reverse Path Forwarding
NOTES
One of the first ideas to emerge for multicast forwarding was a form of broadcasting described earlier. Known as Reverse Path Forwarding (RPF), the schemes uses a datagrams source address to prevent the datagram from traveling around a loop repeatedly. To use RPF, a multicast router must have a conventional routing table with shorter paths to all destinations. When a datagram arrives, the router extracts the source address, looks it up in the local routing table, and finds I, the interface that leads to the source. If the datagram arrives over interface I, the router forwards a copy to each of the other interfaces; otherwise, the router discards the copy. Because it ensures that a copy of each multicast datagram is sent across every network in the internet, the basic RPF scheme guarantees that every host in a multicast group will receive a copy of each datagram sent to the group. However, you should know that RPF alone is not used for multicast routing because it wastes bandwidth by transmitting multicast datagrams over networks that neither has group members nor lead to group member. The solution to the above problem is a modified form of RPF by name Truncated Reverse Path Forwarding (TRPF). TRPF basically follows RPF algorithm. However, it restricts propagation by avoiding paths that do not lead to group members. To use TRPF, a multicast router needs two pieces of information: a conventional routing table and a list of multicast groups reachable through each network interface. When a multicast datagram arrives, the router first applies the RPF rule. If RPF specifies discarding the copy, the router does so. However, if RPF specifies transmitting the datagram over a
DIT 116
NETWORK PROTOCOLS
NOTES
particular interface, the router first makes an additional check to verify that one or more members of the group designated in the datagrams destination address are reachable over the interface. If no group members are reachable over the interface, the router skips that interface, and continuous examining the next one. In fact, we can now understand the origin of the term truncated - a router truncates forwarding when no more group members lie along the path. We said that TRPF is used instead of conventional RPF to avoid unnecessary traffic: TRPF does not forward a datagram to a network unless that network leads to at least one member of the group. Consequently, a multicast router must have knowledge of group membership. We also said that IP allows any host to join or leave a multicast group at any time, which results in rapid membership changes. More important, membership does not follow local scope a host that joins may be far from some router that is forwarding datagrams to the group. So, group membership information must be propagated across the internet. In general, because membership can change rapidly, the information available at a given router is imperfect, so routing may lag changes. Therefore, a multicast design represents a tradeoff between traffic overhead and inefficient data transmission. On one hand, if group membership information is not propagated rapidly, multicast routers will not make optimal decisions (i.e. they either forward datagrams across some networks unnecessarily or fail to send datagrams to all group members). On the other hand, a multicast routing scheme that communicates every membership change to every router is doomed because the resulting traffic can overwhelm an internet. Each design chooses a compromise between the two extremes. 2.6.2 Tree Structure of TRPF
The set of all paths from a given source to all members of a multicast group can be described using graph theory concepts. The set of paths in a multicast group can be said to form a tree called forwarding tree or delivery tree. Each multicast router corresponds to the node in the tree, and a network that connects two routers corresponds to an edge in the tree. The source of a datagram is the root or root node of the tree. Finally, the last router along each of the paths from the source is called a leaf router. The terminology is sometimes applied to networks as well researchers call a network hanging off a leaf router a leaf network. A multicast forwarding tree is defined as a set of paths through multicast router from a source to all members of a multicast group. For a given multicast group, each possible source of datagrams can determine a different forwarding tree. One of the immediate consequences of the principle concerns the size of tables used to forward multicast. Unlike conventional routing tables, each entry in a multicast table is identified by a pair:
96
DIT 116
NETWORK PROTOCOLS
(Multicast group, source) Conceptually, source identifies a single host that can send datagrams to the group (i.e., any host in the internet). In practice, keeping a separate entry for each host is unwise because the forwarding trees defined by all hosts on a single network are identical. Thus, to save space, routing protocol use a network prefix as a source. That is, each router defines one forwarding entry that is used for all hosts on the same physical network. Aggregating entries by network prefix instead of by host address reduces the table size dramatically. However, multicast routing table can grow much larger than conventional routing table. Unlike a conventional table in which the size is proportional to the number of networks in the internet, a multicast table has size proportional to the product of the number of networks in the internet and the number of multicast groups. 2.6.3 Limitations of TRPF
NOTES
Although TRPF guarantees that each member of a multicast group receives a copy of each datagram sent to the group, it has two surprising consequences. First, because it relies on RPF to prevent loops, TRPF delivers an extra copy of datagrams to some networks just like conventional RPF. Figure 2.14 illustrates how duplicates arise.
N etw o r k 1
R1
R2
N etw o rk 2 R3 N etw o r k 4
N etw o r k 3 R4
Figure 2.14. Topology with RPF Scheme
In the figure, when host A sends a datagram, routers R1 and R2 each receive a copy. Because the datagram arrives over the interface that lies along the shortest path to A, R1 forwards a copy to network 2, and R2 forwards a copy to network 3. When it receives a copy from network 2 (the shortest path to A), R3 forwards the copy to network 4. Thus, although RPF allows R3 and R4 to prevent a loop by discarding the copy that arrives over network 4, host B receives two copies of the datagram. A second surprising consequence arises because TRPF uses both source and destination addresses when forwarding datagrams: delivery depends on a datagrams source. For example, figure 2.15 shows how multicast routers forward datagrams from two different sources across a fixed topology.
DIT 116
NETWORK PROTOCOLS
NOTES
Net 1 X Net 2 R1 R2 Net 3
R3
R4
R5
R6
Net 4
Net 5
Net 6
Connection
Data Flow
Figure 2.15a Sample multicast scenario

Net 1
Net 2
R1
R2
Net 3
R3
R4
R5
R6
Net 4
Net 5
Net 6
Connection
Data Flow
Figure 2.15b Sample multicast scenario
As the figure shows, the source affects both the path a datagram follows to reach a given network as well as the delivery details. For example in part (a) of the figure, a transmission by host X causes TRPF to deliver two copies of the datagram to network 5. In part (b), only one copy of a transmission by host Z reaches network 5, but two copies reach networks 2 and 4. 2.6.4 Reverse Path Multicasting
Reverse Path Multicasting (RPM) extends TRPF to make it more dynamic. Three assumptions underlie the design. First, it is more important to ensure that a multicast
DIT 116
NETWORK PROTOCOLS
datagram reaches each member of the group to which it is sent than to eliminate unnecessary transmission. Second, multicast routers each contain a conventional routing table that has correct information. Third, multicast routing should improve efficiency when possible (i.e. eliminate needless transmission). RPM uses a two step process. When it begins, RPM uses the RPF broadcast scheme to send a copy of each datagram across all networks in the internet. Doing so ensures that all group members receive the copy. Simultaneously, RPM proceeds to have multicast routers inform one another about paths that do not lead to group members. Once it learns that no group member lie along a given path, a router stops forwarding along that path. How do routers learn about the location of group members? As in most multicasting routing schemes, RPM propagates membership information bottom up. The information starts with hosts that choose to join or lead groups. Hosts communicate membership information with their local router by using IGMP. Thus although a multicast router does not know about distant group members, it knows about local members. As a consequence routers attached to leaf networks can decide whether to forward over the leaf network if a leaf network contains no members for the given group, the router connecting that network to the rest of the internet does not forward on the network. In addition to taking local action the leaf router informs the next router along the path back to the source. Once it learns that no group member lie beyond the given router space the next router stops forwarding datagrams for the group across the network. When a router finds that no group members lie beyond it, the router informs the next router along the path to the route. Using graph theoretic terminology, we say that when a router learns that a group has no members along the path and stops forwarding it has pruned the path from the forwarding tree. In Fact, RPM is called a broadcast and prune strategy, because a router broadcasts until it receives information that allows it to prune a path. Researches also use another term for the RPM algorithm: they say that the system is data driven, because a router does not send group membership information to any other routers until datagrams arrive for the group. In the data driven model, a router must also handle the case where a host decides to join a particular group after the routers pruned the path for that group. RPM handles joins bottom-up: when a host informs a local router that it has joined a group, the router consults its record of the group and obtains the address to the router to which it had previously sent a pruned request. The router sends a new message that undoes the effort of the previous prune and causes datagrams to flow again. Such a message is known as graft request, and the algorithm is said to graft the previously pruned branch back on to the tree.
99
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
Have you understood? 1. Whether all the members of a multicast group should be on the same physical network? Justify your answer. 2. What is the basic principle of Reverse Path Forwarding? 3. How does TRPF overcome the limitations of RPF? 4. What do nodes and edges in a multicast tree represent? 5. What is meant by pruning in multicast forwarding? 6. What are the extensions that have been added to RPM when compared to TRPF? 2.7 MULTICASTING OVER THE INTERNET
When the hosts that are part of a multicast group attached to different networks/ subnetworks geographically distributed around the Internet, then intermediate subnet routers and/or interior/exterior gateways may be involved. Thus since an IP multicast address (class D address) has no structure and hence no net id associated with it, a different type of routing from that used to route unicast packets must be used. The sequence of steps followed to route a packet with a multicast address is as follows: A router that can route packets containing a (destination) IP multicast address is known as a multicast router (mrouter) Normally, in the case of a network that comprises multiple subnets interconnected by subnet routers, a single subnet router also acts as mrouter for the network Each mrouter learns the set of multicast group addresses of which all the hosts attached to the networks which mrouter serves are currently members The information gathered by each mrouter is passed on to each of the mrouters so that each knows the complete list of group addresses that each mrouter has an interest in On receipt of a packet with a destination IP multicast address, each mrouter uses an appropriate routing algorithm to pass the packet only to those mrouters that are attached to a network which has an attached host that is a member of the multicast group indicated in the destination IP address field.
As with broadcast routing, two different algorithms are used, the choice determined by the routing algorithm that is used to route unicast packets. The aim of both algorithms is to minimize the amount of information bandwidth required to deliver each multicast packet to those multicast routers that have an interest in that packet. Two algorithms we are going to discuss are Distance Vector Multicast Routing Protocol and Multicast Open Shortest Path First. One of them is a distance vector algorithm and another one is a link state algorithm.
DIT 116
NETWORK PROTOCOLS
2.7.1
Distance Vector Multicast Routing
NOTES
Distance Vector Multicast Routing (DVMRP) is an interior gateway protocol; suitable for use within an autonomous system, but not between different autonomous systems. DVMRP is not currently developed for use in routing non-multicast datagrams, so a router that routes both multicast and unicast datagrams must run two separate routing processes. DVMRP is designed to be easily extensible and could be extended to route unicast datagrams. DVMRP differs from Routing Information Protocol (RIP) in one very important way. RIP thinks in terms of routing and forwarding datagrams to a particular destination. The purpose of DVMRP is to keep track of the return paths to the source of multicast datagrams. When distance vector routing table is being used, an additional set of routing tables (to those used to route unicast packets) based on multicast router to multicast router (MR-to-MR) distances are derived. They are based on a routing metric of hop count and have been derived using the procedure used in ordinary distance vector routing algorithm. Multicast Address Table (MAT) maintained in each multicast router maintains the multicast addresses in which the multicast router is interested in. The IP first consults its MAT and finds the list of multicast routers to which a copy of the packet should be sent to. It then proceeds to consult its routing table and finds the shortest path to the multicast routers. In case if a set of multicast routers are on the path to some other multicast router, then it is not necessary to create separate copies of the packet. 2.7.1.1 DVMRP in UNIX Mrouted is a well known program that implements DVMRP for UNIX systems. mrouted cooperates closely with the operating system kernel to install multicast routing information. However, mrouted can be used only with a special version of UNIX known as multicast kernel. A UNIX multicast kernel contains a special multicast routing table as well as the code needed to forward multicast datagrams. Just like ordinary routing protocols, mrouted has also to do two different but related functions namely routing and forwarding. mrouted uses DVMRP to propagate multicast routing information from one router to another. A computer running mrouted interprets multicast routing information, and constructs a multicast routing table. Each entry in the table specifies a (group,source) pair and a corresponding set of interfaces over which to forward datagrams that match the entry. However, mrouted needs a base routing protocol also. Since all routers on the internet do not support multicasting, mrouted arranges a tunnel to multicast datagrams from one router to another through intermediate routers that do not participate in multicast routing.
101
DIT 116
NETWORK PROTOCOLS
NOTES
Although a single mrouted program can perform both the tasks, a given computer may not need both functions. To allow a manager to specify exactly how it should operate, mrouted uses a configuration file. The configuration file contains entries that specify which multicast groups mrouted is permitted to advertise on each interface, and how it should forward datagrams. Furthermore, the configuration file associates a metric and threshold with each route. The metric allows a manager to assign a cost to each path. The threshold gives the minimum IP TTL that a datagram needs to complete the path. If a datagram does not have a sufficient TTL to reach its destination, a multicast kernel does not forward the datagram. Instead, it discards the datagram, which avoids wasting bandwidth. Multicast tunneling is perhaps the most interesting capability of mrouted. A tunnel is needed when two or more hosts wish to participate in multicast applications, and one or more routers along the path between the participating hosts do not run multicast routing software. Figure 2.16 illustrates the concept.
Figure 2.16 Multicast Tunneling
To allow hosts on networks 1 and 2 to exchange multicast, managers of the two routers configure an mrouted tunnel. The tunnel merely consists of an agreement between the mrouted programs running on the two routers to exchange datagrams. Each router listens on its local net for datagrams sent to the specified multicast destination for which the tunnel has been configured. When a multicast datagram arrives that has a destination address equal to one of the configured tunnels, mrouted encapsulates the datagram in a conventional unicast datagram and sends it across the internet to the other router. When it receives a unicast datagram through one of its tunnels, mrouted extracts the multicast datagram, and then forwards according to its multicast routing table. The encapsulation technique that mrouted uses to tunnel datagrams is known as IP-in-IP. Figure 2.17.
102
DIT 116
NETWORK PROTOCOLS
DATAGRAM HEADER
MULTICAST DATAGRAM DATA AREA
NOTES
DATAGRAM HEADER
UNICAST DATAGRAM DATA AREA
Figure 2.17. IP-in-IP encapsulation in mrouted
IP-in-IP encapsulation preserves the original multicast datagram, including the header, by placing it in the data area of a conventional unicast datagram. On the receiving machine, the multicast kernel extracts and processes the multicast datagram as if it arrived over a local interface. In particular, once it extracts the multicast datagram, the receiving machine must decrement the TTL field in the header by one before forwarding. Thus, when it creates a tunnel, mrouted treats the internet connecting two multicast routers like a single, physical network. 2.7.1.2 Multicast Backbone (MBONE) Multicast tunnels form the basis of the Internets Multicast Backbone (MBONE). Many Internet sites participate in the MBONE; the MBONE allows hosts at participating sites to send and receive multicast datagrams, which are then propagated to all other participating sites. The MBONE is often used to propagate audio and video. To participate in the MBONE, a site must have at least one multicast router connected to at least one local network. Another site must agree to tunnel traffic, and a tunnel is configured between routers at two sites. When a host at the site sends a multicast datagram over a tunnel, a multicast router removes the outer encapsulation, and then forwards the datagram over the tunnel using IP-in-IP. When it receives a multicast datagram over a tunnel, a multicast router removes the outer encapsulation, and then forwards the datagram according to the local multicast routing table. The easiest way to understand the MBONE is to think of it as a virtual network built on the top of the Internet. Conceptually, the MBONE consists of mrouters that are interconnected by a set of point-to-point networks. Some of the conceptual point-topoint connections coincide with physical networks; others are achieved by tunneling. The details are hidden from the multicast routing software. Thus, when mrouted computes a multicast forwarding tree for a given (group,source), it thinks of a tunnel as a single link connecting two routers. Tunneling has two consequences. First, because some tunnels are much more expensive than others, they cannot all be treated equally. Mrouted handles the problem by allowing a manager to assign a cost to each tunnel, and uses the costs when choosing routes. Typically, a manager assigns a cost that reflects the number of hops in the underlying internet. It is also possible to assign costs that reflect administrative boundaries. Second, because DVMRP forwarding depends
DIT 116
NETWORK PROTOCOLS
NOTES
on knowing the shortest path to each source, and because multicast tunnels are completely unknown to conventional routing protocols, DVMRP must compute its own version of unicast forwarding that includes the tunnels. 2.7.2 Multicast Open Shortest Path First
Multicast Open Shortest Path first (MOSPF) provides the ability to forward multicast datagrams from one IP network to another (i.e., through internet routers). MOSPF forwards a multicast datagram on the basis of both the datagrams source and destination (this is sometimes called source/destination routing). The OSPF link state database provides a complete description of the Autonomous Systems topology. By adding a new type of link state advertisement, the group-membership-LSA, the location of all multicast group members is pinpointed in the database. The path of a multicast datagram can then be calculated by building a shortest-path tree rooted at the datagrams source. All branches not containing multicast members are pruned from the tree. These pruned shortest-path trees are initially built when the first datagram is received (i.e., on demand). The results of the shortest path calculation are then cached for use by subsequent datagrams having the same source and destination. OSPF allows an Autonomous System to be split into areas. However, when this is done complete knowledge of the Autonomous Systems topology is lost. When forwarding multicasts between areas, only incomplete shortest-path trees can be built. This may lead to some inefficiency in routing. An analogous situation exists when the source of the multicast datagram lies in another Autonomous System. In both cases (i.e., the source of the datagram belongs to a different OSPF area, or to a different Autonomous system) the neighborhood immediately surrounding the source is unknown. In these cases the sources neighborhood is approximated by OSPF summary link advertisements or by OSPF as external link advertisements respectively. Routers running MOSPF can be intermixed with non-multicast OSPF routers. Both types of routers can interoperate when forwarding regular (unicast) IP data traffic. Obviously, the forwarding extent of IP multicasts is limited by the number of MOSPF routers present in the Autonomous System (and their interconnection, if any). An ability to tunnel multicast datagrams through non-multicast routers is not provided. In MOSPF, just as in the base OSPF protocol, datagrams (multicast or unicast) are routed as is they are not further encapsulated or decapsulated as they transit the Autonomous System. 2.7.3 Protocol Independent Multicast
Protocol Independent Multicast (PIM) was developed in response to the scaling problems of existing multicast routing protocols like DVMRP and MOSPF. The major problem with the existing protocols is that they do not scale well in environments where
DIT 116
NETWORK PROTOCOLS
a relatively small proportion of the routers want to receive traffic for a certain group. For example, broadcasting traffic to all routers until they are explicitly asked to be removed from the distribution is not a good design choice if most routers do not want to receive the traffic in the first place. This situation is sufficiently common that PIM divides the problem space into sparse mode and dense mode. PIM Sparse Mode (PIM-SM) is used when there is a slight possibility that each router is involved in multicasting (sparse mode). In this environment, the use of a protocol that broadcasts the packet is not justified; a protocol such as Core-Based Tree (CBT) protocol is more appropriate. [The Core-Based Tree (CBT) protocol is a group shared protocol that uses a core as the root of the tree. The autonomous system is divided into regions and a core (center router or rendezvous router) is chosen for each region]. PIM-SM is a group-shared tree routing protocol that has a Rendezvous Point (RP) as the source of the tree. Its operations is like CBT; however, it is simpler because it does not require acknowledgement from a join message. In addition, it creates a backup set of RPs for each region to cover RP failures. One of the characteristics of PIM-SM is that it can switch from a group-shared tree strategy to a source-based tree strategy when necessary. This can happen if there is a dense area of activity far from the RP. That area can be more efficiently handled with a source-based tree strategy instead of a group-based tree strategy. PIM Dense Mode (PIM-DM) is used when there is a possibility that each router is involved in multicasting (dense mode). In this environment, the use of a protocol that broadcasts the packet is justified because almost all routers are involved in the process. PIM-DM is a source-based tree routing protocol that uses RPF and pruning/grafting strategies for multicasting. Its operation is like DVMRP; however, unlike DVMRP, it does not depend on a specific unicasting protocol. It assumes that the autonomous system is using a unicast protocol and each router has a table that can find the outgoing interface that has an optimal path to a destination. This unicast protocol can be a distance vector protocol (RIP) or a link state protocol (OSPF). Have you understood? 1. 2. 3. 4. 5. 2.8 Is class D addressing a hierarchical address or flat address? Justify your answer. What are the differences between an ordinary router and mrouter? DVMRP is an extension of RIP. Justify this statement. What type of encapsulation is used in mrouted program of UNIX? How is tunneling performed in MBONE networks? MOBILE IP
NOTES
Mobile communication has received a lot of attention in the last decade. The interest in mobile communication on the Internet means that the IP protocol, originally
DIT 116
NETWORK PROTOCOLS
NOTES
designed for stationary devices, must be enhanced to allow the use of mobile computers. The main problem that must be solved in providing mobile communication using the IP protocol is addressing. The original IP addressing was based on the assumption that a host is stationary, attached to one specific network. Hence the address is valid only when the host is attached to the network. If the network changes, the address is no longer valid. Routers use the net id portion to route a packet to the particular network and once a packet reaches the destination network, host id is used to deliver the packet to the destination host. This scheme works fine with stationary hosts. When a host moves from one network to another, the IP addressing structure needs to be modified. Several solutions have been proposed. 2.8.1 Characteristics of Mobility IP support According to IETF, mobile IP should have the following characteristics. Transparency Mobility is transparent to applications and transport layer protocols as well as to routers not involved in the change. In particular, as long as they remain idle, all open TCP connections survive a change in network and are ready for further use. Interoperability with IPv4 A host using mobile IP can interoperate with stationary hosts that run conventional IPv4 software as well as with other mobile hosts. Furthermore, no special addressing is required the addresses assigned to mobile hosts do not differ from addresses assigned to fixed hosts. Scalability The solution scales to large internets. In particular, it permits mobility across the global internet. Security Mobile IP provides security facilities that can be used to ensure all messages are authenticated (i.e., to prevent an arbitrary computer from impersonating a mobile host). Macro mobility Rather than attempting to handle rapid network transitions such as one encounters in a wireless cellular system, mobile IP focuses on the problem of long-duration moves. For example, mobile IP works well for a user who takes a portable computer on a business trip, and leaves it attached to the new location for a week.
106
DIT 116
NETWORK PROTOCOLS
2.8.2
Architectural entities and Terminology
NOTES
Three major architectural entities involved in mobile IP are mobile node, home agent and foreign agent. They are described as follows. A mobile node is a host or router that changes its point of attachment from one network or subnetwork to another. A mobile node may change its location without changing its IP address; it may continue to communicate with other Internet nodes at any location using its (constant) IP address, assuming link-layer connectivity to a point of attachment is available. A home agent is a router on a mobile nodes home network which tunnels datagrams for delivery to the mobile node when it is away from home, and maintains current location information for the mobile node. A foreign agent is a router on a mobile nodes visited network which provides routing services to the mobile node while registered. The foreign agent detunnels and delivers datagrams to the mobile node that were tunneled by the mobile nodes home agent. For datagrams sent by a mobile node, the foreign agent may serve as a default router for registered mobile nodes. A mobile node is given a long-term IP address on a home network. This home address is administered in the same way as a permanent IP address is provided to a stationary host. When away from its home network, a care-of address is associated with the mobile node and reflects the mobile nodes current point of attachment. The mobile node uses its home address as the source address of all IP datagrams that it sends, except where otherwise described in this document for datagrams sent for certain mobility management functions. A typical communication scenario using mobile IP is shown in figure 2.18a.
Mobile Node A Home Network for A
(2) Internet or other topology or router and links 1 5
3 4
Home Agent
Foreign Agent
Foreign Network
Server X
Figure 2.18a. Mobile IP Scenario 107 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
To understand the architecture and functioning, it is also necessary to become familiar with the following terms. Agent Advertisement An advertisement message constructed by attaching a special Extension to a router advertisement message. Care-of Address The termination point of a tunnel toward a mobile node, for datagrams forwarded to the mobile node while it is away from home. The protocol can use two different types of care-of address: a foreign agent care-of address is an address of a foreign agent with which the mobile node is registered, and a co-located care-of address is an externally obtained local address which the mobile node has associated with one of its own network interfaces. Correspondent Node A peer with which a mobile node is communicating. A correspondent node may be either mobile or stationary. Foreign Network Any network other than the mobile nodes Home Network. Home Address An IP address that is assigned for an extended period of time to a mobile node. It remains unchanged regardless of where the node is attached to the Internet. Home Network A network, possibly virtual, having a network prefix matching that of a mobile nodes home address. Note that standard IP routing mechanisms will deliver datagrams destined to a mobile nodes Home Address to the mobile nodes Home Network. Link A facility or medium over which nodes can communicate at the link layer. A link underlies the network layer. Link-Layer Address The address used to identify an endpoint of some communication over a physical link. Typically, the Link-Layer address is an interfaces Media Access Control (MAC) address.
108
DIT 116
NETWORK PROTOCOLS
Mobility Agent Either a home agent or a foreign agent. Agent Discovery Home agents and foreign agents may advertise their availability on each link for which they provide service. A newly arrived mobile node can send a solicitation on the link to learn if any prospective agents are present. Registration When the mobile node is away from home, it registers its care-of address with its home agent. Depending on its method of attachment, the mobile node will register either directly with its home agent, or through a foreign agent which forwards the registration to the home agent. 2.8.3 Operation of mobile IP Mobility agents (i.e., foreign agents and home agents) advertise their presence via Agent Advertisement messages. A mobile node may optionally solicit an Agent Advertisement message from any locally attached mobility agents through an Agent Solicitation message. A mobile node receives these Agent Advertisements and determines whether it is on its home network or a foreign network. When the mobile node detects that it is located on its home network, it operates without mobility services. If returning to its home network from being registered elsewhere, the mobile node deregisters with its home agent, through exchange of a Registration Request and Registration Reply message with it. When a mobile node detects that it has moved to a foreign network, it obtains a care-of address on the foreign network. The care-of address can either be determined from a foreign agents advertisements (a foreign agent care-of address), or by some external assignment mechanism such as DHCP (a co-located care-of address). The mobile node operating away from home then registers its new care-of address with its home agent through exchange of a Registration Request and Registration Reply message with it, possibly via a foreign agent. Datagrams sent to the mobile nodes home address are intercepted by its home agent, tunneled by the home agent to the mobile nodes care-of address, received at the tunnel endpoint (either at a foreign agent or at the mobile node itself), and finally delivered to the mobile node. In the reverse direction, datagrams sent by the mobile node are generally delivered to their destination using standard IP routing mechanisms, not necessarily passing through the home agent. When away from home, Mobile IP uses protocol tunneling to hide a mobile nodes home address from intervening routers between its home network and its current location. The tunnel terminates at the mobile nodes care-of address. The care-of address
109
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
must be an address to which datagrams can be delivered via conventional IP routing. At the care-of address, the original datagram is removed from the tunnel and delivered to the mobile node. A foreign agent care-of address is a care-of address provided by a foreign agent through its Agent Advertisement messages. In this case, the care-of address is an IP address of the foreign agent. In this mode, the foreign agent is the endpoint of the tunnel and, upon receiving tunneled datagrams, decapsulates them and delivers the inner datagram to the mobile node. This mode of acquisition is preferred because it allows many mobile nodes to share the same care-of address and therefore does not place unnecessary demands on the already limited IPv4 address space. A co-located care-of address is a care-of address acquired by the mobile node as a local IP address through some external means, which the mobile node then associates with one of its own network interfaces. The address may be dynamically acquired as a temporary address by the mobile node such as through DHCP, or may be owned by the mobile node as a long-term address for its use only while visiting some foreign network. When using a co-located care-of address, the mobile node serves as the endpoint of the tunnel and itself performs decapsulation of the datagrams tunneled to it. The mode of using a co-located care-of address has the advantage that it allows a mobile node to function without a foreign agent, for example, in networks that have not yet deployed a foreign agent. The operation of mobile IP is illustrated with the help of figure 2.18b.
Home Agent
Registration request (relay)
Foreign Agent
Mobile Host
Registration request
Registration reply
Registration reply (relay)
Figure 2.18b Registration Request/Reply
2.8.4 Sample Scenario Let us take an example of IP datagrams being exchanged over a TCP connection between the mobile node A and another host by name server X as shown in figure 2.18a. The sequence of events are shown below.
110
DIT 116
NETWORK PROTOCOLS
Home address of A is advertised and known to X X wants to transmit a datagram to A X does not know whether A is in the home network or a foreign network X sends the packet to A with As home address as the destination IP Home agent at As home network intercepts the packet Home agent discovers that A is in a foreign network Home agent encapsulates the entire packet within a new datagram (IP within IP) with As COA as destination address Foreign agent at As foreign network intercepts, strips of the outer IP header and delivers the original datagram to A A intends to respond to this and sends traffic to X
NOTES
In this example, X is not mobile. Therefore X has a fixed IP address. For routing As IP datagram to X, each datagram is sent to some router in the foreign network. Typically, this router is the foreign agent. A uses Xs IP static address as the destination address in the IP header. The IP datagram from A to X travels directly across the network, using Xs IP address as the destination address. 2.8.5 Discovery and Registration
The mobile IP discovery procedure has been built on top of an existing ICMP router discovery and advertisement procedure as specified in RFC 1256. Using these procedures a router can detect whether a new mobile node has entered into its network. Also, using this procedure, the mobile node determines whether it is in a foreign network. For the purpose of discovery, a router or an agent periodically issue a router advertisement ICMP message. The mobile node on receiving this advertisement packet compares the network portion of the router IP address with the network portion of its own IP address allocated by the home network. If these network portions do not match, then the mobile node knows that it is in a foreign network. A router advertisement can carry information about default routers and information about one or more care-of-addresses. If a mobile node needs a care-of-address without waiting for agent advertisements, the mobile node can broadcast a solicitation that will be answered by any foreign agent. Once a mobile node obtains a care-of-address from the foreign network, the same needs to be registered with the home agent. The mobile node sends a registration request to the home agent with the care-of-address information. When the home agent receives this request, it updates its routing table and sends a registration reply back to the mobile node. As a part of the registration, the mobile host needs to be authenticated. Using 128 bit secret key and the MD5 hashing algorithm, a digital signature is generated. Each mobile node and home agent shares the common secret. This secret
DIT 116
NETWORK PROTOCOLS
NOTES
makes the digital signature unique and allows the agent to authenticate the mobile node. At the end of the registration, a triplet containing the home address, care-of-address and registration lifetime is maintained in the home agent. This is called a binding for the mobile node. The home agent maintains this association until the registration lifetime expires. The registration process invokes the following four steps. 1. 2. 3. 4. The mobile node requests for forwarding service from the foreign network by sending a registration request to the foreign agent. The foreign agent relays this registration request to the home agent of that mobile node. The home agent either accepts or rejects the request and sends the registration reply to the foreign agent. The foreign agent relays this reply to the mobile node.
We have assumed that the foreign agent will allocate the care-of-address. However, it is possible that a mobile node moves to a network that has no foreign agents or on which all foreign agents are busy. As an alternate therefore, the mobile node act as its own foreign agent by using a colocated care-of-address. A collocated care-ofaddress is an IP address obtained by the mobile node that is associated with the foreign network. If the mobile node is using a collocated care-of-address then the registering happens directly with its home agent. 2.8.6 Tunneling
Once the home agent finds that the mobile node has moved away from the home network, the home agent looks up the mobile users new (temporary) location and finds the address of the foreign agent handling the mobile user. Now, the home agent encapsulates the packet in the payload field of an outer packet and sends the later to the foreign agent handling the mobile user. This mechanism is called tunneling. The home agent also tells the sender to send the packets hereafter to the mobile host by encapsulating them in the payload of packets explicitly addressed to the foreign agent. Hereafter the packets are sent to the user via the foreign agent and the home agent of the user is byepassed. Figure 2.19 shows the tunneling operations in mobile IP. In the mobile IP, IPwithin-IP encapsulation mechanism is used. Using IP-within-IP, the home agent, adds a new IP header called tunnel header. The new tunnel header uses the mobile nodes care-of-address as the tunnel destination IP address. The tunnel source IP address is the home agents IP address. The tunnel header uses 4 as the protocol number (figure 2.20), indicating that the next protocol header is again an IP header. In IP-within-IP, the entire original IP header is preserved as the first part of the payload of the tunnel header. The foreign agent after receiving the packet drops the tunnel header and delivers the
DIT 116
NETWORK PROTOCOLS
rest to the mobile node. When a mobile is roaming in a foreign network, the home agent must be able to intercept all IP datagarm packets sent to the mobile node so that these datagrams can be forwarded via tunneling. The home agent, therefore, needs to inform other nodes in the home network that all IP datagrams with the destination address of the mobile node should be delivered to the home agent. In essence, the home agent steals the identitiy of mobile node in order to capture packets destined for that node that are transmitted across home networks. For this purpose, ARP is used to notify all nodes in the home network.
NOTES
MH ?
Payload Encapsulated Diagram

Src Dest Proto Src Dest Proto
Home Agent
HA
COM
4 or 55
MH
PL
Foreign Agent
Src Dest Proto
MH ?
PL
Figure 2.19. Tunneling Operation
Let us take the example of figure 2.18. The original IP datagram from X to A has a source address as IP address of X and destination address as the home IP address of A. The datagram is routed through the Internet to As home network, where it is intercepted by the home agent. The home agent encapsulates the incoming datagram within outer IP header. This outer header includes the source address as the IP address of the home agent and destination address equal to the care-of-address. As the care-ofaddress has the network portion of the foreign network, the packet will find its way directly to the mobile host. When this new datagram reaches the host in the foreign network, it strips of the outer IP header to extract the original datagram. From this striped off packet it also finds out the original sender. This is necessary for the host to know who has send the packet so the response reaches the right destination.
113
DIT 116
NETWORK PROTOCOLS
NOTES
VERS HLEN SERVICE TYPE TOTAL LENGTH IDENTIFICATION TIME TO LIVE FLAGS FRAGMENT OFFSET PROTOCOL HEADER CHECKSUM SOURCE IP ADDRESS (HOME AGENT ADDRESS) DESTINATION IP ADDRESS (CARE-OF-ADDRESS) IP OPTIONS ( IF ANY ) PADDING DATA . VERS HLEN SERVICE TOTAL LENGTH TYPE IDENTIFICATION FLAGS FRAGMENT OFFSET TIME TO LIVE PROTOCOL HEADER CHECKSUM SOURCE IP ADDRESS (ORIGINAL SENDER) DESTINATION IP ADDRESS (HOME ADDRESS) IP OPTIONS ( IF ANY ) PADDING DATA .
Figure 2.20 IP within IP
2.8.7
Cellular IP
Hosts connecting to the Internet via wireless interface are likely to change their point of access frequently. A mechanism is required that ensures that packets addressed to moving hosts are successfully delivered with high probability. A change of access point during active data transmission or reception is called a handoff. During or immediately after a handoff, packet losses may occur due to delayed propagation of new location information. These losses should be minimized in order to avoid a degradation of service quality as handoffs become more frequent. Mobile IP is not appropriate for fast and seamless handoff control. Cellular IP is a protocol that provides mobility and handoff support for frequently moving hosts. It is intended to be used at a local level, for instance in a campus or metropolitan area network. Cellular IP can interwork with Mobile IP to support wide area mobility, that is, mobility between Cellular IP Networks. Cellular IP uses a two tier addressing scheme to manage the mobility and handoff. One address is for a fixed location which is known to all; other one is for a dynamic location which changes as the user moves. In the case of GSM, this is done through Home Location Register (HLR) and Visitor Location Register (VLR). Same is true in mobile IP, where the mobile host is associated with two IP addresses: a fixed home address that serves as the host identifier; and a care-of-address that reflects its current point of attachment. The mobile IP architecture comprises three functions: 1. 2. A database that conatins the most up to date mapping between the two address spaces (home address to care-of-address) The translation of the host identifier to the actual destination address
114
DIT 116
NETWORK PROTOCOLS
3.
Agents ensuring that the source and destination packets for arriving and outgoing packets are updated practically so that routing of packets are proper.
NOTES
Whenever the mobile host moves to a new network managed by different foreign agents, the dynamic care-of-address will change. This changed care-of-address needs to be communicated to the home agent. This process works for slowly moving hosts. For a high speed mobile host, the rate of update of the addresses needs to match the rate of change of addresses. Otherwise, packets will be forwarded to the wrong (old address). Mobile IP fails to update the addressed properly for high speed mobility. Cellular IP (figure 2.21), a new host mobility protocol has been designed to address this issue. In a cellular IP, none of the nodes know the exact location of a mobile host. Packets addressed to a mobile host are routed to its current base station on a hop by hop basis where each node only needs to know on which of its outgoing ports to forward packets.
Global Internet with Mobile IP (Coarse Grained Mobility) Mobile IP
Gateway
Redirect
Fine-grained Mobility
Wireless Access Network Cellular IP Global Mobility
Local Handoffs
Figure 2.21. Relationship between Mobile IP and Cellular IP
This limited routing information (referred as mapping) is local to the node and does not assume the nodes have any knowledge of the wireless network topology. Mappings are created and updated based on the packets, transmitted by mobile hosts. Cellular IP uses two parallel structures of mapping through paging caches (PC) and routing caches (RC). PCs maintain mappings for stationary and idle (not in data communication state) hosts; whereas, RC maintains mappings for mobile hosts. Mapping entries in PCs have a large timeout interval in the order of seconds or minutes. RCs maintain mappings of mobile hosts currently receiving data or expecting to receive
DIT 116
NETWORK PROTOCOLS
NOTES
data. For RC mappings the timeout are in the packet time scale. Figure 2.22 illustrates the relationship between PCs and RCs. While idle at location 1, the mobile host X keeps PCs up to date by transmitting dummy packets at low frequency (step 1 in figure 2.17). Let us assume that the host is mobile and moved to lacation 2 without transacting any data. The PC mapping for X now points to location 2. While at location 2, there are data packets to be routed to the mobile host X, the PC mappings are used to find the host (step 2). As there is data transmission, the mapping database to be used will be the RC. As long as data packets keep arriving, the host maintains RC mappings, either by its outgoing data packets or through the transmission of dummy packets (step 3).
Figure 2.22. Cellular IP paging and Routing
Idle mobile hosts periodically generate short control packets, called paging update packets. These are sent to the nearest available base station. The paging update packet travel in the access network from the base station toward the gateway router, on a hop-by-hop basis. Handoff in cellular IP is always initiated by the mobile host. As the host approaches a new base station, it redirects its data packets from the old to the new base station. First few redirected packets will automatically configure a new path of RC mappings for the host to the new base station. For a time equal to the timeout of RC mappings, packets addressed to the mobile host will be delivered at both old and new base stations. Have you understood? 1. 2. 3. 4. What are the limitations of IPv4 in the context of mobile communication? What are the functions of home agent and foreign agent? What is meant by Care-Of-Address (COA)? What are the four steps involved in registering the COA with the home agent?
116
DIT 116
NETWORK PROTOCOLS
5. 6.
Why is IP-in-IP type of encapsulation required in mobile IP? List down the major differences between mobile IP and cellular IP.
NOTES
Have you understood? 1. MAC address is a physical address and IP address is a logical address. Justify this statement. 2. Which protocol is used to map the IP address into its equivalent MAC address? 3. How do the diskless workstations obtain their IP addresses? 4. What are the limitations of static allocation schemes of IP addresses? 5. What are the two steps involved in the bootstrap procedure of BOOTP? 6. What are the limitations of BOOTP? 7. What are the various modes of IP address allotment supported by DHCP? 8. What is the advantage of leasing scheme followed in DHCP? 9. How does a client discover DHCP servers? 10. What is meant by early lease termination in DHCP? Summary 1. 2. Multicasting is mode of data communication where the datagrams from a source are to be delivered to all the members of the multicast group. The key requirement of multicasting is to relieve the source from creating the required number of copies and transmitting across the network. Instead the source has to follow the usual transmission procedure with only one exception that says that destination address is not an usual unicast address and it is a multicast address. Class D IPv4 addresses are used for multicast. Class D IPv4 addresses can not be assigned to individual hosts of the network. Class D addresses are not hierarchical addresses like class A, B and C addresses where certain octets are used to indicate the net id and certain hosts are used to indicate the host id portion. The major challenge in devising multicast addresses is that addressing scheme has to satisfy two conflicting goals. It has to allow local autonomy in assigning the IP addresses and at the same time the assigned addresses should have global meaning. Another challenge in multicasting is due to the requirement that the multicasting scheme should make effective use of hardware multicast when it is available and at the same time, it should allow IP multicast delivery over networks which do not have hardware support for multicast. Members of a multicast may be in the same network or across different networks of an internet. It becomes necessary for the multicast routers to maintain these details to forward the datagrams to all members of the multicast group with minimum overhead and effective utilization of bandwidth. Hosts expect a very high dynamic multicast services from the network so that the hosts can choose to join or leave this group at will, without synchronizing or
3.
4.
5.
6.
7.
DIT 116
NETWORK PROTOCOLS
NOTES
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
negotiating with other members of the group. Also a host may be willing to become a member of more than one group at a time. A mrouter learns the multicast addresses associated with its own attached networks/subnetworks using a protocol by name Internet Group Management protocol. Like ICMP, IGMP is also an integral part of IP. Unlike unicast routing in which routes change only when the topology changes or equipment fails, multicast routes can change simply because an application program joins or leaves a multicast group. A multicast datagram may originate on a computer that is not a part of the multicast groups, and may be routed across networks that do not have any group members attached. Mobile IP overcomes the limitations of IPv4 in the context of mobile communication by allowing a single computer to hold two addresses simultaneously. First address is the computers primary address and this address is fixed and permanent. Second address or the secondary address changes as the computer moves. Mobile IP involves considerable overhead after each move and hence mobile IP is intended for situations in which a host moves infrequently and remains at a given location for a relatively long period of time. Every site that wants to allow its users to roam has to create a home agent. Every site that wants to allow visitors has to create a foreign agent. When a mobile host shows up at a foreign site, it contacts the foreign hosts there and registers. The foreign host then contacts the users home agent and gives it a care-of-address, normally the foreign agents own IP address. Once a mobile node obtained a care-of-address from the foreign network, the same needs to be registered with the home agent. The mobile node sends a registration request to the home agent with the care-of-address information. When the home agent receives this request, it updates sits routing table and sends a registration reply back to the mobile node. In the mobile IP, IP-within-IP encapsulation mechanism is used. Using IP-withinIP, the home agent, adds a new IP header called tunnel header. The new tunnel header uses the mobile nodes care-of-address as the tunnel destination IP address. The tunnel source IP address is the home agents IP address. If a mobile does not have a unique foreign address, a foreign agent must use the mobiles home address for communication. Instead of relaying on ARP for address binding, the agent records the mobiles hardware address when a request arrives and uses the recorded information to supply the necessary binding. When a host (source) wants to communicate with another host (destination) in a different network it becomes necessary for the source to know the IP address of the destination. The network has to ensure that all hosts in the internet have unique IP addresses. When the size of the network is small, it is possible to assign IP addresses manually to all the systems in the network. The human network administrator config118
DIT 116
NETWORK PROTOCOLS
19.
20.
21. 22.
ures the IP addresses of all the systems in such a way that no two machines on the network have same IP addresses. When the size of the network becomes very large it is almost impossible to assign the addresses in this way. Reverse Address Resolution Protocol (RARP) associates a known MAC addresses with an IP addresses. A network device, such as a diskless workstation, might know its MAC address but not its IP address. RARP allows the device to make a request to learn its IP address. Devices using RARP require that a RARP server be present on the network to answer RARP requests. BOOTP uses UDP messages, which are forwarded over routers. It also provides diskless workstations with additional information, including the IP address of the file server holding the memory image, the IP address of the default router and the subnet mask to use. DHCP allows both manual IP address assignment and automatic assignment. In most systems, it has largely replaced RARP and BOOTP. Since the DHCP server may not be reachable by broadcasting, a DHCP relay agent is needed on each LAN. To find the IP address, a newly booted host broadcasts a DHCP DISCOVERY packet and the DHCP server responds to that. DHCP follows the idea of leasing in order to avoid loosing IP addresses when the hosts go down.
NOTES
Exercises 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Does broadcasting really increase the amount of traffic? What are the advantages of IP multicasting? Can multicasting use TCP? What is the advantage of the notation like 233/8, 224/4 etc? Change the multicast IP address 230.43.14.7 to an Ethernet multicast physical address? What are the functions of a group joining module of IGMP? What are the functions of a group leaving module? What are the general characteristics of Mobile IP? Why is it necessary to use the limited broadcast in BOOTP request/response? How is DHCP different from RARP and BOOTP?
Answers 1. Broadcasting by itself does not add to network traffic, but it adds extra host processing. Broadcasting can lead to additional traffic if the receiving hosts incorrectly respond with errors such as with errors such as ICMP port unreachables. Also, routers normally dont forward broadcast packets, whereas bridges normally do, so broadcasts on a bridged network can travel much farther than they would on a routed network.
119
DIT 116
NETWORK PROTOCOLS
NOTES
2.
3.
4.
5.
6.
Multicasting is useful because it conserves bandwidth, in many cases the most expensive part of network operations. It does this by replicating packets as needed within the network, thereby not transmitting unnecessary packets. Multicasting is the most economical technique for sending a packet stream (which could be audio, video, or data) from one location to many other locations on the Internet simultaneously. Its commercial applications include webcasting over the Internet (one to many multicasts), multiparty computer games and conference calls (many to many multicasts) and communication between devices behind the scenes (the focus of recent work on small group multicast, also called explicit multicast or Xcast). No. Multicasting uses UDP (User Datagram Protocol) as its underlying transport protocol. TCP (transmission control protocol) uses frequent transmission of acknowledgement (ACK) packets between the receiver and the transmitter for flow control, and also to determine if packets have arrived safely, in order that dropped packets can be retransmitted. This form of feedback and retransmission does not scale well into the one to many cases, although some forms of reliable multicast do use negative acknowledgements (NACKs) to signal the need for retransmission. UDP is a simpler protocol where there is no acknowledgement of the success or failure of the transmission of any packet, and no retransmission, at the transport layer. (In the jargon, UDP is called best effort.) Strictly speaking, therefore, multicast data transport is unreliable, and any reliability must be engineered-in at a higher level. It is cumbersome to refer to address blocks in complete dotted decimal notation. Hence CIDR developed (and multicasting has adopted) a shorthand, where the start of a block, and the number of bits that are fixed, are specified. Later IP multicasting has also adopted. In that shorthand, the multicast address space can be described as 224.0.0.0/4 or, even more simply, as 224/4. The fixed part of the address is referred to as the prefix, and this block would be pronounced two twenty four slash four. So, in the above question, 233/8 means all addresses between 233.0.0.0 to 233.255.255.255, which is the GLOP address space. Note that the larger the number after the slash, the longer the prefix and the smaller the actual address block. The solution can be obtained in two steps. a. We write the rightmost 23 bits of the IP address in hexadecimal. This can be done by changing the rightmost 3 bytes to hexadecimal and then subtracting 8 from the leftmost digit if it is greater than or equal to 8. In our example the result is 2B:0E:07. b. We add the result of part a to the starting Ethernet Multicast address, which is 01:00:5E:00:00:00. The result is 01:00:5E:2B:0E:07. Receive a request from a process to join a group i. Look for the corresponding entry in the table. ii. If (found) Increment the reference count.
120
DIT 116
NETWORK PROTOCOLS
7.
8.
9.
iii. If(not found) 1. Create an entry with reference count set to one. 2. Add the entry to the table. 3. Request a membership report from the output module. 4. Inform the data link layer to update its configuration table iv. Return Receive: a request from a process to leave a group i. Look for the corresponding entry in the table. ii. If (found) 1. Decrement the refernce count. 2. If (reference count is zero) a. If (any timer for this entry) cancel the timer. b. Change the state to free. c. Request a leave report from the output module. 3. Return. General characteristics of IP mobility support are i. Transparency Mobility is transparent to applications and transport layer protocols as well as to routers that are not involved in the change. In particular, as long as they remain idle, all open TCP connections survive a change in network and are ready for further use. ii. Interoperability with IPv4 A host using mobile IP can interoperate with stationary hosts that run conventional IPv4 software as well as with other mobile hosts. Furthermore, no special addressing is required the addresses assigned to mobile hosts do not differ from addresses assigned to fixed hosts. iii. Scalability The solution scales to large internets. In particular, it permits mobility across the global internet. iv. Security Mobile IP provides security facilities that can be used to ensure all messages are authenticated (i.e., to prevent an arbitrary computer from impersonating a mobile host). v. Macro mobility Rather than attempting to handle rapid network transitions such as one encounters in a wireless cellular system, mobile IP works well for a user who takes a portable computer on a business trip, and leaves it attached to the new location for a week. Suppose client machine A wants to use BOOTP to find bootstrap information including its IP address and suppose B is the server on the same physical net that will answer the request. Because A does not know Bs IP address or the IP address of the network, it must broadcast its initial BOOTP request using the IP limited broadcast address. Now the issue is whether B has to respond with As IP address (since B knows As IP address) or with the limited broadcast address. The server B uses the limited broadcast address. The reason is if B uses
121
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
10.
As IP address, Bs network interface software has to use mechanisms like ARP to find out As MAC address. However till the BOOTP reply reaches A, A does not know its IP address. As a result A will not respond to the Bs ARP request. Therefore B has only two alternatives: either broadcast the reply or use the information from the request packet to manually add an entry to its ARP cache. On systems that do not allow programs to modify the ARP cache, broadcasting is the only solution. DHCP is based on BOOTP and maintains some backward compatibility. The main difference is that BOOTP was designed for manual pre-configuration of the host information in a server database, while DHCP allows for dynamic allocation of network addresses and configurations to newly attached hosts. Additionally, DHCP allows for recovery and reallocation of network addresses through a leasing mechanism. RARP is a protocol used by Sun and other vendors that allows a computer to find out its own IP number, which is one of the protocol parameters typically passed to the client system by DHCP or BOOTP. RARP doesnt support other parameters and using it, a server can only serve a single LAN. DHCP and BOOTP are designed so they can be routed.
122
DIT 116
NETWORK PROTOCOLS
UNIT - 3
NOTES
3.1
INTRODUCTION
In first two units of this course, you learnt about the protocols that are required for the basic functioning of the network. IP, ICMP, TCP and UDP are the basic protocols required to transfer the data from one machine to another machine across multiple networks. To be more specific, network layer protocols along with transport layer protocols are able to provide the data transport from one application program running in one machine to another application program running in another machine. Due to the demands from the users, it becomes necessary for IP to support multicasting and mobility. However all these do not serve the programmers or end users directly. In this unit, you are going to learn about the application layer protocols with which application programmers develop the application programs. These application programs are used by the end users to carry out their tasks. All these applications have to identify the remote machine with which they want to communicate. Since remembering dotted decimal notation of IP addresses is difficult, machines are assigned some mnemonic addresses called domain names. However the TCP/IP protocol stack can understand only IP addresses in specifying the end points. Hence it becomes necessary for the client to map the domain name into its equivalent IP address. Domain Name System (DNS) is the application protocol meant for this. Another frequently used application is sharing the files among a set of users who belong to the same organization or working in the same project. File Transfer Protocol (FTP), Trivial File Transfer Protocol (TFTP) and Network File System (NFS). At the end of this unit you should be familiar with these protocols. Another important aspect we discuss in this unit is the elimination of specialized servers and provides more generality by allowing the user to establish a login session on the remote machine and then execute commands. The protocol that achieves this is TELNET. 3.2 LEARNING OBJECTIVES To understand the ideas of domain names To differentiate between flat name space and hierarchical name space To study about the different types of resource records of DNS To learn about the various types of queries made by the DNS in resolving the mapping
DIT 116
NETWORK PROTOCOLS
NOTES
3.3
To study about the message formats of DNS To understand the ideas and challenges of shared file access To study about the process model of FTP To know the features of anonymous FTP To understand the differences between reliable file transfer and file transfer with limited features To study about the nature of file transfer in TFTP To understand the NFS protocol and other related protocols To learn about remote login facility To study about the TELNET protocol To know the features of Rlogin (BSD UNIX) DOMAIN NAME SYSTEM
Domain Name System (DNS) is an application used by other application layer protocols or applications to map the mnemonic addresses called domain names into their equivalent IP addresses to communicate with applications running in other hosts of the network. You will learn about the principles and the implementation details of DNS in this section. Remembering IP addresses of the servers is difficult for the users and hence they prefer mnemonic addresses. However the network is able to understand only numerical addresses. Hence some mechanism is required to convert the mnemonic addresses into numerical addresses and DNS is the application layer protocol that achieves this. The name to address translation is easy for small scale networks. The table can be manually stored in the computer memory by the network administrators. But this is difficult for large networks as one single hosts file can not relate every name to its IP address and vice versa. The host file grows too large. One solution is to store all the information in a single computer and allow access to this centralized information to every computer that needs mapping. But this would create a huge traffic on the network. Hence it is necessary to divide this huge amount of information into smaller parts and store each part on a different host. The end system that needs mapping will contact the closest server holding the relevant mapping information. We will discuss the DNS application protocol in terms of the name space, domain name space, name servers, client/server architecture and the DNS message format. 3.3.1 The Name Space
The name space is the one from which the names are chosen and assigned to machines. Assigning names to machines should take place in such a way that the names should be unique and unambiguous. Organization of name space is easier in small netAnna University Chennai 124
DIT 116
NETWORK PROTOCOLS
works, since the size of the name space is small. It is easy to satisfy uniqueness and unambiguity in such cases. However, the size of the Internet is very large (approximately one hundred million computers connected) and choosing symbolic names is very difficult in such a large network. Two approaches are followed in the organization of the name space namely flat and hierarchical. In a flat name space, a name is assigned to an address. A name in this space is a sequence of characters without structure. The names may or may not have a common section. Even if the names have a common section it does not have any significance. The main disadvantage of a flat name space is that it can not be used in a large system such as Internet because it must be centrally controlled to avoid ambiguity and duplication. In a hierarchical name space, each name is made of several parts. The first part can define the nature of the organization, the second part can define the name of an organization, and the third part can define departments in the organization, and so on. For example, in cs.annauniv.edu, edu indicates an educational institution; annauniv indicates the name of the organization and cs indicates the computer science department. A central authority can assign the part of the name that defines the nature of the organization and the name of the organization. The responsibility of the rest of the name can be given to the organization itself. In cs.annauniv.edu, only annauniv.edu has been assigned by a central authority and the name cs has been assigned by the university itself. 3.3.2 Domain Name Space Domain Name space follows a hierarchical namespace to make the domain name system scalable. One well known example for hierarchical naming space is the addressing scheme followed by the postal system. In the postal system, name management is done by requiring letters to specify (implicitly or explicitly) the country, state or province, city, and street address of the addressee. In this scheme even if there are two persons with the same name in different areas of the city, it does not lead to duplication. In the hierarchical organization of the name space, the names are designed in an inverted-tree structure with the root at the top. The tree can have only 128 levels: level 0 (root) to level 127 as shown in figure 3.1. Conceptually, the internet is divided into over 200-top level domains, where each domain covers many hosts. Each domain is partitioned into sub domains, and these are further partitioned, and so on. The leaves of the tree represent domains that have no sub domains. A leaf domain may contain a single host, or it may represent a company and contain thousands of hosts. The top level domains come in two flavors: generic and countries. The original generic domains were com (commercial), edu (educational institutions), gov (the US Federal Government), int (certain international organizations), mil
125
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
(the US armed forces), net (network providers) and org (nonprofit organizations). The country domains include one entry for every country, as defined in ISO 3166.
Generic
Countries
int
com
edu
gov
mil
org
net
jp
us
nl
sun
yale acm eng jack jill
ac ieee
co
oce
vu
eng
keio
cs ai india
cs
nec csl
filts fluit
cs pc24
robot
Figure 3.1. A portion of the Internet domain name space
Each node in the tree has a label, which is a string with a maximum of 63 characters. The root label is a null string. DNS requires that nodes that branch from the same node have different labels, which guarantees the uniqueness of the domain names. Each node in the tree has a domain name. A full domain name is a sequence of labels separated by dots (.). The domain names are always read from the node up to the root. The last label is the label of the root (null). This means that a full domain name always ends in a null label, which means the last character is a dot because the null string is nothing. If a label is terminated by a null string, it is called a fully qualified domain name (FQDN). An FQDN is a domain name that contains the full name of a host. It contains all labels, from the most specific to the most general, that uniquely define the name of the host. For example, the domain name challenger.atc.fhda.edu is the FQDN of a computer named challenger installed at the Advanced Technology Center (ATC) at De Anza College. A DNS server can only match an FQDN to an address. If a label is not terminated by a null string, it is called partially qualified domain name (PQDN). A PQDN starts from a node, but it does not reach the root. It is used when the name to be resolved belongs to the same site as the client. For example, if a user at the fhda.edu site wants to get the IP address of the challenger computer, he or
126
DIT 116
NETWORK PROTOCOLS
she can define the partial name challenger. The DNS client adds the suffix atc.fhda.edu before passing the address to the DNS server. 3.3.3 Name Servers The entire information content of any domain name space is enormous. It is difficult to store it in a single computer as this makes it inefficient when requests are sent from all over the world to the same computer. If it is the case, the single computer is subjected to heavy load and may not be able to respond with the reply in a reasonable period. Moreover, this way of maintaining the information leads to single point of failure. The solution to these problems is to distribute the information among many computers called DNS servers. One way to do this is to divide the whole space into many domains based on the first level. The information is distributed among multiple DNS servers by dividing the whole name space into domains based on the first level. The root stands alone and there are as many domains or subtrees as there are first level nodes. Since a domain created this way could be very large, DNS allows domains to be divided further into smaller domains (subdomains). As a result, we have a hierarchy of servers in the same way that we have a hierarchy of names. Since the complete domain name hierarchy cannot be stored on a single server, it is divided among many servers. The domain or sub-domain over which a server has complete authority is called a zone. If a server accepts responsibility for a domain and does not divide the domain into smaller domains, the domain and the zone refer to the same thing. The server makes a database called a zone file and keeps all the information for every node under the domain. However, if a server divides its domain into subdomains and delegates part of its authority to other servers, domain and zone refer to different things. The information about the nodes in the subdomains is stored in the servers at the lower levels, with the original server keeping some sort of reference to these lower-level servers. However, the original server is still vested with the overall responsibility. It still has a zone but the details of the data are maintained by lower level servers. A server can also divide part of its domain and delegate responsibility but still keep part of the domain for itself. Its zone contains detailed information regarding that part of the zone not delegated to other servers. A root server is a server whose zone consists of the whole tree. A root server usually does not store any information about domains but delegates its authority to other servers, keeping references to those servers. There are several root servers, each covering the whole domain name space. The servers are distributed all around the world. Besides the root server, servers of the DNS are of two types namely a primary server and a secondary server. A primary server is a server that stores a file about the
127
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
zone for which it is an authority. It is responsible for creating, maintaining, and updating the zone file. It stores the zone file on a local disk. A secondary server is a server that transfers the complete information about a zone from another server (primary or secondary) and stores the file on its local disk. The secondary server neither creates nor updates the zone files. If updating is required, it must be done by the primary server, which sends the updated version to the secondary. The primary and secondary servers are both authoritative for the zones they serve. The idea is not to put the secondary server at a lower level of authority but to create redundancy for the data so that if one server fails, the other can continue serving clients. 3.3.4 The DNS Client Server Model DNS is designed as a client-server application. A host that needs to map an address to a name or a name to an address calls a DNS client called a resolver. The resolver accesses the closest DNS server with a mapping request. The server checks the generic domains or the country domains to find the mapping. In both the cases, if the server has the information, it satisfies the resolver; otherwise, it either refers the resolver to other servers or asks other servers to provide the information. After receiving the mapping, it interprets the response to see if it is a real resolution or any error, and delivers the result to the process that requested it. The client-server communication in resolving the domain names into their equivalent IP addresses can take place in two different ways. Servers can resolve the mapping in two ways namely recursive resolution and iterative resolution. In recursive resolution, the resolver asks for a recursive answer from a name server. This means that the resolver expects the server to supply the final answer. If the server is the authority for the domain name, it checks its database and responds. If the server is not the authority, it sends the request to another server (usually the parent) and waits for the response. If the parent is the authority, it responds; otherwise, it sends the query to yet another server. When the query is finally resolved, the response travels back until it finally reaches the requesting client. If the client does not seek for a recursive solution, the mapping is done iteratively. If the server is an authority for the name, it sends the answer. If it is not, it returns (to the client) the IP address of the server that it thinks can resolve the query. The client is responsible for repeating the query to this second server. If the newly addressed server can resolve the problem, it answers the query with IP address; otherwise, it returns the IP address of a new server to the client. Now the client must repeat the query to the third server. This process is called iterative because the client repeats the same query to multiple servers.
128
DIT 116
NETWORK PROTOCOLS
Each time a server receives a query for a name that is not in its domain, it needs to search its database for a server IP address. Reduction of this search time would increase efficiency. DNS handles this with a mechanism called caching. When a server asks for a mapping from another server and receives the response, it stores this information in its cache memory before sending it to the client. If the same or another client asks for the same mapping, it can check its cache memory and resolve the problem. However, to inform the client that the response is coming from the cache memory and not from an authoritative source, the server marks the response as unauthoritative. Caching speeds up resolution, but it can also become problematic. If a server caches a mapping for a long time, it may send an outdated mapping to the client. To counter this, two techniques are used. First, the authoritative server always adds information to the mapping called time to live (TTL). TTL indicates the time in seconds that the receiving server can cache the information. After this time, the mapping is invalid and any query must be sent again to the authoritative server. Second, DNS requires that each server keep a TTL counter for each mapping it caches. The cache memory must be searched periodically and those mappings with an expired TTL must be removed. 3.3.5 DNS Message Format The DNS message format allows a client to ask multiple questions in a single message. Each question consists of a domain name for which the client seeks an IP address, a specification of the query class and the type of object desired. The server responds by returning a similar message that contains answers to the questions for which the server has bindings. If the server cannot answer all questions, the response will contain information about other name servers that the client can contact to obtain the answers. Responses also contain information about the servers that are authorities for the replies and the IP addresses of those servers. Figure 3.2 shows the message format.
Identification Num ber Of Questions Num ber Of Authority Question Section . Answer Section .. Authority Section . Additional Inform ation Section . Param eter Num ber Of Answers Num ber Of Additional
NOTES
Figure 3.2. Domain name Server Message Format 129 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
As the figure shows, each message begins with a fixed header. The header contains a unique IDENTIFICATION field that the client uses to match responses to queries, and a PARAMETER field that specifies the operation requested and a response code. Figure 3.7 gives the interpretation of bits in the PARAMETER field. The fields labeled NUMBER OF each give a count of entries in the corresponding sections that occur later in the message. For example, the field labeled NUMBER OF QUESTIONS give sthe count of entries that appear in the QUESTION SECTION of the message. The QUESTION SECTION contains queries for which answers are desired. The client fills in only the question section; the server returns the questions and answers in its response. Each question consists of a QUERY DOMAIN NAME followed by QUERY TYPE and QUERY CLASS fields, as Figure 3.3 shows.
Bit PARAMETER field 0
Meaning Operation: 0 Query 1 Response Query Type: 0 Standard 1 Inverse 2 Completion 1 (now obsolete) 3 Completion 2 (now obsolete) Set if answer authoritative Set if message truncated Set if recursion desired Set if recursion available Reserved Response Type 0 No error 1 Format error in query 2 Server failure 3 Name does not exist
1-4
5 6 7 8 9-11 12-15
Figure 3.3 Meaning of bits of the PARAMETER FIELD
QUERY DOMAIN NAME ... QUERY TYPE QUERY CLASS
Figure 3.4 Format of entries in QUESTION SECTION Anna University Chennai 130
DIT 116
NETWORK PROTOCOLS
Although the QUERY DOMAIN NAME field has variable length, the internal representation of domain names makes it possible for the receiver to know exact length. The QUERY TYPE encodes the type of the question. The QUERY CLASS field allows domain names to be used for arbitrary objects because official Internet names are only one possible class. It should be noted that, although the diagram in figure 3.4 follows our convention of showing formats in 32-bit multiples, the QUERY DOMAIN NAME field may contain an arbitrary number of octets. No padding is used. Therefore, messages to or from domain name servers may contain an odd number of octets. In a domain name server message, each of the ANSWER SECTION, AUTHORITY SECTION, and ADDITIONAL INFORMATION SECTION consists of a set of resource records that describe domain names and mappings. Each resource record describes one name. Figure 3.5 shows the format.
NOTES
Figure 3.5 Format of resource records used in messages returned by DNS
The RESOURCE DOMAIN NAME field contains the domain name to which this resource record refers. It may be an arbitrary length. The TYPE field specifies type of the data included in the resource record; the CLASS field specifies the datas class. The TIME TO LIVE field contains a 32 bit integer that specifies the number of seconds information in this resource record can be cached. It is used by clients who have requested a name binding and may want to cache the results. The last two fields contains the results of the binding, with the RESOURCE DATA LENGTH field specifying the count of octets in the RESOURCE DATA field. Have you understood? 1. 2. 3. 4. 5. What is meant by a domain name? What are the difficulties in working with IP addresses itself without mnemonic addresses? What are the limitations of flat naming space? Give examples for flat schemes and hierarchical schemes. (Apart from domain name space) What are the two flavors of top level domains?
DIT 116
NETWORK PROTOCOLS
NOTES
6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Mention the original generic domains of DNS and the later additions to it. After getting a domain name from the higher authority, to divide that into further sub domains, is it necessary to get permission from higher authorities? Mention the five tuples of a resource record of DNS. What does a type A record indicate in resource records? Write short notes on MX type records of resource records. What is meant by an iterative query in resolving domain names? How does a recursive query resolves domain names? Mention the factors that affect the cost of lookup in resolving domain names. What are the mechanisms supported by DNS to make the mapping mechanism effective? Apart form the IP address of the specified domain name, what are the other possible details that may be returned by the DNS server?
3.4 SHARED FILE ACCESS Many network systems provide computers with the ability to access files on remote machines. However, many complications are involved in providing this facility to the users. If cost is the important consideration in providing this facility, a single, centralized file server provides a secondary storage for a set of inexpensive computers that have no local disk storage. Examples of inexpensive computers are diskless machines, hand-held device etc. Such machines communicate with the centralized server in a high speed wireless network. If the computers involved in file sharing are conventional desk top computers, then they send copies of files across a network to an archival facility, where they are stored in case of accidental loss. In other type of environments, it may be necessary to share data cross multiple programs, multiple users or multiple sites. The ability to access files remotely can be achieved in two different ways namely online shared access and sharing by file transfer. File sharing comes in two distinct forms: on line access and whole file copying. Shared on line access means allowing multiple programs to access a single file concurrently. Changes to the file take effect immediately and are available to all programs that access the file. Whole-file copying means that whenever a program wants to access a file, it obtains a local copy. Copying is often used for read-only data, but if the file must be modified, the program makes changes to the local copy and transfers a copy of the modified file back to the original site. A file system that provides shared, on-line access for remote users does not necessarily require a user to invoke a special client program as a database system does. Instead the operating system provides access to remote, shared files exactly the same way it provides access to local files. A user can execute any application program using a remote file as input or output. We say that the remote file is integrated with local files, and that the entire file system provides transparent access to shared files.
DIT 116
NETWORK PROTOCOLS
The major advantage of transparent access is that remote file access occurs with no visible changes to application programs. Users can access both local and remote files, allowing them to perform arbitrary computations on shared data. However, online shared access is difficult to implement in heterogeneous environments. The alternative to integrated, transparent on-line access is file transfer. Accessing remote data with a transfer mechanism is a two step process: the user first obtains a local copy and then operates on the copy. Most transfer mechanisms operate outside the local file system. A user must invoke a special purpose client program to transfer files. When invoking the client, the user specifies a remote computer on which the desired file resides and possibly an authorization needed to obtain an access. The client contacts the server on the remote machine and requests a copy of the file. Once the transfer is complete, the user terminates the client and uses the application program on the local system to read or modify the local copy. However sharing by file transfer also has to face challenges in heterogeneous environments. In some applications the two computers involved may both be large servers each running different operating systems with a different file system and a different character set. In another application, one of the computers may be a server and the other an item of equipment such as a cable modem or a set top box which does not have a hard disk. Hence in this case all the data that is transferred must have been formatted specifically for running in the cable modem or set top box. Clearly, therefore, the file transfer protocol associated with the second type of application can be much simpler than the first. Hence to meet the requirements of these two different types of application, TCP/IP protocol stack provides two different file transfer protocols namely FTP (File Transfer protocol) and TFTP (Trivial FTP). Have you understood? 1. 2. 3. 4. 5. Give a sample application scenario where it may be necessary to share the files. What are the challenges in sharing the files across systems and across networks? What are the two major types of shared file access? What are the limitations of on-line shared access of files? What are the advantages of file transfer method?
NOTES
3.5 FILE TRANSFER PROTOCOL File transfer is among the most frequently used TCP/IP applications, and it accounts for the major portion of the traffic in the Internet. FTP is a standard application layer protocol provided by TCP/IP protocol stack for copying a file from one host to another host. FTP is also based on the client/server model like many other applications of TCP/IP. However, it differs from other applications in that it uses two connections
133
DIT 116
NETWORK PROTOCOLS
NOTES
between the client and the server. One is used for data traffic and the other for the control traffic. Separation of control and data transfer makes FTP more efficient. FTP is discussed in terms of the process model, data connection, control connection and the communication between the client and server. 3.5.1 Features Although transferring files from one system to another in the presence of a TCP connection seems trivial, FTP is a complex protocol due to many reasons. Two systems may use different naming conventions. Two systems may have different ways to represent text and data. Two systems may have different directory structures. In addition, FTP offers many facilities beyond the transfer function itself. 1. Interactive Access: Although FTP is designed to be used by programs, most implementations provide an interactive interface that allows humans to easily interact with remote servers. For example, a user can ask for a listing of all files in a directory on a remote machine. Also, the client usually response to the input helps by showing the user information about possible comments that can be invoked. Format specification: FTP allows the client to specify the type and format of stored data. For example, the user can specify whether a file contains text or binary integers and whether text files use the ASCII or EDBCDIC character sets. Authentication control: FTP requires client to authorize themselves by sending a login name and password to the server before requesting file transfers. The server refuses access to clients that cannot supply a valid login and password.
2.
3.
3.5.2 Process Model Like other servers, most FTP servers implementation allows concurrent access by multiple clients. Clients use TCP to connect to a server. In this process model, a single master server process awaits connections and creates a slave process to handle each connection. In turn, the slave accepts and handles control connection from the client, but uses an additional process or processes to handle a separate data transfer connection. The control connection carries commands that tell the server which file to transfer. The data transfer connection, which also uses TCP as the transport protocol, carries all data transfers. Usually, both the client and server create a separate process to handle the data transfer. While the exact details of the process architecture depend on the operating system used, figure 3.6 illustrates the concept.
134
DIT 116
NETWORK PROTOCOLS
NOTES
Client system Server System
Data Transfer
Control process
Control Process
Data Transfer
Operating System
Operating System
TCP/IP INTERNET
Fig 3.6. FTP Client/Server Interaction
As the figure shows, the client control process connects to the server control process using one TCP connection, while the associated data transfer process use their own TCP connections. In general, the control process and the control connection remain alive as long as the user keeps the FTP session active. However, FTP establishes a new data transfer connection for each file transfer. In fact, many implementations create a new pair of data transfer processes, as well as a new TCP connection, whenever the server needs to send information to the client. Data transfer connection and the data transfer processes that use them can be created dynamically when needed, but the control connection persists throughout a session. Once the control connection disappears the session is terminated and the software at both ends terminates all data transfer processes. Of course, client implementations that execute on a computer without operating system support for multiple processes may have a less complex structure. Such implementations of a sacrifice generality by using a single application program to perform both the data transfer and control function. However, the protocol requires that such clients still use multiple TCP connections, one for control and the other(s) for data transfer. 3.5.3 Control Connection and Data Connection When a client forms an initial connection to a server, the client uses a random, locally assigned, protocol port number, but contacts the server at well-known port (port number 21). A server that uses only one protocol port can accept connections
DIT 116
NETWORK PROTOCOLS
NOTES
from many clients because TCP uses both endpoints to identify a connection. However, the issue arises when the control processes create a new TCP connection for a given data transfer. Obviously, they cannot use the same pair of port numbers used in the control connection. Instead, the client obtains an unused port on its machine, which will be used for a TCP connection with the data transfer process on the servers machine. The data transfer process on the server machine uses the well-known port reserved for FTP data transfer (port number 20). To ensure that a data transfer process on the server connects to the correct data transfer process on the client machine, the server side must not accept connections from an arbitrary process. Instead, when it issues the TCP active open request, a server specifies the port that will be used on the client machine as well as the local port. We can see why the protocol uses two connections the client control process obtains a local port to be used in ht file transfer, creates a transfer process on the client machine to listen at the port, communicates the port number to the server over the control connection, and then waits for the server to establish a TCP connection to the port. In addition to passing user command to the server, FTP uses the control connection to allow client and server control process to coordinate their use of dynamically assigned TCP protocol ports and the creation of data transfer processes that use those ports. The designers of FTP allow FTP to use the TELNET network virtual terminal protocol. Unlike the full TELNET protocol, FTP does not allow option negotiation; it uses only the basic Network Virtual Terminal (NVT) definition. Thus, management of an FTP control connection is much simpler than management of a standard TELNET connection. Despite its limitations, using the TELNET definition instead of investing a new one helps simplify FTP considerably. 3.5.4 Communication between Client and server The FTP client and the FTP server are running on different computers in the Internet. These two computers may use different operating systems, different character sets, different file structures and different file formats. Hence it is necessary for FTP to make this heterogeneity compatible. FTP has to achieve this in both control connection and data connection. 3.5.4.1 Communication over Control connection FTP uses the control connection between the client control process and the server control process. During this communication, the commands are sent from the client to the server and the responses are sent from the server to the client. Commands, which are sent from the FTP client control process, are in the form of ASCII uppercase, which may or may not be followed by an argument. The control commands are divided into six groups namely access commands, file management commands, data formatting
136
DIT 116
NETWORK PROTOCOLS
commands, port defining commands, file transferring commands, and miscellaneous commands. Access commands are used to access the remote system. Access commands are used to provide the details like user information, password, account information etc. File management commands are used to access the file system on the remote computer. They allow the user to navigate through the directory structure, create new directories, and delete files and so on. The data formatting commands allow the user to define the data structure, file type and transmission mode. Port defining commands define the port number for the data connection on the client side. This can be done in two ways. The client can choose an ephemeral port number and send it to the server or the client asks the server to first choose a port number. File transfer commands are used to actually transfer the files. These commands are used to retrieve files, store files, allocate storage space for the files at the server, position the file marker at a specified data point etc. Miscellaneous command helps the client to know the various details of the server. 3.5.4.2 Communication over Data Connection The purpose and implementation of the data connection are different from that of the control connection. We want to transfer files through the data connection. The client must define the type of file to be transferred, the structure of the data, and the transmission mode. This is required to resolve the heterogeneity problem that may exist between the client and the server. FTP can transfer one of the following file types across the data connection. ASCII file is the default format for transferring test files. If one or both ends of the connection use EBCDIC encoding, the file can be transferred using EBCDIC encoding. Image file is the default format for transferring binary files. Image file is sent as continuous streams of bits without any interpretation or encoding. This is mostly used to transfer binary files such as compiled programs. FTP can transfer a file across the data connection using any one of the interpretations about the structure of the data. File structure is the default structure. In this file has no structure and considered as a continuous stream of bytes. In record structure, the file is divided into records. Page structure is used to divide the file into pages, with each page having a page number and a page header. The pages can be stored and accessed randomly or sequentially. FTP can transfer a file across the data connection using one of the three transmission modes. Stream mode is the default mode and in this mode data are delivered from FTP to TCP as a continuous stream of bytes. TCP is responsible for chopping data into segments of appropriate size. In block mode, data can be delivered from FTP to TCP in blocks. In compressed mode, if the file is big, the data can be compressed. The
137
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
compression method normally used is run-length encoding. 3.5.5 The Users View of FTP Users view FTP as an interactive system. Once invoked, the client performs the following operations repeatedly: read a line of input, parse the line to extract a command and its arguments, and executed the command with the specified arguments. For example, to initiate the version of FTP available under UNIX, the user invokes the ftp command: The local FTP client program begins and issues a prompt to the user. Following the prompt, the user can issue commands like help. ftp> help Commands may be abbreviated. Commands are: ! cr macdef proxy $ delete mdelete sendport account debug mdir put append dir mget pwd ascii disconnect mkkir quit bell form mls quote binary get mode recv bye glob mput remotehelp case hash nmap rename cd help ntrans reset cdup lcd open rm close ls prompt runique
send status struct snique tenex trace type user verbose ? dir
To obtain more information about a given command the user types help command as in the following example (output is shown in the format ftp produces): ftp> help is ls ftp> help cdup cdup ftp> help glob glob ftp> help bell bell
list contents of remote directory change remote working directory to parent directory toggle metacharacter expansion of local file names beep when command completed
To execute a command, the user types the command name: ftp> bell Bell mode on.
DIT 116
NETWORK PROTOCOLS
3.5.6 Anonymous FTP Normal assumption in FTP is the client can access a file from a remote server only if the client has the access right. In an FTP session, a client is authenticated by the server before permitting the client user to transfer a file from the server. This is not always the case since FTP is also used to access information from a server that allows unknown users to log on to it. To access information from this type of server, the user must know the DNS name of the server but, when prompted for a user name he or she enters anonymous and, for the password, his or her e-mail address. In some instances, however, before granting the user access, the server carries out a rudimentary check that the client host has a valid domain name. Although the IP address of the client host has not been formally sent at this point this does not occur until the PORT command is sent it is present in the (IP) source address field of each of the IP datagrams that have been used to set up the (TCP) control connection and to send the username and password. Hence before granting access, the control part in the server uses its own resolver to check that the IP address of the host is in the DNS database. A sample interaction between the client and the server in anonymous FTP is described by comer as follows. % ftp ftp.cs.purdue.edu Connected to lucan.cs.purdue.edu. 220 lucan.cs.purdue.edu FTP server (Version wu-2.4.2-VR16(1) ready. Name (ftp.cs.purdue.edu:usera): anonymous 331 Guest login ok, send e-mail address as password. Password: guest 230 Guest login ok, access restrictions apply. ftp>get pub/comer/tcpbook.tar bookfile 200 PORT command okay. 150 Opening ASCII mode data connection for tcpbook.tar (9895469 bytes). 226 Transfer Complete. 9895469 bytes received in 22.76 seconds (4.3e+02 Kbytes/s) ftp>close 221 Goodbye ftp>quit The user specifies machine ftp.cs.purdue.edu as an argument to the FTP command and opens a TCP connection with the server. Once the TCP connection is complete, the user invokes anonymous FTP by specifying user name as anonymous and
NOTES
139
DIT 116
NETWORK PROTOCOLS
NOTES
password as guest. After typing a login and password, the user requests a copy of a file using the get command. In the example, the get command is followed by two arguments that specify the remote file name and a name for the local copy. The remote file name is pub/comer/tcpbook.tar and the local copy will be placed in bookfile. Once the transfer completes, the user types close to break the connection with the server, and types quit to leave the client. Along with the commands, the interaction involves a lot of informational messages also. FTP messages always begin with a 3-digit number followed by text. Most come from the server; other output comes from the local client. Foe example, the message that begins 220 comes from the server and contains the domain name of the machine on which the server executes. The statistics that report the number of bytes received and the rate of transfer come from the client. Another important point to be noted from the above session is FTP is an out of band protocol. The client PORT command reports that a new TCP port number has been obtained for use as a data connection. The client sends the port information to the server over the control connection; data transfer processes at both ends use the new port number when forming a connection. After the transfer completes, the data transfer processes at each end close the connection. Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. What is the transport layer used by FTP? Apart from file transfer, what are the other functions provided by FTP? Why FTP is called an out of band protocol? How many processes should be active in an FTP client on the assumption that it interacts with only one FTP server? How many processes are active in an FTP server? Mention the usage of port numbers 20 and 21 in FTP. What are the issues involved in the data representation of FTP? What are the responsibilities of the FTP server? Under what conditions an FTP server closes the connection? What is meant by anonymous FTP?
3.6 TRIVIAL FILE TRANSFER PROTOCOL FTP is the most general file transfer protocol and it provides a reliable service to the users. However, to achieve the reliability FTP has to perform many additional functions apart from the core file transfer function. As a result, FTP becomes a heavyweight protocol. If reliability is not the major issue or if the underlying network itself is reliable, a sophisticated file transfer protocol like FTP may not be required. Hence the TCP/IP suite provides a second file transfer protocol that provides inexpensive, unsoAnna University Chennai 140
DIT 116
NETWORK PROTOCOLS
phisticated service. Know as the Trivial File Transfer Protocol, or (TFTP), it is intended for applications that do not need complex interactions between the client and server. TFTP restricts operations to simple file transfers and do not provide authentications. Because it is more restrictive, TFTP software is much smaller that FTP. Small size is important in many applications. For example, manufacturers of diskless services can import TFTP in read only memory (ROM) and used it to obtain an initial memory image when the machine is powered on. The program in ROM is called the system bootstrap. The advantage of using TFTP is that it allows bootstrapping code to use the same underlying TCP/IP protocols that the operating system uses once it begins execution. Thus, it is possible for a computer to bootstrap from a server on another physical network. Unlike FTP, TFTP does not need a reliable stream transport service. It runs on top of UDP or any other unreliable packet delivery system, using timeout and retransmission to ensure that data arrives. We discuss TFTP in terms of overview, connection protocol and transfer protocol. 3.6.1 Protocol Overview
NOTES
Any transfer begins with a request to read or write a file, which also serves to request a connection. If the server grants the request, the connection is opened and the file is sent in fixed length blocks of 512 bytes. Each data packet contains one block of data, and must be acknowledged by an acknowledgment packet before the next packet can be sent. A data packet of less than 512 bytes signals termination of a transfer. If a packet gets lost in the network, the intended recipient will timeout and may retransmit his last packet (which may be data or an acknowledgment), thus causing the sender of the lost packet to retransmit that lost packet. The sender has to keep just one packet on hand for retransmission, since the lock step acknowledgment guarantees that all older packets have been received. You please notice that both machines involved in a transfer are considered senders and receivers. One sends data and receives acknowledgments; the other sends acknowledgments and receives data. Most errors cause termination of the connection. An error is signaled by sending an error packet. This packet is not acknowledged, and not retransmitted (i.e., a TFTP server or user may terminate after sending an error message), so the other end of the connection may not get it. Therefore timeouts are used to detect such a termination when the error packet has been lost. Errors are caused by three types of events: not being able to satisfy the request (e.g., file not found, access violation, or no such user), receiving a packet which cannot be explained by a delay or duplication in the network (e.g., an incorrectly formed packet), and losing access to a necessary resource (e.g., disk full or access denied during a transfer).
DIT 116
NETWORK PROTOCOLS
NOTES
TFTP recognizes only one error condition that does not cause termination, the source port of a received packet being incorrect. In this case, an error packet is sent to the originating host. This protocol is very restrictive, in order to simplify implementation. For example, the fixed length blocks make allocation straight forward, and the lock step acknowledgement provides flow control and eliminates the need to reorder incoming data packets. 3.6.2 Initial Connection Protocol A transfer is established by sending a write request (WRQ) to write onto a foreign file system or a read request (RRQ) to read from it, an acknowledgment packet for write (ACK), or the first data packet (DATA) for read. In general an acknowledgment packet will contain the block number of the data packet being acknowledged. Each data packet has associated with it a block number; block numbers are consecutive and begin with one. Since the positive response to a write request is an acknowledgment packet, in this special case the block number will be zero. (Normally, since an acknowledgment packet is acknowledging a data packet, the acknowledgment packet will contain the block number of the data packet being acknowledged.) If the reply is an error packet, then the request has been denied. In order to create a connection, each end of the connection chooses a terminal identifier (TID) for itself, to be used for the duration of that connection. The TIDs chosen for a connection should be randomly chosen, so that the probability that the same number is chosen twice in immediate succession is very low. Every packet has associated with it the two TIDs of the ends of the connection, the source TID and the destination TID. These TIDs are handed to the supporting UDP (or other datagram protocol) as the source and estination ports. A requesting host chooses its source TID as described above, and sends its initial request to the known TID 69 decimal (105 octal) on the serving host. The response to the request, under normal operation, uses a TID chosen by the server as its source TID and the TID chosen for the previous message by the requestor as its destination TID. The two chosen TIDs are then used for the remainder of the transfer. As an example, the following shows the steps used to establish a connection to write a file. Note that WRQ, ACK, and DATA are the names of the write request, acknowledgment, and data types of packets respectively. 1. 2. Host A sends a WRQ to host B with source= As TID, destination= 69. Host B sends a ACK (with block number= 0) to host A with source= Bs TID, destination= As TID.
At this point the connection has been established and the first data packet can be sent by Host A with a sequence number of 1. In the next step, and in all succeeding
DIT 116
NETWORK PROTOCOLS
steps, the hosts should make sure that the source TID matches the value that was agreed on in steps 1 and 2. If a source TID does not match, the packet should be discarded as erroneously sent from somewhere else. An error packet should be sent to the source of the incorrect packet, while not disturbing the transfer. This can be done only if the TFTP in fact receives a packet with an incorrect TID. If the supporting protocols do not allow it, this particular error condition will not arise. The following example demonstrates a correct operation of the protocol in which the above situation can occur. Host A sends a request to host B. Somewhere in the network, the request packet is duplicated, and as a result two acknowledgments are returned to host A, with different TIDs chosen on host B in response to the two requests. When the first response arrives, host A continues the connection. When the second response to the request arrives, it should be rejected, but there is no reason to terminate the first connection. Therefore, if different TIDs are chosen for the two connections on host B and host A checks the source TIDs of the messages it receives, the first connection can be maintained while the second is rejected by returning an error packet. 3.6.3 Transfer Protocol The sending side transmits a file in fixed size (512 byte) blocks and awaits an acknowledgment for each block before sending the next. The receiver acknowledges each block upon receipts. The rules for TFTP are simple. The first packet sent requests a file transfer and establishes the interaction between client and server the packet specifies a file name and whether the file will be read (transferred to the client) or written (transferred to the server). Blocks of the file are numbered consecutively starting at 1. Each data packet contains a header that specifies the number of block it carries, and each acknowledgement contains the number of the block being acknowledged. A block of 512 bytes signals the end of file. It is possible to send an error message either in the place of data or an acknowledgement: error terminates the transfer.
NOTES
143
DIT 116
NETWORK PROTOCOLS
NOTES
2-octet opcode READ REQ. (1) n octets FILENAME 1-octet 0 n octets MODE 1-octet 0
2-octet opcode READ REQ. (2)
n octets FILENAME
1-octet 0
n octets MODE
1-octet 0
2-octet opcode DATA(3)
2-octets BLOCK #
up to 512 octet DATA OCTETS
2-octet opcode ACK(4)
2-octets BLOCK #
2-octet opcode ERROR(5)
2-octets ERROR CODE
n octets ERROR MESSAGE
1-octet 0
Figure 3.7 The five message types
Once a read or write request has been made, the server uses the IP address and UDP protocol port number of the client to identify subsequent operations. Does, neither data messages(the message that carry blocks from the file) nor ack messages (the messages that acknowledges data blocks) need to specify the file name. The final message type illustrated in figure 3.7 is used to report errors. Last messages can be retransmitted after a timeout, but most other errors simply cause termination of the interaction. TFTP retransmission is unusual because it is symmetric. Each side implements a timeout and retransmission. If the side sending data times out, it retransmits the last data block. If the side responsible for acknowledgement times out, it retransmits the last acknowledgement. Having both sides participate in retransmit ion helps ensure that transfer will not fail after a single packet loss. While symmetric transmission guarantees robustness, it can lead to excessive retransmissions. The problem, known as the Soccers Apprentice bug, arises when an acknowledgement for data packet k is delayed, but not lost. The sender retransmits the data packet which the receiver acknowledges. Both acknowledgments eventually arrive, and each triggers a transmission of data packet k+1. The receiver will acknowledge both copies of data packet k+1, and the two acknowledgements will each cause
DIT 116
NETWORK PROTOCOLS
the sender to transmit data packet k+2. The Sorcerers Apprentice Bug can also start if the underlying internet duplicates packets. Once started, the cycle continues indefinitely with each data packet being transmitted exactly twice. Although TFTP contains little expect the minimum needed for transfer, it does support multiple file types. One interesting aspect of TFTP allows it to be integrated with electronic mail. A client can specify to the server that it will send a file that should be treated as mail with the FILENAME field taken to be the name of the mailbox to which the server should deliver the message. Have you understood? 1. What is the transport layer used by TFTP? 2. Mention the circumstances in which TFTP is preferred over FTP. 3. TFTP plays a major role in configuring routers. Justify this statement. 4. Mention the steps involved in establishing a TFTP session. 5. What are the steps involved in the transfer of file in TFTP? 3.7 NETWORK FILE SYSTEM Initially developed by Sun Microsystems Incorporated, the Network File System (NFS) provides on-line shared file access that is transparent and integrated; many TCP/IP sites use NFS to interconnect their computers file systems. From the users perspective, NFS is almost invisible. A user can execute an arbitrary application program and use arbitrary files for input or output. The file names themselves do not show whether the files are local or remote. The NFS protocol is designed to be portable across different machines, operating systems, network architectures, and transport protocols. This portability is achieved through the use of Remote Procedure Call (RPC) primitives built on top of an eXternal Data Representation (XDR). Implementations already exist for a variety of machines, from personal computers to supercomputers. The supporting mount protocol allows the server to hand out remote access privileges to a restricted set of clients. It performs the operating system-specific functions that allow, for example, attaching remote directory trees to some local file system. 3.7.1 RPC The remote procedure call model is similar to the local procedure call model. In the local case, the caller places arguments to a procedure in some well-specified location (such as a result register). It then transfers control to the procedure, and eventually gains back control. At that point, the results of the procedure are extracted from the well-specified location, and the caller continues execution.
NOTES
145
DIT 116
NETWORK PROTOCOLS
NOTES
The remote procedure call is similar, in that one thread of control logically winds through two processes one is the callers process, the other is a servers process. That is, the caller process sends a call message to the server process and waits (blocks) for a reply message. The call message contains the procedures parameters, among other things. The reply message contains the procedures results, among other things. Once the reply message is received, the results of the procedure are extracted, and callers execution is resumed. On the server side, a process is dormant awaiting the arrival of a call message. When one arrives, the server process extracts the procedures parameters, computes the results, ends a reply message, and then awaits the next call message. Note that in this model, only one of the two processes is active at any given time. However, this model is only given as an example. The RPC protocol makes no restrictions on the concurrency model implemented, and others are possible. For example, an implementation may choose to have RPC calls be asynchronous, so that the client may do useful work while waiting for the reply from the server. Another possibility is to have the server create a task to process an incoming request, so that the server can be free to receive other requests. 3.7.1.1 Transports and Semantics The RPC protocol is independent of transport protocols. That is, RPC does not care how a message is passed from one process to another. The protocol deals only with specification and interpretation of messages. It is important to point out that RPC does not try to implement any kind of reliability and that the application must be aware of the type of transport protocol underneath RPC. If it knows it is running on top of a reliable transport such as TCP, then most of the work is already done for it. On the other hand, if it is running on top of an unreliable transport such as UDP, it must implement its own retransmission and time-out policy as the RPC layer does not provide this service. Because of transport independence, the RPC protocol does not attach specific semantics to the remote procedures or their execution. Semantics can be inferred from (but should be explicitly specified by) the underlying transport protocol. For example, consider RPC running on top of an unreliable transport such as UDP. If an application retransmits RPC messages after short time-outs, the only thing it can infer if it receives no reply is that the procedure was executed zero or more times. If it does receive a reply, then it can infer that the procedure was executed at least once. A server may wish to remember previously granted requests from a client and not regrant them in order to insure some degree of execute-at-most-once semantics. A server can do this by taking advantage of the transaction ID that is packaged with every RPC request. The main use of this transaction is by the client RPC layer in matching replies to
146
DIT 116
NETWORK PROTOCOLS
requests. However, a client application may choose to reuse its previous transaction ID when retransmitting a request. The server application, knowing this fact, may choose to remember this ID after granting a request and not regrant requests with the same ID in order to achieve some degree of execute-at-most- once semantics. The server is not allowed to examine this ID in any other way except as a test for equality. On the other hand, if using a reliable transport such as TCP, the application can infer from a reply message that the procedure was executed exactly once, but if it receives no reply message, it cannot assume the remote procedure was not executed. Note that even if a connection-oriented protocol like TCP is used, an application still needs time-outs and reconnection to handle server crashes. There are other possibilities for transports besides datagram- or connection-oriented protocols. 3.7.1.2 Protocol Requirements The RPC protocol must provide for the following: (1) (2) (3) Unique specification of a procedure to be called. Provisions for matching response messages to request messages. Provisions for authenticating the caller to service and vice-versa.
NOTES
Besides these requirements, features that detect the following are worth supporting because of protocol roll-over errors, implementation bugs, user error, and network administration: (1) (2) (3) (4) (5) RPC protocol mismatches. Remote program protocol version mismatches. Protocol errors (such as misspecification of a procedures parameters). Reasons why remote authentication failed. Any other reasons why the desired procedure was not called.
3.7.1.3 RPC Programs and Procedures The RPC call message has three unsigned fields: remote program number, remote program version number, and remote procedure number. The three fields uniquely identify the procedure to be called. Program numbers are administered by some central authority (like Sun). Once an implementor has a program number, he can implement his remote program; the first implementation would most likely have the version number of 1.Because most new protocols evolve into better, stable, and mature protocols, a version field of the call message identifies which version of the protocol the caller is using. Version numbers make speaking old and new protocols through the same server process possible. The procedure number identifies the procedure to be called. These numbers are documented in the specific programs protocol specification. For ex147 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
ample, a file services protocol specification may state that its procedure number 5 is read and procedure number 12 is write. Just as remote program protocols may change over several versions, the actual RPC message protocol could also change. Therefore, the call message also has in it the RPC version number, which is always equal to two for the version of RPC described here. The reply message to a request message has enough information to distinguish the following error conditions: (1) (2) (3) (4) (5) The remote implementation of RPC does speak protocol version 2. The lowest and highest supported RPC version numbers are returned. The remote program is not available on the remote system. The remote program does not support the requested version number. The lowest and highest supported remote program version numbers are returned. The requested procedure number does not exist. (This is usually a caller side protocol or programming error.) The parameters to the remote procedure appear to be garbage from the servers point of view. (Again, this is usually caused by a disagreement about the protocol between client and service.) XDR
3.7.2
XDR is a standard for the description and encoding of data. It is useful for transferring data between different computer architectures, and has been used to communicate data between such diverse machines as the SUN WORKSTATION, VAX, IBMPC, and Cray. XDR fits into the ISO presentation layer, and is roughly analogous in purpose to X.409, ISO Abstract Syntax Notation. The major difference between these two is that XDR uses implicit typing, while X.409 uses explicit typing. XDR uses a language to describe data formats. The language can only be used only to describe data; it is not a programming language. This language allows one to describe intricate data formats in a concise manner. The alternative of using graphical representations (itself an informal language) quickly becomes incomprehensible when faced with complexity. The XDR language itself is similar to the C language, just as Courier is similar to Mesa. Protocols such as ONC RPC (Remote Procedure Call) and the NFS (Network File System) use XDR to describe the format of their data. The XDR standard makes the following assumption: that bytes (or octets) are portable, where a byte is defined to be 8 bits of data. A given hardware device should encode the bytes onto the various media in such a way that other hardware devices may decode the bytes without loss of meaning. For example, the Ethernet standard suggests that bytes be encoded in little-endian style, or least significant bit first.
DIT 116
NETWORK PROTOCOLS
3.7.3 Stateless Servers The NFS protocol was intended to be as stateless as possible. That is, a server should not need to maintain any protocol state information about any of its clients in order to function correctly. Stateless servers have a distinct advantage over stateful servers in the event of a failure. With stateless servers, a client need only retry a request until the server responds; it does not even need to know that the server has crashed, or the network temporarily went down. The client of a stateful server, on the other hand, needs to either detect a server failure and rebuild the servers state when it comes back up, or cause client operations to fail. This may not sound like an important issue, but it affects the protocol in some unexpected ways. We feel that it may be worth a bit of extra complexity in the protocol to be able to write very simple servers that do not require fancy crash recovery. Note that even if a so-called reliable transport protocol such as TCP is used, the client must still be able to handle interruptions of service by re-opening connections when they time out. Thus, a stateless protocol may actually simplify the implementation. On the other hand, NFS deals with objects such as files and directories that inherently have state what good would a file be if it did not keep its contents intact? The goal was to not introduce any extra state in the protocol itself. Inherently stateful operations such as file or record locking, and remote execution, are implemented as separate services. The basic way to simplify recovery was to make operations as idempotent as possible (so that they can potentially be repeated). Some operations in this version of the protocol did not attain this goal; luckily most of the operations (such as Read and Write) are idempotent. Also, most server failures occur between operations, not between the receipt of an operation and the response. Finally, although actual server failures may be rare, in complex networks, failures of any network, router, or bridge may be indistinguishable from a server failure. 3.7.4 File System Model NFS assumes a file system that is hierarchical, with directories as all but the bottom level of files. Each entry in a directory (file, directory, device, etc.) has a string name. Different operating systems may have restrictions on the depth of the tree or the names used, as well as using different syntax to represent the pathname, which is the concatenation of all the components (directory and file names) in the name. A file system is a tree on a single server (usually a single disk or physical partition) with a specified root. Some operating systems provide a mount operation to make all file systems appear as a single tree, while others maintain a forest of file systems. Files
NOTES
149
DIT 116
NETWORK PROTOCOLS
NOTES
are unstructured streams of uninterrupted bytes. Version 3 of NFS uses slightly more general file system model. NFS looks up one component of a pathname at a time. It may not be obvious why it does not just take the whole pathname, traipse down the directories, and return a file handle when it is done. There are several good reasons not to do this. First, pathnames need separators between the directory components, and different operating systems use different separators. We could define a Network Standard Pathname Representation, but then every pathname would have to be parsed and converted at each end. Although files and directories are similar objects in many ways, different procedures are used to read directories and files. This provides a network standard format for representing directories. The same argument as above could have been used to justify a procedure that returns only one directory entry per call. The problem is efficiency. Directories can contain many entries, and a remote call to return each would be just too slow. 3.7.5 Implementation Figure 3.8 illustrates how NFS is embedded in an operating system. When an application program executes, it calls the operating system to open a file, or to store and retrieve data in files. The file access mechanism accepts the request and automatically passes it to either the local file system software or to the NFS client, depending on whether the file is on the local disk or on a remote machine. When it receives a request, the client software uses the NFS protocol to contact the appropriate server on a remote machine and perform the requested operation. When the remote server replies, the client software returns the results to the application program.
application
local file system
NFS client
local disk
Internet connection to NFS server
Figure 3.8 NFS Code in an OS Anna University Chennai 150
DIT 116
NETWORK PROTOCOLS
Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 3.8 What is meant by transparent file access? What are the building blocks of NFS? What are the actual steps that take place in RPC while a client calls the functions of a remote server? What are the advantages provided by RPC? What is the standard used to encode the values in the RPC call and reply messages? In which layer of ISO/OSI reference model XDR fits in? What is the advantage of stateless servers? What is the file system model followed in NFS? What is the relationship between NFS and the operating system? How is mount protocol related to NFS? TELNET PROTOCOL
NOTES
Remote login is one of the most popular Internet applications. Instead of having a hard-wired terminal on each host, we can login to one host and then remote login across the network to any other host. The TCP/IP suite includes a simple remote terminal protocol called TELNET that allows a user to log into a computer across an internet. TELNET establishes a TCP connection, and then passes keystrokes from the users keyboard directly to the remote computer as if they had been typed on a keyboard attached to the remote machine. TELNET also carries output from the remote machine back to the users screen. The service is called transparent because it gives the appearance that the users keyboard and display attach directly to the remote machine. Although TELNET is not as sophisticated as some remote terminal protocols, it is widely available. Usually, TELNET client software allows the user to specify a remote machine either by giving its domain name or IP address. Because it accepts IP addresses, TELNET can be used with hosts even if a name-to-address binding cannot be established (e.g. when domain naming software is being debugged). 3.8.1 Protocol Overview TELNET offers three basic services. First, it defines a network virtual terminal that provides a standard interface to remote systems. Client programs do not have to understand the details of all possible remote systems; they are built to use the standard interface. Second, TELNET includes a mechanism that allow the client and the server to negotiate options, and it provides a set of standard options (e.g. one of the options controls whether data passed across the connection uses the standard 7-bit ASCII character set or an 8-bit character set). Finally, TELNET treats both ends of the connection symmetrically. In particular, TELNET does not force client input to come
DIT 116
NETWORK PROTOCOLS
NOTES
from a keyboard, nor does it force the client to display output on a screen. Thus, TELNET allows an arbitrary program to become a client. Furthermore, either end can negotiate options. Figure 3.9 illustrates how application programs implement a TELNET client and server.
Figure 3.9 Path of data in a TELNET session
As the figure shows, when a user invokes TELNET, an application program on the users machine becomes the client. The client establishes a TCP connection on the server over which they will communicate. Once the connection has been established, the client accepts keystrokes from the users keyboard and sends them to the server, while it concurrently accepts character that the server sends back and display them on the users screen. The server must accept a TCP connection from the client, and then relay data between the TCP connection and the local operating system. In practice, the server is more complex than the figure shows because it must handle multiple, concurrent connections. Usually, a master server process waits for new connections and creates a new slave to handle each connection. Thus, the TELNET server shown in figure 1, represents the slave that handles one particular connection. The figure does not show the master server that listens for new request, nor does it show the slaves handling other connections. We use the term pseudo terminal to describe the operating system entry point that allows a running program like the TELNET server to transfer characters to the operating system as if they came from a keyboard. It is impossible to build a TELNET server unless the operating system supplies such a facility. If the system supports a pseudo terminal abstraction, the TELNET server can be implemented with application proAnna University Chennai 152
DIT 116
NETWORK PROTOCOLS
grams. Each slave server connects a TCP stream from one client to a particular pseudo terminal. Arranging for the TELNET server to be an application level program has advantages and disadvantages. The most obvious advantage is that it makes modification and control of the server easier than if the code were embedded in the operating system. The obvious disadvantage is inefficiency. Each keystroke travels from the users keyboard through the operating system and across the internet to the server machine. After reaching the destination machine, the data must travel up through the servers operating system to the server application program, and from the server application program back into the servers operating system at a pseudo terminal entry point. Finally, the remote operating system delivers the character to the application program the user is running. Meanwhile, output (including remote character echo if that option has been selected) travels back from the server to the client over the same path. Readers who understand operating systems will appreciate that for the implementation shown in figure 1, every keystroke requires computers to switch process context several times. In most systems, an additional context switch is required because the operating system on the servers machine must pas characters from the pseudo terminal back to another application program (e.g. a command interpreter). Although context switching is expensive, the scheme is practical because user do not type at high speed. 3.8.2 Network Virtual Terminal To make TELNET interoperate between as many systems as possible, it must accommodate the details of heterogeneous computer and operating systems. For example, some system requires lines of text to be terminated by the ASCII carriage control character (CR). Others require the ASCII linefeed (LF) character. Still other requires the two-character sequence of CR-LF. A specific example is, the end-of-file token in disk operating system (DOS) is Ctrl+z, where as in UNIX it is Ctrl+d. In addition, most interactive systems provide a way for a user to enter a key that interrupts a running program. However, the specific keystroke used to interrupt a program varies from system to system (e.g. some systems use Control-C, while others use ESCAPE). To accommodate heterogeneity, TELNET defines how data sequences are sent across the Internet. The definition is known as the network virtual terminal (NVT). NVT is an universal interface. Via this interface, the client TELNET translates characters (data or commands) that come from the local terminal into NVT form and delivers them to the network. The server TELNET, on the other hand, translates data and commands from NVT form into the form acceptable by the remote computer. These are illustrated in figure 3.10.
153
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
Client System format used NVT format used Server Systems format used
Figure 3.10 NVT format in TELNET
The definition of NVT format is fairly straightforward. All communication involves 8-bit bytes. At startup, NVT uses the standard 7-bit USASCII representation for data and reserves bytes with the high order bit set for command sequences. The USASCII character set includes 95 characters that have printable graphics (e.g. letters, digits, and punctuation marks) as well as 33 control codes. All printable characters are assigned the same meaning as in the standard USASCII character set. The NVT standard defines interpretations for control characters as shown in figure 3.11.
ASCII Control Code NUL BEL BS HT LF VT FF CR Other Control Decimal Value 0 7 8 9 10 11 12 13 -
Assigned Meaning No operation (has no effect on output) Sound audiable/visible signal (no motion) Move left one character position Move right to the next horizontal tab stop Move down (vertically) to the next line Move down to the next vertical tab stop Move down to the top of the next page Move to the left margin on the current line No operation (has no effect on output)
Figure 3.11 TELNET NVT interpretations of ASCII Control Characters
In addition to the control character interpretations in Figure 3, NVT defines the standard line transmission to be a two-character sequence CR-LF. When a user presses the key that corresponds to end-of-line on the local terminal (e.g. ENTER or RETURN), the TELNET client must map it into CR-LF for transmission. The TELNET server translates CR-LF into the appropriate end-of-line character sequence for the remote machine. 3.8.3 Controlling the Server We said that most systems provide a mechanism that allows users to terminate a running program. Usually, the local operating system binds such mechanisms to a parAnna University Chennai 154
DIT 116
NETWORK PROTOCOLS
ticular key or keystroke sequence. For example, unless the user specifies otherwise, many UNIX systems reserve the character generated by CONTROL-C as the interrupt key. Depressing CONTROL-C causes UNIX to terminate the executing program; the program does not receive CONTOL-C as input. The system may reserve other character or character sequences for other control functions. TELNET NVT accommodates control functions by defining how they are passed from the client to the server. Conceptually, we think of NVT as accepting input from a keyboard that can generate more than 128 possible characters. We assume input from a keyboard has virtual (imaginary) keys that correspond to the functions typically used to control processing. For example, NVT defines a conceptual interrupt key that requests program termination. Figure 3.12 lists the control functions that NVT allows. Signal IP AO AYT EC EL SYNCH BRK Meaning Interrupt Process (terminate running program) Abort Output (discard any buffered output) Are You There (test if server is responding) Erase Character (delete the previous character) Erase Line (delete the entire current line) Synchronize (cleat data path until TCP urgent data point, but no interpret commands) Break (break key or attention signal)
NOTES
Figure 3.12 Control Functions of TELNET NVT
In practice, most keyboards do not provide extra keys for commands. Instead, individual operating systems or command interpreters have a variety of ways to generate them. We already mentioned the most common technique: binding an individual ASCII character to a control functions so when the user passes the key, the operating system takes the appropriate action instead of accepting the character as input. The NVT designers chose to keep commands separate from the normal ASCII character set for two reasons. First, defining the control functions separately means TELNET has greater flexibility. It can transfer all possible ASCII character sequences between client and server as well as all possible control functions. Second, by separating signals from normal data, NVT allows the client to specify signals unambiguously there is never confusion about whether an input character should be treated as data or as a control function. To pass control functions across the TCP connection, TELNET encodes them using an escape sequence. An escape sequence uses a reserved octet to indicate that a control code octet follows. In TELNET, the reserved octet that starts an escape
155
DIT 116
NETWORK PROTOCOLS
NOTES
sequence is known as the interpreter as command (IAC) octet. Figure 3.13 lists the possible commands and the decimal coding used for each.
Command IAC
DONT DO WONT WILL SB GA EL EC AYT AO IP BRK DMARK NOP SE EOR
Decimal Meaning Encoding 255 Interpret next octet as command (when the IAC octet appears as data, the sender doubles it and sends the 2-octet sequence IAC-IAC) 254 Denial of request to perform specified option 253 Approval to allow specified option 252 Refusal to perform specified option 251 Agreement to perform specified option 250 Start of option sub-negotiation 249 The go ahead signal 248 The erase line signal 247 The erase character signal 246 The are you there signal 245 The abort output signal 244 The interrupt process signal 243 The break signal 242 The data stream portion of a SYNCH (always accompanied by TCP Urgent notification) 241 No operation 240 End of option sub-negotiation 239 End of record
Figure 3.13 TELNET Commands and encoding for each
As the figure shows, the signal generated by conceptual keys on an NVT keyboard each have a corresponding command. For example, to request that the server interrupt the executing program, the client must send the 2-octet sequence IAC IP (255 followed by 244). Additional commands allow the client and server to negotiate which options they will use and to synchronize communication. 3.8.4 Out-of-band Signaling Sending control functions along with normal data is not always sufficient to guarantee the desired results. To see why, consider the situation under which a user might send the interrupt process control function to the server. Usually, such control is only needed when the program executing on the remote machine is misbehaving and the user wants the server to terminate the program. For example, the program might be executing an endless loop without reading input or generating output. Unfortunately, if the application at the servers site stop reading input, operating system buffers will eventually fill and the server will and the server will be unable to write more data from the TCP connection, causing its buffers to fill. Eventually, TCP on the server machine will begin advertising a zero window size, preventing data from flowing across the connection.
DIT 116
NETWORK PROTOCOLS
If the user generates as interrupt control function when buffers are filled, the control function will never reach the server. That is, the client can form the command sequence IAC IP and write it to the TCP connection, but because TCP has stopped sending to the servers machine, the server will not read the control sequence. TELNET cannot rely on the conventional data stream alone to carry control sequence between client and server, because a misbehaving application that needs to be controlled might inadvertently block the data stream. To solve the problem, TELNET uses an out of band signal. TCP implements out of band signaling with the urgent data mechanism. Whenever it places a control function in the data stream, TELNET also sends a SYNCH command. TELNET then appends a reserved octet called the data mark, and causes TCP to signal the server by sending a segment with the URGENT DATA bit set. Segments carrying urgent data bypass flow control and reach the server immediately. In response to an urgent signal, the server reads and discards all data until it finds the data mark. The server returns to normal processing when it encounters the data mark. 3.8.5 TELNET Options and Negotiation Our simple description of TELNET omits one of the most complex aspects: options. In TELNET, options are negotiable, making it possible for the client and the server to reconfigure their connection. For example, we said that usually the data stream passes 7-bit data and uses octets with the eighth bit set to pass control information like the Interrupt Process Command. However, TELNET also provides an option that allows the client and server to pass 8-bit data (when passing 8-bit data, the reserved octet IAC must still be doubled if it appears in the data). The client and server must negotiate, and both must agree to pass 8-bit data before such transfers are possible. The range of TELNET options is wide: some extended the capabilities in major ways while other deal with minor details. For example, the original protocol was designed for a half-duplex environment where it was necessary to tell the other end to go ahead before it would send more data. One of the options control whether TELNET operates in half-or full-duplex mode. Another option allows the server on a remote machine to determine the users terminal type. The terminal type is important for software that generates cursor positioning sequences (e.g. a full screen editor executing on a remote machine). Figure 3.14 lists several of the most commonly implemented TELNET options. The way TELNET negotiates options is interesting. Because it sometimes makes sense for the server to initiate a particular option, the protocol is designed to allow either end to make a request. Thus, the protocol is said to be symmetric with respect
157
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
to option processing. The receiving end either responds to a request with a positive acceptance or a rejection. In TELNET terminology, the request is WILL X, meaning will you agree to let me use option X; and the response is either DO X or DONT X, meaning I do agree to let you use option X or I dont agree to let you use option X. The symmetry arises because DO X requests that the receiving party begins using option X, and WILL X or WONT X means I will start using option X or I wont start using it.
Name Transmit Binary Echo Suppress-GA Status Timing-Mark Terminal-Type Code 0 1 3 5 6 24 RFC 856 857 858 859 860 884 Meaning Change transmission to 8-bit binary Allow one side to echo data it receives Suppress (no longer send) Go-ahead signal after data Request for status of a TELNET option from remote site Request timing mark be inserted in return stream to synchronize two ends of a connection Exchange information about the make and model of a terminal being used (allows programs to tailor output like cursor positioning sequences for the users terminal) Terminate data sent with EOR code Use local editing and send complete lines instead of individual characters
End-of-Record Linemode
25 34
885 1116
Figure 3.14 Commonly used TELNET options
Another interesting negotiation concept arises because both ends are required to run an unenhanced NVT implementation (i.e. one without any options turned on). If one side tries to negotiate an option that the other does not understand, the side receiving the request can simply decline. Thus, it is possible to interoperate newer, more sophisticated versions of TELNET clients and servers (i.e. software that understands more options) with older, less sophisticated versions. If both the client and the server understand the new options, they may be able to improve interaction. If not, they will revert to a less efficient, but workable style. TELNET uses a symmetric option negotiation mechanism to allow clients and servers to reconfigure the parameters controlling their interactions. Because all TELNET software understands a basic NVT protocol, clients and servers can interoperate even if one understands options another does not. 3.9 RLOGIN (BSD UNIX)
Operating Systems derived from BSD UNIX include a remote login service, rlogin that supports trusted hosts. It allows system administration to choose a set of machines over which login names and file access protections are shared and to establish equivalences among user logins. Users can control access to their accounts by authorizing remote login based on remote host and remote user name. Thus, it is possible for a user to have login name X on one machine and Y on another, and still be able to remotely login from one of the machines to the other without typing a password each time.
DIT 116
NETWORK PROTOCOLS
Having automatic authorization makes remote login facilities useful for general purpose programs as well as human interaction. One variant of the rlogin command, rsh, invokes a command interpreter on the remote UNIX machine and passes the command line arguments to the command interpreter, skipping the login step completely. The format of a command invocation using rsh is: rsh machine command Thus, typing rsh merlin ps on any of the machines in the Computer Science Department at Purdue university executes the ps command on machine merlin, with UNIXs standard input and standard output connected across the network to the users keyboard and display. the user sees the output as if he or she were logged into machine merlin. Because the user can arrange to have rsh invoke remote commands without prompting for a password, it can be used in programs as well as from the keyboard. Because protocols like rlogin understand both the local and remote computing environments, they communicate better than general purpose remote login protocols like TELNET. For example, rlogin understands the UNIX notions of standard input, standard output, and standard error, and uses TCP to connect them to the remote machine. Thus, it is possible to type rsh merlin ps > filename and have output from the remote command redirected into file filename. Rlogin also understands terminal control functions like flow control characters (typically Control-S and Control-Q). It arranges to stop output immediately without waiting for the delay required to send them across the network to the remote host. Finally rlogin exports part of the users environment to the remote machine, including information like the users terminal type (i.e. the TERM variable). As a result, a remote login session appears to behave almost exactly like a local login session. 3.9.1 Connection Establishment Upon connection establishment, the client sends four null-terminated strings to the server. The first is an empty string (i.e., it consists solely of a single zero byte), followed by three non-null strings: the client username, the server username, and the terminal type and speed. More explicitly: <null> client-user-name<null> server-user-name<null> terminal-type/speed<null>
159
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
For example: <null> bostic<null> kbostic<null> vt100/9600<null> The server returns a zero byte to indicate that it has received these strings and is now in data transfer mode. Window size negotiation may follow this initial exchange From Client to Server (and Flow Control) Initially, the client begins operation in cooked (as opposed to raw) mode. In this mode, the START and STOP (usually ASCII DC1, DC3) characters are intercepted and interpreted by the client to start and stop output from the remote server to the local terminal, whereas all other characters are transmitted to the remote host as they are received. In raw mode, the START and STOP characters are not processed locally, but are sent as any other character to the remote server. The server thus determines the semantics of the START and STOP characters when in raw mode; they may be used for flow control or have quite different meanings independent of their ordinary usage on the client. Screen/Window Size The remote server indicates to the client that it can accept window size change information by requesting a window size message just after connection establishment and user identification exchange. The client should reply to this request with the current window size. If the remote server has indicated that it can accept client window size changes and the size of the clients window or screen dimensions changes, a 12-byte special sequence is sent to the remote server to indicate the current dimensions of the clients window, should the user process running on the server care to make use of that information. The window change control sequence is 12 bytes in length, consisting of a magic cookie (two consecutive bytes of hex FF), followed by two bytes containing lowercase ASCII s, then 8 bytes containing the 16-bit values for the number of character rows, the number of characters per row, the number of pixels in the X direction, and the number of pixels in the Y direction, in network byte order. Thus: FF FF s s rr cc xp yp
160
DIT 116
NETWORK PROTOCOLS
Other flags than ss may be used in future for other in-band control messages. None are currently defined. From Server to Client Data from the remote server is sent to the client as a stream of characters. Normal data is simply sent to the clients display, but may be processed before actual display (tabs expanded, etc.). The server can imbed single-byte control messages in the data stream by inserting the control byte in the stream of data and pointing the TCP urgentdata pointer at the control byte. When a TCP urgent-data pointer is received by the client, data in the TCP stream up to the urgent byte is buffered for possible display after the control byte is handled, and the control byte pointed to is received and interpreted as follows: 02 10 A control byte of hex 02 causes the client to discard all buffered data received from the server that has not yet been written to the client users screen. A control byte of hex 10 commands the client to switch to raw mode, where the START and STOP characters are no longer handled by the client, but are instead treated as plain data. A control byte of hex 20 commands the client to resume interception and local processing of START and STOP flow control characters.
NOTES
20
All other values of the urgent-data control byte are ignored. In all cases, the byte pointed to by the urgent data pointer is NOT written to the client users display. 3.9.2 Connection Closure When the TCP connection closes in either direction, the client or server process which notices the close should perform an orderly shut-down, restoring terminal modes and notifying the user or processes of the close before it closes the connection in the other direction. Have you understood? 1. What is the necessity of remote login? 2. What are the two popular applications provided by TCP/IP for remote login? 3. What are the three basic services provided by TELNET? 4. What is meant by pseudo terminal in TELNET? 5. What are the control functions supported by NVT? 6. What are the various options provided by TELNET? 7. What are the circumstances under which TELNET goes for out of band signaling? 8. Rlogin is a simple protocol than TELNET. Justify this statement. 9. What is the importance of window size changes in Rlogin? 10. How is flow control done in Rlogin?
DIT 116
NETWORK PROTOCOLS
NOTES
Summary 1. It is very difficult for the end users to work with IP addresses itself in communicating with remote machines in the network because remembering IP addresses is cumbersome and error prone. Hence TCP/IP introduced an application layer protocol by name Domain Name System that enables the users to work with mnemonic addresses instead of dotted decimal notation of IP addresses. DNS is an indirect application used by other application layer protocols or applications like SMTP, HTTP, and TELNET etc in resolving the mapping between a domain name (mnemonic address) and its equivalent IP address. DNS is a distributed database that is used by TCP/IP applications to map between hostnames and IP addresses and to provide electronic mail routing information. Distributed refers to the fact no single site on the Internet knows all the information. In DNS, each site maintains its own database of information, and runs a server program that other systems across the Internet (clients) can query. The DNS provides the protocol that allows the clients and servers to communicate with each other. The top level domains of the DNS form two completely different naming hierarchies: geographic and organizational. The geographic scheme divides the universe of machines by country. Organizational scheme divide the Internet according to the type of the organization to which the machines belong. One important feature of DNS is the delegation of responsibility within the DNS. No single entity manages every label in the DNS tree. NIC maintains a portion of the tree (top level domains) and delegates the responsibility to others for specific zones. A zone is a subtree of the DNS tree that is administered separately. A common zone is a second level domain and the second level domains divide their zone into smaller zones. Once the authority for a zone is delegated, it is up to the person responsible for the zone to provide multiple name servers for that zone. Apart from the basic mapping of domain name into its equivalent IP address, DNS provides other services like mail address aliasing, canonical name aliasing etc and hence DNS maintains different type of resource records. DNS may work with two different types of queries namely recursive query and iterative query. In recursive query, each server that does not have the requested information goes and finds it somewhere, then reports back. In iterative query, when a query cannot be satisfied locally, the query fails, but the name of the next server along the line to try is returned. In a networked environment, where a group of related users are working, it becomes necessary for one user to access or modify the files that are present in other systems. Two methods are used to share the files between the systems of a network by name file access and file transfer.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
162
DIT 116
NETWORK PROTOCOLS
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
File access provides only the portions of a file that a process references and the goal of file access is to make the access transparent. With file transfer a complete copy of the file transferred to the client side. File Transfer Protocol is an out of band protocol that maintains two different connections for control information and data. Initially the control processes running in the client and server machines establish a TCP connection at port number 21 for control information and then the data connection is established at port number 20. In FTP, data is transferred from a storage device in the sending host to a storage device in the receiving host. Often it is necessary to perform certain transformations on the data because data storage representations in the two systems are different. A different problem in representation arises when transmitting binary data (not character codes) between host systems with different word lengths. Although FTP is highly reliable, it is heavy weight protocol and the overheads involve din the process of file transfer is heavy. Hence FTP is required only if the client and server requires reliability of very high degree and the reliability of the network is less (e.g., Internet) FTP permits a client to transfer a file from the remote server only if the client is an authenticate user and have the access rights for a particular file. Anonymous FTP is another version of FTP where every one can transfer a file in the guest account itself. When reliability is not a major issue and if the client and server are on the same Local area Network, a simple version of file transfer protocol by name TFTP (Trivial FTP) is sufficient. TFTP uses UDP at the transport layer. The NFS protocol is designed to be portable across different machines, operating systems, network architectures, and transport protocols. A user can execute an arbitrary application program and use arbitrary files for input or output. The file names themselves do not show whether the files are local or remote. Usually in networked environment, clients and servers are developed using socket programming. Servers run always and wait for the clients request. Client sends the request and gets the response from the server. Remote Procedure Call (RPC) is a different way of doing network network programming. A client program is written that just calls functions in the server program. In RPC the client stub packages the procedure arguments into a network message, and sends this message to the server. A server stub on the server host receives the network message. It takes the arguments from the network message and calls the server procedure that the application programmer wrote. NFS is a client/server application built using Sun RPC. NFS clients access files on an NFS server by sending RPC requests to the server. In NFS it is transparent to the client whether its accessing a local file or an NFS file. The kernel determine this when the file is opened.
163
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
23.
Remote login is one of the most popular Internet applications that enables us to login to one host and then remote login across the network to any other host instead of having a hard-wired terminal on each host. The TCP/IP suite includes a simple remote terminal protocol called TELNET that allows a user to log into a computer across an internet.
Exercises 1. 2. 3. DNS uses UDP instead of TCP. If a DNS packet is lost, there is no automatic recovery. Does this cause a problem, and if so, how is it solved? Classify a DNS resolver and a DNS name server as either client, server, or both. Changes are made in the list of root servers in DNS. Unfortunately system administrators do not update their DNS files whenever changes are made. How do you think the DNS handles this? What is the problem with maintaining the cache in the name server, and having a stateless resolver? Consider the following ftp session. $ ftp voyager.deanza.fhda.edu. Connected to voyager.deanze.fhda.edu. 220 (vsFTPd 1.2.2) 530 Please login with USER and PASS Name (voyager.deanza.fhda.edu:forouzan): forouzan 331 Please specify the password Password: 230 Login Successful Remote system type is UNIX Using binary mode to transfer files ftp>ls reports 227 Entering Passive Mode (153,18,17,11,238,169) 150 Here comes the directory listing. drwxr-xr-x 2 3027 411 4096 Sep 24 2002 business drwxr-xr-x 2 3027 411 4096 Sep 24 2002 personal drwxr-xr-x 2 3027 411 4096 Sep 24 2002 school 226 Directory send OK ftp> quit 221 Goodbye Explain this ftp session. Give an example for anonymous FTP session.
4. 5.
6.
164
DIT 116
NETWORK PROTOCOLS
7.
8. 9. 10.
TFTP sender performs the timeout and retransmission to handle lost packets. How does this affect the use of TFTP when its being used as a part of the bootstrap process? What is the limiting factor in time required to transfer a file using TFTP? What is the master-slave relationships among the server processes running in a TELNET server? What are the four options that may be exchanged by either side in a TELNET session?
NOTES
Answers 1. Since the DNS primarily uses UDP, both the resolver and the name server must perform their own timeout and retransmission. Also, unlike many other Inetrnet applications that use UDP (TFTP, BOOTP and SNMP), which operate mostly on local area networks. DNS queries and responses often traverse wide area networks. The packet loss rate and variability in round-trip times are normally higher on a WAN than a LAN, increasing the importance of a good retransmission and timeout algorithm for DNS clients. A resolver is always a client. A resolver cannot function like a server. A name server can function like a server as well as client. When a name server receives a request either from a resolver or from another name server, it functions like a server. Suppose if the name server does not have the required mapping, it may be necessary for it to forward the request to some other name server. In the later case, it functions like a client. When a name server starts, it normally reads the list of root servers from a disk file. The issue is, the disk file may have the out of date entries also. It then tries to contact one of these root servers, requesting the name server records (a query type of NS) for the root domain. This returns the current up-to-date list of root servers. Minimally this requires one of the root server entries in the start-up disk file to be current. Since the resolver comes and goes, as applications come and go, if the system is configured to use multiple name servers and the resolver maintains no state, the resolver cannot keep track of the round-trip times to its various name servers. This can often lead to timeouts for resolver queries that are too short, causing unnecessary retransmissions. After the control connection to port 21 is created, the FTP server sends the 220 (service ready) response on the control connection. The client sends the USER command. The server responds with 331 (user name is OK, password is required) The client sends the PASS command. The server responds with 230 (user login is OK) The client issues a passive open on an ephemeral port for the data connection and sends the PORT command to give this port number to the server.
2.
3.
4.
5.i. ii. iii. iv. v. vi.
DIT 116
NETWORK PROTOCOLS
NOTES
vii. The server does not open the connection at this time, but it prepares itself for issuing an active open on the data connection between port 20 (server side) and the ephemeral port received from the client. It sends response 150 (data connection will open shortly) viii. The client sends the LIST message. ix. Now the server responds with 125 and opens the data connection. x. The server then sends the list of files or directories (as a file) on the data connection. When the whole list (file) is sent, the server responds with 226 (closing data connection) over the control connection. xi. The client now has two choices. It can use the QUIT command to request the closing of the control connection or it can send another command to start another activity (and eventually open another data connection). In our example, the client sends a QUIT command. xii. After receiving the QUIT command, the server responds with 221 (server closing) and then closes the control connection. 6. $ftp internic.net Connected to internic.net 220 Server ready Name: anonymous 331 Guest login OK, send guest as password Password:guest ftp>pwd 257 / is current directory ftp>ls 200 OK 150 Operating ASCII mode bin .. .. .. ftp>close 221 Goodbye ftp>quit 7. This simplifies coding a TFTP client to fit in read-only memory, because the server is the sender of the bootstrap files, so the server must implement the timeout and retransmission. 8. With its stop and wait protocol, TFTP can transfer a maximum amount of 512 bytes per client-server round trip. The maximum throughput of TFTP is then 512 bytes divided by the round-trip time between the client and the server. On an
166
DIT 116
NETWORK PROTOCOLS
9.
10. i. ii. iii. iv.
Ethernet, assuming a round-trip time of 3ms, the maximum throughput is around 170,000 bytes/sec. In its simple form the TELNET server accepts a TCP connection from the client, and then relay data between the TCP connection and the local operating system. In practice, the server is more complex, because it must handle multiple, concurrent connections. Usually, a master server process waits for a new connection and creates a new slave to handle the connection. WILL The sender wants to enable the option itsef DO The sender wants the receiver to enable the option WONT The sender wants to disable the option itself DONT The sender wants the receiver to disable the option
NOTES
167
DIT 116
NETWORK PROTOCOLS
NOTES
168
DIT 116
NETWORK PROTOCOLS
UNIT - 4
NOTES
4.1
INTRODUCTION
This unit introduces two most popular applications of the Internet namely World Wide Web (WWW) and Electronic Mail (E-Mail). World Wide Web is an open ended information retrieval system that enables to access the documents (web pages) that are available in various servers that are situated in various parts of the world. WWW is very popular to the extent that layman to computing and communication and many end users think that both WWW and the Internet are one and the same. But the fact is WWW is just one among the many services provided on the Internet. Now days, it is very difficult to find people who do not use electronic mail service of the Internet to send and receive the messages. Electronic mail provides a lot of features and advantages when compared to conventional postal mail (called as snail mail now days) and telephone systems. The third protocol we introduce in this unit is Real Time Protocol (RTP). The Internet is based on best-effort model and hence it is not able to support real time applications and multimedia applications. RTP is an effort to accommodate multimedia applications in the Internet by providing the required information along with the packets so that applications running in the end systems are able to interpret the headers and try to satisfy the timing requirements of the multimedia applications. 4.2 LEARNING OBJECTIVES To understand the framework of World Wide web (WWW) To learn the functions of the client side in WWW To know the details of the server side of WWW To have an exposure to the browsers and their interfaces To study about plug-ins, helper applications and their interaction with the browsers To learn the protocol specifications of Hyper Text Transfer Protocol (HTTP) To discuss about the various methods supported by HTTP To understand the architecture of Electronic Mail (E-Mail) system To study about the functions of User Agent and Mail Transfer Agent of E-Mail To know the message types supported by E-Mail To study about Multipurpose Internet Mail Extension (MIME) standard
DIT 116
NETWORK PROTOCOLS
NOTES
4.3
To learn the Simple Mail Transfer Protocol (SMTP), the heart of an e-mail system To know the details of the final delivery of mail to the users To learn the basics of audio and video transmission To learn about the Quality of Service (QoS) metrics To study the details of Real-Time Transport Protocol (RTP) encapsulation To understand RTP Control Protocol, an adjunct protocol of RTP WORLD WIDE WEB
The World Wide Web (WWW) is an open ended information retrieval system for accessing linked documents spread out over millions of machines all over the Internet. It is a distributed client-server system, in which a client using a browser can access a service using a server. However a single server is not able to provide all the services and the services are distributed over many locations called sites. Client is popularly known as browser. Todays browsers with excellent and user friendly interface have made the WWW accessible for every one. WWW provides an enormous wealth of information on different subjects. 4.3.1 Architecture of the Web
The WWW is based on the client/server architecture. A web client (that is, a browser such as Netscape Navigator, or Microsoft Internet Explorer) sends request for information to any web server. A web server is a program which upon receipt of the request sends the requested document (or an error message if needed) to the requesting client. Typically the browser runs on a separate machine from that of the server. The server takes care of all the issues related to document storage, whereas the task of presenting the information to the user is taken care by the client program. The web client and the web server communicate with each other using the HyperText Transfer Protocol (HTTP). The protocol transfers data in the form of plain text, hypertext, audio, video and so on. However, it is called hypertext transfer protocol because it allows its use in hypertext environment where there are rapid jumps from one document to another. Web servers maintain a vast, worldwide collection of documents or Web pages, often just called pages for short. The web is a hypermedia system that supports interactive access. A hypermedia system provides a straightforward extension of traditional hypertext system. In either system, information is stored as a set of documents. Besides the basic information, each page may contain links to other pages anywhere in the world. Users can follow a link by clicking on it, which then takes them to the pages pointed to. This process can be repeated indefinitely. A page that has links that point to other pages is said to have hypertext. The hypermedia information available on the web is called a page. An example of the web page is given in figure 4.1a. This page starts
DIT 116
NETWORK PROTOCOLS
with a title, contains some information, and ends with the e-mail address of the pages maintainer. Strings of text that are links to other pages, called hyperlinks, are often highlighted, by underlining, displaying them in a special color, or both. To follow a link, the user places the mouse cursor on the highlighted area, which causes the cursor to change, and clicks on it. Theoretically speaking, a browser can exist without any graphical user interface (e.g., Lynx). However, only browsers with user-friendly graphical user interface were able to survive in the market. As the next stage in the development of the browsers, voice-based browsers are also being developed. Users who are curious about the Department of Computer Science and engineering can learn more about it by clicking on its (underlined) name. The browser then fetches the page to which the name is linked and displays it, as shown in figure 4.1b.
NOTES
WELCOME TO ANNA UNIVERSITYS HOME PAGE Campus Information Admission Information Campus Map Directions to campus The Student body Academic Departments Department of Computer Science And Engineering Department of Electronics And Communication Engineering Department of Mechanical Engineering Department of Civil Engineering Department of Management studies Department of Science And Humanities webmaster@annauniv.edu
Figure 4.1 (a) A web page
171
DIT 116
NETWORK PROTOCOLS
NOTES
THE DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Information of courses Personnel Faculty members Research Scholars Post graduate students Under graduate students Research Projects Positions available Curriculum and Syllabi Placement records Publications Seminars and conferences webmaster@cs.annauniv.edu
Figure 4.1 (b) The page reached by clicking on Department of Computer Science and Engineering
The underlined items here can also be clicked on to fetch other page, and so on. The new page can be on the same machine itself or in some other machine in the same network or in a machine in a different network. These details are transparent to the users and the web is able to fetch the pointed page irrespective of its physical location. Page fetching is done by the browser, without any help from the user. If the user ever returns to the main page, the links that have already been followed may be shown with a dotted underline and possibly a different color to distinguish them from links that have not been followed. Clicking on the Campus Information line in the main page does nothing. It is not underlined, which means that it is just text and not a hyper text. The client-server model of the Web is shown in Figure 4.2. The browser is displaying a web page with many links to other pages on the client machine. When the user clicks on a link that is linked to a page on the first.com server, the browser follows the hyperlink by sending a message to the first.com server asking for it for the page. When the page arrives, it is displayed. If this page contains a hyperlink to a page on the second.com server that is clicked on, the browser then sends a request to that machine for the page, and so on indefinitely.
172
DIT 116
NETWORK PROTOCOLS
NOTES
Server first.com Server second.com Current page displayed by browser
client
Hyperlink to first.com Hyperlink to second.com
Browser program
disc Web server
disc
Web server TCP connection
The internet
Figure 4.2 The parts of the web
4.3.2 The Client Side The client side of the web is the browser program. From the users point of view, browser is a program that is used to fetch and display the web pages. In addition to fetching and displaying web pages, the browser has to catch mouse clicks to items on the displayed page. It is necessary for the web to have an addressing scheme or a naming mechanism for the web pages so that the browser can establish the TCP connection with the appropriate server in the Internet. Then only the hyperlink on a page can point to the correct page on the web. Pages are named using URLs (Uniform Resource Locators). A typical URL is http://www.annauniv.edu/research/index.html A URL has three parts: the name of the protocol (http), the DNS name of the machine where the page is located (www.annauniv.edu), and (usually) the name of the file containing the page (/research/index.html). When a user clicks on a hyperlink, the browser carries out a series of steps in order to fetch the page pointed to. Suppose that a user is browsing the web and finds a link on Internet telephony those points to Anna Universitys home page, which is http:/ /www.annauniv.edu/. The following steps take place in the process of fetching and displaying the web page.
173
DIT 116
NETWORK PROTOCOLS
NOTES
1. 2. 3. 4. 5. 6. 7. 8. 9.
The browser determines the URL (by seeing what was selected). The browser asks DNS for the IP address of www.annauniv.edu DNS replies with 156.106.192.32 The browser makes a TCP connection to port on 156.106.192.32. It then sends over a request asking for file /research/index.html The www.annauniv.edu server sends the file /research/index.html The TCP connection is released The browser displays all the text in file /research/index.html The browser fetches and displays all images in this file.
All of the above steps take place using the application layer protocol Hypertext Transfer Protocol (HTTP). Hence the browser is considered as the HTTP client or web client. To be able to display the new page (or any page), all browser has to understand its format. To allow all browsers to understand all web pages, web pages are written in a standardized language called Hypertext Markup Language (HTML), which describes web pages. From this view point, a browser is considered as an HTML interpreter. Although a browser is basically an HTML interpreter, most browsers have numerous buttons and features to make it easier to navigate the Web. This is the major difference between text based browsers like Lynx and commercially successful browsers like Netscape Navigator and Internet Explorer. Many graphical browsers display which step they are currently executing in a status line at the bottom of the screen. In this way, when the performance is poor, the user can see if it is due to DNS not responding, the server not responding, or simply network congestion during page transmission. Most have a button for going back to the previous page, a button for going forward to the next page, and a button for going straight to the users own start page. Most browsers have a button or menu item to set a bookmark on a given page and another one to display the list of book marks, making it possible to revisit any of them with only a few mouse clicks. Pages can also be saved to disk or printed. Numerous options are generally available for controlling the screen layout and setting various user preferences. 4.3.3 The Server Side In this section we are going to discuss about the functional requirements of a web server. When the user types in a URL or clicks on a line of hypertext, the browser passes the URL and interprets the part between http:// and the next slash as a DNS name to look up. It sends the request to the DNS server and gets the equivalent IP address. Once the browser gets the IP address it establishes a TCP connection to port 80 on that sever. Then it sends over a command containing the rest of the URL, which
DIT 116
NETWORK PROTOCOLS
is the name of a file on that server. The server then returns the file for the browser to display. Once the request is made by the client, the server has to perform a set of functions to supply the requested web page to the client. Just like any other server, the web server is also expected to wait round the clock for the client request. Typically a web server performs the following functions in its main loop. Accept a TCP connection from a client (a browser) Get the name of the file requested Get the file (from disk) Return the file to the client Release the TCP connection.
NOTES
In the above sequence of steps third step (Get the file (from disk)) becomes the bottleneck in fetching the web page since it involves the secondary storage devices. The data rate at which the secondary storage devices (disks) operate is considerably less than that of the rate at which the processor operates. Because of this bottleneck, the web server cannot serve more request per second than it can make disk accesses. For example, a high-end SCSI disk has an average access time of around 5 msec, which limits the server to at most 200 request/sec, less if large files have to be read often. For a major web site provider with a large customer base, this figure is too low. One obvious improvement is to maintain a cache in memory of the n most recently used files. Before going to disk to get a file, the server checks the cache. If the file is there, it can be served directly from memory and the disk is eliminated in the process of fetching the web page. Even though caching requires extra space and extra overhead the benefits obtained out of caching outweigh these limitations. A better solution is to build a faster server based on multithreading. The five steps we have discussed earlier are the bare minimum steps required in supplying the requested web page to the client. However, modern commercial web servers have to perform many additional steps to make the applications like electronic commerce a reality. A modern web server performs the following set of functions. 1. 2. 3. 4. 5. 6. 7. Resolve the name of that Web page requested. Authenticate the client Perform access control on the client. Perform access control on the Web page. Check the cache. Fetch the requested page from disk. Determine the MIME type to include in the response.
DIT 116
NETWORK PROTOCOLS
NOTES
8. 9. 10.
Take care of miscellaneous odds and ends. Return the reply to the client. Make an entry in the server log.
When a web server has to perform all of the above steps for each and every transaction for every client request, a single processor with multiple disks and multiple threads may not be sufficient when the web server receives too many requests in each second. Hence the designers of the web servers decided to use a set of CPUs in the web servers to satisfy the users request in a better way. Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. What is World Wide Web? What is meant by a home page? What are the minimal functions to be performed by the browser? List down the steps required in fetching and displaying a web page. What is the information present in the MIME type of a web page? What is the role of a plug-in in displaying a web page? Differentiate between a plug-in and helper application. What are the minimal functions to be performed by the web server? Which step is the bottleneck in accessing the web pages? List down the various functions to be performed by a web server in an e-commerce application. What is the necessity of multithreading in a web server? What is meant by a server farm? What is meant by TCP handoff in a web server? What are the different ways in which a cache can be maintained by the web servers? WWW is stateless. Justify this statement.
4.3.4 Statelessness and Cookies The World Wide Web was developed as a stateless entity. A client sends a request; a server responds. Their relationship is over. The original design of WWW is to retrieve publicly available documents in the Internet. For this original design, the stateless approach is suitable. However, now a days the WWW is considered as an effective medium to carry out business. In this scenario, some web sites need to allow access to registered clients only. Certain web sites are being used as electronic stores that allow users to browse through the store, select wanted items, put them in an electronic cart, and pay at the end with a credit card. Some web sites are used as portals: the user selects the web pages he wants to see. For web sites like these, the original stateless approach is not sufficient and hence cookies were introduced by Netscape.
DIT 116
NETWORK PROTOCOLS
When a server receives a request from a client, it stores information about the client in a file or a string. The information may include the domain name of the client, the contents of the cookie (information the server has gathered about the client such as name, registration number, and so on), a timestamp, and other information depending on the implementation. The server includes the cookie in the response that it sends to the client. When the client receives the response, the browser stores the cookie in the cookie directory, which is stored by the domain server name. When a client sends a request to the server, the browser looks in the cookie directory to see if it can find a cookie sent by the server. If found, the cookie is included in the request. When the server receives the request, it knows that this is an old client, not a new one. You please note that the contents of the cookie are never read by the browser or disclosed to the user. It is a cookie made by the server and used by the server. The site that restricts access to registered clients only sends a cookie to the client when the client registers for the first time. For any repeated access, only those clients that send the appropriate cookie are allowed. An electronic store can use a cookie for its client shoppers. When a client selects an item and inserts it into a cart, a cookie that contains information about the item such as its number and unit price is sent to the browser. If the client selects a second item, the cookie is updated with the new selection information. This is repeated till the client finishes shopping. A web portal uses a cookie in a similar way. When a user selects her favorite pages, a cookie is made and sent. If the site is accessed again, the cookie is sent to the server to show what the client is looking for. Have you understood? 1. 2. 3. 4. 5. 4.4 Why web has been designed as stateless? What are the limitations of the web being stateless? What are the problems in keeping track of the clients in terms of their IP addresses? What is meant by a cookie? What are the security risks in using cookies? HYPER TEXT TRANSFER PROTOCOL
NOTES
The protocol used by the client and the server in the process of transferring the web documents or pages is Hyper Text Transfer Protocol (HTTP). The specification of HTTP discusses about the methods that are used by the client and the server through which various types of requests are made and the responses are obtained. Each interaction consists of one ASCII request, followed by one RFC 822 MIME-like response. All clients and all servers must obey this protocol. It is defined in RFC 2616.
177
DIT 116
NETWORK PROTOCOLS
NOTES
4.4.1 Connections The usual way for a browser to contact a server is to establish a TCP connection to port 80 on the servers machine. However, it is possible for the client to make use of UDP or some other unreliable transport layer protocol also to communicate with the server. TCP is preferred since it takes care of the problems like lost messages, duplicate messages, or acknowledgements. Otherwise either the application layer protocol or the application has to take care of these issues. In HTTP 1.0, the web client (the browser) establishes a connection with the web server and makes a single request and the server sends the single response over the established connection. Once the reply is sent the connection is released. When WWW was confined to documents with HTML text alone, the above mode of communication between the client and the sender was adequate. However, WWW in its present status contains pages that have many other things apart from the HTML text such as icons, images, and other eye candy, so establishing a TCP connection to transport a single icon became a very expensive way to operate. Hence many modifications were done in HTTP 1.0 and its next version by name HTTP 1.1 was released. HTTP 1.1 supports persistent connections. Persistent connections refer to the ability to send additional requests and get additional responses over the same TCP connection. By amortizing the TCP setup and release over multiple requests, the relative overhead due to TCP is much less per request. HTTP 1.1 supports two type of persistent connections namely persistent connection with pipeline and without pipeline. In persistent connection without pipeline, client issues new request only when previous response has been received. This means that one RTT for each referenced object. In pipelined requests, it is possible to send request 2 before the response to request 1 has arrived. 4.4.2 Methods HTTP has been designed in such a way that it can be used for other types of object-oriented applications also apart from fetching and displaying the web pages. Hence HTTP supports a variety of methods instead of just supporting the methods for web page request and response. Each request consists of one or more lines of ASCII text, with the first word on the first line being the name of the method requested. The built-in methods are listed in figure 4.3. For accessing general objects, additional object-specific methods may also be available. The names are case sensitive, so GET is a legal method but get is not. The GET method requests the server to send the page (by which we mean object, in the most general case, but in practice normally just a file). The page is suitably encoded in MIME. The vast majority of requests to Web servers are GETs. The usual form of GET is
178
DIT 116
NETWORK PROTOCOLS
GET filename HTTP/1.1 Where filename names the resource (file to be fetched and 1.1 is the protocol version being used). The HEAD method just asks for the message header, without the actual page. This method can be used to get a pages time of last modification, to collect information for indexing purposes, or just to test a URL for validity.
NOTES
Method GET HEAD PUT POST DELETE TRACE CONNECT OPTIONS
Description Request to read a web page Request to read a web pages header Request to store a web page Append to a named resource(e.g., a Web page) Remove the Web page Echo the incoming request Reserved for future use Query certain options
Figure 4.3 The built-in HTTP request methods
The PUT method is the reverse of GET: Instead of reading the page, it writes the page. This method makes it possible to build a collection of web pages on a remote server. The body of the request contains the page. It may be encoded using MIME, in which case the lines following the PUT might include Content-Type: and authentication headers to prove that the caller indeed has permission to perform the requested operation. Somewhat similar to PUT is the POST method. It, too, bears the URL, but instead of replacing the existing data, the new data is appended to it in some generalized sense. Posting a message to a newsgroup or adding a file to a bullet-in-board system are examples of appending in this context. In particular neither PUT nor POST is used very much. DELETE does what you might expect: it removes the page. As with PUT, authentication and permission play a major role here. There is no guarantee that DELETE succeeds, since even if the remote HTTP server is willing to delete the page, the underlying file may have a mode that forbids the HTTP server from modifying or removing it. The TRACE method is for debugging. It instructs the server to send back the request. This method is useful when requests are not being processed correctly and the client wants to know what request the server has actually got.
179
DIT 116
NETWORK PROTOCOLS
NOTES
The CONNECT method is not currently used. It is reserved for future use. The OPTIONS method provides a way for the client to query the server about is properties or those of a specific file. Every request gets a response consisting of a status line, and possibly additional information (e.g., all are part of a web page). The status line contains the 3 digit status code telling whether the request was satisfied, and if not, why not. The first digit is used to divide the responses into five major groups as shown in figure 4.4.
Code 1xx 2xx 3xx 4xx 5xx
Meaning Information Success Redirection Client error Server error
Examples 100=server agrees to handle clients request 200=request succeeded;204=no content present 301=page moved; 304= cached page still valid 403 = forbidden page; 404=page not found 500=internal server error; 503=try again later
Figure 4.4. The status code response groups
The 1xx codes are rarely used in practice. The 2xx codes mean that the request was handled successfully and the content (if any) is being returned. The 3xx codes tell the client to look elsewhere, either using a different URL or in its own cache. The 4xx codes means the request failed due to a client error such as invalid request or a nonexistent page. Finally, the 5xx errors mean the server itself has the problem, either due to an error in its code or to a temporary overload. 4.4.3 Message Headers HTTP is basically a request/response scheme. In addition to the actual method, HTTP messages are to be provided with the required additional information called headers. Some of them are used as request headers and some of them are used as response headers. Few headers can be used as both request as well as response headers. The request line followed by additional lines with more information is called request headers. Additional information present in the response lines is called response headers. A selection of the most important ones is given in figure 4.5. The User-Agent header allows the client to inform the server about its browser, operating system and other properties. This header is used by the client to provide the server with the information. The four Accept headers tell the server what the client is willing to accept in the event that it has a limited repertoire of what is acceptable. The first header specifies the MIME types that are welcome (e.g., text/html). The second gives the character set (e.g., ISO-8859-5 or Unicode-1-1). The third deals with compression methods (e.g., gzip). The fourth indicates a natural language (e.g., Spanish). If
180
DIT 116
NETWORK PROTOCOLS
the server has a choice of pages, it can use this information to supply the one the client is looking for. If it is unable to satisfy the request, an error code is written and the request fails.
NOTES
Header User-Agent Accept Accept-Charset Accept-Encoding Accept-Language Host Authorization Cookie Date Upgrade Server Content-Encoding Content-language Content-Length Content-Type Last-Modified Location Accept-Ranges Set-Cookie
Type Request Request Request Request Request Request Request Request Both Both Response Response Response Response Response Response Response Response Response
Contents Information about the browser and its platform The type of pages the client can handle The character sets that are acceptable to the client The page encodings the client can handle The natural languages the client can handle The servers DNS name A list of the clients credentials Sends a previously set cookie back to the server Date and time the message was sent The protocol the sender wants to switch on Information about the server How the content is encoded (e.g., gzip) The natural language used in the page The pages length in bytes The pages MIME type Time and date the page was last changed A command to the client to send its request elsewhere The server will accept byte range requests The server wants the client to save a cookie
Figure 4.5 Some HTTP message headers
The Host header names the server. It is taken from the URL. This header is mandatory. It is used because some IP addresses may serve multiple DNS names and the server needs some way to tell which host to hand the request to. The Authorization header is needed for pages that are protected. In this case, the client may have to prove it has a right to see the page requested. This header is used for that case. Although cookies are dealt with in RFC 2109, rather than RFC 2616, they also have two headers. The Cookie header is used by client to return to the server a cookie that was previously sent by some machine in the servers domain. The Date header can be used in both directions and contains the time and date the message was sent. The Upgrade header is used to make it easier to make the transition to a future (possibly incompatible) version of the HTTP protocol. It allows the client to announce what it can support and the server to asset what it is using.
181
DIT 116
NETWORK PROTOCOLS
NOTES
The first one, Server, allows the server to tell who it is, and some of its properties if it wishes. The next four headers, all starting with Content-, allow the server to describe properties of the page it is sending. The Last-Modified header tells when the page was last modified. This header plays an important role in page caching. The Location header is used by the server to inform the client that it should try a different URL. This can be used if the page has moved or to allow multiple URLs to refer to the same page (possibly on different servers). It is also used for companies that have a main web page in the .com domain, but which redirect clients to a national or regional page based on their IP addresses or preferred languages. If a page is very large, a small client may not want it all at once. Some servers will accept request for byte ranges, so the page can be fetched in multiple small units. The Accept-Ranges header announces the servers willingness to handle this type of partial page request. The second cookie header, Set-Cookie, is how servers send cookies to clients. The client is expected to save the cookie and return it on subsequent requests to the server. Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 4.5 What is the application layer protocol of the web? What is meant by a hypertext? What are the limitations of HTTP 1.0? What is meant by a persistent connection in HTTP? Differentiate between the persistent connection without pipeline and persistent connection without pipeline. What is the purpose of GET method of HTTP? Differentiate between PUT and POST methods of HTTP. What is the purpose of status response code? Give examples for Request type HTTP message headers. What is meant by Response Message header? ELECTRONIC MAIL
The electronic mail (e-mail) service of the Internet allows users to send memos or messages or mails across the Internet. E-mail is one of the most widely used application services. Indeed, some users rely on e-mail for normal business activity. E-mail is also popular because it offers a fast, convenient method of transferring information. E-mail accommodates small notes or large voluminous memos with a single mechanism.
DIT 116
NETWORK PROTOCOLS
Mail delivery is a new concept because it differs fundamentally from other services of the Internet. In other applications of the Internet, network protocols send packets directly to destinations, using timeout and retransmission for individual segments if no acknowledgement returns. Since e-mail has to support off-line delivery also, the application should be able to deliver the message even if the recipient is not on line. A sender does not want to wait for the remote machine to respond before continuing work nor does the user want the transfer to abort to merely because the destination is temporarily unavailable. To handle delay delivery mail system uses a technique known as spooling. When the user sends a mail message, the system places a copy in its private storage (spool) area along with identification of the sender, recipient, destination machine, and time of deposit. The system then initiates the transfer to the remote machine as a background activity, allowing the sender to proceed with other computational activities. Figure 4.6 illustrates the concept.
TCP Connection Outgoing mail spool area User interface C;lient (backgroun d transfer)
NOTES
user sends mail
for outgoing mail
User reads mail
Mailboxes for incoming mail
Server ( to accept mail)
TCP connection
for incoming mail
Figure 4.6 Conceptual components of an e-mail system
The background mail transfer process becomes a client. It first uses the domain name system to map the destination machine name to an IP address, and then attempts to form a TCP connection to the mail server on the destination machine. If it succeeds, the transfer process passes a copy of the message to the remote server, which stores the copy in the remote systems spool area. Once the client and server agree that the copy has been accepted and stored, the client removes the local copy. If it cannot form a TCP connection or if the connection fails, the transfer process records the time delivery was attempted and terminates. The background transfer process sweeps through the spool area periodically, typically once every 30 minutes, checking for undelivered mail. Whenever it finds a message or whenever user deposits new outgoing mail, the background process attempts delivery. If it finds that a mail message cannot be delivered after an extended time (e.g., 3 days), the mail software returns the message to the sender.
DIT 116
NETWORK PROTOCOLS
NOTES
4.5.1
Basic Functions of an E-mail System
An e-mail system has to support five basic functions. However modern e-mail systems have added more sophisticated functions. Composition This refers to the process of creating messages and answers. The e-mail system should provide an editing facility to compose a mail and to make changes in it (if required). In addition to the functions of an ordinary editor, the composition facility should provide addressing fields and many header fields to convey additional meaning. Transfer This refers to moving messages from the sender to the recipient. This requires establishing connection to the destination (in TCP/IP based e-mail systems) or to the intermediate machines (in Application Gateway approach), outputting the message and releasing the connection. Reporting This has to do with telling the originator what happened to the message. It may be necessary to answer questions like was it delivered? Was it lost etc. Displaying This function is required to enable the users to read their e-mail. Sometimes the e-mail system itself may not be able to display the message (the message is a postscript file or digitized voice) and it may seek the help of special viewer applications. Disposition This is the final step to be done in the e-mail system. This step deals with what the recipient does with the message after receiving it. Possibilities include throwing it away before reading, throwing it after reading, saving it and so on. 4.5.2 Mailbox Names and Aliases There are three important ideas hidden in our simplistic description of mail delivery. First, users specify recipients by giving pair of strings that identify the mail destination machine name and a mailbox address on that machine. Second, the names used in such specifications are independent of other names assigned to machines. Usually, a mailbox address is same as a users login id, and a destination machine name is same as the machines domain name, but that is not necessary. It is possible to assign a mailbox to a position of employment (e.g., the mailbox vc in the mail-id vc@annauniv.edu always refer to the present vice-chancellor of Anna University and not confined to an indiAnna University Chennai
184
DIT 116
NETWORK PROTOCOLS
vidual). Also, because the domain name system includes a separate query type for mail destinations, it is possible to decouple mail destination names from the usual domain names assigned to machines. Thus, mail sent to a user at example.com may go to a different machine than a telnet connection to the same name. Third, our simplistic diagram fails to account for mail processing and mail forwarding, which include mail sent from one user to another on the same machine, and mail that arrives on a machine but which should be forwarded to another machine. 4.5.3 Alias Expansion and Mail Forwarding Most systems provide mail forwarding software that includes a mail alias expansion mechanism. A mail forwarder allows the local site to map identifiers used in mail addresses to a set of one or new mail addresses. Usually, after a user composes a message and names a recipient, the mail interface program consults the local aliases to replace the recipient with the mapped version before passing the message to the delivery system. Recipients for whom no mapping has been specified remain unchanged. Similarly, the underling mail system uses the aliases to map incoming recipient addresses.
NOTES
alias database
User sends mail Alias expansion and forwarding User interfa ces Mailboxes for incoming mail User reads mail Server (to accept mail) outgoing mail spool area Client (background
transfer)
Figure 4.7 An extension of the mail system shown in figure 4.6
Aliases increase mail system functionality and convenience substantially. In mathematical terms, alias mappings can be many-one or one-many. For example, the aliases system allows a single user to have multiple mail identifiers, including nicknames and positions, by mapping a set of identifiers to a single person. The system also allows a site to associate groups to recipients with a single identifier. Using aliases that map an identifier to a list of identifiers makes it possible to establish a mail exploder that accepts
DIT 116
NETWORK PROTOCOLS
NOTES
one incoming message and sends it to a large set of recipients. The set of recipients associated with an identifier is called an electronic mailing list. Not all the recipients on a list need to be local. Although it is uncommon, it is possible to have a mailing list at site, Q, with none of the recipients from the list located at Q. expanding a mail alias into a large set of recipients is a popular technique used widely. Figure 4.7 illustrates the components of a mail system that supports mail aliases and list expansion. As figure 4.13 shows, incoming and outgoing mail passes through the mail forwarder that expands aliases. Thus, if alias database specifies that mail address x maps to replacement y, alias expansion will rewrite destination address x, changing it to y. the alias expansion program then determines whether y specifies a local or remote address, so it knows whether to place the message in the incoming mail queue or outing mail queue. Mail alias expansion can be dangerous. Suppose two sites establish conflicting aliases. For example, assume site A maps mail address x into mail address y at site B, while site B maps mail address y into address x at site A. A mail message sent to address x at site A could bounce forever between the two sites. Similarly, if the manager at site A accidentally maps a users login name at that site to an address at another site, the user will be unable to receive mail. The mail may go to another user or, if the alias specifies an illegal address, senders will receive error messages. 4.5.4 Two different Approaches of providing E-Mail Commercial services exist that can forward electronic mail among computers without using TCP/IP and without having the computers connected to the global internet. There are two crucial differences between these services and TCP/IP e-mail system. First, a TCP/IP internet makes possible universal delivery service. Second, electronic mail systems built on TCP/IP are inherently more reliable than those systems built from arbitrary networks. TCP/IP makes possible universal mail delivery because it provides universal interconnection among machines. In essence, all machines attached to an internet behave as if attached to a single, vendor independent network. With the basic network services in place, devising a standard mail exchange protocol becomes easier. The key idea behind TCP/IP e-mail systems is that TCP provides end-to-end connectivity. That is, mail software on the sending machine acts as a client, contacting a server on the ultimate destination. Only after the client successfully transfers a mail message to the server does it remove the message from the local machine. With such systems, the sender can always determine the exact status of a message by checking the local mail spool area. The alternative form of electronic mail delivery uses the application gateway approach. The message is transferred through a series of mail gateways, sometimes called
186
DIT 116
NETWORK PROTOCOLS
mail bridges, mail relays or intermediate mail stops. In such systems, the senders machine does not contact the recipients machine directly. Instead, a complete mail message is sent from the original sender to the first gateway. The main disadvantage of using mail gateways is that they introduce unreliability. Once it transfers a message to the first intermediate machine, the senders computer discards the local copy. Thus, while the message is in transit, neither the sender nor the recipient has a copy. Failures at intermediate machines may result in message loss without either the sender or the receiver being informed. Message loss can also result if the mail gateways route mail incorrectly. Another disadvantage of mail gateways is that they introduce delay. A mail gateway can hold messages for minutes, hours or even days if it can not forward them on to the next machine. Neither the sender nor receiver can determine where a message has been delayed, why it has not arrived, or how long the delays will last. The important point is that the sender and recipient must depend on computers over which they may have no control. If e-mail gateways are less reliable than end-to-end delivery, why are they used? The chief advantage of mail-gateways is interoperability. Mail gateways provide connections among standard TCP/IP mail systems and other mail systems, as well as between TCP/IP internets and networks that do not support Internet protocols. Suppose, for example, that company X has a large internal network and that the employees use electronic mail, but that the network software does not support TCP/IP. Although it may be infeasible to make the companys network part of the global Internet, it might be easy to place a mail gateway between the companys private network and the Internet and to devise software that accepts mail messages from the local network and forwards them to the Internet. While the idea of mail gateways may seem somewhat awkward, electronic mail has become such an important tool that users who do not have Internet access depend on the gateways. Thus, although gateway service is not reliable or convenient as end-to-end delivery, it can still be useful. 4.5.5 Electronic Mail Addresses A user familiar with electronic mail knows that mail address formats vary among e-mail systems. Thus, it can be difficult to determine a correct electronic mail address, or even to understand a senders intentions. Within the global internet, addresses have a simple, easy to remember form: local-part @ domain-name where domain-name is the domain name of a mail destination to which the mail should be delivered, and local-part is the address of a mailbox on that machine. For example, within the internet, an electronic mail address is:
NOTES
187
DIT 116
NETWORK PROTOCOLS
NOTES
ramesh @ annauniv.edu However, mail gateways make addresses complex. Someone outside the internet must either address the mail to the nearest mail gateway that connected between outside networks and the internet, someone with access to the gateway might have used the following address to reach the recipient: ramesh%annauniv.edu @ vsnl.net Once the mail reached machine vsnl.net, the mail gateway software extracted local-part, changed the percent sign (%) into an at sign (@), and used the result as a destination address to forward the mail. The reason addresses become complex when they include non-internet sites is that the mail address mapping function is local to each machine. Thus, some mail gateways require the local part to contain addresses of the form: user % domain-name while others require: user : domain - name and still others use completely different forms. More important, electronic mail systems do not usually agree on conventions for precedence or quoting, making it impossible for a user to guarantee how addresses will be interpreted. For example, consider the electronic mail address: ramesh % annauniv.edu @ vsnl.net mentioned earlier. A site using the TCP/IP standard for mail would interpret the address to mean, send the message to mail exchanger vsnl.net and let that mail exchanger decide how to interpret ramesh % annauniv.edu (the local part). In essence, the site acts as if the address were parenthesized: (ramesh % annauniv.edu) @ (vsnl.net) At a site that uses % to separate user names from destination machines, the same address might mean, send the mail to user ramesh at the site given by the remainder of the address. That is., such sites act as if the address were parenthesized: (ramesh) % (annauniv.edu @ vsnl.net) Have you understood? 1. 2. 3. How is e-mail different from other applications of the Internet? What are the advantages of e-mail over postal system and telephone system? What is meant by spooling in an e-mail system?
188
DIT 116
NETWORK PROTOCOLS
4. 5. 6. 7. 8. 9. 10. 4.6
What is the transport layer protocol used by e-mail systems? List down the basic functions to be performed by an e-mail system. What is meant by a mail box in an e-mail system? What are the advantages of aliases in e-mail systems? What are the two popular mechanisms of providing e-mail facility? What are the limitations of application-gateway approach in e-mail systems? Mention the various parts of an e-mail address with an example. PROTOCOLS OF E-MAIL SYSTEM
NOTES
In addition to message formats, the TCP/IP protocol suite specifies a standard for the exchange of mail between machines. That is, the standard specifies the exact format of messages a client on one machine uses to transfer mail to a server on another. Apart from the protocol used to transfer the messages between machines, additional protocols are required to enable the end users to retrieve the mail from the mailboxes. This section introduces the various types of protocols involved in sending and delivering messages. 4.6.1 Simple Mail Transfer Protocol
The standard transfer protocol is known as Simple Mail Transfer Protocol (SMTP). SMTP has been named so since it is than the earlier Mail Transfer Protocol (MTP). The SMTP protocol focuses specifically on how the underlying mail delivery system passes messages across an internet from one machine to another. It does not specify how the mail system accepts mail from a user or how the user interface presents the user with incoming mail. Also, SMTP does not specify how mail is stored or how frequently the mail system attempts to send messages. Usually the sequences of events that take place in SMTP are 1. Source machine establishes a TCP connection to port 25 of the destination machine. (The process that waits at this port no is e-mail daemon that speaks SMTP). 2. After establishing a TCP connection at port number 25, the sending machine, operating as the client, waits for the receiving machine (the server) to talk first. 3. The server starts by sending a line of text giving its identity and telling whether it is prepared to receive the mail. 4. If the server is willing to accept the e-mail, the client announces whom the e-mail is coming from and whom it is going to. 5. SMTP daemon accepts incoming connections and copies messages from them into the appropriate mailboxes. 6. If the message cannot be delivered, an error report containing the first part of the undeliverable message is sent to the client.
189
DIT 116
NETWORK PROTOCOLS
NOTES
These steps are depicted in figure 4.8. Communication between a client and server consists of readable ASCII text. Although SMTP rigidly defines the command format, humans can easily read a transcript of interactions between a client and server. Initially, the client establishes a reliable stream connection to the server and waits for the server to send a 220 READY FOR MAIL message. (If the server is overloaded, it may delay sending the 220 message temporarily). Upon receipt of the 220 messages, the client sends a HELLO command. The end of a line marks the end of a command. The server responds by identifying itself. Once communication has been established, the sender can transmit one or more mail messages, terminate the connection, or request the server to exchange the roles of sender and receiver so messages can flow in the opposite direction. The receiver must acknowledge each message. It can also abort the entire connection or abort the current message transfer. Mail transactions begin with a MAIL command that gives the sender identification as well as a FROM: field that contains the address to which errors should be reported. A recipient prepares its data structures to receive a new mail message, and replies to a MAIL command by sending the response 250. Response 250 means that all is well. The full response consists of the text 250 OK. As with other application protocols, programs read the abbreviated commands and 3-digit numbers at the beginning of lines the remaining text is intended to help humans debug mail software.
mail server
user agent user agent
SMTP SMTP
\
mail server
mail server
user agent
SMTP
user agent
user agent
user agent
Figure 4.8 SMTP Protocol Anna University Chennai 190
DIT 116
NETWORK PROTOCOLS
After a successful MAIL command, the sender issues a series of RCPT commands that identify recipients of the mail message. The receiver must acknowledge each RCPT command by sending 250 OK or by sending the error message 550 No such user here. After all RCPT commands have been acknowledged, the sender issues a DATA command. In essence, a DATA command informs the receiver that the sender is ready to transfer a complete mail message. The receiver responds with message 354 Start mail input and specifies the sequence of characters used to terminate the mail message. The termination sequence consists of 5 characters: carriage return, line feed, period, carriage return, and line feed. An example will clarify the SMTP exchange. Suppose user Smith at host Alpha.EDU sends a message to users Jones, Green, and Brown at host Beta.GOV. the SMTP client software on host Alpha.EDU contacts the SMTP server software on host Beta.GOV and begins the exchange as shown in figure 4.9. S: 220 Beta .GOV Simple Mail Transfer Service Ready C: Hello Alpha.EDU S: 250 Beta.GOV C: MAIL FROM : < Smith @ Alpha.EDU> S: 250 OK C: RCPT TO : < Jones @ Beta.GOV> S: 250 OK C: RCPT TO : < Green @ Beta.GOV> S: 550 No such user here C: RCPT TO : < Brown @ Beta.GOV> S: 250 OK C: DATA S: 354 Start mail input; end with <CR><LF>.<CR><LF> C: sends body of mail message C: continues for as many lines as message contains C: <CR><LF>.<CR><LF> S: 250 OK C: QUIT S: 221 Beta.GOV Service closing transmission channel
Figure 4.9 Example of SMTP transfer
NOTES
In the example, the server rejects recipient Green because it does not recognize the name as a valid mail destination (i.e., it is neither a user nor a mailing list). The
DIT 116
NETWORK PROTOCOLS
NOTES
SMTP protocol does not specify the details of how a client handles such errors the client must decide. Although clients can abort the delivery completely if an error occurs, most clients do not. Instead, they continue delivery to all valid recipients and then report problems to the original sender. Usually, the client reports errors using electronic mail. The error message contains a summary of the error as well as the header of the mail message that caused the problem. Once the client has finished sending all the mail messages it has for a particular destination, the client may issue the TURN command to turn the connection around. If it does, the receiver responds 250 OK and assumes control of the connection. With the roles reversed the side that was originally a server sends back any waiting mail messages. Whichever side controls the interaction can choose to terminate the session; to do so, it issues a QUIT command. The other side responds with command 221, which means it agrees to terminate. Both sides then close the TCP connection gracefully. SMTP is much more complex than we have outlined here. For example, if a user has moved, the server may know the users new mailbox address. SMTP allows the server to inform the client about the new address so the client can use it in the future. When informing the client about the new address, the server may choose forward the mail triggered the message, or it may request that the client take responsibility for forwarding. 4.6.2 Final Delivery SMTP is not involved in the final delivery of the mail to the recipient. SMTP is a push protocol; it pushes the message from the client to the server. In other words, in SMTP, the direction of bulk data (messages) is from the client to the server. On the other hand, the final delivery of the mail needs a pull protocol; the client must pull messages from the server. The required direction for the bulk data is from the client to the server. For the final delivery, currently two message access protocols are available: Post Office Protocol (POP) and Internet Mail Access Protocol (IMAP). The scenario is as depicted in figure 4.10.
SMTP
user agent
SMTP
access protocol
user agent
senders mail server
receivers mail server
Figure 4.10 Mail access protocol
192
DIT 116
NETWORK PROTOCOLS
4.6.3
POP3
NOTES
The final delivery can be achieved with the help of a pull protocol that allows user transfer agents (on client PCs) to contact the message transfer agent (on the ISPs machine) and allow e-mail to be copied from the ISP to the user. POP3 (Post Office Protocol, version 3) and it is described in RFC 1939. POP3 is simple and limited in functionality. The client POP3 software is installed on the recipient computer and the server POP3 software is installed on the mail server. The situations of both the sender and receiver are available on line and only the sender is on line are depicted in figures 4.11a and 4.11b.
MTA SMTP Internet
UA
Permanent Connection Sending host
Mailbox
Receiving Host
Figure 4.11a. Receiver with a Permanent Internet Connection
POP 3 POP3 Server
UA
MTA SMTP
Internet
Sending host
Mail Box
ISPs machine Dial-up Connection
Users PC
Figure 4.11b. Receiver with a dial-up connection to an ISP
193
DIT 116
NETWORK PROTOCOLS
NOTES
Mail access starts with the client when the user needs to download its e-mail from the mailbox on the mail server. The client opens a connection to the server on TCP port 110. It then sends its user name and password to access the mailbox. The user can then list and retrieve the messages one by one. POP3 begins when the user starts the mail reader. The mail reader calls up the ISP (unless there is already a connection) and establishes a TCP connection with the message transfer agent at port 110. Once the connection has been established, the POP3 protocol goes through three states in sequence: 1. Authorization 2. Transactions 3.Update. The authorization state deals with having the user log in. The transaction state deals with the user collecting the e-mails and marking them for deletion from the mail box. The update state actually causes the e-mails to be deleted. This behavior can be observed by typing something like telnet mail.isp.com 110 where mail.isp.com represents the DNS name of your ISPs mail server. Telnet establishes a TCP connection to port no 110, on which the POP3 server listens. Upon accepting the TCP connection, the server sends an ASCII message announcing that is present. Usually, it begins with +OK followed by a comment. An example of the scenario is shown in figure 4.12 starting after the TCP connection has been established. As before, the lines marked C: are from the client (user) and those marked S: are from the server (message transfer agent on the ISPs machine). S:+OK POP3 server ready C:USER Carolyn S:+OK C:PASS Vegetables S:+OK login successful C:LIST S:1 2505 S:2 14302 S:3 8122 S:. C:RETR 1 S: (sends message 1) C:DELE 1 C:RETR 2 S: (sends message 2) C:DELE 2 C:RETR 3 S: (sends message 3)
194
DIT 116
NETWORK PROTOCOLS
C:DELE 3 C:Quit S:+OK POP# server disconnecting

Figure 4.12 Interaction between the user and server using POP3
NOTES
During the authorization state, the client sends over its user name and then its password. After a successful login, the client can then send over the LIST command, which causes the server to list the contents of the mail box, one message per line, giving the length of that message. The list is terminated by a period. Then the client and retrieve messages using the RETR command and marks them for deletion with DELE. Then all messages have been retrieved (and possibly marked for deletion), the client gives the QUIT command to terminate the transaction state and enter the update state. When the server has deleted all the messages, it sends a reply and breaks the TCP connection. 4.6.4 IMAP
Another mail access protocol is Internet Mail access Protocol, version 4 (IMAP4). IMAP4 is similar to POP3, but it has more features. IMAP4 is more powerful and more complex. POP3 has many limitations. It does not allow the user to organize her mail on the server; the user cannot have different folders on the server. In addition POP3 does not allow the user to partially check the contents of the mail before downloading. IMPA4 provides many extra functions. A user can check the email header prior to downloading. A user can search the contents of the email for a specific string of characters prior to downloading. A user can partially download email. This is especially useful if bandwidth is limited and the email contains multimedia with high bandwidth requirements. A user can create, delete or rename mailboxes. A user can create a hierarchy of mailboxes in a folder for email storage. The general style of the IMAP protocol is similar to that of POP3 as shown in Figure 4.13, except that are there dozens of commands. The IMAP server listens to port 143. A comparison of POP3 and IMAP is given in figure 4.19. It should be noted, however, that not every ISP supports both protocols. Thus, when choosing an e-mail program, it is important to find out which protocol(s) it supports and make sure the ISP supports at least one of them.
195
DIT 116
NETWORK PROTOCOLS
NOTES
Feature Where is protocol defined TCP port used Where e-mail is stored Where is e-mail read Connect time required Use of server resources Multiple mailboxes Who backs up mailboxes Good for mobile users User control over downloading Partial Message downloads Are disk quotas a problem Simple to implement Widespread support POP3 RFC 1939 110 Users PC Off-line Little Minimal No User No Little No No Yes Yes IMAP RFC 2060 143 Server On-line Much Extensive Yes ISP Yes Great Yes Yes No Growing
Figure 4.13 POP3 Vs IMAP
Have you understood? 1. List down the steps involved in sending the mail by TCP. 2. How is the word Simple justified in SMTP? 3. What does the code 250 indicate in SMTP? 4. What are the various fields present in the headers sent by the client to the server in SMTP interaction? 5. What are the limitations of SMTP? 6. What is the necessity of off-line delivery in SMTP? 7. How is 24-hours delivery ensured in e-mail systems? 8. What is the sequence of states followed by POP3? 9. What are the limitations of POP3? 10. Where e-mail is stored in POP3 and IMAP? 4.7 4.7.1 MESSAGE FORMATS RFC 822
Messages consist of a primitive envelope, some number of header fields, a blank line, and then the message body. Each header field (logically) consists of a single line of ASCII text containing the field name, a colon, and for most fields, a value. RFC 822 was designed decades ago and does not clearly distinguish the envelope fields from the header fields. Although it was revised in RFC 2822, completely redoing it was not
DIT 116
NETWORK PROTOCOLS
possible due to its widespread usage. In normal usage, the user agent builds a message and passes it to the message transfer agent, which then uses some of the header fields to construct the actual envelope, a some what old fashioned mixing of message and envelope. The principal header fields related to the message transport are listed in figure 4.14. The To: field gives the DNS address of the primary recipient. Having multiple recipients is also allowed. The Cc: field gives the addresses of any secondary storage recipients. In terms of delivery, there is no distinction between the primary and secondary recipients. It is entirely a psychological difference that may be important to the people involved but is not important to the mail system. The term Cc: (Carbon copy) is a bit dated, since computers do not use carbon paper, but it is well established. The Bcc: (Blind Carbon Copy) field is like the Cc: field, except that this line is deleted from all copies sent to the primary and secondary recipients. This feature allows people to send copies to third parties without the primary and secondary recipients knowing this.
NOTES
Header To: Cc: Bcc: From: Sender: Received: Return-Path:
Meaning E-mail address(es) of primary recipient(s) E-mail address(es) of secondary recipient(s) E-mail address(es) for blind carbon copies Person or people who created the message E-mail address of the actual sender Line added by each transfer agent along the route Can be used to identify a path back to the sender
Figure 4.14 RFC 822 header fields related to message transport
The next two fields, from: and Sender:, tell who wrote and sent the message, respectively. These need not be the same. For example, a business executive may write a message, but her secretary may be the one who actually transmits it. In this case, the executive would be listed in the From: field and the secretary in the Sender: field. The From: field is required, but the Sender: field may be omitted if it is the same as the From: field. These fields are needed in case the message is undeliverable and must be returned to the sender. A line containing Received: is added by each message transfer agent along the way. The line contains the agents identity, the date and time the message was received and other information that can be used for finding bugs in the routing system. The Return path: field is added by the final message transfer agent and was intended to tell how to get back to the sender. In theory, this information, can be gathered from all the Received: headers (except for the name of the senders mail box), but it is rarely filled in as such and typically just contains the senders address. In addition to the fields of figure 4.20, RFC 822 messages may also contain a variety of header fields used by the user agents or human recipients. The most common ones are listed in figure 4.15.
DIT 116
NETWORK PROTOCOLS
NOTES
Header Date: Reply-To: Message-Id: In-Reply-To: References: Keywords: Subject:
Meaning The date and time the message was sent E-mail address to which replies should be sent Unique number for referencing this message later Message-Id of the message to which this is a reply Other relevant Message_Ids User-chosen keywords Short summary of the message for the one-line display
Figure 4.15 Some fields used in the RFC 822 message header
The Reply-To: field is sometimes used when neither the person composing the message nor the person sending the message wants to see the reply. For example, a marketing manager writes an e-mail message telling customers about a new product. The messages sent by a secretary, but the Reply-To: field lists the head of the sales of the department, who can answer questions and take orders. The field is also useful when the sender has two e-mail accounts and wants the reply to go to the other one. The RFC 822 document explicitly says that the users are allowed to invent new headers for their own private use, provided that these headers start with string X. it is guaranteed that no future headers will use names starting with X, to avoid conflicts between official and private headers. Sometimes Wiseguy undergraduates makeup fields like X-Fruit-of-the-Day: or X-Disease-of-the-Week:, which are legal, although not always illuminating. After the headers comes the message body. Users can put whatever they want here. Some people terminate their messages with elaborate signatures, including simple ASCII cartoons, quotations from greater and lesser authorities, political statements and disclaimers of all kinds. 4.7.2 MIME- The Multipurpose Internet Mail Extension
In the early days of the ARPANET, e-mail consisted exclusively of text messages written in English and expressed in ASCII. For this environment, RFC 822 did the job completely: it specified the headers but left the content entirely upto the users. Now a days, on the world wide internet, this approach is no longer adequate. The problems includes sending and receiving 1. 2. 3. 4. Messages in language with accents Messages in non Latin alphabets Messages in languages without alphabets Messages not containing text at all
198
DIT 116
NETWORK PROTOCOLS
A solution was proposed in RFC 1341 and updated in RFCs 2045-2049. This solution, called MIME is now widely used. The basic idea of the MIME is to continue to use the RFC 822 format, but was structured to the message body and defined encoding rules for non ASCII messages. By not deviating from RFC 822, MIME messages can be sent using the existing mail programs and protocols. All that has to be changed are the sending and receiving programs, which users can do for themselves. MIME defines file new message headers as shown in figure 4.16. The first of these simply tells the user agent receiving the message that it is dealing with the MIME message, and which version of MIME it uses. Any message not containing a MIME version header is assumed to be an English prime text message and is processed as such.
Header MIME-Version: Content-Description: Content-Id: Content-transfer-Encoding: Content-Type: Meaning Identifies the MIME version Human-readable string telling what is in the message Unique identifier How the body is wrapped for transmission Type and format of the content
NOTES
Figure 4.16 RFC 822 headers added by MIME
The Content-Description: header is an ASCII string telling what is in the message. This header is needs so the recipient will know whether is worth decoding and reading the message. If the string says Photo of Barbaras Hamster and the person getting the message is not a big Hamster fan, the message will probably discarded rather than decoded into a high resolution color photograph. The Content-Id: header identifies the content. It also uses the same format as the standard Message-Id: header. The Content-Transfer-Encoding: header tells how the body is wrapped for transmission through a network that may object to most characters other than letters, numbers and punctuation marks. Five schemes are provided. The simplest scheme is just the ASCII text. ASCII characters use just 7 bits and can be carried directly by the email protocol provided that no line exceeds 1000 characters. The next simplest scheme is the same thing, but using 8-bit characters, that is, all values from 0 up to and including 255. This encoding scheme violates the (original) internet e-mail protocol but is used by some parts of the Internet that implement some extensions to the original protocol. While declaring the encoding does not make it legal, having it explicit may at least explain things when something goes wrong. Messages using the 8-bit encoding must still adhere to the standard maximum line length. The correct way to encode binary messages is to use base64 encoding, sometimes called ASCII armor. In this scheme, groups of 24 bits are broken up into four 6199 Anna University Chennai
DIT 116
NETWORK PROTOCOLS
NOTES
bit units, with each unit being sent as a legal ASCII character. The coding is A for 0, B for 1, and so on, followed by the 26 lower-case letters, the ten digits, and finally + and / for 62 and 63, respectively. The == and = sequences indicate that the last group contained only 8 or 16 bits, respectively. Carriage returns and line feeds are ignored, so they can be inserted at will to keep the lines short enough. Arbitrary binary text can be sent safely using this scheme. For messages that are entirely ASCII but with a few non-ASCII characters, base64 encoding is somewhat inefficient. Instead an encoding known as quoted-printable encoding is used. This is just 7-bit ASCII, with all the characters above 127 encoded as an equal sign followed by the characters value as two hexadecimal digits. Content-Type: The last header shown in figure 4.17 is really the most interesting one. It specifies the nature of the message body. Seven types are defined in RFC 2045, each of which has one or more subtypes. The type and subtype are separated by a slash, as in Content-Type: video/mpeg The subtype must be given explicitly in the header; no defaults are provided. The initial list of types and subtypes specified in RFC 2045 is given in Figure 4.23. Many new ones have been added since then, and additional entries are being added all the time as the need arises. Let us now go briefly through the list of types. The text type is for straight ASCII text. The text/plain combination is for ordinary messages that can be displayed as received, with no encoding and no further processing. This option allows ordinary messages to be transported in MIME with only a few extra headers.
Type Text Subtype Plain Enriched Image Gif Jpeg Audio Basic Video Mpeg Application Octet-stream Postscript Message Rfc822 Partial External-body Multipart Mixed Alternative Parallel Digest Description Unformatted text Text including simple formatting commands Still picture in GIF format Still picture in JPEG format Audible sound Movie in MPEG format An uninterrupted byte sequence A printable document in PostScript A MIME RFC 822 message Message has been split for transmission Message itself must be fetched over the net Independent parts in the specified order Same message in different formats Parts must be viewed simultaneously Each part is a complete RFC 822 message
Figure 4.17 The MIME types and subtypes defined in RFC 2045 Anna University Chennai 200
DIT 116
NETWORK PROTOCOLS
The text/enriched subtype allow a simple markup language to be included in the text. This language provides a system independent way to express boldface, italics, smaller and larger point sizes, indentation, justification, sub- and super-scripting, and simple page layout. The markup language is based on SGML, the Standard Generalized Markup Language also used as the basis for the World Wide Webs HTML. For example, the message The <bold> time </bold> has come to the <italic> walrus </italic> said . Would be displayed as The time has come to the walrus said It is up to the receiving system to choose the appropriate rendition. If boldface and italics are available, they can be used; otherwise, colors, blinking, underlining, reverse video, etc can be used for emphasis. Different systems can, and do, make different choices. When the web become popular, a new subtype text/html was added (in RFC 2854) to allow web pages to be sent in RFC 822 e-mail. A subtype for the extension markup language, text/xml, is defined in RFC 3023. The next MIME type is image, which is used to transmit still pictures. Many formats are widely used for storing and transmitting images now a day, both with and without compression. Two of these GIF and JPEG are built into nearly all browsers, but many others exist as well and have been added to the original list. The audio and video types are for sound and moving pictures, respectively. Please note that video includes only the visual information, not the soundtrack. If a movie with sound is to be transmitted separately, depending on the encoding system used. The first video format defined was the one devised by the modestly-named Moving Pictures Experts Group (MPEG), but others have been added since. In addition to audio/basic, a new audio type, audio/mpeg was added in RFC 3003 to allow people to e-mail MP3 audio files. The application type is a catchall for formats that require external processing not covered by one of the other types. An octet-stream is just a sequence of uninterrupted bytes. Upon receiving such a stream, a user agent should probably display it by suggesting to the user that it be copied to a file and prompting for a file name. Subsequent processing is then up to the user. The other defined subtype is postscript, which refers to the PostScript language defined by Adobe Systems and widely used for describing printed pages. Many printers have built-in PostScript interpreters. Although a user agent can just call an external PostScript interpreter to display incoming PostScript files, doing so is not without some danger. PostScript is a full-blown programming language. Given enough time, a suffi201
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
ciently masochistic person could write a C compiler or a database management system in PostScript. Displaying an incoming PostScript message is done by executing the PostScript program contained in it. In addition to displaying some text, this program can read, modify, or delete the users files, and have other nasty side effects. The message type allows one message to be fully encapsulated inside another. This scheme is useful for forwarding e-mail, for example. When a complete RFC 822 message is encapsulated inside an outer message, the RFC822 subtype should be used. The partial subtype makes it possible to break an encapsulated message into pieces and send them separately (for example if the encapsulated message is too long). Parameters make it possible to reassemble all parts at the destination in the correct order. Finally, the external-body subtype can be used for very long messages (e.g., video films). Instead of including the MPEG file in the message, an FTP address is given and the receivers user agent can fetch it over the network at the time it is needed. This facility is especially useful when sending a movie to a mailing list of people, only a few of whom are expected to view it (think about electronic junk mail containing advertising videos). The final type is multipart, which allows a message to contain more than one part, with the beginning and end of each part being clearly delimited. The mixed subtype allows each par to be different, with no additional structure imposed. Many e-mail programs allow the user to provide one or more attachments to a text message. These attachments are sent using the multipart type. In contrast to multipart, the alternative subtype, allows the same message to be included multiple times but expressed in two or more different media. For example, a message could be sent in plain ASCII, in enriched text, and in PostScript. A properlydesigned user agent getting such a message would display it in PostScript if possible. Second choice would be enriched text. If neither of these were possible, the flat ASCII text would be displayed. The parts should be ordered from simplest to most complex to help recipients with pre-MIME user agents make some sense of the message(e.g., even a pre-MIME user can read flat ASCII text). A multimedia is shown in figure 4.18. Here a birthday greeting is transmitted both as text and a song. If the receiver has an audio capability, the user agent there will fetch the sound file, birthday.snd, and play it. If not, the lyrics are displayed on the screen in story silence. The parts are delimited by two hyphens followed by a (software-generated) string in the boundary parameter.
202
DIT 116
NETWORK PROTOCOLS
Note that the Content-Type header occurs in three positions within this example. At the top level, it indicates that the message has multiple parts. Within each part, it is required to tell the user agent what kind of an external file it is to fetch. To indicate this slight difference in usage, we have used lower case letters here, although all headers are case insensitive. The content-transfer-encoding is similarly required for an external body that is not encoded as 7-bit ASCII. Getting back to the subtypes for multipart messages, two more possibilities exist. The parallel subtype is used when all parts must be viewed simultaneously. For example, movies often have an audio channel and video channel. Movies are more effective if these two channels are played back in parallel, instead of consecutively. Finally, the digit subtype is used when many messages are packed together into a composite message. For example, some discussion groups on the internet collect messages from subscribers and then out to the group as a single multipart/digest message From: elinor@abcd.com To: Carolyn@xyz.com MIME-Version:1.0 Message-Id:0704760941.AA00747@abcd.com Content-Type: multipart/alternative; boundary=qwertyuiopasdfghjklzxcvbnm Subject: Earth orbits sun integral number of times This is the preamble. The user agent ignores it. Have a nice day. qwertyuiopasdfghjklzxcvbnm Content-Type:text/enriched Happy birthday to you Happy birthday to you Happy birthday to dear <bold>Carolyn</bold> Happy birthday to you qwertyuiopasdfghjklzxcvbnm Content-Type: message/external-body; access-type=anon-ftp; site=bicycle.abcd.com; directory=pub; name=birthday.snd
NOTES
203
DIT 116
NETWORK PROTOCOLS
NOTES
Content-type: audio/basic Content-transfer-encoding: base64 qwertyuiopasdfghjklzxcvbnm

Figure 4.18 A multipart message containing enriched and audio alternatives
Have you understood? 1. What is the difference between Cc: and Bcc: headers of RFC 422? 2. What is the purpose of Reply-To header? 3. What type of messages cant be interpreted by RFC 422 properly? 4. What is the necessity of MIME standard? 5. What is the multimedia support provided by MIME standard? 4.8 MULTIMEDIA APPLICATIONS
This section focuses on the transfer of real-time data such as voice and video over an IP network. In addition to discussing the protocols used to transport such data, this section considers two broader issues. First, it examines the question of how IP can be used to provide commercial telephone service. Second, it examines the question of how routers in an IP network can guarantee sufficient service to provide high-quality video and audio reproduction. Although it was designed and optimized to transport data, IP has successfully carried audio and video since its inception. In fact, researchers began to experiment with audio transmission across the ARPANET before the internet was in place. By the 1990s, commercial radio stations were sending audio across the internet, and software was available that allowed an individual to send audio across the internet or to the standard telephone network. Commercial telephone companies also began using IP technology internally to carry voice. 4.8.1 Audio Clips and Encoding Standards
The simplest way to transfer audio across an IP network consists of digitizing an analog audio signal to produce a data file, using a conventional protocol to transfer the file, and then decoding the digital file to reproduce the original analog signal. Of course, the technique does not work well for interactive exchange because placing coded audio in a file and transferring the file introduces a long delay. Thus, file transfer is typically used to send short audio recordings, which are known as audio clips. Special hardware is used to form high-quality digitized audio. Known as a coder/ decoder (codec), the device can convert in either direction between an analog audio signal and an equivalent digital representation. The most common type of codec, a
204
DIT 116
NETWORK PROTOCOLS
waveform coder, measures the amplitude of the input signal at regular intervals and consequence of integers as input and recreates the continuous analog signal that matches the digital values. Several digital encoding standards exist, with the main tradeoff being between quality of reproduction and the size of digital representation. For example, the conventional telephone system uses the Pulse Code Modulation (PCM) standard that specifies taking an 8-bit sample every 125 seconds (i.e., 8000 times per second). As a result, a digitized telephone call produces data at a rate of 64 Kbps. The PCM encoding produces a surprising amount of output storing a 128 second audio clip requires one megabyte of memory. There are three ways to reduce the amount of data generated by digital encoding: take efwer samples per second, use fewer bits to encode each sample, or use a digital compression scheme to reduce the size of the resulting output. Various systems exist that use one or more of the techniques, making it possible to find products that produce encoded audio at a rate of only 2.2 Kbps. However, each technique has disadvantages. The chief disadvantage of taking fewer samples or using fewer bits to encode a sample is lower quality audio the system cannot reproduce as large a range of sounds. The chief disadvantage of compression is delay digitized output must be held while it is compressed. Furthermore, because greater reduction in size requires more processing, he best compression either requires a fast CPU or introduces longer delay. Thus, compression is most useful when delay is important (e.g., when the output from a codec is being stored in a file). 4.8.2 Audio and Video Transmission and Reproduction
NOTES
Many audio and video applications are classified as real-time because they require timely transmission and delivery. For example, an interactive telephone call is a real-time exchange because audio must be delivered without significant delay or users find the system unsatisfactory. Timely transfer means more than low delay because the resulting signal is unintelligible unless it is presented in exactly the same order as the original, and with exactly the same timing. Thus, if a sender takes a sample every 125 seconds, the receiver must convert digital values to analog at exactly the same rate. How can a network guarantee that the stream is delivered at exactly the same rate the sender used? The conventional telephone system introduced one answer: an isochronous architecture. Isochronous design means that the entire system, including the digital circuits, must be engineered to deliver output with exactly the same timing as was used to generate input. Thus, an isochronous system that has multiple paths between any two points must be engineered so all paths have exactly the same delay.
205
DIT 116
NETWORK PROTOCOLS
NOTES
An IP internet is not isochronous. We have already seen datagrams can be duplicated, delayed, or arrive out of order. Variance in delay is called jitter, and is especially pervasive in IP networks. To allow meaningful transmission and reproduction of digitized signals across a network with IP semantics, additional protocol support is required. To handle datagram duplication and out-of-order delivery, each transmission must contain a sequence number. To handle jitter, each transmission must contain a timestamp that tells the receiver at which time the data in the packet should be played back. Separating sequence and timing information allows a receiver to reconstruct the signal accurately independent of how the packets arrive. Such timing information is especially critical when a datagram is lost or if the sender stops encoding during periods of silence; it allows the receiver to pause during playback the amount of time specified by the timestamps. A Jitter and Playback Delay How can a receiver recreate a signal accurately if the network introduces jitter? The receiver must implement a playback buffer as figure 4.19 illustrates.
Items inserted at a variable rate Items extracted at a fixed rate
Figure 4.19 The conceptual organization of a playback buffer
When a session begins, the receiver delays playback and places incoming data in the buffer. When data in the buffer reaches a predetermined threshold, known as the playback point, output begins. The playback point, labeled K in the figure, is measured in time units of data to be played. Thus, playback begins when a receiver has accumulated K time units worth of data. As playback proceeds, datagrams continue to arrive. If there is no jitter, new data will arrive at exactly the same rate old data is being extracted and played, meaning the buffer will always contain exactly K time units of unplayed data. If a datagram experiences a small delay, playback is unaffected. The buffer size decreases steadily as data is extracted, and played continues uninterrupted for K time units. When a delay datagram arrives, the buffer is refilled. Of course, a playback buffer cannot compensate for datagram loss. In such cases, playback eventually reaches an unfilled position in that buffer, and output pauses for a time period corresponding to the missing data. Furthermore, the choice of K is a compromise between loss and delay. If K is too small amount of jitter causes the system to exhaust the playback buffer before the needed data arrives. If K is too large, the sysAnna University Chennai 206
DIT 116
NETWORK PROTOCOLS
tem remains immune to jitter, but the extra delay, when added to the transmission delay in the underlying network, may be noticeable to users. Despite the disadvantages, most applications that send real-time data across an IP internet depend on playback buffering as the primary solution for jitter. Have you understood? 1. What is the limitation of basic digitization technique in interactive exchange between the client and server? 2. What is the function of a codec? 3. Mention the sampling rate and data rate of PCM. 4. Define jitter. 5. What is meant by a playback application? 4.9 REAL-TIME TRANSPORT PROTOCOL (RTP)
NOTES
The protocol used to transmit digitized audio or video signals over an IP internet is known as the Real-Time Transport Protocol (RTP). Interestingly, RTP does not contain mechanisms that ensure timely delivery; such guarantees must be made by the underlying system. Instead, RTP provides two key facilities: a sequence number in each packet that allows a receiver to detect out-of-order delivery or loss, and a timestamp that allows a receiver to control playback. Because RTP is designed to carry a wide variety of real-time data, including both audio and video, RTP does not enforce a uniform interpretation of semantics. Instead, each packet begins with a fixed header; fields in the header specify how to interpret remaining header files and how to interpret the payload. Figure 4.20 illustrates the format of RTPs fixed header.
VER P X CC M PTYPE SEQUENCE NUM TIMESTAMP SYNCHRONIZATION SOURCE IDENTIFIER CONTRIBUTING SOURCE ID ...
Figure: 4.20 Illustration of the fixed header used in RTP.
As the figure shows, each packet begins with a two-bit RTP version number in field VER; the current version is 2. The sixteen-bit SEQUENCE NUM field contains a sequence number for the packet. The first sequence number in a particular session is chosen at random. Some applications define an optional header extension to be placed between the fixed header and the payload. If the application type allows an extension, the X bit is used to specify whether the extension is present in the packet. The interpretation of most of the remaining fields in the header depends on the seven-bit PTYPE
DIT 116
NETWORK PROTOCOLS
NOTES
field that specifies the payload type. The P bit specifies whether zero padding follows the payload; it is used with encryption that requires data to be allocated in fixed-size blocks. Interpretation of the M (marker) bit also depends on the application; it is used by applications that need to mark points in the data stream (e.g., the beginning of each frame when sending video). The payload type also affects the interpretation of the TIMESTAMP field. A timestamp is a 32-bit value that gives the time at which the first octet of digitized data was sampled, with the initial timestamp for a session chosen at random. The standard specifies that the timestamp for a session chosen at random. The standard specifies that the timestamp is incremented continuously, even during periods when no signal is detected and no values are sent, but it does not specify the exact granularity. Instead, the granularity is determined by the payload type, which means that each application can choose a clock granularity that allows a receiver to position item in the output with accuracy appropriate to the application. For example, if a stream of audio data is being transmitted over RTP, a logical timestamp granularity of one clock tick per sample is appropriate. However, if video data is being transmitted, the timestamp granularity needs to be higher than one tick per frame to achieve smooth playback. In any case, the standard allows the timestamp in two packets to be identical, if the data in the two packets was sampled at the same time. 4.9.1 Streams, Mixing, And Multicasting A key part of RTP is its support for translation (i.e., changing the encoding of a stream at an intermediate station) or mixing (i.e., receiving streams of data from multiple sources, combining them into a single stream, and sending the result). To understand the need for mixing, imagine that individuals at multiple sites participate in a conference call using IP. To minimize the number of RTP streams, the group can designate a mixer, and arrange for each site to establish an RTP session to the mixer. The mixer combines the audio streams (possibly by converting them back to analog and resampling the resulting signal), and sends the result as a single digital stream. Fields in the RTP header identify the sender and indicate whether mixing occurred. The field labeled SYNCHRONIZATION SOURCE IDENTIFIER specifies the source of a stream. Each source must choose a unique 32-bit identifier; the protocol includes a mechanism for resolving conflicts if they arise. When a mixer combines multiple streams, the mixer becomes the synchronization source for the new stream. Information about the original sources is not lost, however, because the mixer uses the variable-size CINTRIBUTING SOURCE ID field to provide the synchronization IDs of streams that were mixed together. The four-bit CC field gives a count of contributing sources; a maximum of 15 sources can be listed.
208
DIT 116
NETWORK PROTOCOLS
RTP is designed to work with IP multicasting, and mixing is especially attractive in a multicast environment. To understand why, imagine a teleconference that includes many participants. Unicasting requires a station to send a copy of each outgoing RTP packet to each participant. With multicasting, however, a station to send one copy of the packet, which will be delivered to al participants. Furthermore, if mixing is used, all sources can unicast to a mixer, which combines them into a single stream before multicasting. Thus, the combination of mixing and multicast results in substantially fewer datagrams being delivered too each participating host. 4.9.2 RTP Encapsulation
NOTES
Its name implies that RTP is a transport-level protocol. Indeed, if it functioned like a conventional transport protocol. RTP would require each message to be encapsulated directly in an IP datagram. In fact, RTP does not function like a transport protocol; although is allowed, direct encapsulation in IP does not occur in practice. Instead, RTP runs over UDP, meaning that each RTP message is encapsulated in a UDP datagram. The chief advantage of using UDP is concurrency a single computer can have multiple applications using RTP without interference. Unlike many of the application protocols we have seen, RTP does not use a reserved UDP port number. Instead, a port is allocated for use with each session, and the remote application must be informed about the port number. By convention, RTP chooses an even numbered UDP port; the following section explains that a companion protocol, RTCP, uses the next port number. 4.9.3 RTP Control Protocol
So far, our description of real-time transmission has focused on the protocol mechanisms that allow a receiver to reproduce content. However, another aspect of real-time transmission is equally important: monitoring of the underlying network during the session and providing out of band communication between the end points. Such a mechanism is especially important in cases where adaptive schemes are used. For example, an application might choose a lower-bandwidth encoding when the underlying network becomes congested, or a receiver might vary the size of its playback buffer when network delay or jitter changes. Finally, an out-of-band mechanism can be used to send information in parallel with the real-time data (e.g., captions to accompany a video stream). A companion protocol and integral part of RTP, known as the RTP Control Protocol (RTCP), provides the needed control functionality. RTCP allows senders and receivers to transmit a series of reports to one another that contain additional information about the data being transferred and the performance of the network. RTCP mes-
209
DIT 116
NETWORK PROTOCOLS
NOTES
sages are encapsulated in UDP for the transmission, and are sent using a protocol number one greater than the port number of the RTP stream to which they pertain. RTCP uses five basic message types to allow senders and receivers to exchange information about a session. Figure 4.21 lists the types.
Type 200 201 202 203 204
Meaning Sender report Receiver report Source description message Bye message Application specific message
Figure 4.21 The five RTCP message types
The bye and application specific messages are the most straight forward. A sender transmits a bye message when shutting down a stream. The application specific message type provides an extension of the basic facility to allow the application to define a message type. For example, an application that sends a closed caption to accompany a video stream might choose to define an RTCP messages that supports closed captioning. Receivers periodically transmit receiver report messages that inform the source about conditions of reception. Receiver reports are important for two reasons. First, they allow all receivers participating in a session as well as a sender to learn about rereporting to avoid using excessive bandwidth and overwhelming the sender. The adaptive scheme guarantees that the total control traffic will remain less than 5% of the real time data traffic, and that receiver reports generate less than 75% of the control traffic. Each receiver report identifies one or more synchronization sources and contains a separate section for each. A section specifies the highest sequence number packet received from the source, the cumulative and percentage packet loss experienced, time since the last RTCP report arrived from the source, and the inter-arrival jitter. Senders periodically transmit a sender report message that provides an absolute timestamp. To understand the need for a timestamp, recall that RTP allows each stream to choose a granularity for its timestamp and that the first timestamp is chosen t random. The absolute timestamp in a sender report is essential because it provides the only mechanism a receiver has to synchronize multiple streams. In particular, because RTP requires a separate stream for each media type, the transmission of video and accompanying audio requires two streams. The absolute timestamp information allows a receiver to play the two streams simultaneously.
210
DIT 116
NETWORK PROTOCOLS
In addition to the periodic sender report messages, senders also transmit source description messages which provide general information about the user who owns or controls the source. Each message contains one section for each outgoing RTP stream; the contents are intended for humans to read. For example, the only required field consists of a canonical name for the stream owner, a character string in the form; user@host where host is either the domain name of the computer or its IP address in dotted decimal form, and user is a login name. optional fields in the source description contain further details such as the users e0-mail address (which may differ from the canonical name), telephone number, the geographical location of the site, the application program or tool used to create the stream, or other textual notes about the source. Have you understood? 1. What is RTP? 2. Whether RTP itself has any mechanism to ensure timely delivery? Justify your answer. 3. What is the purpose of SEQUENCENUM field in RTP header? 4. What is the purpose of TIMESTAMP field in RTP header? 5. Whether RTP is a transport layer protocol or application layer protocol? Justify your answer. 6. Whether RTCP is an independent protocol or not? Justify your answer. 7. What are the five message types of RTCP? 8. What type of QoS report is provided by RTCP? 4.10 IP TELEPHONY AND SIGNALING
NOTES
One aspect of real-time transmission stands out as especially important: the use of IP as the foundation for telephone service. Known as IP telephony or voice over IP, the idea is endorsed by many telephone companies. The question arises, what additional technologies are needed before IP can be used in place of the existing isochronous telephone system? although no simple answer exists, researchers are investigating three components. First, we have seen that a protocol like RTP is needed to transfer a digitized signal across an IP internet correctly. Second, a mechanism is needed to establish and terminate telephone calls. Third, researchers are exploring ways an IP internet can be made to function like an isochronous network. The telephone industry uses the term signaling to refer to the process of establishing a telephone call. Specifically, the signaling mechanism used in the conventional Public Switched Telephone Network (PSTN) is signaling system 7 (SS7). SS7 performs call routing before any audio is sent. Given a telephone number, it forms a circuit through
DIT 116
NETWORK PROTOCOLS
NOTES
the network, rings the designated telephone, and connects the circuit when the phone is answered. SS& also handles details such as call forwarding and error conditions such as the destination phone being busy. Before IP can be used make phone calls, signaling functionality must be available. Furthermore, to enable adoption by the phone companies, IP telephony must be compatible with extant telephone standards it must be possible for the IP telephony system to interoperate with the conventional phone system at all levels. Thus, it must be possible to translate between the signaling used with IP and standard PCM encoding. As a consequence, the two signaling mechanisms will have equivalent functionality. The general approach to interoperability uses a gateway between the IP phone system and the conventional phone system. A call can be initiated on either side of the gateway. When a signaling request arrives, the gateway translates and forwards the request; the gateway must also translate and forward the response. Finally, after signaling is complete and a call has been established, the gateway must forward voice in both directions, translating from the encoding used on one side to the encoding used on the other. Two groups have proposed standards for IP telephony. The ITU has defined a suite of protocols known as H.323, and the IETF has proposed a signaling protocol known as the Session Initiation Protocol (SIP). The next sections summarize the two approaches. 4.10.1 H.323 Standards The ITU originally created H.323 to allow the transmission of voice over local area network technologies. The standard has been extended to allow transmission of voice over IP internets, and telephone companies are expected to adopt it. H.323 is not a single protocol. Instead, it specifies how multiple protocols can be combined to form a functional IP telephony system. For example, in addition to gateways, H.323 defines devices known as gatekeepers that each provides a contact point for telephones using IP. To obtain permission to place outgoing calls and enable the phone system to correctly route incoming calls, each IP telephone must register with a gatekeeper; H.323 includes the necessary protocols. In addition to specifying a protocol for the transmission of real-time voice and video, the H.323 framework allows participants to transfer data. Thus, a pair of users engaged in an audio-video conference can also share an on-screen whiteboard, send still images, or exchange copies of documents. H.323 relies on the four major protocols listed in figure 4.22.
212
DIT 116
NETWORK PROTOCOLS
Protocol H.225.0 H.245 RTP T.120
Purpose Signaling used to establish a call Control and feedback during the call Real-time data transfer(sequence and timing) Exchange of data associated with a call
NOTES
Figure 4.22. The protocols used by H.323 for IP telephony
Together., the suite of protocols covers all aspects of IP telephony, including phone registration, signaling, real-time data encoding and transfer (both voice and video), and control. Figure 4.23 illustrates relationships among the protocols that comprise H.323. as the figure shows, the entire suite ultimately depends on UDP and TCP running over IP.
Audio/video applications Video codec Audio codec RTP UDP RTCP Signaling and control H.225 H.225 Registr. Signaling IP h.245 Control TCP Data applications T.120 data
Figure 4.23 Relationship among protocols of H.323
4.10.2 Session Initiation protocol (SIP) The IETF has proposed an alternate to H.323 called the Session Initiation Protocol (SIP), that only covers signaling; it does not recommend specific codecs nor does it require the use of RTP for real-time transfer. Thus, SIP does not supply all of H.323 functionality. Sip uses client-server interaction, with servers being divided into two types. A user agent server runs in a SIP telephone. It is assigned an identifier (e.g., user @ site), and can receive incoming calls. The second type of server is intermediate (i.e., between two SIP telephones) and handles tasks such as call set up and call forwarding. An intermediate server functions as a proxy server that can forward an incoming call request to the next proxy server along the path to the phone, or as a redirect server that tells a caller how to reach the destination. To provide information about a call, SIP relies on a companion protocol, the Session Description Protocol (SDP). SDP is especially important in a conference call, because participants join and leave the call dynamically. SDP specifies details such as the media encoding, protocol port number, and multicast address.
213
DIT 116
NETWORK PROTOCOLS
NOTES
4.10.3 Resource Reservation and Quality Of Service The term Quality of Service (QoS) refers to statistical performance guarantees that a network system can make regarding loss, delay, throughput, and jitter. An isochronous network that is engineered to meet strict performance bounds is said to provide QoS guarantees, while a packet switched network that uses best effort delivery is said to provide no QoS guarantees, while a packet switched network that uses best effort delivery is said to provide no QoS guarantee. Is guaranteed QoS needed for real-time transfer of voice and video over IP? If so, how should it be implemented? A major controversy surrounds the two questions. On one hand, engineers who designed the telephone system insist that toll-quality voice reproduction requires the underlying system to provide QoS guarantees about delay and loss for each phone call. On the other hand, engineers who designed IP insist that the Internet works reasonably well without QoS guarantees and that adding per-flow QoS is infeasible because routers will make the system both expensive and slow. The QoS controversy has produced many proposals, implementations, and experiments. Although it operates without QoS, the internet is already used to send audio. Technologies like ATM that were derived from the telephone system model provide QoS guarantees for each individual connection. Finally, the IETF adopts a conservative differentiated services approach that divides traffic into separate QoS classes. The differentiated services scheme sacrifices fine grain control for less complex forwarding. 4.10.4 QoS, Utilization, and Capacity The debate over QoS is reminiscent of earlier debates on resource allocation such as those waged over operating system policies for memory allocation and processor scheduling. The central issue is utilization: when a network has sufficient resources for all traffic, QoS constraints are unnecessary; when traffic exceeds network capacity, no QoS system can satisfy all users demands. That is, a network with 1% utilization does not need QoS, and a network with 101% utilization will fail under any QoS mechanisms achieve two goals. First, by dividing the existing resources among more users, they make the system more fair. Second, by shaping the traffic from each user, they allow the network to run at higher utilization without danger of collapse. One of the major arguments against complicated QoS mechanisms arises from improvements in the performance of underlying networks. Network capacity has increased dramatically. As long as rapid increases in capacity continue, QoS mechanisms merely represent unnecessary overhead. However, if demand rises more rapidly than capacity, QoS may become an economic issue by associating higher prices with higher levels of service, ISPs can use cost to ration capacity.
214
DIT 116
NETWORK PROTOCOLS
4.10.4.1 RSVP If QoS is needed, how can an IP network provide it? Before announcing the differentiated services solution, the IETF worked on a scheme that can be used to provide QoS in an IP environment. The work produced a pair of protocols: the Resource Reservation Protocol (RSVP) and the Common Open Policy Services (COPS) protocol. QoS cannot be added to IP at the application layer. Instead, the basic infrastructure must change routers must agree to reserve resources (e.g., bandwidth) for each flow between a pair of endpoints. There are two aspects. First, before data is sent, the endpoints must send a request that specifies the resources needed, and all routers along the path must agree to supply the resources; the procedure can be viewed as a form of signaling. Second, as datagrams traverse the flow, routers need to monitor and control traffic forwarding. Monitoring, sometimes called traffic policing, is needed to ensure that the traffic sent on a flow does not exceed the specified bounds. Control of queuing and forwarding is needed for two reasons. The router must implement a queuing policy that meets the guaranteed bounds on delay, and the router smooth packet bursts. The latter is sometimes referred to as traffic shaping, and is necessary because network traffic is often bursty. For example, a flow that specifies an average throughput of 1 Mbps may have 2 Mbps of traffic for a millisecond followed by no traffic for a millisecond. A router can reshape the burst by temporarily queuing incoming datagrams and sending them at a steady rate of 1 Mbps. RSVP handles reservation requests and replies. It is not a routing protocol, nor does it enforce policies once a flow has been established. Instead, RSVP operates before any data is sent. To initiate an end-to-end flow, an endpoint first sends an RSVP path message to determine the path to the destination; the datagram carrying the message uses the router alert option to guarantee that routers examine the message. After it receives a reply to its path message, the endpoint sends a request message to reserve resources for the flow. The request specifies QoS bounds desired; each router that forwards the request along to the destination must agree to reserve the resources the request specifies. If any router along to the path denies the request, the router uses RSVP to send a negative reply back to the source. If all systems along the path agree to honor the request, RSVP returns a positive reply. Each RSVP flow is simplex (i.e., unidirectional). If an application requires QoS guarantees in two directions, each endpoint must use RSVP to request a flow. Because RSVP uses existing routing, there is no guarantee that the two flows will pass through the same routers, nor does approval of a flow in one direction imply approval in the other.
NOTES
215
DIT 116
NETWORK PROTOCOLS
NOTES
4.10.4.2 COPS When an RSVP request arrives, a router must evaluate two aspects: feasibility (i.e., whether the router has the resource to satisfy the request) and policies (i.e., whether the request lies within policy constraints). Feasibility as a local decision a router can decide how to manage the bandwidth, memory, and processing power that is routers must agree to the same set of policies. To implement global policies, the IETF architecture uses a two-level model, with client-server interaction between the levels. When a router receives a RSVP request, it becomes a client that consults a server known as a Policy Decision Point (PDP) to determine whether the request meets policy constraints. The PDP does not handle traffic; it merely evaluates requests to see if they satisfy global policies. If a PDP approves a request, the router must operate as a Policy Enforcement Point PEP to ensure traffic does not exceed the approved policy. The COPS protocol defines the client-server interaction between a router and a PDP (or between a router and a local PDP if the organization has multiple levels of policy servers). Although COPS defines its own message header, the underlying format shares many details with RSVP. In particular, COPS uses the same format as RSVP for individual items in a request message. Thus, when a router receives an RSVP request, it can extract items related to policy, place them in a COPS message, and send the result to a PDP. Have you understood? 1. 2. 3. 4. 5. What are the two standards available for voice traffic in IP networks? Whether H.323 is a single protocol or not? Justify your answer. What is SIP? What is RSVP? What is COPS?
Summary 1. The World Wide Web (WWW) is an open ended information retrieval system for accessing linked documents spread out over millions of machines all over the Internet. Web is one of the services provided by the Internet and is composed of a vast, worldwide collection of documents or Web pages, often just called pages for short. Each page may contain links to other pages anywhere in the world. The client side of the web is the browser program. From the users point of view, browser is a program that is used to fetch and display the web pages. In addition to fetching and displaying web pages, the browser has to catch mouse clicks to items on the displayed page.
216
2.
3.
DIT 116
NETWORK PROTOCOLS
4. 5.
6.
7.
8.
9.
10.
11.
12. 13.
14.
15.
16.
Technically a browser is a HTTP client and HTML interpreter. Most browsers have numerous buttons and features to make it easier to navigate the Web. This is the major difference between text based browsers like Lynx and commercially successful browsers like Netscape Navigator and Internet Explorer. A plug-in is a code module that the browser fetches from a special directory on the disk and installs as an extension to itself. The plug-in runs as a part of the browser. After the plug-in has done its job, the plug-in is removed from the browsers memory. Helper applications are alternate to plug-ins in supporting the file types that cant be interpreted by the browser. A helper application is a complete program and runs as a separate process. It accepts the name of a scratch file where the content file has been stored, opens the file, and displays the contents. A web server accepts a TCP connection from a client (a browser). Then it gets the name of the file requested. The server actually retrieves the file from the disk, returns the file to the client and releases the TCP connection Among the various operations performed by the web server, getting the file from disk becomes the bottleneck in fetching the web page since it involves the secondary storage devices. The data rate at which the secondary storage devices operate is considerably less than that of the rate at which the processor operates. If the web server has a very large client base (with more probability for simultaneous requests), many improvements are required in the basic design of a web server. With such improvements only web servers in e-commerce and e-business applications can cope up with the demand. Some mechanisms to improve the capability of web servers is to maintain a cache in memory of the n most recently used files, making the server multithreaded with or without a front end and using a set of CPUs to create a server farm. Fetching a web page requires three questions to be answered namely What is the page called? Where is the page located? How can the page be accessed? The addressing scheme of the web by name Uniform Resource Locator (URL) is composed of three parts namely protocol (How can the page be accessed?), domain name (Where is the page located?) and the file name (What is the page called?). To fetch a web page, the domain name part of the URL is mapped into its equivalent IP address with the help of DNS, a TCP connection is established at port number 80, the request is made and the response is obtained. URLs have been designed to not only allow users to navigate the web, but to deal with FTP, news, Gopher, e-mail and telnet as well, making all the specialized user interface programs for those other services unnecessary and thus integrating nearly all the Internet access into a single program, the Web browser. The web has been made as a stateless one intentionally to follow the design philosophy Keep It as Simple as Possible. However, the stateless approach
217
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
suffers from many limitations in the context of e-commerce and e-business applications. To overcome the limitation of the stateless nature of the web, Netscape came up with the idea of cookies, which is additional information sent along with the requested page. Browsers store these cookies in a cookie directory on the clients hard disk. Cookies are just files or strings, not executable programs. The protocol used by the client and the server in the process of transferring the web documents or pages is Hyper Text Transfer Protocol (HTTP). The specification of HTTP discusses about the methods that are used by the client and the server through which various types of requests are made and the responses are obtained. In HTTP 1.0, the web client (the browser) establishes a connection with the web server and makes a single request and the server sends the single response over the established connection. Once the reply is sent the connection is released. That is, HTTP 1.0 does not support persistent connections. HTTP 1.1 supports persistent connections. Persistent connections refer to the ability to send additional requests and get additional responses over the same TCP connection. By amortizing the TCP setup and release over multiple requests, the relative overhead due to TCP is much less per request. HTTP is basically a request/response scheme. In addition to the actual method, HTTP messages are to be provided with the required additional information called headers. Some of them are used as request headers and some of them are used as response headers. Few headers can be used as both request as well as response headers. Electronic mail is one of the services provided by the Internet to the users who have an e-mail account to exchange the messages among themselves. The greatest advantage of e-mail is that it reaches the recipient anywhere in the world within few minutes. Mail delivery is a new concept because it differs fundamentally from other services of the Internet. An e-mail system has to support off-line delivery also and hence the application should be able to deliver the message even if the recipient is not on line. To support the off-line delivery the mail system uses a technique known as spooling. In this technique, the system places a copy of the mail in its private storage (spool) area then initiates the transfer to the remote machine as a background activity, allowing the sender to proceed with other computational activities. Basic functions of an e-mail system are composition of messages, transfer of messages, reporting of messages, displaying of messages and disposing of messages. SMTP is the protocol used by the mail servers to transfer the messages among themselves and the end delivery of message sis provided by protocols like POP#, IMAP etc.
218
DIT 116
NETWORK PROTOCOLS
27.
28.
29.
30.
Messages of e-mail systems consist of a primitive envelope, some number of header fields, a blank line, and then the message body. Each header field consists of a single line of ASCII text containing the field name, a colon, and for most fields, a value. RFC 822 standard supports only ASCII messages and do not support nonASCII messages like multimedia applications and messages of non-Latin alphabets. The basic RFC was revised and the revised standard is called MIME. The basic idea of the MIME is to continue to use the RFC 822 format, but was structured to the message body and defined encoding rules for non ASCII messages. The protocol used to transmit digitized audio or video signals over an IP internet is known as the Real-Time Transport Protocol (RTP). Interestingly, RTP does not contain mechanisms that ensure timely delivery; such guarantees must be made by the underlying system. A companion protocol and integral part of RTP, known as the RTP Control Protocol (RTCP), provides the needed control functionality. RTCP allows senders and receivers to transmit a series of reports to one another that contain additional information about the data being transferred and the performance of the network.
NOTES
Exercises 1. 2. i. ii. iii. iv. 3. 4. Write short notes on web based e-mail. Explain the transfer of mail in the following scenarios. Sender and receiver are in the same system (shared system) Sender and receiver are on two different systems One of the user is separated from his system (Connected through a network to the mail server) Both the users are connected to their mail servers through networks. Excluding the connection establishment and termination, what is the minimum number of network round trips to send a small message? TCP is a full duplex protocol, yet SMTP uses TCP in a half-duplex fashion. The client sends a command then stops and waits for the reply. Why doesnt the client send multiple commands at once, for example, a single write that contains the HELO, MAIL, RCPT, DATA and QUIT commands? How can the half-duplex operation of SMTP fool the slow start mechanism when the network is running near capacity? Explain the steps involved in displaying the web pages that have other file types apart from HTML. How do Windows and UNIX handle helper applications? What are the functions of a proxy server?
5. 6. 7. 8.
219
DIT 116
NETWORK PROTOCOLS
NOTES
9. 10.
What are the built-in short cuts supported by certain sites in the usage of URLs? Mention the major characteristic features of HTTP.
Answers 1. Electronic mail is such a common application that some websites today provide this service to anyone who accesses the site. Two common sites are Hotmail and Yahoo. The idea is very simple. Mail transfer from As browser to his mail box is done through HTTP. The transfer of the message from the sending mail server to the receiving mail server is still through SMTP. Finally, the message from the receiving server (the web server) to Bs browser is done through HTTP. Instead of POP3 or IMAP, HTTP is used. When B needs to retrieve his mail, he sends a message to the website. The website sends a form to be filled in by B, which includes the log-in name and the password. If the log-in name and password match, the e-mail is transferred from the web server to Bs browser in HTML format. 2. i. When A needs to send a message to B, he runs a User Agent (UA) program to prepare the message and store it in Bs mailbox. B can retrieve and read the contents of his mailbox at his convenience using the UA. ii. The messages need to be sent over the Internet. Here we need UAs and Mail Transfer Agents (MTAs). A needs to use the UA to send his message to the system at his own site. The system (sometimes called the mail server) at his site uses a queue to store messages waiting to be sent. B also needs a UA program to retrieve messages stored in the mailbox of the system at his site. iii. A needs a UA to prepare his message. He then needs to send the message through the LAN or WAN. This can be done through a pair of MTAs (client and server). Whenever A has a message to send, she calls the UA which, in turn calls the MTA client. The MTA client establishes a connection with the MTA server on the system, which is running all the time. The system at As site queues all messages received. It then uses an MTA client to send the messages to the system at Bs site. The system receives the message and reads it. Note that we need two pairs of MTA client-server programs. iv. After the message has arrived at Bs mail server, B need to retrieve it. Here, we need another set of client-server agents, which we call Message Access Agents (MAAs). B uses an MAA client to retrieve his messages. The client sends a request to the MAA server, which is running all the time, and requests the transfer of the messages. 3. Six round trips: the HELO command, MAIL, RCPT, DATA, body of the message and QUIT. 4. This is legal and is called pipelining. Unfortunately there exist brain-damaged SMTP receiver implementations that clear their input buffer after each command is processed, causing this technique to fail. If this technique is used, naturally the client cannot discard the message until all the replies have been checked to verify that the message was accepted by the server.
DIT 116
NETWORK PROTOCOLS
5.
Consider the round trips because of HELO, MAIL, RCPT, DATA and body of the message. Each is a small command (probably a single segment) that places little load on the network. If all five make it through to the server without retransmission, the congestion window could be six segments when the body is sent. If the body is large, the client could send the first six segments at once, which the network might not be able to handle. 6. When a web server returns a web page, it also returns some additional information about the page. This information includes the MIME type of the page. Pages of type text/html are just displayed directly, as are pages in a few other built-intypes. If the MIME type is not one of the built-in ones, the browser consults its table of MIME types to tell it how to display the page. This table associates a MIME type with a viewer. 7. On Windows, when a program is installed on the computer, it registers the MIME types it wants to handle. That is, the registration process is completely automatic in Windows operating system. This mechanism leads to conflict when multiple viewers are available for some subtype, such as video/mpg. What happens is that the last program to register overwrites (MIME type, helper application) associations, capturing the type for itself. As a consequence, installing a new program may change the way a browser handles existing types. On UNIX, the registration process is generally not automatic. The user must manually update certain configuration files. Hence in UNIX, the administrators have better control over plug-ins and helper applications. 8. HTTP supports proxy servers. A proxy server is a computer that keeps copies of responses to recent requests. The HTTP client sends a request to the proxy server. The proxy server checks its cache. If the response is not stored in the cache, the proxy server sends the request to the corresponding server. Incoming responses are sent to the proxy server and stored for future requests from other clients. The proxy server reduces the load on the original server, decreases traffic, and improves latency. However, to use the proxy server, the client must be configured to access the proxy instead of the target server. 9. Many sites have built-in shortcuts for file names. At many sites, a null file name defaults to the organizations main home page. Typically, when the file named is a directory, this implies a file named index.html. Another frequently used shortcut is ~user/ might be mapped onto the users WWW directory and then onto the file index.html in that directory. 10.i. HTTP operates at the application level. It assumes a reliable, connection-oriented transport protocol such as TCP, but does not provide reliability or retransmission itself. ii. Once a transport section has been established, one side (usually a browser) must send an HTTP request to which the other side responds. iii. Each HTTP request is self-contained. The server does not keep a history of previous requests or previous sessions.
NOTES
221
DIT 116
NETWORK PROTOCOLS
NOTES
In most cases, a browser requests a web page, and the server transfers a copy to the browser. HTTP also allows transfer from a browser to a server (e.g., when a user submits a so-called form). v. HTTP allows browsers and servers to negotiate details such as the character set to be used during transfers. A sender can specify the capabilities it offers and a receiver can specify the capabilities it accepts. vi. To improve response time, a browser caches a copy of each web page it retrieves. If a user requests a page again, HTTP allows the browser to interrogate the server to determine whether the contents of the page has changed since the copy was cached. vii. HTTP allows a machine along the path between a browser and a server to act as a proxy server that caches web pages and answers a browsers request from its cache.
iv.
222
DIT 116
NETWORK PROTOCOLS
UNIT - 5
NOTES
5.1
INTRODUCTION
All the applications we have discussed in the previous sections work properly only if the underlying network is managed effectively. Moreover, we can expect the optimal performance from these applications only if the security is ensured in the network. When the size of a network is small exclusive software for network management may not be required. A small network can be effectively managed using the routine network administration and ICMP protocol. However, in a large network (especially internetworks) ICMP alone is not sufficient to manage the network. In this unit, we are going to discuss a protocol by name Simple Network Management protocol (SNMP) that is exclusively meant for network management. Regarding network security, we are going to discuss a collection of protocols, collectively known as IP Security (IPsec) that provides the secure communication in the network. This unit also discusses about firewalls that block all unauthorized communication between computers in the organization and computers outside the organization. Another feature we discuss in this unit is the future of TCP/IP protocol stack. This becomes necessary since IPv4 suffers from the major limitations of exhaustion of address space and inability to support applications that are delay sensitive. This unit introduces the various features and issues of the next generation IP by name IPv6. 5.2 LEARNING OBJECTIVES To understand the necessity for exclusive software for network management To learn the functional modules of network management To study about the basic components of SNMP To study about the Abstract Syntax Notation To learn about Structure of Management Information, a subset of ASN.1 To discuss about the Management Information Base To study about the interaction between the NMS and the Agent To understand the various message types supported by SNMP To understand the possible security breaches in a network To understand the features of IPsec To study about the implementation of firewalls
DIT 116
NETWORK PROTOCOLS
NOTES
5.3
To understand the limitations of IPv4 To study about the basic IPv6 header To study about the extension header of IPv6 To understand about the various addressing schemes supported by IPv6 To have an exposure to the features of IPv6 that are debatable NETWORK MANAGEMENT
In small to medium sized networks, the need for explicit network management does not arise. However, as the size of the network increases apart from the routine administration, it is necessary to monitor the behavior of the various devices in the network in order to utilize the resources of the network optimally and to achieve the maximum performance. Network management refers to the maintenance and administration of large-scale computer networks and telecommunications networks at the top level. Network management is the execution of the set of functions required for controlling, planning, allocating, deploying, coordinating, and monitoring the resources of a network, including performing functions such as initial network planning, frequency allocation, predetermined traffic routing to support load balancing, cryptographic key distribution authorization, configuration management, fault management, security management, performance management, bandwidth management, and accounting management. Network management means different things to different people. In some cases, it involves a solitary network consultant monitoring network activity with an outdated protocol analyzer. In other cases, network management involves a distributed database, autopolling of network devices, and high-end workstations generating real-time graphical views of network topology changes and traffic. In general, network management is a service that employs a variety of tools, applications, and devices to assist human network managers in monitoring and maintaining networks. The early 1980s saw tremendous expansion in the area of network deployment. As companies realized the cost benefits and productivity gains created by network technology, they began to add networks and expand existing networks almost as rapidly as new network technologies and products were introduced. By the mid-1980s, certain companies were experiencing growing pains from deploying many different (and sometimes incompatible) network technologies. The problems associated with network expansion affect both day-to-day network operation management and strategic network growth planning. Each new network technology requires its own set of experts. In the early 1980s, the staffing requirements alone for managing large, heterogeneous networks created a crisis for many organizations. An urgent need arose for automated network management (including
DIT 116
NETWORK PROTOCOLS
what is typically called network capacity planning) integrated across diverse environments. 5.3.1 Network Management Architecture
NOTES
Most network management architectures use the same basic structure and set of relationships. Any Network Management System (NMS) has two basic entities namely managing device and managed device. Managed devices, such as computer systems and other network devices, run software that enables them to send alerts to the managing devices when they recognize problems. Upon receiving these alerts, management entities are programmed to react by executing one, several, or a group of actions, including operator notification, event logging, system shutdown, and automatic attempts at system repair. Management entities also can poll end stations to check the values of certain variables. Polling can be automatic or user-initiated, but agents in the managed devices respond to all polls. Agents are software modules that first compile information about the managed devices in which they reside, then store this information in a management database, and finally provide it (proactively or reactively) to management entities within NMSs via a network management protocol. Well-known network management protocols include the Simple Network Management Protocol (SNMP) and Common Management Information Protocol (CMIP). Management proxies are entities that provide management information on behalf of other entities. Figure 5.1 depicts a typical network management architecture.
Figure5.1ATypicalNetworkManagementArchitectureMaintainsManyRelationships
225
DIT 116
NETWORK PROTOCOLS
NOTES
5.3.2
ISO Network Management Model
The ISO has contributed a great deal to network standardization. Its network management model is the primary means for understanding the major functions of network management systems. This model consists of the following five conceptual areas. Performance Management The goal of performance management is to measure and make available various aspects of network performance so that internetwork performance can be maintained at an acceptable level. Examples of performance variables that might be provided include network throughput, user response times, and line utilization. Performance management involves three main steps. First, performance data is gathered on variables of interest to network administrators. Second, the data is analyzed to determine normal (baseline) levels. Finally, appropriate performance thresholds are determined for each important variable so that exceeding these thresholds indicates a network problem worthy of attention. Management entities continually monitor performance variables. When a performance threshold is exceeded, an alert is generated and sent to the network management system. Each of the steps just described is part of the process to set up a reactive system. When performance becomes unacceptable because of an exceeded user-defined threshold, the system reacts by sending a message. Performance management also permits proactive methods: For example, network simulation can be used to project how network growth will affect performance metrics. Such simulation can alert administrators to impending problems so that counteractive measures can be taken. Configuration Management The goal of configuration management is to monitor network and system configuration information so that the effects on network operation of various versions of hardware and software elements can be tracked and managed. Each network device has a variety of version information associated with it. An engineering workstation, for example, may be configured as follows: Operating system, Version 3.2 Ethernet interface, Version 5.4 TCP/IP software, Version 2.0 NetWare software, Version 4.1 NFS software, Version 5.1 Serial communications controller, Version 1.1
226
DIT 116
NETWORK PROTOCOLS
X.25 software, Version 1.0 SNMP software, Version 3.1
NOTES
Configuration management subsystems store this information in a database for easy access. When a problem occurs, this database can be searched for clues that may help solve the problem. Accounting Management The goal of accounting management is to measure network utilization parameters so that individual or group uses on the network can be regulated appropriately. Such regulation minimizes network problems (because network resources can be apportioned based on resource capacities) and maximizes the fairness of network access across all users. As with performance management, the first step toward appropriate accounting management is to measure utilization of all important network resources. Analysis of the results provides insight into current usage patterns, and usage quotas can be set at this point. Some correction, of course, will be required to reach optimal access practices. From this point, ongoing measurement of resource use can yield billing information as well as information used to assess continued fair and optimal resource utilization. Fault Management The goal of fault management is to detect, log, notify users of, and to the extent possible automatically fix network problems to keep the network running effectively. Because faults can cause downtime or unacceptable network degradation, fault management is perhaps the most widely implemented of the ISO network management elements. Fault management involves first determining symptoms and isolating the problem. Then the problem is fixed and the solution is tested on all-important subsystems. Finally, the detection and resolution of the problem is recorded. Security Management The goal of security management is to control access to network resources according to local guidelines so that the network cannot be sabotaged (intentionally or unintentionally) and sensitive information cannot be accessed by those without appropriate authorization. A security management subsystem, for example, can monitor users logging on to a network resource and can refuse access to those who enter inappropriate access codes. Security management subsystems work by partitioning network resources into authorized and unauthorized areas. For some users, access to any network resource is inappropriate, mostly because such users are usually company outsiders. For other (internal) network users, access to information originating from a particular department is inappropriate. Access to Human Resource files, for example, is inappropriate for most users outside the Human Resources department.
DIT 116
NETWORK PROTOCOLS
NOTES
Security management subsystems perform several functions. They identify sensitive network resources (including systems, files, and other entities) and determine mappings between sensitive network resources and user sets. They also monitor access points to sensitive network resources and log inappropriate access to sensitive network resources. Have you understood? 1. What is the necessity of network management in large networks? 2. What are the two basic entities of a Network Management System? 3. What are the functional areas of management according to ISO/OSI reference model? 4. Give some examples for the metrics that are used to measure the performance of a network. 5. What is the purpose of the network management protocol in NMS? 5.4 SIMPLE NETWORK MANAGEMENT PROTOCOL
The Simple Network Management Protocol (SNMP) is an application layer protocol that facilitates the exchange of management information between network devices. It is part of the Transmission Control Protocol/Internet Protocol (TCP/IP) protocol suite. SNMP enables network administrators to manage network performance, find and solve network problems, and plan for network growth. Please note the word simple in SNMP. This implies that this network management is simple in comparison with CMIP. CMIP is a full fledged network management system aimed at designing and implementing a network management protocol through which networks based on ISO/ OSI reference model. Like ISO/OSI which never became popular, CMIP also never became popular. Researchers of networks realized the fact that instead of trying to design a full fledged network management protocol like CMIP, it is enough to design and implement a Simple Network Management Protocol that suits TCP/IP networks. 5.4.1 Basic Components
An SNMP-managed network consists of five key components: managed devices, agents, network-management systems (NMSs) (or management stations), management information and a management protocol. The architecture of SNMP is shown in figure 5.2
228
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 5.2 SNMP Architecture
A managed device is a network node that contains an SNMP agent and that resides on a managed network. Managed devices collect and store management information and make this information available to NMSs using SNMP. Managed devices, sometimes called network elements, can be routers and access servers, switches and bridges, hubs, computer hosts, or printers. To be managed directly by SNMP, a node must be capable of running an SNMP management process, called an SNMP agent. An agent is a network-management software module that resides in a managed device. An agent has local knowledge of management information and translates that information into a form compatible with SNMP. Network management is done from management stations, which are, in fact, general-purpose computers running special management software. The management stations contain one or more processes that communicate with agents over the network, issuing commands and getting responses. In this design, all the intelligence is in the
229
DIT 116
NETWORK PROTOCOLS
NOTES
management stations, in order to keep the agents as simple as possible and minimize their impact on the devices they are running on. Many management stations have a graphical user interface to allow the network manager to inspect the status of the network and take action when required. In order to allow a management station to talk to the diverse components of the network, the nature of the information maintained by all the devices must be rigidly specified. Having the management station ask a router what its packet loss rate is of no use if the router does not keep track of its loss rate. Therefore, a network management system has to describe the exact information each kind of agent has to maintain and the format it has to supply it in. The management station interacts with the agents using a management protocol (SNMP). This protocol allows the management stations to query the state of an agents local objects, and change them if necessary. Most of SNMP consists of this queryresponse type communication. 5.4.2 Abstract Syntax Notation
In SNMP, each device maintains one or more variables that describe its state. In the SNMP literature, these variables are called objects, but the term is misleading because they are not objects in the sense of an object oriented language because they have just state and no methods. The heart of the SNMP model is the set of objects managed by the agents and read and written by management station. To make multivendor communication possible, it is essential that these objects be defined in a standard and vendor-neutral way. Furthermore, a standard way is needed to encode them for transfer over a network. While definitions in C would satisfy the first requirement, such definitions do not define a bit encoding on the wire in such a way that a 32bit 2s complement little-endian management station can exchange information unambiguously with an agent on a 16-bit ones complement big-endian CPU. For this reason, a standard object definition language, along with encoding rules, is needed. The one used by SNMP is taken from OSI and called ASN.1 (Abstract Syntax Notation One). ASN.1 is a formal notation used for describing data transmitted by telecommunications protocols, regardless of language implementation and physical representation of these data, whatever the application, whether complex or very simple. The notation provides a certain number of pre-defined basic types such as: integers (INTEGER), booleans (BOOLEAN), character strings (IA5String, UniversalString...), bit strings (BIT STRING),
230
DIT 116
NETWORK PROTOCOLS
etc., and makes it possible to define constructed types such as: structures (SEQUENCE), lists (SEQUENCE OF), choice between types (CHOICE), etc.
NOTES
Subtyping constraints can be also applied on any ASN.1 type in order to restrict its set of values. Unlike many other syntaxes which claim to be extensible, ASN.1 offers extensibility which addresses the problem of, and provides support for, the interworking between previously deployed systems and newer, updated versions designed years apart. ASN.1 sends information in any form (audio, video, data, etc.) anywhere it needs to be communicated digitally. ASN.1 only covers the structural aspects of information (there are no operators to handle the values once these are defined or to make calculations with). Therefore it is not a programming language. ASN.1 definition can be contrasted to the concept in ABNF (Augmented BackusNaur Form) of valid syntax, or in XSD (XML Schema Definition) of a valid document, where the focus is entirely on what are valid encodings of data, without concern with any meaning that might be attached to such encodings. That is, without any of the necessary semantic linkages. One of the main reasons for the success of ASN.1 is that this notation is associated with several standardized encoding rules such as the BER (Basic Encoding Rules), or more recently the PER (Packed Encoding Rules), which prove useful for applications that undergo restrictions in terms of bandwidth. These encoding rules describe how the values defined in ASN.1 should be encoded for transmission (i.e., how they can be translated into the bytes over the wire and reverse), regardless of machine, programming language, or how it is represented in an application program. ASN.1s encodings are more streamlined than many competing notations, enabling rapid and reliable transmission of extensible messages which is a notable advantage in wireless broadband. Because ASN.1 has been an international standard since 1984, its encoding rules are mature and have a long track record of reliability and interoperability. An ASN.1 definition can be readily mapped (by a pre-run-time processor) into a C or C++ or Java data structure that can be used by application code, and supported by run-time libraries providing encoding and decoding of representations in either an XML (Extensible Markup Language) or a TLV (Type Length Value) format, or a very compact packed encoding format. Tools on almost all operating systems support ASN.1. ASN.1 also supports popular programming languages such as Java, C and C++, as well as older ones including
231
DIT 116
NETWORK PROTOCOLS
NOTES
COBOL. As an example of ASN.1s universality, there are tools that have been ported to over 150 different computing platforms. There are a lot of well-tested ASN.1 tools that have been used for a long time. Using such tools, there are less likely to be costly delays in bringing new products to market or, even worse, recalling products based on new code that hasnt been sufficiently tested for flaws. ASN.1 is widely used in industry sectors where efficient (low-bandwidth, lowtransaction-cost) computer communications are needed, but is also being used in sectors where XML-encoded data is required (for example, transfer of biometric information). Suppose a company owns several sales outlets linked to a central warehouse where stocks are maintained and deliveries start from. The company requires that its protocol have the following features: The orders are collected locally at the sales outlets They are transmitted to the warehouse, where the delivery procedure should be managed An account of the delivery must be sent back to the sales outlets through the clients order This protocol can be specified with the two following ASN.1 modules: Module-order DEFINITIONS AUTOMATIC TAGS ::= BEGIN Order ::= SEQUENCE { header Order-header, items SEQUENCE OF Order-line} Order-header ::= SEQUENCE { number Order-number, date Date, client Client, payment Payment-method } Order-number ::= NumericString (SIZE (12)) Date ::= NumericString (SIZE (8)) MMDDYYYY Client ::= SEQUENCE { name PrintableString (SIZE (1..20)),
232
DIT 116
NETWORK PROTOCOLS
street PrintableString (SIZE (1..50)) OPTIONAL, postcode NumericString (SIZE (5)), town PrintableString (SIZE (1..30)), country PrintableString (SIZE (1..20)) DEFAULT default-country } default-country PrintableString ::= France Payment-method ::= CHOICE { check NumericString (SIZE (15)), credit-card Credit-card, cash NULL } Credit-card ::= SEQUENCE { type Card-type, number NumericString (SIZE (20)), expiry-date NumericString (SIZE (6)) MMYYYY } Card-type ::= ENUMERATED { cb(0), visa(1), eurocard(2), diners(3), american-express(4) } Order-line ::= SEQUENCE { item-code Item-code, label Label, quantity Quantity, price Cents } Item-code ::= NumericString (SIZE (7)) Label ::= PrintableString (SIZE (1..30)) Quantity ::= CHOICE { unites INTEGER, millimetres INTEGER, milligrammes INTEGER } Cents ::= INTEGER Delivery-report ::= SEQUENCE { order-code Order-number, delivery SEQUENCE OF Delivery-line } Delivery-line ::= SEQUENCE { item Item-code, quantity Quantity } END
233
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
Protocol DEFINITIONS AUTOMATIC TAGS ::= BEGIN IMPORTS Order, Delivery-report, Item-code, Quantity, Order-number FROM Module-order ; PDU ::= CHOICE { question CHOICE { question1 Order, question2 Item-code, question3 Order-number, ... }, answer CHOICE { answer1 Delivery-report, answer2 Quantity, answer3 Delivery-report, ... }} END ASN.1 transfer syntax defines how values of ASN.1 types are unambiguously converted to a sequence of bytes for transmission. It is also necessary the converted bytes should be properly decoded on the receiver side. ASN.1 defines the abstract syntax of information but does not restrict the way the information is encoded. Various ASN.1 encoding rules provide the transfer syntax (a concrete representation) of the data values whose abstract syntax is described in ASN.1. The standard ASN.1 encoding rules include: Basic Encoding Rules (BER) Canonical Encoding Rules (CER) Distinguished Encoding Rules (DER) XML Encoding Rules (XER) Packed Encoding Rules (PER) Generic String Encoding Rules (GSER) ASN.1 together with specific ASN.1 encoding rules facilitates the exchange of structured data especially between application programs over networks by describing data structures in a way that is independent of machine architecture and implementation language. Application layer protocols such as X.400 electronic mail, X.500 and LDAP directory services, H.323 VoIP and SNMP use ASN.1 to describe the PDUs they exchange. It is also extensively used in the Access and Non-Access Strata of UMTS. There are many other application domains of ASN.1
234
DIT 116
NETWORK PROTOCOLS
Among the above set of transfer syntaxes BER is the most popular one and used by SNMP. The Basic Encoding Rules were the original rules laid out by the ASN.1 standard for encoding abstract information into a concrete data stream. The rules, collectively referred to as a transfer syntax in ASN.1 parlance, specify the exact octet sequences which are used to encode a given data item. The syntax defines such elements as: the representations for basic data types, the structure of length information, and the means for defining complex or compound types based on more primitive types. The BER syntax, along with two subsets of BER (the Canonical Encoding Rules and the Distinguished Encoding Rules), are defined by the ITU-Ts X.690 standards document, which is part of the ASN.1 document series. The BER format specifies a self-describing and self-delimiting format to encoding the ASN.1 data structures. Each data element is encoded as a type identifier, a length description, the actual data elements, and where necessary, an end-of-content marker. These types of encodings are commonly called type-length-value or TLV encodings. This format allows a receiver to decode the ASN.1 information from an incomplete stream, without requiring any pre-knowledge of the size, content, or semantic meaning of the data. The key difference between the BER format and the CER or DER formats is the flexibility provided by the Basic Encoding Rules. As stated in the X.690 standard, Alternative encodings are permitted by the basic encoding rules as a senders option. Receivers who claim conformance to the basic encoding rules shall support all alternatives. For example, when encoding a constructed value (that is, a value that is made up of multiple smaller, already-encoded values), the sender can use one of three different forms to specify the length of the data. A receiver must be prepared to accept all legal encodings in order to legitimately claim BER-compliance. By contrast, both CER and DER restrict the available length specifications to a single option. There is a common perception of BER as being inefficient compared to alternative encoding rules. It has been argued by some that this perception is primarily due to poor implementations, not necessarily any inherent flaw in the encoding rules. These implementations rely on the flexibility that BER provides to use encoding logic that is easier to implement, but results in a larger encoded data stream than necessary. Whether this inefficiency is reality or perception, it has led to a number of alternative encoding schemes, such as the Packed Encoding Rules, which attempt to improve on BER performance and size. Other alternative formatting rules, which still provide the flexibility of BER but use alternative encoding schemes, are also being developed. The most popular of these are XML-based alternatives, such as the XML Encoding Rules and ASN.1 SOAP. In
NOTES
235
DIT 116
NETWORK PROTOCOLS
NOTES
addition, there is a standard mapping to convert an XML Schema to an ASN.1 schema, which can then be encoded using BER. Despite its perceived problems, BER is a popular format for transmitting data, particularly in systems with different native data encodings. The SNMP protocol specifies ASN.1 with BER as its required encoding scheme. The digital signature standard PKCS #7 also specifies ASN.1 with BER to encode encrypted messages and their digital signature or digital envelope. Many telecommunication systems, such as ISDN, toll-free call routing, and most cellular phone services use ASN.1 with BER to some degree for transmitting control messages over the network. LDAP messages are encoded using BER. 5.4.3 Structure of Management Information RFC 1422 first says that ASN.1 will be used to describe SNMP data structures, then it goes on for 57 pages scratching out parts of the ASN.1 standard that does not want and adding new definitions (in ASN.1) that are needed. In particular, RFC 1442 defines four key macros and eight new data types that are heavily used throughout SNMP. It is this sub-super-set of ASN.1, which goes by the ungainly name of Structure of Management Information (SMI) that is really used to define the SNMP data structures. At the lowest level, SNMP variables are defined as individual objects. Related objects are collected together into groups, and groups are assembled into modules. For example, groups exist for IP objects and TCP objects. A router might support the IP group, since its manager cares about how many packets it has lost. On the other hand, a low-end router might not support the TCP group, since it need not use TCP to perform its routing functions. It is the intention that vendors supporting a group support all the objects in that group. However, a vendor supporting a module need not support all of its groups, since not all may be applicable to the device. All MIB modules start with an invocation of the MODULE_IDENTITY macro. Its parameters provide the name and address of the implementer, the revision history, and other administrative information. Typically, this call is followed by an invocation of the OBJECT-IDENTITY macro, which tells where the module fits in the naming tree as shown in the figure. Later on come one or more invocations of the OBJECT-TYPE macro, which name the actual variables being managed and specify their properties. Grouping variables into groups is done by convention; there are no BEGIN-GROUP and END-GROUP statements in ASN.1 or SMI. The OBJECT-TYPE macro has four required parameters and four optional ones. The first required parameter is SYNTAX and defines the variables data type from among the following types.
236
DIT 116
NETWORK PROTOCOLS
INTEGER: Some variable are declared as an integer with no restrictions (e.g., the MTU of an interface), some are defined as taking on specific values (e.g., the IP forwarding flag is 1 if forwarding is enabled or 2 if forwarding is disabled), and others are defined with a minimum and maximum value (e.g., UDP and TCP port numbers are between 0 and 65535). OCTET STRING: A string of 0 or more 8-bit bytes. Each byte has a value between 0 and 255. In the BER encoding used for this data type and the next, a count of number of bytes in the string precedes the string. These strings are not null-terminated strings. DisplayString: A string of 0 or more 8-bit bytes, but each byte must be a character from the NVT ASCII set. All variables of this type in the MIB-II must contain no more than 255 characters. (A 0-length string is OK). OBJECT IDENTIFIER: An object identifier is a data type specifying an authoritatively named object. By authoritative we mean that these identifiers are not assigned randomly, but are allocated by some organization that has responsibility for a group of identifiers. An object identifier is a sequence of integers separated by decimal points. These integers traverse a tree structure, similar to the DNS or a UNIX file system. There is an unnamed root at the top of the tree where the object identifiers start. (This is the same direction of tree traversal thats used with a UNIX file system). Figure 5.3 shows the structure of this tree when used with SNMP. All variables in the MIB start with the object identifier 1.3.6.1.2.1. NULL: This indicates that the corresponding variable has no value. It is used, for example, as the value of all the variables in a get or get-next request, since the values are being queried, not set. IpAddress: An OCTET STRING of length 4, with 1 byte foe each byte of the IP address. PhysAddress: An OCTET STRING specifying a physical address (e.g., a 6-byte Ethernet address). Counter: A nonnegative integer whose value increases monotonically from 0 to 2 -1 (4,294,967,295), and then wraps back to 0.
32
NOTES
Gauge: A nonnegative integer between 0 and 232-1, whose value can increase or decrease, but latches at its maximum value. That is, if the value increments to 232-1, it stays there until reset. The MIB variable tcpCurrEstab is an example: it is the number of TCP connections currently in the ESTABLISHED or CLOSE_WAIT state.
237
DIT 116
NETWORK PROTOCOLS
NOTES
TimeTicks: A counter that counts the time in hundredths of a second since some epoch. Different variables can specify this counter from a different epoch, so the epoch used for each variable of this type is specified when the variable is declared in the MIB. For example, the variable sysUpTime is the number of hundredths of a second that the agent has been up. SEQUENCE: This is similar to a structure in the C programming language. For example, we will see that the MIB defines a SEQUENCE named UdpEntry containing information about an agents active UDP end points. (By active we mean ports currently in use by an application.) Two entries are in the structure: udpLocalAddress, of type IpAddress, containing the local IP address. udpLocalPort, of type INTEGER, in the range 0 through 65535, specifying the local port number.
SEQUENCE OF: This is the definition of a vector, with all elements having the same data type. If each element has a simple data type, such as an integer, then we have a simple vector (a one-dimensional array). But we will see that SNMP uses this data type with each element of the vector being a SEUQENCE (structure). We can think of it as a two-dimensional array or table.
Figure5.3 Part ofASN.1 object naming tree Anna University Chennai 238
DIT 116
NETWORK PROTOCOLS
5.4.4 Management Information Base From the perspective of a network manager, network management takes place between two major types of systems: those in control, called managing systems, and those observed and controlled, called managed systems. The most common managing system is called a network management system (NMS). Managed systems can include hosts, servers, or network components such as routers or intelligent repeaters. To promote interoperability, cooperating systems must adhere to a common framework and a common language, called a protocol. In the Internet Network Management Framework, that protocol is the Simple Network Management Protocol (SNMP). The exchange of information between managed network devices and a robust NMS is essential for reliable performance of a managed network. Because some devices have a limited ability to run management software, most of the computer processing burden is assumed by the NMS. The NMS runs the network management applications that present management information to network managers and other users. The MIB structure is logically represented by a tree hierarchy as shown in figure 5.4. The root of the tree is unnamed and splits into three main branches: Consultative Committee for International Telegraph and Telephone (CCITT), International Organization for Standardization (ISO), and joint ISO/CCITT. These branches and those that fall below each category have short text strings and integers to identify them. Text strings describe object names, while integers allow computer software to create compact, encoded representations of the names. For example, the Cisco MIB variable authAddr is an object name and is denoted by number 5, which is listed at the end of its object identifier number 1.3.6.1.4.1.9.2.1.5. The object identifier in the Internet MIB hierarchy is the sequence of numeric labels on the nodes along a path from the root to the object. The Internet standard MIB is represented by the object identifier 1.3.6.1.2.1. as shown in figure 5.4a. It also can be expressed as iso.org.dod.internet.mgmt.mib.
NOTES
239
DIT 116
NETWORK PROTOCOLS
NOTES
Figure5.4a Internet MIB Hierarchy
The collection of objects managed by SNMP is defined in the MIB. For convenience, these objects are (currently) grouped into ten categories, which correspond to the ten nodes under mib-2 in figure 5.4b. (Note that mib-2 corresponds to SNMPv2 and that object 9 is no longer present)
240
DIT 116
NETWORK PROTOCOLS
Group System Interfaces AT IP ICMP TCP UDP EGP Transmission SNMP
# Objects 7 23 3 42 26 19 6 20 0 29
Description Name, location, and description of the equipment Network interfaces and their measured traffic Address translation (deprecated) IP packet statistics Statistics about ICMP messages received TCP algorithms, parameters, and statistics UDP traffic statistics Exterior gateway protocol traffic statistics Reserved for media-specific MIBs SNMP traffic statistics
NOTES
Figure 5.4b The object groups of the internet MIB-II
The ten categories are intended to provide a basis of what a management station should understand. New categories and objects will certainly be added in the future, and vendors are free to define additional objects for their products. The ten categories are summarized in figure 5.4b. Although space limitations prevent us from delving into the details of all 175 objects defined in MIB-II, a few comments may be helpful. The system group allows the manager to find out what the device is called, who made it, what hardware and software it contains, where it is located, and what it is supposed to do. The time of the last boot and the name and address of the contact person are also provided. This information means that a company can contact our system management to another company In a distant city and have the latter be able to easily figure out what the configuration being managed actually is who should be contacted if there are problems with various devices. The interfaces group deals with the network adapters. It keeps track of the number of packets and bytes sent and received from the network, the number of discards, the number of broadcasts, and the current output queue size. The AT group was present in MIB-I and provided information about address mapping (e.g., Ethernet to IP addresses). This information was moved to protocolspecific MIBs in SNMPv2. The IP group deals with IP traffic into and out of the node. I is especially rich in counters keeping track of the number of packets discarded for each of a variety of reasons (e.g., no known route to the destination or lack of resources). Statistics about datagram fragmentation and reassembly are also available. All these items are particular important for managing routers.
241
DIT 116
NETWORK PROTOCOLS
NOTES
The ICMP group is about IP error messages. Basically, it has a counter for each ICMP message that records how many of that type have been seen. The TCP group monitors the current and cumulative number of connections opened, segments sent and received, and various error statistics. The UDP group logs the number of UDP datagrams sent and received, and how many of the latter were undeliverable due to an unknown port or some other reasons. The EGP group is used for routers that support the exterior gateway protocol. It keeps tack of how many packets of what kind went out, came in and were forwarded correctly, and came in and were discarded. The transmission group is a place holder for media-specific MIBs. For example, Ethernet-specific statistics can be kept here. The purpose of including an empty group in MIB-II is to reserve the identifier {internet 2 1 9} for such purposes. The last group is for collecting statistics about the operation of SNMP itself. How many messages are being sent, what kind of messages are they, and so on. MIB-II is formally defined in RFC 1213. The bulk of RFC 1213 consists of 175 macro calls similar to those of figure 7-36, with comments delineating the ten groups. For each of the 175 objects defined, the data type is given along with an English text description of what the variable is used for. For further information about MIB-II, the reader is referred to this RFC. 5.4.5 Protocol Operations SNMP is a simple request/response protocol. The network-management system issues a request, and managed devices return responses. This behavior is implemented by using one of four protocol operations: Get, GetNext, Set, and Trap. The Get operation is used by the NMS to retrieve the value of one or more object instances from an agent. If the agent responding to the Get operation cannot provide values for all the object instances in a list, it does not provide any values. The GetNext operation is used by the NMS to retrieve the value of the next object instance in a table or a list within an agent. The Set operation is used by the NMS to set the values of object instances within an agent. The Trap operation is used by agents to asynchronously inform the NMS of a significant event. The interaction between the managing device and managed device is illustrated in figure 5.5.
242
DIT 116
NETWORK PROTOCOLS
NOTES
Figure 5.5 Interaction between Managing device and Managed device in SNMP
The important types of SNMP messages are summarized in figure 5.6.
Message Get-request Get-next-request Get-bulk-request Set-request Inform-request SnmpV2-trap
Description Requests the value of one or more variables Requests the variable following this one Fetches a large table Uploads one or more variables Manager-to-manager message describing local MIB Agent-to-manager trap report
Figure 5.6 SNMP Message Types
5.4.6
Versions of SNMP
SNMPv1 was the standard version of SNMP. The SNMPv2 was created as an update of SNMPv1 with several features. The key enhancements of SNMPv2 are focused on the SMI, Manager-to-manager capability, and protocol operations. The SNMPv2c combined the community-based approach of SNMPv1 with the protocol operation of SNMPv2 and omitted all SNMPv2 security features. One notable deficiency in SNMP was the difficulty in monitoring networks, as opposed to nodes on networks. A substantial functional enhancement to SNMP was achieved by the definition of a set of standardized management objects referred to as the Remote Network Monitoring MIB (RMON MIB) objects. Another major deficiency in SNMP was the
243
DIT 116
NETWORK PROTOCOLS
NOTES
complete lack of security facilities. The development of SNMPv3 was based on the security issues. SNMPv3 defines two security-related capabilities, namely USM (Universal subscription Module) and VACM (View based Access Control Model). SNMPv2c provides several advantages over SNMPv1. SNMPv2c have expanded data types of 64-bit counter. It calls for improved efficiency and performance by introducing the GETBULK operation. Confirmed event notification is sought by the introduction of the Inform operator. Enhanced error handling approach, improved sets, and a fine tuned Data Definition Language are some of the advantages of SNMPv2c over the SNMPv1. The SNMPv1 framework distinguishes between application entities and protocol entities. In SNMPv3, these are renamed as applications and engines respectively. The SNMPv1 framework also introduces the concept of an authentication service supporting one or more authentication schemes. In SNMPv3, the concept of an authentication service is expanded to include other services, such as privacy. The SNMPv1 framework introduces access control based on a concept called an SNMP MIB view. The SNMPv3 framework specifies a fundamentally similar concept called view-based access control. Both the versions v1 and v2c lack the security-related features like authentication, privacy, authorization and access control and remote configuration and administration capabilities. SNMPv3 was formed mainly to address the deficiencies related to security and administration. Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. What are the functions of an agent in a managed device? What are the functions of an NMS in a managing device? Give some examples for the type of information to be maintained by the agents. What is the necessity of ASN in SNMP? Why is ASN preferred in representing the information by the agents? How many layers form the application component of a network reference model? What are the layers present in the transport component of a protocol stack? What are the various types of transfer syntaxes supported by ASN.1? Mention the advantages of BER over other transfer syntax rules? SMI is a subset of ASN.1 used for network management. Justify this statement. What is an object identifier in SMI? What is meant by a Management Information Base? Mention the ten categories of objects maintained by MIB. Differentiate between Get-request and Get-next-request message types of SNMP. What is the purpose of Set-request message of SNMP?
244
DIT 116
NETWORK PROTOCOLS
5.5
NETWORK SECURITY
NOTES
Providing the information security is the basic requirement of a computer network, if at all, the network has to provide really useful services to its users. Providing security in a networked system is more difficult than providing security in a stand alone system since systems cant function in isolation in a network. Security in an internet environment is even more important and difficult. It is important because information has significant value information can be bought and sold directly or used indirectly to create new products and services that yield high profits. Security in an internet is difficult because security involves understanding when and how participating users, computers, services, and networks can trust one another as well as understanding the technical details of network hardware and protocols. Security is required on every computer and every node of the network. Even if one of the nodes of the network is compromised, the whole network may become insecure. Since various organizations with different interests constitute the internet, it is very difficult to formulate the security policies so that everyones need is satisfied. There are two basic approaches to network security. In the first approach, we will not permit the external data to enter the network. This approach is called perimeter security. Perimeter security allows an organization to determine the services and networks it will make available to outsiders and the extent to which outsiders can use resources. Perimeter security solves the problem only up to certain extend because to carry out the business transactions it becomes necessary to permit the external traffic to enter the network. Another type of security is encrypting the information so that even if unauthorized users get the information, they will not be able to interpret it. 5.5.1 Aspects of Information Security The terms network security and information security refer in a broad sense to confidence that information and services available on a network cannot be accessed by unauthorized users. Security implies safety, including assurance of data integrity, freedom from unauthorized access of computational resources, freedom from snooping or wiretapping, and freedom from disruption of service. Of course, just as no physical property is absolutely secure against crime, no network is completely secure. Organizations make an effort to secure networks for the same reason they make an effort to secure buildings and offices; basic security measures can discourage crime by making it significantly more difficult. Providing security for information requires protecting both physical and abstract resources. Physical resources include passive storage devices such as disks and CDROMs as well as active devices such as users computers. In a network environment, physical security extends to the cables, bridges, and routers that comprise the network
DIT 116
NETWORK PROTOCOLS
NOTES
infrastructure. Indeed, although physical security is seldom mentioned, it often plays an important role in an overall security plan. Obviously, physical security can prevent wiretapping. Good physical security can also eliminate sabotage (e.g., disabling a router to cause packets to be routed through an alternative, less secure path). Protecting an abstract resource such as information is usually difficult than providing physical security because information is elusive. Information security encompasses many aspects of protection: Data Integrity: Protection of information from unauthorized change Data Availability: Data should be always available for authenticate and legitimate users Privacy or Confidentiality: The system must prevent outsiders from making copies of data as it passes across a network or understanding the contents if copies are made. Authorization: Although physical security often classifies people and resources into broad categories, security for information usually needs to be more restrictive (e.g., some part of an employees record are available only to the personnel office, others are available only to the employees boss, and others are available to the payroll office). Authentication: The system must allow two communicating entities to validate each others identity. This should not permit masquerading. Replay Avoidance: To prevent outsiders from capturing copies of packets and using them later, the system must prevent a retransmitted copy of a packet from being accepted. 5.5.2 Security Policy
Before an organization can enforce network security, the organization must assess risks and develop a clear policy regarding information access and protection. The policy specifies who will be granted access to each piece of information, the rules an individual must follow in disseminating the information to others, and a statement of how the organization will react to violations. Humans are usually the most susceptible point in any security scheme. A worker who is malicious, careless, or unaware of an organizations information policy can compromise the best security. As the size of the network (in terms of the number of users and services offered) devising the information security is the most challenging area of information security. 5.5.3 Internet Security Security in a private network (LANs, Corporate networks etc.) is easy to implement since the entire network comes under a single authority. However, it is not always possible for the sender and the receiver to communicate through their own network, since it may be too costly. Hence it becomes necessary for the sender and the recipient
DIT 116
NETWORK PROTOCOLS
to make use of public networks like PSTN (Public Switched telephone Network) or the Internet. Provision of security is very difficult in an internet because datagrams traveling from source to destination often pass across many intermediate networks and through routers that are not owned or controlled by either the sender or the recipient. Thus, because datagrams can be intercepted or compromised, the contents cannot be trusted. As an example, consider a server that attempts to use source authentication to verify that requests originated from valid customers. Source authentication requires the server to examine the source IP address on each incoming datagram, and only accept requests from computers on an authorized list. Source authentication is weak because it can be broken easily. In particular, an intermediate router can watch traffic traveling to and from the server, and record the IP address of a valid customer. Later the intermediate router can manufacture a request that has the same source address (and intercept the reply). An authorization scheme that uses a remote machines IP address to authenticate its identity does not suffice in an unsecured internet. An imposer who gains control of an intermediate router can obtain access by impersonating an authorized client. Stronger authentication requires encryption. To encrypt a message, the sender applies a mathematical function that rearranges the bits according to a key which is known only to the sender. The receiver uses another mathematical function to decrypt the message. Careful choices of an encryption algorithm, a key and the contents of messages can make it virtually impossible for intermediate machines to decode messages or manufacture messages that are valid. Have you understood? 1. 2. 3. 4. 5. 5.6 What is the necessity of providing security mechanism in networks? What is the difference between network security and information security? What is meant by perimeter security? Define encryption. Why is it difficult to provide security in public networks? IP SECURITY (IPSEC)
NOTES
The Internet should be provided security features if it is to be used as a medium for serious applications like e-commerce, video on demand etc. This need is accepted by every one. But the issue is where to put these security features. Two obvious choices are end-to-end solution and the network layer solution. Initially everyone thought of leaving the security to the application layer. In this approach, the sender encrypts the information and the receiver decrypts the information. Both encryption and decryption takes place in the application layer. The limitation of this approach is that all the applications should be security aware. Another solution was also proposed which said that
DIT 116
NETWORK PROTOCOLS
NOTES
all the security issues should be taken care by a separate layer (between transport layer and application layer). This solution is also end-to-end but does not require the applications to be security aware. However, finally the idea of making the network layer itself secure got the recognition. As a result, IETF has devised a set of protocols that provide secure Internet communication. This set of protocols is collectively known as IPsec (short for IP security) and these protocols offer authentication and privacy services at the IP layer, and can be used with both IPv4 and IPv6. The major advantage of IPsec is it does not restrict the users to implement a particular encryption or authentication algorithm. Instead, provides a general framework that allows each pair of communicating endpoints to choose algorithms and parameters (e.g., key size). To guarantee interoperability, IPsec does include a set of encryption algorithms that all implementation must recognize. In short, we can say that IPsec is not a single security protocol. Instead, IPsec provides a set of security algorithms plus a general framework that allows a pair of communicating entities to use whichever algorithms provide security appropriate for the communication. The IPsec authentication mechanism is designed to ensure that an arriving datagram is identical to the datagram sent by the source. However, such a guarantee is impossible to make. To understand, recall that IP is a machine-to-machine layer, meaning that the layering principle only applies across one hop. In particular, each intermediate router decrements the time-to-live field and recomputes the checksum. IPsec uses the term mutable fields to refer to IP header fields that are changed in transit. To prevent such changes causing authentication errors, IPsec specifically omits such fields from the authentication computation. Thus, when a datagram arrives, IPsec only authenticates immutable fields (e.g., the source addresses and protocol type). 5.6.1 Datagrams with IPsec
IPsec has chosen the idea of using a separate header for security. The normal process of data flow is as follows. 1. 2. 3. 4. The application generates the data and hands over to transport layer. The transport layer adds its own header to the data and creates the TCP segment. The transport layer hands over the segment to the Internet layer. The Internet layer adds its own header and creates the datagram.
When we choose IPSec an additional step is done. Before forming the datagram, an Authentication Header (AH) is added to the segment. The process of forming the datagram in the absence and presence of IPsec is shown in figure 5.7a anf figure 5.7b.
248
DIT 116
NETWORK PROTOCOLS
IPv4 HEADER
TCP HEADER
TCP DATA
NOTES
Figure 5.7a Illustration of an IPv4 datagram
IPv4 AUTHENTICATION TCP HEADER HEADER HEADER
TCP DATA
Figure 5.7b Datagram after an IPsec authentication header has been added
As the figure shows, IPsec inserts the authentication header immediately after the original IP header, but before the transport header. Furthermore, the PROTOCOL field in the IP header is changed to value 51 to indicate the presence of an authentication header. Now the issue is if IPsec modifies the PROTOCOL field in the IP header, how does a receiver determine the type of information carried in the datagram? The authentication header has a NEXT HEADER field that specifies the type IPsec records the original PROTOCOL value in the NEXT HEADER field. When a datagram arrives, the receiver uses security information from the authentication header to verify the sender, and uses the NEXT HEADER value to further demultiplex the datagram. Figure 5.8 illlustrates the header format.
Figure 5.8 The IPsec authentication header format.
The PAYLOAD LEN field does not specify the size of the data area in the datagram. Instead, it specifies the length of the authentication header. The SEQUENCE NUMBER contains a unique sequence number for each packet sent. The number starts at zero when a particular security algorithm is selected and increases monotonically. The SECURITY PARAMETERS INDEX field specifies the security scheme used. To save space in the header, IPsec arranges for each receiver to collect all the destination about a security scheme into an abstraction known as a security association (SA). Each SA is given a number, known as a security parameter index, through which it is identified. Before a sender can use IPsec to communicate with a receiver, the sender must know the index value for a particular SA. The sender then places all the value in the field SECURITY PARAMETERS INDEX of each outgoing datagram. Index values are not globally specified. Instead, each destination creates as many SAs as it
DIT 116
NETWORK PROTOCOLS
NOTES
needs, and assigns an index value to each. The destination can specify a lifetime for each SA, and reuse index values once an SA becomes invalid. Consequently, the index cannot be interpreted without consulting the destination The AUTHENTICATION DATA field contains data for the selected security scheme. 5.6.2 Encapsulating Security Payload (ESP) The alternate IPSec header is ESP (Encapsulating Security Payload). The ESP is more complex than the authentication header and is able to handle privacy as well as authentication. The conceptual idea of ESP is shown in figure 5.9.
IPv4 TCP TCP DATA HEADER HEADER

Figure 5.9a A datagram
Authenticated Encrypted IPv4 ESP TCP HEADER HEADER HEADER TCP DATA ESP ESP TRAILER AUTH
Figure 5.9b Datagram using IPsec ESP
As the figure shows, ESP adds three additional areas to the datagram. The ESP HEADER immediately follows the IP header and precedes the encrypted payload. The ESP TRAILER is encrypted along with the payload; a variable-size ESP AUTH field follows the encrypted section. The Encapsulating Security Payload (ESP) extension header provides origin authenticity, integrity, and confidentiality protection of a packet. ESP also supports encryption-only and authentication-only configurations, but using encryption without authentication is strongly discouraged.
0 - 7 bit 8 - 15 bit Security Parameters Index (SPI) Sequence Number Payload Data (variable) Padding (0-255 bytes) Pad Length Authentication Data (variable) Next Header 16 - 23 bit 24 - 31 bit
Figure 5.10 ESP Header Anna University Chennai 250
DIT 116
NETWORK PROTOCOLS
Unlike the AH header, the IP packet header is not accounted for. ESP operates directly on top of IP using IP protocol number 50. An ESP packet diagram is shown in figure 5.10. Various fields of an ESP packet is explained as follows. Security Parameters Index (SPI) identifies the security parameters in combination with IP address Sequence Number is a monotonically increasing number, used to prevent replay attacks. Payload Data is The data to be transferred. Padding is used with some block ciphers to pad the data to the full length of a block. Pad Length is the size of padding in bytes. Next Header identifies the protocol of the transferred data. Authentication Data contains the data used to authenticate the packet. The ESP TRAILER consists of optional padding, a padding length field, PAD LENGTH, and a NEXT HEADER field that is followed by a variable amount of authentication data. The trailer is shown in figure 5.11.
NOTES
Figure 5.11 ESP Trailer
Padding is optional; it may be present for three reasons. First, some decryption algorithms require zeroes following an encrypted message. Second, note that the NEXT HEADER field is shown right-justified within a 4-octet field. The alignment is important because IPsec requires the authentication data that follows the trailer to be aligned at the start of a 4-octet boundary. Thus, padding may be needed to ensure alignment. Third, some sites may choose to add random amounts of padding to each datagram so eavesdroppers at intermediate points along the path cannot use the size of a datagram to guess its purpose. 5.6.3 Advanced Features of IPsec VPN technology uses encryption along with IP-in-IP tunneling to keep inter-site transfers private. IPsec is specifically designed to accommodate an encrypted tunnel. In particular, the standard defines tunneled versions of both the authentication header
DIT 116
NETWORK PROTOCOLS
NOTES
and the encapsulating security payload. Figure 5.12 illustrates the layout of datagrams in tunneling mode.
OUTER IP HEADER
AUTHENTICATION INNER IP DATAGRAM HEADER (INCLUDING IP HEADER)
Figure 5.12a Illustration of IPsec tunneling mode for authentication

Authenticated Encrypted OUTER IP ESP HEADER HEADER INNER IP DATAGRAM (INCLUDING IP HEADER) ESP TRAILER ESP AUTH
Figure 5.12b Illustration of IPsec tunneling mode for ESP
IPsec defines a minimal set of algorithms that are mandatory (i.e., that all implementations must supply). In each case, the standard defines specific uses. Figure 5.13 lists the required algorithms.
Authentication HMAC with MD5 RFC 2403 HMAC with SHA-1 RFC 2404 Encapsulating Security Payload DES in CBC mode HMAC with MD5 HMAC with SHA-1 Null Authentication Null Encryption RFC 2405 RFC 2403 RFC 2404
Figure 5.13 The security algorithms that are mandatory for IPsec.
By the mid 1990s when it became evident that security was important for internet commerce, several groups proposed security mechanisms for use with the web. Although not formally adopted by the IETF, one of the proposals has become a de facto standard. Known as the Secure Sockets Layer (SSL), the technology was originally developed by Netscape, Inc. As the name implies, SSL resides at the same layer as the socket authenticate itself to the other. The two sides then negotiate to select an encrypAnna University Chennai 252
DIT 116
NETWORK PROTOCOLS
tion algorithm that they both support. Finally, SSL allows the two sides to establish an encrypted connection (i.e., a connection that uses the chosen encryption algorithm to guarantee privacy). Have you understood? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 5.7 5.7.1 What are the limitations of providing security in the application layer level? What are the advantages of providing security in the network layer? What is the difference in the structure of a datagram when we apply IPsec? What is meant by security association in IPsec? What are the limitations of Authentication Header? What are the advantages of ESP? What is meant by IPsec tunneling? Whether the set of security algorithms are prescribed by IPsec? Justify your answer. What is meant by SSL? Why does IPsec not include mutable header fields in authentication computation? FIREWALLS AND INTERNET ACCESS Need for Firewalls
NOTES
A network is subjected to two types of information leakage namely leak in and leak out. Leak out refers to the disclosure of confidential information like trade secrets, product development plans, marketing strategies, financial analysis etc. Disclosure of this information to competitors may affect the viability of the organization in the market. Leak in refers to the entry of viruses, worms and other digital pests that can breach the security, destroy valuable data and waste large amounts of administrators time trying to cleanup the mess they leave. Consequently, mechanisms are needed to keep good information in and bad information out. Firewalls are one such mechanism used to provide security to the network. Another important issue to be considered is that for the purpose of network security, a variety of security mechanisms already exists. Encryption schemes like DES (Data Encryption Standard), AES (Advanced Encryption Standard), RSA (Rivest, Shamir, Adleman), MD5 (Message Digest 5) and their corresponding decryption algorithms can be used. In addition to these encryption algorithms, a set of authentication protocols like Needham-Schroedar, Otway-Rees and Kerberos exist. IPsec provides the security at the network layer itself. In spite of the availability of all these techniques, firewalls are used by many organizations due to the fact that getting security algorithms and right protocols is a very difficult task. So firewalls are used as a stopgap measure till standards and algorithms for other security techniques become popular. Even in the
DIT 116
NETWORK PROTOCOLS
NOTES
long term, unless every single system runs IPsec or some similar end-to-end security mechanism, it seems likely that the network administrators will continue to depend on firewalls. Moreover, a firewall allows the system administrator to implement a security policy in one centralized place. 5.7.2 Implementation of Firewalls
A firewall is a secure and trusted machine that sits between a private network and a public network. The firewall machine is configured with a set of rules that determine which network traffic will be allowed to pass and which will be blocked or refused. In some large organizations, you may even find a firewall located inside their corporate network to segregate sensitive areas of the organization from other employees. Many cases of computer crime occur from within an organization, not just from outside. Firewalls can be constructed in quite a variety of ways. The most sophisticated arrangement involves a number of separate machines and is known as a perimeter network. Two machines act as filters called chokes to allow only certain types of network traffic to pass, and between these chokes reside network servers such as a mail gateway or a World Wide Web proxy server. This configuration can be very safe and easily allows quite a great range of control over who can connect both from the inside to the outside, and from the outside to the inside. This sort of configuration might be used by large organizations. To operate at network speeds, a firewall must have hardware and software optimized for the task. Fortunately, most commercial routers include a high-speed filtering mechanisms that can be used to perform much of the necessary work. 5.7.3 Packet Level Filters
Many commercial routers offer a mechanism that augments normal routing and permits a manager to further control packet processing. Informally, called a packet filter, the mechanism requires the manager to specify how the router should dispose of each datagram. For example, the manager might choose to filter (i.e., block) all datagrams that come from a particular source or those used by a particular application, while choosing to route other datagrams to their destination. The term packet filter arises because the filtering mechanism does not keep record of interaction or a history of previous datagrams. Instead, the filter considers each datagram separately. When a datagram first arrives, the router passes the datagram through its packet filter before performing any other processing. If the filter rejects the datagram, the router drops it immediately. Because TCP/IP doses not dictate a standard for packet filters, each router vendor is free choose the capabilities of their packet filter as well as the interface a manager
DIT 116
NETWORK PROTOCOLS
uses to configure the filter. Some routers permit a manager to configure separate filter actions for each interface, while others have a single configuration for all interfaces. Usually, when specifying datagrams that the filter should block, a manger can list any number, and destination protocol port number. For example, figure 5.14 illustrates a filter specification. In the example, the manager has chosen to block incoming datagrams destined for a few well-known services and to block one case of outgoing daatagrams. The filter blocks all outgoing datagrams that originate from any host address matching the 16-bit prefix of 128.5.0.0 that are destined for a remote e-mail server (TCP port 25). The filter also blocks incoming datagrams destined for FTP (TCP port 21), TELNET (TCP port 23), WHOIS (UDP port 43), TFTP (UDP port 69), or FINGER (TCP port 79).
NOTES
Arrives On Interface 2 2 1 2 2 2
Ip Source * * 128.5.0.0/16 * * *
Ip Destination * * * * * *
Protocol TCP TCP TCP UDP UDP TCP
Source Port * * * * * *
Destination Port 21 23 25 43 69 79
Figure 5.14 A router with two interfaces and an example datagram filter specification
5.7.4 Security and Packet Filter Specification Although the example filter configuration in figure 32-6 specifies a small list of services that should be blocked, such an approach does not work well for an effective firewall. There are three reasons. First, the number of well-known ports is large and growing rapidly. Thus, listing each service requires a manager to update the list continually; an error of omission can leave the firewall vulnerable. Second, much of the traffic on an internet does not travel to or from a well-known port. In addition to programmers who can choose port numbers for their private client-server applications,, services like Remote Procedure Call (RPC) assign ports dynamically. Third, listing ports of well-known services leaves the firewall vulnerable to tunneling. Tunneling can circumvent security if a host or router on the inside agrees to accept encapsulated datagram from an outsider, remove one layer of encapsulation, and forward the datagram on to the service that would otherwise be restricted by the firewall.
255
DIT 116
NETWORK PROTOCOLS
NOTES
How can firewall use a packet filter effectively? The answer lies in reversing the idea of a filter.; instead of specifying the datagrams that should be filtered, a firewall should be configured to block all datagrams except those destined for specific networks, hosts, and protocol ports for which external communication has been approved. Thus, a manager begins with the assumption that communication is not allowed, and then must examine the organizations information policy carefully before enabling the port. In fact, many packet filters allow a manager to specify a set of datagrams to admit instead of a set of datagrams to block. To be effective, a firewall that uses datagram filtering should restrict to all IP sources, IP destinations, protocols, and protocol ports except those computers, networks, and services the organization explicitly decides to make available externally. A packet filter that allows a manager to specify which datagrams to admit instead of which datagrams to block can make such restrictions easy to specify. 5.7.5 The Consequence of Restricted Access for Clients A blanket prohibition on datagrams arriving for an unknown protocol port seems to solve many potential security problems by preventing outsiders from accessing arbitrary servers in the organization. Such a firewall has an interesting consequence; it also prevents an arbitrary computer inside the firewall from becoming a client that assesses a service outside the firewall. To understand why, recall that although each server operates at a well known port, a client does not. When a client program begins execution, it requests the operating system to select a protocol port number that is neither among the well-known ports nor currently in use on the clients computer. When it attempts to communicate with a server outside the organization, a client will generate one or more datagrams and send them to the server. Each outgoing datagram has the clients protocol port as the source port and servers well=known protocol port as the destination port. The firewall will not block such datagrams as they leave. When it generates a response, the server reverses the protocol ports. The clients port becomes the destination port and the servers port becomes the source port. When the datagram carrying the response reaches the firewall, it will be blocked because the destination port is not approved. Thus, we can see an important idea: If an organizations firewall restricts incoming datagrams except for ports that correspond to services the organization makes available externally, an arbitrary application inside the organization cannot become a client of a server outside the organization. 5.7.6 Proxy Access through a Firewall Of course, not all organizations configure their firewalls to block datagrams destined for unknown protocol ports. In cases where a secure firewall is needed to prevent
256
DIT 116
NETWORK PROTOCOLS
unwanted access, however, users on the inside need a safe mechanism that provides access to services outside. That mechanism forms the second major piece of firewall architecture. In general, an organization can only provide safe access to outside services through a secure computer. Instead of trying to make all computer systems in the organization secure (a daunting task), an organization usually associates one secure computer with each firewall, and installs a set of application gateways on that computer. Because the computer must be strongly fortified to serve a secure communication channel, it is often called a bastion host. Figure 5.15 illustrates the concept. As the figure shows, the firewall has two conceptual barriers. The outer barrier blocks all incoming traffic except (1) datagrams destined for services on the bastion host that the organization chooses to make available externally, and (2) datagrams destined for clients on the bastion host. The inner barrier blocks incoming traffic except datagrams that originate on the bastion host. Most firewalls also include a manual bypass that enables managers to temporarily pass some or all traffic between a host inside the organization and a host outside (e.g., for testing or debugging the network).
NOTES
Figure 5.15 The conceptual organization of a bastion host embedded in a firewall
To understand how a bastion host operates, consider web access. Because the firewall prevents the users computer from receiving incoming datagrams, the user cannot use a browser for direct access. Instead, the organization arranges a proxy server on the bastion host. Inside the organization, each browser is configuring to use the proxy. Whenever a user selects a link or enters a URL, their browser contacts the proxy. The proxy contacts the server, obtains the specified page, and then delivers it internally.
257
DIT 116
NETWORK PROTOCOLS
NOTES
5.7.7 The Details of Firewall Architecture Now that we understand the basic firewall concept, the implementation should appear straightforward. Conceptually, each of the barriers shown in figure 5.14 requires a router that has a packet filter. Networks interconnect the routers and a bastion host. For example, an organization that connects to the global Internet might choose to implement a firewall as figure 5.16 shows. As the figure 5.16 shows, router R2 implements the router barrier; it filters all traffic except datagrams destined for the bastion host, H. router R1 implements the inner barrier that isolates the rest of the corporate intranet from outsiders; it blocks all incoming datagrams except those that originate on the bastion host. Of course, the safety of an entire firewall depends on the safety of the bastion host. If an intruder can gain access to the computer system running on the bastion host, they will gain access to the entire inside Internet. Moreover, an intruder can exploit security flaws in either the operating system on the bastion host or the network applications it runs. Thus, managers must be particularly careful when choosing and configuring software for a bastion host. Hence the fact is although a bastion host is essential for communication through a firewall, the security of the firewall depends on the safety of the bastion host. An intruder who exploits a security can assess to hosts inside the firewall.
Figure 5.16 A firewall implemented with two routers and a bastion host.
258
DIT 116
NETWORK PROTOCOLS
5.7.8 Stub Network It may seem that figure 5.17 contains a superfluous network that connects the two routers and then the bastion host. Such a network is often called a stub network because it is small (i.e., stubby). The question arises, Is the stub network necessary or could a site place the bastion host on one of its production networks? The answer depends on the traffic expected from the outside. The stub network isolates the organization from incoming datagram traffic. In particular, because router R2 admits all datagrams destined for the bastion host, an outsider can send an arbitrary number of such datagrams across the stub network. If an eternal connection is slow relative to the capacity of a stub network, a separate physical wire may be unnecessary. However, a stub network is usually an inexpensive way for an organization to protect itself against disruption of service on an internal production network. 5.7.9 An Alternative Firewall Implementation The firewall implementation in figure 5.15 works well for an organization that has a single serial connection to the rest of the global Internet. Some sites have a different interconnection topology. For example, suppose a company has three or four large customers who each need to deposit or extract large volumes of information. The company wishes to have a single firewall, but allow connections to multiple sites. Figure 5.17 illustrates one possible firewall architecture that accommodates multiple external connections.
NOTES
Figure 5.17 An alternate firewall architecture that permits multiple external connections through a single firewall. Using one firewall for multiple connections can reduce the cost
As the figure shows, the alternative extends a firewall by providing an outer network at which external connections terminate. Router R1 acts as in figure 5.15 to
DIT 116
NETWORK PROTOCOLS
NOTES
protect the site by restricting incoming datagrams to those sent from the bastion host. Routers R2 each connect one external site to the firewall. To understand why firewalls with multiple connections often use a router per connection, recall that all sites mistrust one another. That is, the organization running the firewall does not trust any of the external organizations completely, and none of the external organizations trust one another completely. The packet filter in a router on a given external connection can be configured to restrict traffic on that particular connection. As a result, the owner of the firewall can guarantee that although all external connections share a single, common network, no datagram from one external connection will pass to another. Thus, the organization running the firewall can assure customers that it is safe to connect. Hence we can summarize that when multiple external sites connect through a single firewall, an architecture that has a router per external connection can prevent unwanted packet flow from one external site to another. 5.7.10 Monitoring and Logging Monitoring is one of the most important aspects of a firewall design. The network manager responsible for a firewall needs to be aware of attempt to bypass security. Unless a firewall reports incidents, a manager may be unaware of problems. Monitoring can be active or passive. In active monitoring, firewall notifies a manager whenever an incident occurs. The chief advantage of active monitoring is speed a manager finds out about a potential problem immediately. The chief disadvantage is that active monitors often produce so much information that a manager cannot comprehend it or notice problems. Thus, most managers prefer passive monitoring, or a combination of passive monitoring with a few high-risk incidents also reported by an active monitor. In passive monitoring, a firewall logs a record of each incident in a file on disk. A passive monitor usually records information about normal traffic (e.g., simple statistics) as well as datagrams that are filtered. A manager can access the log at any time; most managers use a computer program. The chief advantage of passive monitoring rises from its record of events a manager can consult the log to observe trends and when a security problem does occur, review the history of events that led to the problem. More important, a manager can analyze the log periodically (e.g., daily) to determine whether attempts to assess the organization increase or decrease over time. Have you understood? 1. 2. What is the need for firewalls in the network? What is the necessity of the firewall when many other security mechanisms like encryption, authentication etc exist?
260
DIT 116
NETWORK PROTOCOLS
3. 4. 5. 6. 7. 8. 9. 10. 5.8
What are the various ways of implementing the firewalls? What is meant by a packet level filter? What are the consequences of restricted access for clients? What are the functions of a bastion host? What is meant by a stub network? What is the role of a router in the configuration of firewalls? Mention the activities involved in monitoring and logging. What are the factors to be considered in the selection of a particular type of firewall? THE FUTURE OF TCP/IP
NOTES
Evolution of TCP/IP technology is intertwined with evolution of the global internet for several reasons. First, the internet is the largest installed TCP/IP internet, so many problems related to scale arise in the internet before they surface on other TCP/IP internets. Second, funding for TCP/IP research and engineering comes from companies and government agencies that use the operational internet, so they tend to fund projects that impact the internet. Third, because most researchers use the global internet daily, they have immediate motivation to solve problems that will improve service and extend functionality. With millions of users at tens of thousands of sites around the world depending on the global internet as part of their daily work environment, it might appear that the internet is a completely stable production facility. We have passed the early stage of development in which every user was also an expert, and entered a stage in which few users understand the technology. Despite appearances, however neither the internet nor the TCP/IP protocol suite is static. Groups discover new ways to use the technology. Researchers solve new networking problems, and engineers improve the underlying mechanisms. In short, the technology continues to evolve. The purpose of this chapter is to consider the ongoing evolutionary process and examine one of the most significant engineering efforts: a proposed revision of IP. When the proposal is adopted by vendors, it will have a major impact on TCP/IP and the global internet. 5.8.1 Limitations of IPv4 Version 4 of the Internet Protocol (IPv4) provides the basic communication mechanisms of the TCP/IP suite and the global internet; it has remained almost unchanged since its inception in the late 1970s. The longevity of version 4 shows that the design is flexible and powerful. Since the time IPv4 was designed, processor performance has increased over two orders of magnitude, typical memory sizes have increased by over
DIT 116
NETWORK PROTOCOLS
NOTES
a factor of 100, network bandwidth of the internet backbone has risen by a factor of 7000, LAN technologies have emerged, and the number of hosts on the internet has risen from a handful to over 56 million. Furthermore, because the changes did not occur simultaneously, adapting to them has been a continual process. Despite its sound design, IPv4 must be replaced soon. When IP was designed, a 32-bit address space was more than sufficient. Only a handful of organizations used a LAN; fewer had a corporate WAN. Now, however most medium-sized corporations have multiple LANs, and most large organizations have a corporate WAN. Consequently, even with careful assignment and NAT technology, the current 32-bit IP address space cannot accommodate projected growth of the global internet beyond the year 2020. Although the need for a larger address space is the most immediate motivation, other factors contributed to the new design. In particular, to make IP better suited to real-time applications, thought was given to supporting systems that associate a datagram with a pre-assigned resource reservation. To make electronic commerce safer, the next version of IP is designed to include support for security features such as authentication. 5.8.2 Efforts of IETF It took many years for the IETF to formulate a new version of IP. Because the IETF produces open standards, it invited the entire community to participate in the process. Computer manufacturers, hardware and software vendors, users, managers, programmers, telephone companies, and the cable television industry all specified their requirements for the next version of IP, and all commented on specific proposals. Many designs were proposed to serve a particular purpose or a particular community. One of the major proposals would have made IP more sophisticated at the cost of increased complexity and processing overhead. Another design proposed retaining most of the ideas in IP, but making simple extensions to accommodate larger addresses. The design, known as SIP (Simple IP), became the basis for an extended proposal that included ideas from other proposals. The extended version was named Simple IP Plus (SIPP), and eventually emerged as the design selected as a basis for the next IP. Choosing a new version of IP was not easy. The popularity of the internet means that the market for IP products around the world is staggering. Many groups see the economic opportunity, and hope that the new versions of IP will help them gain an edge over the competition. In addition, personalities have been involved some individuals hold strong technical opinions; others see active participation as a path to a promotion. Consequently, the discussions generated heated arguments.
262
DIT 116
NETWORK PROTOCOLS
The IETF decided to assign the revision of IP version number 6 and to name it IPv6, to distinguish it from the current IPv4. The choice to skip version number 5 arose after a series of mistakes and misunderstandings. In one mistake, the IAB caused widespread confusion by inadvertently publishing a policy statement that referred to the next version of the IP as IP version 7. In a misunderstanding, an experimental protocol led some to conclude that ST had been selected as the replacement of IP. In the end, the IETF chose 6 because doing so eliminated confusion. 5.8.3 Why IPv6? There are a number of reasons why IPv6 is appropriate for the next generation of the Internet Protocol. It solves the Internet scaling problem, provides a flexible transition mechanism for the current Internet, and was designed to meet the needs of new markets such as nomadic personal computing devices, networked entertainment, and device control. It does this in an evolutionary way which reduces the risk of architectural problems. Ease of transition is a key point in the design of IPv6. It is not something was added in at the end. IPv6g is designed to interoperate with IPv4. Specific mechanisms (embedded IPv4 addresses, pseudo- checksum rules etc.) were built into IPv6 to support transition and compatibility with IPv4. It was designed to permit a gradual and piecemeal deployment with a minimum of dependencies. IPv6 supports large hierarchical addresses which will allow the Internet to continue to grow and provide new routing capabilities not built into IPv4. It has anycast addresses which can be used for policy route selection and has scoped multicast addresses which provide improved scalability over IPv4 multicast. It also has local use address mechanisms which provide the ability for plug and play installation. The address structure of IPv6g was also designed to support carrying the addresses of other internet protocol suites. Space was allocated in the addressing plan for IPX and NSAP addresses. This was done to facilitate migration of these internet protocols to IPv6. IPv6 provides a platform for new Internet functionality. This includes support for real-time flows, provider selection, host mobility, end-to- end security, auto-configuration, and auto-reconfiguration. In summary, IPv6 is a new version of IP. It can be installed as a normal software upgrade in internet devices. It is interoperable with the current IPv4. Its deployment strategy was designed to not have any flag days. IPv6 is designed to run well on high performance networks (e.g., ATM) and at the same time is still efficient for low bandwidth networks (e.g., wireless). In addition, it provides a platform for new internet functionality that will be required in the near future.
263
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
5.8.4
Features of IPv6
The proposed IPv6 protocol retains many of the features that contributed to the success of IPv4. In fact, the designers have characterized IPv6 as being basically the same as IPv4 with a few modifications. For example, IPv6 still supports connectionless delivery (i.e., each datagram is routed independently), allows the sender to choose the size of a datagram, and require the sender to specify the maximum number of hops a datagram can make before being terminated. As we will see, IPv6 also retains most of the concepts provided by IPv4 options, including facilities for fragmentation and source routing. Despite many conceptual similarities, IPv6 changes most of the protocol details. For example, IPv6 uses larger addresses, and adds a few features. More important, IPv6 completely revises the datagram format by replacing IPv4s variable length options filed by a series of fixed-format headers. We will examine details after considering major changes and the underlying motivation for each. The changes introduced by IPv6 can be grouped into seven categories: Larger Addresses: The new address size is the most noticeable change. IPv6 quadruples the size of an IPv4 address from 32bits to 128 bits. The IPv6 address space is so large that it cannot be exhausted in the foreseeable future. Extended Address Hierarchy: Ipv6 uses the larger address space to create additional levels of addressing hierarchy. In particular, IPv6 can define a hierarchy of ISPs as well as hierarchical structure within a given site. Flexible Header Format: IPv6 uses an entirely new and incompatible datagram format. Unlike the IPv4 fixed-format header, IPv6 defines a set of optional headers. Improved Options: Like IPv4, IPv6 allows a datagram to include optional control information. IPv6 includes new options that provide additional facilities not available in IPv4. Provision for Protocol Extension: Perhaps the most significant change in IPv6 is a move away from a protocol that fully specifies all details to a protocol that can permit additional features. The extension capability has the potential to allow the IETF to adapt to changes in underlying network hardware or to new applications. Support for Auto configuration and Renumbering: IPv6 provides facilities that allow computers on an isolated network to assign themselves addresses and begin communicating without depending on a router or manual configuration. The protocol also includes a facility that permits a manager to renumber networks dynamically.
264
DIT 116
NETWORK PROTOCOLS
Support for Resource Allocation: IPv6 has two facilities that permit pre allocation of network resources: a flow abstraction and a differentiated service specification. The latter will use the same approach as IPv4s differentiated services. 5.8.5 General Form of an IPv6 Datagram IPv6 completely changes the datagram format. As figure 5.18 shows, an IPv6 datagram has a fixed-size base header followed by zero or more extension headers, followed by data.
NOTES
Figure 5.18 A general format of an IPv6 datagram with multiple headers.
Interestingly, although it must accommodate larger addresses, an IPv6 base header contains less information than an IPv4 datagram header. Options and some of the fixed fields that appear in an IPv4 datagram header have been moved to extension headers in IPv6. In general, the changes in the datagram header reflect changes in the protocol: Alignment has been changed from 32-bit to 64-bit multiples. The header length field has been eliminated, and the datagram length field has been replaced by a PAYLOAD LENGTH field. The size of source and destination address fields has been increased to 16 octets each. Fragmentation information has been moved out of fixed fields in the base header into an extension header. The TIME-TO-LIVE field has been replaced by a HOP LIMIT field. The SERVICE TYPE is renamed to be a TRAFFIC CLASS field, and extended with a FLOW LABEL field. The PROTOCOL field has been replaced by a field that specifies the type of the next header. Figure 5.19 shows the contents and format of an IPv6 base header. Several fields in an IPv6 base header correspond directly to fields in an IPv4 header. As in IPv4, the initial 4-bit VERS field specifies the version of the protocol; VERS always contains 6 in
DIT 116
NETWORK PROTOCOLS
NOTES
an IPv6 datagram. As in IPv4, the SOURCE ADDRESS and DESTINATION ADDRESS fields specify the addresses of the sender and intended recipient. In IPv6, however, each address require 16 octets. The HOP LIMIT field corresponds to the IPv4 TIME-TO-LIVE field. Unlike IPv4, which interprets the value as giving a strict bound on the maximum number of hops a datagram can make before being discarded.
Figure 5.19 The format of the 40-octet IPv6 base header.
Ipv6 handles datagram length specifications in a new way. First, because the size of the base header is fixed at 40-octets, the base header does not include a field for the header length. Second, IPv6 replaces IPv4s datagram length field by a 16-bit PAYLOAD LENGTH field that specifies the number of octets carried in the datagram excluding the header itself. Thus, an IPv6 datagram can contain 64K octets of data. Two fields in the base header are used in making forwarding decisions. The IPv4 SERVICE CLASS field has been renamed TRAFFIC CLASS. In addition, a new mechanism in IPv6 supports resource reservation and allows a router to associate each datagram with a given resource allocation. The underlying abstraction, a flow, consists of a path through an internet along which intermediate routers guarantee a specific quality of service. Field FLOW LABEL in the base header contains information that routers use to associate a datagram with a specific flow and priority. For example, two applications that need to send video can establish a flow on which the delay and bandwidth is guaranteed. Alternatively, a network provider may require a subscriber to specify the quality of service desired, and then use a flow to limit the traffic a specific computer or a specific application sends. Note that flows can also be used within a given organization to manage network resources and ensure that all applications receive a fair share. A router uses the combination of datagram source address and flow identifier when associating a datagram with a specific flow. To summarize:
266
DIT 116
NETWORK PROTOCOLS
Each IPv6 datagram begins with a 40-octet base header that includes fields for the source and destination addresses, the maximum hop limit, the traffic class, the flow label, and the type of the next header. Thus, an IPv6 datagram must contain at least 40 octets in addition to the data. 5.8.6 Extension Headers IPv6 includes an improved option mechanism over IPv4. IPv6 options are placed in separate extension headers that are located between the IPv6 header and the transport-layer header in a packet. Most IPv6 extension headers are not examined or processed by any router along a packets delivery path until it arrives at its final destination. This facilitates a major improvement in router performance for packets containing options. In IPv4 the presence of any options requires the router to examine all options. The other improvement is that unlike IPv4 options, IPv6 extension headers can be of arbitrary length and the total amount of options carried in a packet is not limited to 40 bytes. This feature plus the manner in which they are processed, permits IPv6 options to be used for functions which were not practical in IPv4. A good example of this is the IPv6 Authentication and Security Encapsulation options. In order to improve the performance when handling subsequent option headers and the transport protocol which follows, IPv6 options are always an integer multiple of 8 octets long, in order to retain this alignment for subsequent headers. The IPv6 extension headers which are currently defined are: Routing Extended Routing (like IPv4 loose source route). Fragmentation Fragmentation and Reassembly. Authentication Integrity and Authentication. Security Encapsulation Confidentiality. Hop-by-Hop Option Special options which require hop by hop processing. Destination Options Optional information to be examined by the destination node. 5.8.7 IPv6 Addressing
NOTES
IPv6 addresses are 128-bits long and are identifiers for individual interfaces and sets of interfaces. IPv6 Addresses of all types are assigned to interfaces, not nodes.
DIT 116
NETWORK PROTOCOLS
NOTES
Since each interface belongs to a single node, any of that nodes interfaces unicast addresses may be used as an identifier for the node. A single interface may be assigned multiple IPv6 addresses of any type. There are three types of IPv6 addresses. These are unicast, anycast, and multicast. Unicast addresses identify a single interface. Anycast addresses identify a set of interfaces such that a packet sent to a anycast address will be delivered to one member of the set. Multicast addresses identify a group of interfaces, such that a packet sent to a multicast address is delivered to all of the interfaces in the group. There are no broadcast addresses in IPv6, their function being superseded by multicast addresses. IPv6 supports addresses which are four times the number of bits as IPv4 addresses (128 vs. 32). This is 4 Billion times 4 Billion times 4 Billion (2^^96) times the size of the IPv4 address space (2^^32). This works out to be: 340,282,366,920,938,463,463,374,607,431,768,211,456 This is an extremely large address space. In a theoretical sense this is approximately 665,570,793,348,866,943,898,599 addresses per square meter of the surface of the planet Earth (assuming the earth surface is 511,263,971,197,990 square meters). In more practical terms the assignment and routing of addresses requires the creation of hierarchies which reduces the efficiency of the usage of the address space. Christian Huitema performed an analysis in which evaluated the efficiency of other addressing architectures (including the French telephone system, USA telephone systems, current internet using IPv4, and IEEE 802 nodes). He concluded that 128bit IPv6 addresses could accommodate between 8x10^^17 to 2x10^^33 nodes assuming efficiency in the same ranges as the other addressing architectures. Even his most pessimistic estimate this would provide 1,564 addresses for each square meter of the surface of the planet Earth. The optimistic estimate would allow for 3,911,873,538,269,506,102 addresses for each square meter of the surface of the planet Earth. The specific type of IPv6 address is indicated by the leading bits in the address. The variable-length field comprising these leading bits is called the Format Prefix (FP). The initial allocation of these prefixes is as follows: Allocation Reserved Unassigned Reserved for NSAP Allocation Reserved for IPX Allocation Prefix(binary) 0000 0000 0000 0001 0000 001 0000 010
268
Fraction of Address Space 1/256 1/256 1/128 1/128
DIT 116
NETWORK PROTOCOLS
Unassigned Unassigned Unassigned Unassigned Provider-Based Unicast Address Unassigned Reserved for Neutral-Interconnect-Based Unicast Addresses Unassigned Unassigned Unassigned Unassigned Unassigned Unassigned Unassigned Link Local Use Addresses Site Local Use Addresses Multicast Addresses
0000 011 0000 1 0001 001 010 011
1/128 1/32 1/16 1/8 1/8 1/8
NOTES
100 101 110 1110 1111 0 1111 10 1111 110 1111 1110 0 1111 1110 10 1111 1110 11 1111 1111
1/8 1/8 1/8 1/16 1/32 1/64 1/128 1/512 1/1024 1/1024 1/256
This allocation supports the direct allocation of provider addresses, local use addresses, and multicast addresses. Space is reserved for NSAP addresses, IPX addresses, and neutral-interconnect addresses. The remainder of the address space is unassigned for future use. This can be used for expansion of existing use (e.g., additional provider addresses, etc.) or new uses (e.g., separate locators and identifiers). Note that Anycast addresses are not shown here because they are allocated out of the unicast address space. Approximately fifteen percent of the address space is initially allocated. The remaining 85% is reserved for future use. Unicast Addresses There are several forms of unicast address assignment in IPv6. These are the global provider based unicast address, the neutral-interconnect unicast address, the NSAP address, the IPX hierarchical address, the site-local-use address, the link-local-use address, and the IPv4-capable host address. Additional address types can be defined in the future.
269
DIT 116
NETWORK PROTOCOLS
NOTES
Provider Based Unicast Addresses Provider based unicast addresses are used for global communication. They are similar in function to IPv4 addresses under CIDR. The format of assignment plan for unicast is shown as follows.
The first 3 bits identify the address as a provider- oriented unicast address. The next field (REGISTRY ID) identifies the internet address registry which assigns provider identifiers (PROVIDER ID) to internet service providers, which then assign portions of the address space to subscribers. This usage is similar to assignment of IP addresses under CIDR. The SUBSCRIBER ID distinguishes among multiple subscribers attached to the internet service provider identified by the PROVIDER ID. The SUBNET ID identifies a specific physical link. There can be multiple subnets on the same physical link. A specific subnet can not span multiple physical links. The INTERFACE ID identifies a single interface among the group of interfaces identified by the subnet prefix. Local-Use Addresses A local-use address is a unicast address that has only local routability scope (within the subnet or within a subscriber network), and may have local or global uniqueness scope. They are intended for use inside of a site for plug and play local communication and for bootstrapping up to the use of global addresses. There are two types of local-use unicast addresses defined. These are Link-Local and Site-Local. The Link-Local-Use is for use on a single link and the Site-Local-Use is for use in a single site. Link-Local- Use addresses have the following format
Link-Local-Use addresses are designed to be used for addressing on a single link for purposes such as auto-address configuration.
270
DIT 116
NETWORK PROTOCOLS
Site-Local-Use addresses have the following format:
NOTES
For both types of local use addresses the INTERFACE ID is an identifier which much be unique in the domain in which it is being used. In most cases these will use a nodes IEEE-802 48bit address. The SUBNET ID identifies a specific subnet in a site. The combination of the SUBNET ID and the INTERFACE ID to form a local use address allows a large private internet to be constructed without any other address allocation. Local-use addresses allow organizations that are not (yet) connected to the global Internet to operate without the need to request an address prefix from the global Internet address space. Local-use addresses can be used instead. If the organization later connects to the global Internet, it can use its SUBNET ID and INTERFACE ID in combination with a global prefix (e.g., REGISTRY ID + PROVIDER ID + SUBSCRIBER ID) to create a global address. This is a significant improvement over IPv4 which requires sites which use private (non-global) IPv4 address to manually renumber when they connect to the Internet. IPv6 does the renumbering automatically. IPv6 Addresses with Embedded IPV4 Addresses The IPv6 transition mechanisms include a technique for hosts and routers to dynamically tunnel IPv6 packets over IPv4 routing infrastructure. IPv6 nodes that utilize this technique are assigned special IPv6 unicast addresses that carry an IPv4 address in the low-order 32-bits. This type of address is termed an IPv4-compatible IPv6 address and has the format:
Anycast Addresses An IPv6 anycast address is an address that is assigned to more than one interfaces (typically belonging to different nodes), with the property that a packet sent to an anycast address is routed to the nearest interface having that address, according to the routing protocols measure of distance.
DIT 116
NETWORK PROTOCOLS
NOTES
Anycast addresses, when used as part of an route sequence, permits a node to select which of several internet service providers it wants to carry its traffic. This capability is sometimes called source selected policies. This would be implemented by configuring anycast addresses to identify the set of routers belonging to internet service providers (e.g., one anycast address per internet service provider). These anycast addresses can be used as intermediate addresses in an IPv6 routing header, to cause a packet to be delivered via a particular provider or sequence of providers. Other possible uses of anycast addresses are to identify the set of routers attached to a particular subnet, or the set of routers providing entry into a particular routing domain. Anycast addresses are allocated from the unicast address space, using any of the defined unicast address formats. Thus, anycast addresses are syntactically indistinguishable from unicast addresses. When a unicast address is assigned to more than one interface, thus turning it into an anycast address, the nodes to which the address is assigned must be explicitly configured to know that it is an anycast address. Multicast Addresses A IPv6 multicast address is an identifier for a group of interfaces. A interface may belong to any number of multicast groups. Multicast addresses have the following format:
11111111 at the start of the address identifies the address as being a multicast address. +-+-+-+-+ FLGS is a set of 4 flags: |0|0|0|T| +-+-+-+-+ The high-order 3 flags are reserved, and must be initialized to 0. T=0 indicates a permanently assigned (well-known) multicast address, assigned by the global internet numbering authority. T=1 indicates a non-permanently assigned (transient) multicast address. SCOP is a 4-bit multicast scope value used to limit the scope of the multicast group. The values are: 0 Reserved 1 Node-local scope
8 9
Organization-local scope (unassigned)

272
DIT 116
NETWORK PROTOCOLS
2 3 4 5 6 7
Link-local scope (unassigned) (unassigned) Site-local scope (unassigned) (unassigned)
A B C D E F
(unassigned) (unassigned) (unassigned) (unassigned) Global scope Reserved
NOTES
GROUP ID identifies the multicast group, either permanent or transient, within the given scope. 5.8.8 Transition Mechanisms The key transition objective is to allow IPv6 and IPv4 hosts to interoperate. A second objective is to allow IPv6 hosts and routers to be deployed in the Internet in a highly diffuse and incremental fashion, with few interdependencies. A third objective is that the transition should be as easy as possible for end- users, system administrators, and network operators to understand and carry out. The IPv6 transition mechanisms are a set of protocol mechanisms implemented in hosts and routers, along with some operational guidelines for addressing and deployment, designed to make transition the Internet to IPv6 work with as little disruption as possible. The IPv6 transition mechanisms provides a number of features, including: Incremental upgrade and deployment. Individual IPv4 hosts and routers may be upgraded to IPv6 one at a time without requiring any other hosts or routers to be upgraded at the same time. New IPv6 hosts and routers can be installed one by one. Minimal upgrade dependencies. The only prerequisite to upgrading hosts to IPv6 is that the DNS server must first be upgraded to handle IPv6 address records. There are no pre-requisites to upgrading routers. Easy Addressing. When existing installed IPv4 hosts or routers are upgraded to IPv6, they may continue to use their existing address. They do not need to be assigned new addresses. Administrators do not need to draft new addressing plans. Low start-up costs. Little or no preparation work is needed in order to upgrade existing IPv4 systems to IPv6, or to deploy new IPv6 systems. The mechanisms employed by the IPv6 transition mechanisms include: An IPv6 addressing structure that embeds IPv4 addresses within IPv6 addresses, and encodes other information used by the transition mechanisms.
273
DIT 116
NETWORK PROTOCOLS
NOTES
A model of deployment where all hosts and routers upgraded to IPv6 in the early transition phase are dual capable (i.e. implement complete IPv4 and IPv6 protocol stacks). The technique of encapsulating IPv6 packets within IPv4 headers to carry them over segments of the end-to-end path where the routers have not yet been upgraded to IPv6. The header translation technique to allow the eventual introduction of routing topologies that route only IPv6 traffic, and the deployment of hosts that support only IPv6. Use of this technique is optional, and would be used in the later phase of transition if it is used at all. The IPv6 transition mechanisms ensures that IPv6 hosts can interoperate with IPv4 hosts anywhere in the Internet up until the time when IPv4 addresses run out, and allows IPv6 and IPv4 hosts within a limited scope to interoperate indefinitely after that. This feature protects the huge investment users have made in IPv4 and ensures that IPv6 does not render IPv4 obsolete. Hosts that need only a limited connectivity range (e.g., printers) need never be upgraded to IPv6. The incremental upgrade features of the IPv6 transition mechanisms allow the host and router vendors to integrate IPv6 into their product lines at their own pace, and allows the end users and network operators to deploy IPv6 on their own schedules. 5.8.9 Issues Deerings original proposal used 8 byte addressees, but during the review process many people felt that with 8 byte addresses IPv6 would run out of addresses within a few decades, whereas with 16 byte addresses it would never run out. Other people argued that 16 bytes was too lengthy, whereas till others favoured using 20 byte addresses to be compatible with the OSI datagram protocol. Still another faction wanted variable sized addresses. After much debate, it was decided that fixed length 16 byte addresses were the best compromise. Another issue is the length of the HOP LIMIT field. One can felt strongly that limiting the maximum number of hops to 255 was a gross mistake. After all, paths of 32 hops are common now, and ten years from now much longer paths may be common. These people argued that using a huge address size was far-sighted but using a tiny hop count was short sighted. Another hot issue was the maximum packet size. The supercomputer community wanted packets in excess of 64 KB. When a supercomputer gets started transferring it really means business and does not want to be interrupted every 64 KB. The argu-
274
DIT 116
NETWORK PROTOCOLS
ments against large packets is that if a 1 MB packets hits a 1.5 MBPS T1 line, that packet will tie the line up for over 5 seconds, producing a very noticeable delay for interactive users sharing the line. A compromise was reached such that normal packets are limited to 64 KB, but the hop-by-hop extension header can be used to permit jumbograms. One more hot topic was removing the IPv4 CHECKSUM. Some people likened this move to removing the brakes from a car. Doing so makes the car lighter, so it can go faster, but if an unexpected event happens, you have a problem. The argument against CHECK SUM was that any application that really cares about data integrity has to have a transport layer CHECK SUM anyway, so having another one in IP is overkill. Furthermore, experience showed that computing the IP CHECK SUM was the major expense in IPv4. The antichecksum won this one, an IPv6 does not have a checksum. Mobile hosts were also a point of contention. If a portable computer flies half way around the world, can it continue operating at the destination with the same IPv6 address, or does it have to use a scheme with home agents and foreign agents? Mobile hosts also introduced asymmetries into the routing system. It may well be the case that, a small mobile computer can easily hear the powerful signal put out by a large stationary router, but the stationary router cannot hear the feeble signal put out by the mobile host. consequently, some people wanted to built explicit support for mobile hosts into IPv6. that effort failed when no consensus could be found for any specific proposal. Probably the biggest battle was about security. Everyone agreed that it was essential. The war was about where and how. The argument for putting in the network layer is that it can become a standard service that all application can use without any advanced planning. The argument against it is that really secured applications generally want nothing less than the end-to-end encryption, where the source application does the encryption and the destination undoes it. With anything less, the user is at the mercy of the potentially buggy network layer implementations over which he has no control. The response to this argument is that these applications can just refrain from the IP security features and do the job themselves. The rejoinder to that is that the people who do not trust to do it right, do not want to pay the price of slow, bulky IP implementations that have this capability, even if it is disabled. Have you understood? 1. 2. 3. 4. What are the limitations of IPv4? IPv6 is an upgraded version of IPv4. Justify this statement. What are the features of IPv4 retained by IPv6? Mention the features of IPv4 changed in IPv6.
275
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
5. 6. 7. 8. 9. 10.
Mention the seven categories of changes made in IPv6. What are the headers present in the base header field of IPv6? What is the purpose of extension header of Ipv6? What are the features provided by IPv6 transition mechanism? What are the types of addressing schemes supported by IPv6? Mention the debatable features of Ipv6.
Summary 1. Network management refers to the maintenance and administration of largescale computer networks and telecommunications networks at the top level. Network management is the execution of the set of functions required for controlling, planning, allocating, deploying, coordinating, and monitoring the resources of a network. Network management means different things to different people. In some cases, it involves a solitary network consultant monitoring network activity with an outdated protocol analyzer. In other cases, network management involves a distributed database, autopolling of network devices, and high-end workstations generating real-time graphical views of network topology changes and traffic. Any Network Management System (NMS) has two basic entities namely managing device and managed device. Managed devices, such as computer systems and other network devices, run software that enables them to send alerts to the managing devices when they recognize problems. Upon receiving these alerts, management entities are programmed to react by executing one, several, or a group of actions, including operator notification, event logging, system shutdown, and automatic attempts at system repair. The ISO has contributed a great deal to network standardization. Its network management model is the primary means for understanding the major functions of network management systems. This model consists of the following five conceptual areas namely performance management, configuration management, accounting management, fault management and security management. In a TCP/IP internet, a manager needs to examine and control routers and other network devices. Because such devices attach to arbitrary networks, protocols for internet management operate at the application level and communicate using TCP/IP transport-level protocols. An SNMP-managed network consists of five key components: managed devices, agents, network-management systems (NMSs) (or management stations), management information and a management protocol. In SNMP, each device maintains one or more variables that describe its state. The heart of the SNMP model is the set of objects managed by the agents and read and written by management station. To make multi-vendor communication possible, it is essential that these objects be defined in a standard and vendor-
2.
3.
4.
5.
6.
7.
276
DIT 116
NETWORK PROTOCOLS
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
neutral way. The standard for object definition language chosen by SNMP is Abstract Syntax Notation One One of the main reasons for the success of ASN.1 is that this notation is associated with several standardized encoding rules such as the BER or more recently the PER. These encoding rules describe how the values defined in ASN.1 should be encoded for transmission regardless of machine, programming language, or how it is represented in an application program. Structure of Management Information (SMI) is the sub-super-set of ASN.1 that is used to define the data structures used in Simple Network Management Protocol. This is the relationship between ASN1 and SMI. A Management Information Base is a type of database used to manage the devices in a communication network. It comprises a collection of objects in a database used to manage entities in a network. Objects in the MIB are defined using a subset of Abstract syntax Notation One called Structure of Management Information. SNMP is a simple request/response protocol. The network-management system issues a request, and managed devices return responses. This behavior is implemented by using one of four protocol operations: Get, GetNext, Set, and Trap. RMON stands for Remote Monitoring. It is a standard used in telecommunications equipment e.g. in routers, which implement a MIB (Management Information Base) which allows for remote monitoring and management of network equipment. RMON uses an agent running on the device being monitored to supply information over SNMP to a management workstation (or some other system). The terms network security and information security refer in a broad sense to confidence that information and services available on a network cannot be accessed by unauthorized users. Security implies safety, including assurance of data integrity, freedom from unauthorized access of computational resources, freedom from snooping or wiretapping, and freedom from disruption of service. The security policy specifies who will be granted access to each piece of information, the rules an individual must follow in disseminating the information to others, and a statement of how the organization will react to violations. Two possibilities in providing the security in networks are end-to-end approach and the network layer solution. Initially everyone thought of leaving the security to the application layer. The limitation of this approach is that all the applications should be security aware. Network layer solution eliminates this constraint. IETF has devised a set of protocols that provide secure Internet communication. This set of protocols is collectively known as IPsec (short for IP security) and these protocols offer authentication and privacy services at the IP layer, and can be used with both IPv4 and IPv6. The major advantage of IPsec is it does not restrict the users to implement a particular encryption or authentication algorithm. Instead, provides a general frame277
NOTES
DIT 116
NETWORK PROTOCOLS
NOTES
18.
19.
20.
21.
22.
23.
24.
25.
26.
27. 28.
work that allows each pair of communicating endpoints to choose algorithms and parameters. IPSec is not a single security protocol. Instead, IPsec provides a set of security algorithms plus a general framework that allows a pair of communicating entities to use whichever algorithms provide security appropriate for the communication. A destination uses the security parameters index to identify the security association for a packet. The values are not global; a combination of destination address and security parameters index is needed to identify an SA. An organization that has multiple external connections must install a firewall on each external connection and must coordinate all firewalls. Failure to restrict access identically on all firewalls can leave the organization vulnerable. To be effective, a firewall that uses datagram filtering should restrict access to all IP sources, IP destinations, protocols, and protocol ports except those computers, networks and services the organization explicitly decides to make available externally. A packet filter that allows a manager to specify which datagrams to admit instead of which datagrams to block can make such restrictions easy to specify. If an organizations firewall restricts incoming datagrams except for parts that correspond to services the organization makes available externally, an arbitrary application inside the organization cannot become a client of a server outside the organization. Although a bastion host is essential for communication through a firewall, the security of the firewall depends on the safety of the bastion host. An intruder who exploits a security flow in the bastion host operating system can gain access to hosts inside the firewall. When multiple external sites connect through a single firewall, an architecture that has a router per external connection can prevent unwanted packet flow from one external site to another. When IP was designed, a 32 bit address space was more than sufficient. Only a handful of organizations used a LAN; fewer had a corporate WAN. Now, most medium sized corporations have multiple LANs, and most large corporations have a corporate WAN. Consequently, even with careful assignment and NAT technology, the current 32-bit IP address space cannot accommodate projected growth of the global Internet beyond the year 2020. Each IPv6 datagram begins with a 40-octet base header that includes fields for the source and destination addresses, the maximum hop limit, the traffic class, the flow label, and the type of the next header. Thus, an IPv6 datagram must contain at least 40 octets in addition to the data. IPv6 extension headers are similar to IPv4 options. Each datagram include extension headers for only those facilities that the datagram uses. An internet protocol that uses end-to-end fragmentation requires a sender to discover the path MTU to each destination, and to fragment any outgoing
278
DIT 116
NETWORK PROTOCOLS
29.
30.
datagram that is larger than the path MTU to each destination, and to fragment any outgoing datagram that is larger than the path MTU. End-to-end fragmentation does not accommodate route changes. In addition to choosing technical details of a new Internet Protocol, the IETF work on IPv6 has focused on finding a way to transition from the current protocol to the new protocol. In particular, the current proposal for IPv6 allows one to encode an IPv4 address inside an IPv6 address such that address translation does not change the pseudo header checksum. The IPv6 transition mechanisms are a set of protocol mechanisms implemented in hosts and routers, along with some operational guidelines for addressing and deployment, designed to make transition the Internet to IPv6 work with as little disruption as possible.
NOTES
Exercises 1. What is the role of SNMP in network management? 2. What is the role of SMI in network management? 3. What is the role of MIB in network management? 4. Explain the encoding format of BER. 5. Define INTEGER 14, OCTET STRING HI, ObjectIdentifier 1.3.6.1, IPAddress 131.21.14.8? 6. Draw the format of various SNMP messages. 7. In SNMP, using two different names allows a system to run both a manager and agent. What would happen if the same port number were used for both? 8. How would you list an entire routing table using get-next? 9. How does a client can search the ipAddrTable without knowing which IP addresses are in the table on a given router? 10. What is the difference in the way programmers use arrays and the way network management software uses tables in MIB? Answers 1. SNMP defines the format of packets exchanged between a manager and an agent. It reads and changes the status (values) of objects (variables) in SNMP packets. SMI defines the general rules for naming objects, defining object types (including range and length), and showing how to encode objects and values. SMI defines neither the number of objects an entity should manage, nor names the objects to be managed nor defines the association between the objects and their values. MIB creates a collection of named objects, their types, and their relationship to each other in an entity to be managed. Tag is a 1 byte field that defines the type of data. It is composed of three subfields: class (2 bits), format (1 bit), and number (5 bits). The class subfield de279 Anna University Chennai
2.
3. 4.
DIT 116
NETWORK PROTOCOLS
NOTES
fines the scope of the data. Four classes are defined: universal (00), applicationwide (01), context-specific (10) and number (5 bits). The class subfield defines the scope of the data. Four classes are defined: universal (00), application wide (01), context specific (10), and private (11). The universal data types are those taken from ASN.1 (INTEGER, OCTET STRING and ObjectIdentifier). The application-wide data types are those added by SMI (IPAddress, Counter, Gauge and TimeTicks). The five context-specific datatypes have meanings that may change from one protocol to another. The private data types are vendor specific. The format subfield indicates whether the data is simple (0) or structured (1). The number subfield further divides simple or structured data into subgroups. The following table shows the codes for data types.
Data Type Integer Octet String Object Identifier Null Sequence, Sequence of Ipaddress Counter Gauge Timeticks Opaque Class 00 00 00 00 00 01 01 01 01 01 Format 0 0 0 0 1 0 0 0 0 0 Number 00010 00100 00110 00101 10000 00000 00001 00010 00011 00100 Tag (Binary) 00000010 00000100 00000110 00000101 00110000 01000000 01000001 01000010 01000011 01000100 Tag (Hex) 02 04 06 05 30 40 41 42 43 44
The length field is 1 or more bytes. If it is 1 byte, the most significant bit must be 0. The other 7 bits define the length of the data. If it is more than 1 byte, the most significant bit of the first byte must be 1. The other 7 bits of the first byte define the number of bytes needed to define the length.
Tag Length Value
Class 2 bits
Format 1 bit
Number 5 bits
The value field codes the value of the data according to the rules defined in BER.
280
DIT 116
NETWORK PROTOCOLS
5.
INTEGER 14
NOTES
6.
281
DIT 116
NETWORK PROTOCOLS
NOTES
7.
8.
9.
10.
If a system is running both a manager and agent, they are probably different processes. The manager listens on UDP port 162 for traps, and the agent listens on UDP port 161 for requests. If the same port were used for both traps and requests, separating the manager from the agent would be hard. The operation of the get-next operator is based on the lexicographic ordering of the MIB. Using the get-next operator in this fashion, one could imagine a manager with a loop that starts at the beginning of the MIB and queries the agent for every variable that the agent maintains. Another use of the operator is to iterate through tables. A client that does not know which IP addresses to identify entries are in the table on a given router cannot form a complete object identifier. However, the client can still use the get-next-request operation to search the table by sending the prefix. Programmers think of an array as a set of elements that have an index used to select a specific element. For example, the programmer might write xyz[3] to select the third element from array xyz. ASN.1 syntax does not use integer indices. Instead, MIB tables append a suffix onto the name to select a specific element in the table. For our example of an IP address table, the standard specifies theat the suffix used to select an item consists of an IP address. Syntactically, the IP address (in the dotted decimal notation) is concatenated onto the end of the object name to form the reference. Thus, to specify the network mask fiald in the IP address table entry corresponding to address 128.10.2.3, one uses the name:
iso.org.dod.internet.mgmt.mib.ip.ipAddrTable.ipAddrEntry.ipAdEntNetMask.128.10.2.3
Which, in numeric form, becomes: 1.3.6.1.2.1.4.20.1.3.128.10.2.3 Although concatenating an index to the end of a name may seem awkward, it provides a powerful tool that allows clients to search tables without knowing the number of items or the type of data used as an index.
References 1) Dougles E Corner internet working with TCP / IP Principles, Protocols and Architectures, Fourth Edition, Prentice Hall of India, 2002. 2) Behrouz A Forouzan, TCP /IP protocol suite Third Edition Tata McGraw Hill Edition, 2006 3) Uyless Black, Computer Networks Protocols, Standards and Interfaces Second Edition, Prentice Hall of India, 2002.
282
DIT 116
NETWORK PROTOCOLS
NOTES
NOTES
283
DIT 116
NETWORK PROTOCOLS
NOTES
NOTES
284

Network Protocols

Uploaded by

Copyright:

Available Formats

Network Protocols

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Network Protocols

Uploaded by

Copyright:

Available Formats

M.Sc.

DIT 116 Network Protocols

I SEMESTER COURSE MATERIAL

Centre for Distance Education

Dr. Ranjani Parthasarathy

Dr. T.V. Geetha

Dr. H. Peeru Mohamed

Copyrights Reserved (For Private Circulation only)

Anna University Chennai

Anna University Chennai

TCP/IP REFERENCE MODEL Network Software

Figure 1.1 TCP/IP Reference Model

Anna University Chennai

Anna University Chennai

INTERNET ARCHITECTURE AND DESIGN PHILOSOPHY

The Conceptual Service Organization

Anna University Chennai

Anna University Chennai

Anna University Chennai

To Reach Hosts on Network 20.0.0.0 30.0.0.0 10.0.0.0 40.0.0.0

Route to this Address Deliver Directly Deliver Directly 20.0.0.5 30.0.0.7

DATAGRAM DATA AREA

Figure 1.5. General form on an IP datagram

Anna University Chennai

Figure 1.6 shows the various fields of the IP datagram. 0

TOTAL LENGTH FLAGS FRAGMENT OFFSET HEADER CHECKSUM

Figure 1.6 IP header

Anna University Chennai

Figure 1.8 Differentiated Services Code Point

Anna University Chennai

Anna University Chennai

Figure: 1.9 IP address format Anna University Chennai 18

Anna University Chennai

Figure 1.11. The algorithm IP uses to forward a datagram.

Anna University Chennai

Figure 1.12. The unified IP routing algorithm

Figure 1.13. Two levels of ICMP encapsulation. Anna University Chennai 24

Anna University Chennai

Figure 1.14. ICMP echo request or reply

Anna University Chennai

Figure 1.15. ICMP destination unreachable message

Figure 1.16. ICMP source quench message format

Figure 1.17. A routing scenario

Figure 1.18. ICMP redirect message format

Anna University Chennai

Figure 1.19. ICMP time exceeded message format

Anna University Chennai

Figure 1.20. ICMP parameter problem message format

Figure 1.21. ICMP timestamp request or reply message format

Figure 1.24. ICMP router solicitation message

Anna University Chennai

Anna University Chennai

Figure1.25. Positive acknowledgement with retransmission

Anna University Chennai

Anna University Chennai

Figure 1.28. The format of a TCP segment

Bit (left to right) URG ACK PSH RST SYN FIN

Figure 1.29. Bits of the CODE field in the TCP header

Anna University Chennai

Figure 1.30. Pseudo header included in the checksum