Computer Science Department: Can We Trust ICMP-based Measurements?
Computer Science Department: Can We Trust ICMP-based Measurements?
Computer Science Department: Can We Trust ICMP-based Measurements?
Technical Report
NWU-CS-04-48
November 8th, 2004
Abstract
ICMP-based measurements (e.g. ping) are often criticized as un-representative of
the applications’ experienced performance, as applications are based on
TCP/UDP protocols and there is a well-accepted conjecture that routers are often
configured to treat ICMP differently from TCP and UDP.
However, to the best of our knowledge, this assumption has not been validated.
With this in mind, we conducted extensive Internet end-to-end path measurements
of these three protocols, spanning over 90 sites (from both commercial and
academic networks), over 6, 000 paths and more than 28 million probes in
PlanetLab during two weeks.
Our results show that ICMP performance is a good estimator for TCP/UDP
performance for the majority of the paths. However for nearly 0.5% of the paths,
we found persistent RTT differences between UDP and ICMP greater than 50%,
while for TCP the difference exceeds 10% for 0.27% of the paths. Thus,
although ICMP-based measurements can be trusted as predictors of TCP/UDP
performance, distributed systems and network researchers should be aware of
some scenarios where these measurements will be heavily misleading; this paper
also provides some hints that can help in identifying those situations.
Keywords: ICMP, TCP, UDP, routers, protocols, trace, end-to-end measurements, PlanetLab.
Can we trust ICMP-based measurements?
Stefan Birrer Fabián E. Bustamante
Yan Chen
Department of Computer Science
Northwestern University, Evanston IL 60201, USA,
{sbirrer,fabianb,ychen}@cs.northwestern.edu
Abstract
1 Introduction
Measuring the behavior of network path characteristics is critical for the diagnosis,
optimization and development of distributed services. Useful tools of this sort find
1
application in a variety of contexts, from server selection [3] to the weighting of al-
ternative paths in overlay networks [4]. Unfortunately, performance measurement
was not a design goal when the Internet was originally architectured [6] and thus
there is limited support available to the system designer.
Over the last few years a renewed interest on measurement techniques [16, 18,
17, 8] have pushed functionality beyond the useful, but rather limited, set offered
by tools such as ping and traceroute [9]. Today’s growing toolset includes ICMP-
, TCP- and UDP-based instruments such as pathchar [10], sting[17], iperf [19],
pathload [16] as well as ping and traceroute [9],
There is, however, a potential dissonance between the application’s experi-
ence and the view portrayed by the measurement tool. In particular, ICMP-based
measurements have been often criticized as un-representative of application per-
formance, as applications often employ TCP or UDP as their transport protocol
and there is a well-accepted conjecture that routers are often configured to treat
packets from these different protocols differently [17, 7]. While a quick look at
the documentation of some of the most popular routers 1 reveals that routers do in-
deed support protocol based Quality of Service (QoS) policies [5, 15, 11, 14], our
research explores how often network administrators make use of this functionality.
In this paper we investigate the dependence of the network characteristics on
the higher-level protocol (ICMP, UDP, TCP). This involves identifying anomalies
in the measurements with regard to fairness. We note that for most of the path,
ICMP performance is a good estimator of UDP and TCP round trip time. However,
the average loss rate for ICMP is higher than for UDP and TCP. The hypothesis
that UDP traffic has a persistent round trip time penalty of more than 50% holds
for 0.45% of all measured paths. We also found 1.76% of the paths with persistent
loss anomalies of more than one packet loss per 100 packets.
2 Related Work
Several comparative studies [16, 18, 13] have evaluated existing measurement
tools [9, 10, 17, 19], but no work has addressed the effect of the layer-4 proto-
cols on the measured network characteristic.
RON [1] monitors end-to-end path connecting dedicated routers at the entry
points of private networks, and it uses these measurement for reactive routing on
an overlay. Their work present a relevant detailed evaluation on loss probability.
Goyal et al. [7] argues that ICMP-based probes may not be a good estimator for
TCP latency and loss rate, since both protocols do not sample network queues
in the same way. Our work is complementary, in that we focus on network-path
1
From brands such as Cisco, Nortel Networks, Juniper Networks and Netgear.
2
100 ms
Probe Probe Probe
Source Destination
Router
behavior as experienced by different protocols. Zhang and Duffield [20] look at the
over time constancy of Internet path properties and report a loss rate of 0.6-0.9%
for TCP (consistent with our findings). Our work, on the other hand, focuses on
exploring the constancy of anomalies across protocols.
We borrow the concepts of Global Research and Education Network from
Banerjee et al. [2], where the authors look at the interdomain connectivity of Plan-
etLab nodes. We plan to validate our site classification with theirs (once this be-
comes available) as part of our future work.
3 Evaluation
3.1 Measurement Methodology
We deployed a ping client/server to PlanetLab, a wide-area test environment. Our
measurement client uses a IP-socket and assembles its own ICMP, UDP and TCP
packets without using any of the TCP features such as retransmissions. Basically
we are comparing the network behavior for IP packets with a different protocol type
and a different payload (which conforms to the appropriate standard, i.e. TCP).
Figure 1 illustrates our method of path probing: the client sends 100 probes
to the server with 100 ms spacing between them. A probe consists of three pack-
ets, one for each of the protocols studied (ICMP, UDP, TCP). These packets are
interleaved in random order with no spacing in between. Packets may get lost any-
where in the path from the source to the destination, we account these losses at the
destination. Lost rate is then computed as the ratio of packets received to the total
number of packets sent.
Every packet forwarded by our client includes a timestamp used to compute an
estimate of round trip time (RTT). The server replies immediately to every received
packet, including in its reply the client’s timestamp. An estimate of RTT is then
computed by the client as the difference between current time and the packet’s
original timestamp.
3
To remove any possible bias due to packet size on loss probability and queueing
delay, we ensure all probes are exactly 100-Byte long (plus IP header). Since the
protocols headers are of different size, we pad the packet to 100 Bytes IP-payload.
3.2 Outliers
Outliers are a general problem in real-world measurements. The unpredictable
nature of the test environment introduces measurements which lay beyond reason-
able boundaries. In a first step, we eliminate RTT outliers for each of our RTT path
measurements. We define outliers as round trip times which differ more than two
standard deviations from the mean of all round trip times for a given protocol. As
a few outliers have a strong influence on the mean, we transfer the RTT first into
log-space, before proceeding with outlier elimination.
RT T = {t0 , t1 , . . . , tn } (1)
0
RT T = {log t0 , log t1 , . . . , log tn } (2)
t0x = log tx (3)
0
mean = RT T 0 (4)
0 0
std = std(RT T ) (5)
outlier = {t0x |t0x > mean0 + 2 · std0
∨t0x < mean0 − 2 · std0 } (6)
3.3 Hypothesis
We employ traditional hypothesis testing techniques from statistics [12]. We vali-
date the well-accepted conjecture that packets from different protocols, due in part
4
to router configurations, experience different QoS. We use hypothesis testing to
estimate the number of path for which we can conclude, with a reasonable confi-
dence, that they hypothesis is true. We employ 95% confidence for all tests.
5
Table 1: Traces Summary
Time 05/02/04 - 05/14/04
Data Sets 23
Packets 28’741’800
Unique Paths Total 6’197
Path Measurements Total 95’806
Good 69’534
Bad 22’713
Sites Total 92
North America 64
International 28
Europe 22
GREN 71
Commercial 4
tool allows us to estimate the relative one way delay of TCP and UDP compared to
ICMP.
4 Experimental Results
In this section we present our findings based on more than 28 million packets.
After outlier elimination we are left with over 20 million packets that we employ
for our analysis. Table 1 summarizes the traces we used for the analysis. Note that
the European sites are also part of the international sites. The Global Research and
Education Network (GREN) combines the academic sites in North America and
Europe.
4.1 Connectivity
We eliminated bad path probes by applying the above described outlier elimination
technique. However, the majority of bad path probes was caused by a complete
outage of one of the four protocols, i.e. only one of the protocols has a loss rate
of 100%. Table 2 summarizes bad paths measurements, it is possible for a bad
paths to miss all probes for one or more of the protocols, we call this an outage.
Some of the bad path are caused by infrastructure problems, as PlanetLab nodes
may crash or reboot during the probing interval of about 2 hours. TCP outages
dominate bad paths, indicating the deployment of firewalls in the PlanetLab test-
bed. The negative impact of TCP cannot simply be explained by our measurement
technique. Even thought we send duplicate SYN and SYN-ACK packets, the first
of these packets is generally expected to pass. We also see that the number of UDP
outages is larger than for ICMP, indicating that UDP connectivity is slightly worse
than that for ICMP.
6
Table 2: Bad Paths: The table summarizes the bad paths by outages per proto-
col. Complete outage means that no packet of the given type is received at the
destination.
Bad Paths Percentage
ICMP 1’765 0.07
UDP 3’694 0.14
TCP 22’713 0.86
Total 26’272 1.00
7
Table 5: Domain Characterization of RTT Anomalies: The characterization of the
anomalies by domain: World (*), GREN (GREN) and Commercial (COM). The
values specify the number of anomalies (UDP/TCP) with more than 10% penalty.
Source Destination
* GREN COM
* 1.71/0.27 0.50/0.19 1.18/0.79
GREN 1.32/0.21 0.05/0.11 1.08/0.00
COM 2.32/1.16 1.05/0.52 8.33/16.7
the identified anomalies, suggesting that the commercial Internet may suffer from
a significant amount of anomalies.
Beside domain and geographical dependencies, the time of day influences the
characteristic of the network. We use three time intervals: night (0am-8am), day
(8am-17pm) and evening (17pm-12am). To give more weight to the time of day
pattern, we reduced the set of sites to North America. This analysis is based on
2259 paths for the night-trace, 2170 paths for the day-trace and 1956 paths for
the evening-trace. Table 6 shows the anomalies present during the different time
intervals. The number of persistent anomalies is substantially reduced during night
hours. During this time there is considerable less traffic in the Internet, buffers
along the routes are generally less filled, and thus probes are more likely to take
the same time independent of their payload. During the day, routers experiencing
congestion may prioritize different IP packet payloads differently, thus delaying
some of the packet types more than others.
8
Table 8: Geographical Characterization of Loss Anomalies: The characterization
of the anomalies by geographical regions: World (*), North America (NA), Inter-
national (INTL) and Europe (EU). The values specify the number of anomalies
(UDP/TCP) with <-1.0% difference.
Source Destination
* NA INTL EU
* 1.76/1.74 2.13/2.10 1.07/1.02 1.04/0.98
NA 0.81/0.69 1.18/1.04 0.07/0.00 0.09/0.00
INTL 3.82/3.96 4.08/4.30 3.29/3.29 3.16/3.16
EU 4.70/4.76 5.02/5.11 4.04/4.04 3.92/3.92
9
Table 10: Temporal Characterization of Loss Anomalies: The characterization of
the anomalies by time (CDT/GMT-5h). The values specify the number of anoma-
lies with an absolute decrease in loss rate of 1.0%, that is one loss in a hundred
packets less.
Protocol Time
night day evening
UDP 0.53 0.41 0.20
TCP 0.58 0.65 0.40
4.4 Validation
We used a simple TTL-based traceroute client2 to validate some of the identi-
fied anomalies. We successfully validated that either the router cs.radio-msu.net
or the router NPI-4700-F0-2.radio-msu.net must employ different QoS for UDP
packets by tracing the path from northwestern.edu and fh-aargau.ch3 to planet-
lab2.cs.msu.su. These two routers represent the first two hops of the site with
the most anomalies (msu.su). Since we only probe the incoming path to the site,
we cannot conclude whether it is the outgoing interface of NPI-4700-F0-2.radio-
msu.net or the incoming interface cs.radio-msu.net, which causes the extra delays
of UDP packets.
After we ruled out msu.su from the traces, we still have about 10 path anomalies
left. It is part of our future work to validate these anomalies.
5 Conclusions
ICMP-based measurements are used to estimate TCP and UDP performance. How-
ever, there is the possibility of dissonance between the application’s performance
and the view portrayed by the measurement tool. This paper addressed the question
of whether ICMP-based measurements can be trusted. Our measurement-based
analysis indicate that for the majority of the paths, ICMP is a good performance
indicator for UDP and TCP RTT. However, over 1.7% of the paths experience UDP
RTT penalties larger than 10%, while 0.27% of the paths suffer similar penalties
for TCP. Further, ICMP has a much higher loss rate than UDP and TCP. The results
indicated that there are significant geographical and temporal differences.
Our results seems to argue in favor of ICMP-based measurements as predictors
of TCP and UDP performance, of course taking into consideration protocol differ-
ences. However, this estimation has some inherited limitations, as the network can
2
Due to technical limitations, we can only trace from nodes outside of PlanetLab.
3
FH Aargau is a university in Switzerland (Central Europe).
10
and sometimes does treat the three protocols differently.
Acknowledgments
We would like to thank Hans-Peter Oser, who kindly loaned us his equipment for
some of our validation experiments. We are also grateful to Tamara Teslovich and
Kate Solinger for their assistance in evaluating the router documentations and their
help in implementing our validation tool. Further, we would like to thank Yi Qiao
and Ananth Sundararaj for their helpful comments on early drafts of this paper.
References
[1] A NDERSEN , D., BALAKRISHNAN , H., K AASHOEK , F., AND M ORRIS , R.
Resilient overlay networks. In Proc. of the 18th ACM SOSP (October 2001).
[2] BANERJEE , S., G RIFFIN , T. G., AND P IAS , M. The interdomain connectiv-
ity of planetlab nodes. In Proc. of PAM (April 2004).
[4] C HU , Y.-H., R AO , S. G., AND Z HANG , H. A case for end system multicast.
In Proc. of ACM SIGMETRICS (June 2000).
[5] C ISCO S YSTEMS. (tcp) and (udp) ports used by cisco unity version 4.0.
www.cisco.com, March 2004. White Paper.
[7] G OYAL , M., G UERIN , R., AND R AJAN , R. Predicting TCP throughput from
non-invasive network sampling. In Proc. of IEEE INFOCOM (June 2002).
[9] JACOBSON , V. Traceroute: A tool to trace the path of packets in the Internet.
ftp://ftp.ee.lbl.gov/traceroute.tar.gz, 1989.
11
[10] JACOBSON , V. Pathchar: A tool to infer characteristics of Internet paths.
ftp://ftp.ee.lbl.gov/pathchar, 1997. A tool to analyze bandwidth, delay and
loss rate of every hop between two end hosts.
[16] P RASAD , R. S., D OVROLIS , C., M URRAY, M., AND C LAFFY, K. C. Band-
width estimation: Metrics, measurement techniques, and tools. IEEE Net-
work 17, 6 (2003).
[20] Z HANG , Y., AND D UFFIELD , N. On the constancy of Internet path proper-
ties. In ACM SIGCOMM Internet Measurement Workshop (August 2001).
12