Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
USENIX Association Proceedings of the First Symposium on Networked Systems Design and Implementation San Francisco, CA, USA March 29–31, 2004 © 2004 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: office@usenix.org WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Listen and Whisper: Security Mechanisms for BGP ☎ Lakshminarayanan Subramanian , Volker Roth University of California, Berkeley lakme,istoica,randy ✆ @cs.berkeley.edu ✁ ✁✄✂ ✂ , Ion Stoica , Scott Shenker , Randy H. Katz ✂ Fraunhofer Institute, Germany ICSI, Berkeley vroth@igd.fhg.de shenker@icsi.berkeley.edu Abstract BGP, the current inter-domain routing protocol, assumes that the routing information propagated by authenticated routers is correct. This assumption renders the current infrastructure vulnerable to both accidental misconfigurations and deliberate attacks. To reduce this vulnerability, we present a combination of two mechanisms: Listen and Whisper. Listen passively probes the data plane and checks whether the underlying routes to different destinations work. Whisper uses cryptographic functions along with routing redundancy to detect bogus route advertisements in the control plane. These mechanisms are easily deployable, and do not rely on either a public key infrastructure or a central authority like ICANN. The combination of Listen and Whisper eliminates a large number of problems due to router misconfigurations, and restricts (though not eliminates) the damage that deliberate attackers can cause. Moreover, these mechanisms can detect and contain isolated adversaries that propagate even a few invalid route announcements. Colluding adversaries pose a more stringent challenge, and we propose simple changes to the BGP policy mechanism to limit the damage colluding adversaries can cause. We demonstrate the utility of Listen and Whisper through real-world deployment, measurements and empirical analysis. For example, a randomly placed isolated adversary, in the worst case can affect reachability to only ✝✟✞ of the nodes. 1 Introduction The Internet is a collection of autonomous systems (AS’s), numbering more than 14,000 in a recent count. The interdomain routing protocol, BGP, knits these autonomous systems together into a coherent whole. Therefore, BGP’s resilience against attack is essential for the security of the Internet. BGP currently enables peers to transmit route announcements over authenticated channels, so adversaries cannot impersonate the legitimate sender of a route announcement. This approach, which verifies who is speaking but not what they say, leaves the current infrastructure extremely vulnerable to both unintentional misconfigurations and deliberate attacks. For example, in 1997 a simple misconfiguration in a customer router caused it to advertise a short path to a large number of network prefixes, and this resulted in a massive black hole that disconnected significant portions of the Internet [14]. To eliminate this vulnerability, several sophisticated BGP security measures have been proposed, most notably SBGP [24]. However, these approaches typically require an extensive cryptographic key distribution infrastructure and/or a trusted central database (e.g., ICANN [3]). Neither of these two crucial ingredients are currently available, and so these security proposals have not moved forward towards adoption.1 In this paper we abandon the goal of “perfect security” and instead seek “significantly improved security” through more easily deployable mechanisms. To the end we propose two measures, Listen and Whisper, that require neither a public key distribution nor a trusted centralized database. We first describe the threat model we address and then summarize the extent to which these mechanisms can defend against those threats. 1.1 Threat Model The primary underlying vulnerability in BGP that we address in this paper is the ability of an AS to create invalid routes. There are two types of invalid routes: Invalid routes in the Control plane: This occurs when an AS propagates an advertisement with a fake AS path (i.e., one that does not exist in the Internet topology), causing other AS’s to choose this route over genuine routes. A single malicious adversary can divert traffic to pass through it and then cause havoc by, for example, dropping packets (rendering destinations unreachable), eavesdropping (violating privacy), or impersonating end-hosts within the destination network (like Web servers etc.). Invalid routes in the Data Plane: This occurs when a router forwards packets in a manner inconsistent with the routing advertisements it has received or propagated; in short, the routing path in the data plane does not match the 1 There is much debate about whether their failure is due to the lack of a PKI and trusted database, or onerous processing overheads, or other reasons. However, the fact remains that neither of these infrastructures are available, and any design that requires them faces a much higher deployment barrier. corresponding routing path advertised in the control plane. Mao et al. [26] show that for nearly ✞ of Internet paths, the control plane and data plane paths do not match. Two primary sources of invalid routes are misconfigurations and deliberate attacks. While these are the only sources of invalid routes in the control plane, data plane invalidity can occur additionally due to genuine reasons (e.g. intra/inter-domain routing dynamics [26]). The fact that a sizable fraction of Internet routes are invalid in the data plane motivates the need for separately verifying the correctness of routes in the data plane and not merely focusing on the control plane. Prior works on securing BGP focus primarily on the control plane. Misconfigurations occur in several forms ranging from buggy configuration scripts to human errors. In the control plane, Mahajan et al. [25] infer that misconfigurations produce invalid route announcements to roughly ✁✄✂☎✂✝✆ ✝✞✁✄✂✟✂ prefixes every day (roughly ✂✡✠☛✁☞✆ ✝✟✞ of the prefix entries in a typical routing table). Stale routes (not propagating new announcements) and forwarding errors at a router (e.g., lack of forwarding entry) are two other data plane misconfigurations causing invalid routes. While AS’s might act in malicious ways on their own, the biggest worry about deliberate attacks comes from adversaries who break into routers. Routers are surprisingly vulnerable; some have default passwords [10, 33], others use standard interfaces like telnet and SSH, and so routers share all their known vulnerabilities. For our purposes in this paper, the only difference between a misconfiguration and an attack is that attackers can take active countermeasures (by, for instance, spoofing responses to various probes) while misconfigured routers don’t. Deliberate attacks can involve an isolated adversary (i.e., a single compromised router) or colluding adversaries (i.e., a set of compromised routers). Colluding adversaries have the additional ability to tunnel route advertisements and fake additional links in the topology. presence of misconfigurations or isolated adversaries; i.e., any invalid route advertisement due to a misconfiguration or isolated adversary with either a fake AS path or with any of the fields of the AS path being tampered (e.g., addition, modification or deletion of AS’s) will be detected. Path integrity also implies that an isolated adversary cannot exploit BGP policies to create favorable invalid routes. In addition, Whisper can identify the offending router if it is propagating a significant number of invalid routes. Listen detects reachability problems caused by errors in the data plane, but is only applicable for destination prefixes that observe TCP traffic. However, none of our solutions can prevent malicious nodes already on the path to a particular destination from eavesdropping, impersonating, or dropping packets. In particular, countermeasures (from isolated adversaries already along the path) can defeat Listen’s attempts to detect problems on the data path. Colluding Adversaries: Two colluding nodes can always pretend the existence of a direct link between them by tunneling packets/ advertisements. In the absence of complete knowledge of the Internet topology, these fake links cannot be detected even using heavy-weight security solutions like Secure BGP [23]. While these fake links enable colluding adversaries to propagate invalid routes without being detected, we show that if BGP employs shortest-path routing then a large fraction of the paths with fake links can be avoided. On the contrary, colluding adversaries can exploit the current application of BGP policies to mount a large scale attack. To deal with this problem and yet support policy-based routing, we suggest simple modifications to the BGP policy engine which in combination with Whisper can largely restrict the damage that colluding adversaries can cause. The spectrum of problems we address in this paper can be described, in order of increasing difficulty, as misconfigurations, isolated adversaries and colluding adversaries. We now describe the extent to which Listen and Whisper provide protection against these threats. The rest of the paper is organized as follows. In Section 2, we discuss related work. In Sections 3 and 4, we describe the whisper and the listen protocols. In Section 5, we present our implementation of Listen and Whisper. In Section 6, we will evaluate several aspects of Listen and Whisper using real-world deployment and security analysis. In Section 7, we discuss the case of colluding adversaries and finally present our conclusions in Section 9. 1.2 Level of Protection 2 Related Work Listen detects invalid routes in the data plane by checking whether data sent along routes reaches the intended destination. Whisper checks for consistency in the control plane. While both these techniques can be used in isolation, they are more useful when applied in conjunction. The extent to which they provide protection against the three threat scenarios can be summarized as follows: In this section, we will present related work as well as try to motivate our work in comparison to previous approaches to this problem. We classify related work based on the threat model. Misconfigurations and Isolated Adversaries: Whisper guarantees path integrity for route advertisements in the 2.1 Misconfigurations Traditional approaches to detecting misconfigurations involves correlating route advertisements in the control plane from several vantage points [25, 34]. While these works identify two forms of misconfigurations (origin and export misconfigurations), a fundamental limitation with analyzing BGP streams: the lack of knowledge of the Internet topology. Since the topology is not known, these techniques can pinpoint invalid routes only when the destination AS is wrongly specified but not when the path is modified. Mao et al. [26] build an AS-traceroute tool to detect the AS path in the data plane which can be used for data-plane verification. While this tool can detect several forms of invalid routes in the data plane, it is useful for diagnostic purposes only once a problem is detected. Padmanabhan et al. [29] propose a secure variant of traceroute to test the correctness of a route. However, this mechanism requires a prior distribution of cryptographic keys to the participating AS’s to ascertain the integrity and authenticity of traceroute packets. In the context of feedback based routing, Zhu et al. [35] proposed a data plane technique based on passive and active probing. The passive probing aspect of this work shares some similarities to our Listen method. 2.2 Dealing with Adversaries Techniques dealing with adversaries can be classified as Key distribution based or Non-PKI based. Key-distribution based: One class of mechanisms builds on cryptographic enhancements of the BGP protocol, for instance the security mechanisms proposed by Smith et al. [31], Murphy et al. [27], Kent et al. [24], and recent work on Secure Origin BGP [28]. All these protocols make extensive use of digital signatures and public key certification. More lightweight approaches based on cryptographic hash functions have been proposed e.g., by Hu et al. [20, 22] in the context of secure routing in ad hoc networks. However, these mechanisms require prior secure distribution of hash chain elements. Why not use a PKI-based infrastructure? Public key infrastructures impose a heavy technological and management burden, and have received a fair share of criticism e.g., by Davis [16], Ellison and Schneier [17]. The PKI model has been criticized based on technical grounds, on grounds of a lack of trust and privacy, as well as on principle [16, 17, 15]. Building an Internet wide PKI infrastructure incurs huge costs and has a high risk of failure. Secure-BGP, despite the push by a tier-1 ISP, has been deployed only by a very small number of ISPs after 5 years (though an IETF working group on Secure-BGP exists). Non-PKI approaches: Non-PKI based solutions offer far less security in the face of deliberate attacks. Some of these mechanisms assume the existence of databases with up to date authoritative route information against which routers verify the route announcements that they receive. The Internet Routing Registry [4] and the Inter-domain Route Validation Service proposed by Goodell et al. [19] belong to PKI SA A SA B SA SB S AS B C SA SB SC S AS BS C D Case(i): Secure−BGP model hA B h AB C h ABC A D hA X h XY Y h AXY Case(ii): Whisper Protocol Model Figure 1: Comparison of the security approach of Whisper protocols with Secure BGP this category. Here, the problem is to ascertain the authenticity, completeness, and availability of the information in such a database. First, ISPs only reluctantly submit routing information because this may disclose local policies that the ISPs regard as confidential. Second, the origin authentication of the database contents again demands a public key infrastructure [28]. Third, access to such databases relies on the very infrastructure that it is meant to protect, which is hardly an ideal situation. 3 Whisper: Control Plane Verification In this section, we will describe the whisper protocol, a control plane verification technique that proposes minor modifications to BGP to aid in detecting invalid routes from misconfigured or malicious routers. In this section, we restrict our discussion to the case where an isolated adversary or a single misconfigured router propagates invalid routes. We will discuss colluding adversaries in Section 7. The Whisper protocol provides the following properties in the presence of isolated adversaries: 1. Any misconfigured or malicious router propagating an invalid route will always a trigger an alarm. 2. A single malicious router advertising more than a few invalid routes will be detected and the effects of these spurious routes will be contained. 3.1 Triggering Alarms vs Identification The main distinction between our approach and a PKIbased approach is the concept of triggering alarms as opposed to identifying the source of problems. In Secure-BGP, a router can verify the correctness of a single route advertisement by contacting a PKI and a central authority to test the validity of the signatures embedded in the advertisement . For example, in Figure 1 (Case(i)), each AS appends an advertisement with a signature ✁✄✂ generated using its public key. Another AS can use a PKI to check whether is the correct signature of . In this case, any misconfigured/malicious AS propagating an invalid route will not be able to append the correct signatures of other AS’s and can be identified. ✁ V ✂ Without either of these two infra-structural pieces, a router cannot verify a single route advertisement in isolation. The Whisper model is to consider two different route advertisements to the same destination and check whether they are consistent with each other. For example, in Figure 1 Case(ii), each route advertisement is associated with a signature of an AS path. AS receives two advertisements to destination ✁ and can compare the signatures ✂☎✄✝✆✟✞ and ✂✠✄ ☛✡ to check whether the routes ☞✍✌✏✎✒✑✓✎✒✁✕✔ and ☞✗✖✘✎ ✙✎✚✁✕✔ are consistent. When two routes are detected as inconsistent, the Whisper protocol can determine that at least one of the routes is invalid. However, it cannot clearly pinpoint the source of the invalid route. Upon detecting inconsistencies, the Whisper protocol can trigger alarms notifying operators about the existence of a problem. This method is based on the composition of well-known principles of weak authentication as discussed by Arkko and Nikander [11]. A V B A P True Uncompromised Node ✦✜✥✦✜✥ ✦✜✥✦✜✥ ✦✜✥✦✜✥ ✦✥✦✥ ✦✥✜✦✥✜✦✜✥✦✜✥ B✦✜✥✦✜✥ ✦✥✦✥ P ★✜✧★✜✧ ★✧★✧ False Comrpomised Node ✛✜✢✜✛ ✛✜✢✜✛ ✛✜✢✜✛ ✛✢✛ ✢✛✜✢✛✜✢✜✛✢✜✛ A✢✜✛✢✜✛ ✢✛✢✛ ✢✜✢✜✢✜✢ ✣✜✣✜✣✜✣✜✣✜✣✜✣✣ ✣✜✤✣✜✣✜✤✜✣ ✣✜✤✜✣ ✣✤✣ ✤✜✤✜✤✜✤✜✤✜✤✜✤✤ ✩✜✩✜✪✜✩✪✜✩ ✪✜✩✪✜✩ ✪✩✜✪✩✜✤✜✪✩✪✩ ✤✜B✤✜✤ ✩✜✩✜✪✜✩✪✜✩ ✪✜✩✪✜✩ ✪✜✩✪✜✩ ✪✩✪✩ V P True or False Imaginary Path Figure 2: Different outcomes for a route consistency test. In all these scenarios, the verifying node is ✫ . The verifying node checks whether the two routes it receives to destination ✬ are consistent with each other. ✂ Whisper does not require the underlying Internet topology to have multiple disjoint paths to every destination AS. As long as an adversary propagating an invalid route is not on every path to the destination, whisper will have two routes to check for consistency: (a) the genuine route to the destination; (b) invalid path through the adversary. 3.2 Route Consistency Testing A route consistency test takes two different route advertisements to the same destination as input and outputs true if the routes are consistent and outputs false otherwise. Consistency is abstractly defined as follows: 1. If both route announcements are valid then the output is true. 2. If one route announcement is valid and the other one is invalid then the output is false. 3. If both route announcements are invalid then the output is true or false. The key output from a route consistency test is false. This output unambiguously signals that at least one of the two route announcements is invalid. In this case, our protocols can raise an alarm and flag both the suspicious routes as potential candidates for invalid routes. If the consistency test outputs true, both the routes could either be valid or invalid. Figure 2 depicts the outcomes of a route consistency test for various examples of network configurations. We will now describe two whisper consistency tests, namely Weak Split Whisper and Strong Split Whisper (SSW), of increasing complexity offering different security guarantees. We primarily use Weak Split, a simple hash chain based construction, to motivate the construction of SSW. SSW offers path integrity in the presence of misconfigurations or isolated adversaries and all the results in the paper are based on SSW. Conceptually, both these constructions introduce a signature field in every BGP UPDATE message which is used for performing the route consistency test. There are three basic operations that are allowed on the signature field: 1. Generate-Signature: The origin AS (the originator of a route announcement) of a destination prefix generates a signature and initializes this field in the BGP UPDATE message and forwards it to its neighbor. The origin AS uses different initial signatures for every prefix it owns. 2. Update-Signature: Every intermediary AS that is not the origin of a destination prefix is required to update the signature field using a cryptographic hash function. This operation is only performed by one router in every AS (typically at the entry point of an AS). 3. Verify-Signature: Any intermediary router that receives two different routes (with different AS paths) can compare whether the signatures in the two different routes are consistent with each other. The path integrity property requires the whisper protocol to satisfy two properties: (a) a malicious adversary should not be able to reverse engineer the signature field of an AS path; (b) any modification to the AS path or signature field in an advertisement should be detected as an inconsistency when tested with a valid route to the same destination. 3.2.1 Weak Split Whisper Figure 3 illustrates the weak-split construction using a simple example topology. Weak-Split whisper is motivated by the hash-chain construction used by Hu et al. [21, 20] in the context of ad-hoc networks. The key idea is as follows: The origin AS generates a secret ✭ and propagates ✂✮☞✗✭☎✔ to its neighbors where ✂✮☞✍✔ is a globally known one-way 3 h(h(X)) h (X) A B 4 C h (X) h(X) P V Secret=X h(X) X 3 h (X) Y h(h(X)) Figure 3: Weak-Split construction using a globally known hash function ✂✁☎✄ g z.P mod N P g z.P.A A g z.P.A.B B C g z.P.A.B.C mod N V g z.P mod N N=p.q Generator g Secret z X g z.P.X mod N g Y z.P.X.Y mod N Figure 4: Basic Strong-Split construction using exponentiation under modulo N where ✆✞✝✠✟☛✡✌☞ , a product of two large primes. hash function. Every intermediary AS in the path repeatedly hashes the signature field. An AS that receives two routes ✍ and ✎ of AS hop lengths ✏ and ✑ with signatures ✒✔✓ and ✒✖✕ can check for consistency by testing whether ✂✂✗✙✘✛✚ ☞ ✒✜✕ ✔✣✢ ✒✔✓ . The security property that the weak-whisper guarantees is: An independent adversary that is ✤ AS hops away from an origin AS can propagate invalid routes of a minimum length without being detected as inconsistent. An AS that of ✤ is ✤ hops away from the origin knows the value ✂✦✥ ☞ ✭☎✔ but cannot compute ✂✦✗ ☞✗✭☎✔ for any ✏★✧✩✤ since ✂ ☞ ✔ is a one-way hash function. Such an AS also is not supposed to reveal its hash value to other nodes (unless the AS colludes with other AS’s). However, the adversary can forward any fake path of length ✤ . ✆ ✝ ✆ ✝ Hence, weak-split whisper does not provide strong forms of security guarantees. In particular, it cannot ensure path integrity i.e. a malicious AS could modify the AS numbers of a path without affecting the AS path length. ✮✛✳ and ✱✵✳ are also prime. It then computes a generator ✪ in the prime group ✷✹✸ and ✷✻✺ . Finally, it chooses a random number ✬ and computes ✪✽✼ mod✤ . The signature generated is a tuple ☞☎✤✙✎✫✪✾✼ mod✤ ✔ . While the origin AS publicly announces ✤ , only it knows the prime factors of ✤ . Similar to RSA, we rely on the fact that an adversary cannot factor ✤ to determine its prime factors. update-signature: Every AS is associated with a unique AS number which is specified in the path. Let AS ✁ that receive an advertisement from a neighboring AS with a signature ☞☎✤✙✎ ✒ ✔ where ✒ is of the form ✪✽✿ mod✤ for some ✄ value of . AS ✁ updates this signature to ☞❀✤ ✎ ✒ mod✤ ✔ . In other words, the AS exponentiates using its AS number. In Figure 4, the route announcement contains an AS path ❁☛✎✒✁ ✎✒✑✓✎✚✌ , the corresponding signature of the route is ☞☎✤✙✎✫✪✾✼❃❂ ❄✛❂ ✄ ❂ ✆ ❂ ✞ mod✤ ✔ . verify-signature: We will describe verify-signature using the example in Figure 4. The verifier, ❅ , receives two sig✄ ✆ ✞ natures ☞❀✤ ✎❆✎✵❇✜✔ and ☞❀✤ ✎❆✎✙❈ ✔ where ✎❉❇❊✢❋✪●✼❍❂ ❄✛❂ ❂ ❂ mod✤ ✡ and ✎ ❈ ✢■✪✾✼❃❂ ❄✛❂ ❂ mod✤ . Given these values and the corresponding AS paths, the verifier outputs the routes to be consistent if: ✂ ✎ ❇ ❂✡ ★ ✢ ✎ ✄❈ ❂ ✆ ❂ ✞ ✂ SSW is similar to the MuHASH construction proposed by Bellare et al. [12] for incrementally hashing signatures. A formal proof of the security guarantees offered by MuHASH is also applicable in our context to show that SSW offers path integrity. The key observation with our construction is: given ✤ and given ✪❑❏ mod✤ , an ad❇ versary cannot compute ✭▲✘ mod ▼✟☞☎✤ ✔ (where ▼✟☞✍✔ is the Euler function on natural numbers; given ✤ ✢◆✮❖✯P✱ , ▼✟☞☎✤ ✔☛✢ ☞◗✮ ✔❘✯ ☞☎✱ ✔ ) and hence cannot remove the signature of previous nodes in the AS path. ✆ ✝ ✆ ✝ This construction has three problems: (a) an adversary can permute entries in a path due to commutative property of multiplication i.e., ✁ ✑❙✢ ✑ ✁ ; (b) the factoring property i.e., ❚✢❱❯❲✯ implies an AS path ☞ ✎❳❯ ✔ can be replaced by ☞ ✔ ; (c) More importantly, an adversary can add AS’s to the AS path without being detected. ✠ ✠ ✁ 3.2.2 Strong Split Whisper The strong split whisper protocol uses a more sophisticated cryptographic check and can provide path integrity in the presence of independent adversaries i.e., If an adversary removes or changes any entry in the AS path, the strong split whisper will always detect an inconsistency. Figure 4 shows a construction of the basic SSW using the RSA mechanism. We use a minor modification of the illustrated example. We will elaborate the three basic operations for this protocol: ✁ Preventing commutativity and factoring: To prevent commutativity and factoring problems, we define a pseudoAS number for every AS which depends on the position of the AS in a given AS path. If an AS appears in position ✮ in the AS path, the following function ❨ ☞ ✙✎☎✮ ✔✻✢ ✁ will produce unique values for all AS’s in different positions in an AS path (since ✙❭ bits are sufficient to express AS numbers). To avoid the problem of commutativity, an ❨ AS updates a signature using ☞ ✙✎❀✮☎✔ instead of using its AS number . ✝ generate-signature: The origin AS computes three basic parameters:. ✤✙✎✫✪ ✎✭✬ . ✤ is chosen as ✮✰✯✰✱ where ✮ and ✱ are two large primes of the form ✲✮✦✳✙✴ and ✔✱✵✳✶✴ where ✁ ✝ ✁ ✝ ❇✫❩ ✯❬✮❬✴ To avoid the factoring problem, we use prime numbers. Given a number ✒ , one can determine the ✱ ☞ ✒ ✔ as the ✒✁✄✂ lowest prime number. Prime numbers are not factorable and these numbers can be precomputed. Hence, given an AS appearing at position ✮ , we use the exponent to be ❨ ✳ ✢✞✱ ☞ ☞ ✙✎❀☎✮ ✔✒✔ to avoid both commutativity and factoring problems. We refer to ✳ as the psuedo-AS number of AS when it appears in position ✮ . The pseudo-AS numbers for a given AS are computable by other routers as well. Hence, we only use pseudo AS numbers for computing the signature but do not change AS numbers in the AS path. Preventing Addition of new AS numbers: The key to preventing an adversary from adding AS numbers is to associate a link identifier to represent an AS link between two ☎✝✆ AS’s. If AS ✁ forwards a route to AS ✑ , let ✑ ✏✝☞✗✁ ✎✒✑ ✔ be a uniquely computable identifier which is a function of the AS numbers ✁ and ✑ . An AS ✁ that received an advertisement ☞☎✤ ✎ ✒ ✔ should propagate the advertisement with the signature: ✍ ✄ ✞✡✠ ☞✚ ☛☞✌✔✗ ✄✏✎ ✆✒✑ ✔ ☞☎✤ ✎ ✒ ✟ where ✁☛✳ is the pseudo-AS numbers of ✁ . Since the iden☎✝✆ tifier ✑ ✏✝☞ ✁ ✎✒✑ ✔ is added to the signature by ✁ , ✑ cannot remove this portion from the signature. This implies ✑ cannot convert an AS path ☞✗✑✓✎✒✁✏✔ to ☞✗✑ ✎ ✌✏✎✒✁✏✔ . However, if ✑ adds an AS at the end of a path (e.g., ☞ ✌✏✎✚✑ ✎✚✁✕✔ ), then the neighbor receiving the advertisement will notice that the neighbor it received the announcement from (i.e., B) does not match the first AS in the path (i.e., C). Hence it will not accept the announcement. One simple way to define a link identifier is: ✑ ☎✝✆ ✝☞✗✁ ✎✒✑ ✔✻✢ ✁✔✓ ✏ ❈ ✁ ✯ ✳ ✴ ✑ ✳ where ✁☛✳ and ✑ ✳ are the pseudo-AS numbers of ✁ and ☎✝✆ ✑ . ✑ ✏✝☞✗✁ ✎✒✑ ✔ will be unique for all AS pairs ✁ ✎✚✑ . Note ❈ that pseudo-AS numbers are always less than ✁ ✓ since ❨ ❈✲❇ ☞ ✎☎✮☎✔ ✧ ✁ for all AS paths less than 32 hops in length. Generalized SSW construction: In this section, we only described the SSW construction using the basic RSA group structure. Alternatively, one can build SSW using elliptic curve cryptography [13]. The main distinction between RSA and ECC is the number of bits necessary for the signature field. While RSA requires ✂✟❉✁ ❯ bit signatures, ECC only requires ✖ ✁ ✕❉❭ bits to provide the same level of security. ✝ 3.3 Containment: Penalty Based Route Selection Route consistency testing only provides the ability to trigger alarms whenever a node propagates invalid route announcements. We append consistency testing with penalty based route selection, a simple containment strategy that attempts to identify suspicious candidates and avoid routes propagated by them. The strategy works as follows: A V MA,MB,MC PA P QB RC Q R A B A B ✗ ✙✘ ✗ ✙✘ ✗ ✙✗ ✙✘ ✗ ✙✘ ✗ ✙✗ ✙✗✘✙✘ ✗ M✙✘ ✗ ✙✗ ✙✗✘✙✘ ✗ ✙✘ ✗ ✙✗ ✙✗✘✙✘ C C Figure 5: Detecting Suspicious AS’s: In this example, ✚ is a malicious AS that propagates 3 invalid routes to 3 different destinations ✛ , ✜ , ✢ . The AS paths in the routes propagated are indicated along the links. The verifi er ✫ assigns penalty values of ✣ , ✤✦✥✧✤✦✥★✤ to ✚✩✥✪✛✫✥✪✜✬✥✭✢ respectively. router counts across destinations how often an AS appears on an invalid route, and assigns this count as a penalty value for the AS. The more destinations an adversary affects the higher becomes its penalty and the clearer it stands out from the rest. The route selection strategy is to choose the route to a destination with the lowest penalty value. Consider the topology in Figure 5, where ✮ is a malicious node that propagates ✯ invalid route announcements with AS paths ✮ ✁ , ✮ ✑ , ✮ ✌ . By choosing the minimum penalty route, the verifier ❅ can avoid the invalid routes through ✮ since they have a higher penalty value. One key assumption used in this technique is: The identity of an AS propagating invalid routes is always present in the AS path attribute of the routes. The identity of every AS is verified by the neighboring AS which receives the advertisement. For example, Zebra’s BGP implementation [2] explicitly checks for this constraint for every announcement it receives. BGP should use shared keys across peering links to avoid man in the middle attacks. Penalties should primarily be viewed as a reasonable first response to detect suspicious candidates and not as a foolproof mechanism. In the presence of an isolated adversary, penalty based filtering can ensure that the effects of the adversary are contained. We believe that penalties is a good mechanism to detect malicious adversaries in customer AS’s but should be applied with caution when involving AS’s in the Internet core. In particular, penalties are not a good security measure in the presence of colluding adversaries or when the number of independent adversaries is large. For example, multiple adversaries can artificially raise the penalty of an innocent AS by including its AS number in the invalid route. 4 Listen: Data Plane Verification In this section, we will present the Listen protocol, a data plane verification technique that detects reachability prob- lems in the data plane. Reachability problems can occur due to a variety of reasons ranging from routing problems to misconfigurations to link failures. Listen primarily signals the existence of such problems as opposed to identifying the source or type of a problem. a prefix is unreachable if during a period ✁ it does not observe a complete TCP flow where ✁ is defined as the maximum between: (a) the time taken to observe ✤ or more incomplete TCP flows with different destinations within prefix ❁ ; (b) a predefined time period . Data plane verification mechanisms are necessary in two contexts: (a) connectivity problems due to stale routes or forwarding problems are detectable only by data plane solutions like Listen. (b) Blackhole attacks by malicious adversaries already present along a path to a destination. However, proactive malicious nodes can defeat any data plane solution by impersonating the behavior of a genuine end-hosts. The attractive features of Listen are: (a) passive (b) a standalone solution that can be incrementally deployed without any modifications to BGP; (c) quick detection of reachability problems for popular prefixes; (d) low overhead. The basic probing mechanism described above suffers from two forms of classification errors: (a) false negatives; (b) false positives. A false negative arises when a router infers a reachable prefix as being unreachable due to incomplete connections. A false positive arises when an unreachable prefix is inferred as being reachable. A malicious end-host can create false positives by generating bogus TCP connections with SYN and DATA packets without receiving ACKs. In Section 6.2, we show how to choose the parameters ✤ and to reduce the chances of incomplete connections causing false negatives. The basic form of the protocol described in this section is vulnerable to port scanners generating many incomplete connections. In Section 6.2, we use propose defensive measures against port scanners and motivate them using real world measurements. 4.1 Listening to TCP flows The general idea of Listen is to monitor TCP flows, and to draw conclusions about the state of a route from this information. The forward and reverse routing paths between two end-hosts can be different. Thus we may observe packets that flow in only one direction. We say that a TCP flow is complete if we observe a SYN packet followed by a DATA packet, and we say that it is incomplete if we observe only a SYN packet and no DATA packet over a period of 2 minutes (which is longer than the SYN timeout period). 4.1.1 Dealing with False Positives Malicious end-hosts can create false positives by opening bogus TCP connections to keep a router from detecting that a particular route is stale or invalid. Adversaries noticing route advertisements from multiple vantage points (e.g., Routeviews [8]) can potentially notice mis-configurations before routers notice reachability problems. Such adversaries can exploit the situation and open bogus TCP connections. We propose a combination of active dropping and retransmission checks as a countermeasure to reduce the probability of false positives. 1. Active dropping: Choose a random subset of ✂ ❇ packets within a completed connection (or across connections), drop them and raise an alarm if these packets are not retransmitted. Alternatively, one can just delay packets at the router instead of dropping them. 2. Retransmission check: Sample a different random subset of ✂ ❈ packets and raise an alarm if more than ✕☎✂ of the packets are retransmitted. Consider that a router receives a route announcement for a prefix ❁ and wishes to verify whether prefix ❁ is reachable via the advertised route. In the simplest case, a router concludes that the prefix ❁ is reachable if it observes at least one complete TCP flow. On the other hand, the router cannot blindly conclude that a route is unreachable if it does not observe any complete connection. Incomplete connections can arise due to reasons other than just reachability problems. These include: (a) non-live destination hosts; (b) route changes during the connection setup of a single flow i.e. SYN and DATA packets traverse different routes. (c) port scanners generating SYN packets. An adversary generating a bogus connection cannot decide which packets to retransmit without receiving ACKs. If the adversary blindly retransmits many packets to prevent being detected by Active dropping, the Retransmission check notices a problem. We set a threshold of 50% for retransmission checks assuming that most genuine TCP connections will not experience a loss-rate close to 50%. Under the assumption that port scanners are not present, detecting reachability problems would be easy. To deal with non-live destinations, a router should notice multiple incomplete connections to ✤ different distinct destination addresses (for a reasonable choice of ✤ ). The problem of route changes can be avoided by observing flows over a minimum time period . Hence, a router can conclude that Consider an adversary that has transmitted ✏ packets in a TCP connection without receiving ACKs to retransmit a ✒ ❚✢ ✍ ❏☎✄ repfraction, ✱ , of these packets. Let ❏✔✘✝✆ ✑ ✄ ❂ ✆✞✄ resent the binomial coefficient for two values and ✒ . The probability with which the adversary able to mislead the ✍ is ✧✺ ✎ ✠☛✡✪✑ active dropping test is given by ✗✞✍ ✟ ✎ ✠☞✡✪✑ . The probabil✗ ity with which the retransmission check cannot detect an ✞ ✌ ☞ ✭ ✎ ✔ ✭ ✞ ✞ procedure LISTEN(P,T,N) Require: Prefi x , time period , number of unique destinations ✆ 1: ✁✄✂ = time at which fi rst SYN packet observed 2: wait until ☎ flows with distinct dest. in ✆☎✞✝ ✆ 3: wait till clock time ✟✠✁ ✂☛✡ 4: ☞ Clean the data-set ✌ 5: For every pair of IP addresses ✁✎✍✑✏✓✒ ✥✕✔✖✍✗✁✫✄ observed 6: if at least a single connection has completed then 7: Add sample ✁✎✍✑✏✘✒ ✥✕✔✖✍✗✁★✥✄✒✗✙✘❊ ✚ ✟✜✛✣✢✑✁✄❍ ✢ ✄ 8: else 9: Add sample ✁✎✍✑✏✘✒ ✥✕✔✖✍✗✁★✥✄✤✎✥✦✒✧✙✓✚☛✟✜✛✣✢✑✁✕✢❃✄ 10: end if 11: ☞ Constants ✢✩★✖✥✭✢✫✪ must be determined in practice ✌ 12: if fraction of complete connections ✟ ✢✬★ then 13: return “route is verifi able” 14: end if 15: if at least one connection completes then 16: if fraction of complete connections ✭ ✢✩✪ then 17: ☞ Test for false positive ✌ 18: sample 2 future complete TCP flows towards termining whether a route is verifiable. Since false positive tests can impact the performance of a few flows, the algorithm uses the constant ✂ and ✚ to trade off between when to test for false positives. When the test is not applied, we use the fraction of complete connections as the only metric to determine whether the route works. The setting of ✂ ✚ depends on the popularity of the prefixes. Firstly, we apply the false positive tests only for popular for non-popular prefixes. For a popuprefixes i.e., ✚ ✢ lar prefix, we choose a conservative estimate of ✂ (closer to ) i.e., a large fraction of the connections have to complete in order to conclude that the route is verifiable. On the other hand, if we observe that a reasonable fraction of combination of incomplete connections, we apply the false positive test to sampled complete connections. The user has choice in tuning ✚ based on the total number of false positive tests that need to be performed. For non-popular prefixes, the statistical sample of connections is small. For such prefixes, we set the value of ✂ to be small. ✬ ✌ ✬ ✌ ✌ ✁ ✌ ✌ 5 Implementation In this section, we will describe the implementation of Listen and Whisper and their overhead characteristics. 5.1 Whisper Implementation Figure 6: Pseudo-code for the probing algorithm. adversary is given by the tail of the binomial distribution ✰✯ ❈ ❆✑ ❳✱ ✚ ✄✮ ✱ ✰✯ ✘✂✚ . Hence the over✳✯✗✴ ❈ ✲✱ ✚ all probability, ✮✶✵ , that our algorithm does not detect an adversary is: ✠ ✠ ☞ ✝ ✆ ☞ ✌ ☞ ✂ ✎ ✔ ☞ ✝ ✆ ✔ ✔ ✔ ✠ ✌ ✌ ☞ ✏✸❃✷ ✱ ✏ ☞ ✎ ✂ ✎ ❇ ✂ ❇ ✔ ✂ ✝ apply active dropping and retransmission checks if test is successful then return “route is verifi able” else return “route is not verifi able” end if end if end if 20: 21: 22: 23: 24: 25: 26: ✌ ✌ ✬ 19: ✎ ✌ ✔ ✯ ✹ ☞ ✝ ✆ ✯ ✠ ☞ ✚✲✱ ✰✯✗✴ ✠ ❈ ✌ ☞ ✂ ❈ ❆✑ ✫✱ ✚ ✎ ✔ ☞ ✝ ✱ ✆ ✯ ✠ ✔ ✘✂✚ ✔ In this section, we will only focus on the implementation of the strong split whisper protocol. The whisper implementation contains two basic components: (a) a stand alone whisper library which performs the cryptographic operations used in the protocol. (b) a Whisper-BGP interface which integrates the whisper functions into a BGP implementation. We implemented the Whisper library on top of the crypto library supported by OpenSSL development version 0.9.6b-33. We integrated this library with the Zebra BGP router implementation version 0.93b [2]. Our Whisper implementation works on Linux and FreeBSD platforms. ✔ 5.1.1 Whisper Library For a given prefix, the overhead of active dropping can be made very small. By choosing ❇ ✢■❭ and dropping only ❭ packets across different TCP flows, we can reduce the probability of false positive, ✮ ✵ , to be less than . The structure of a basic Whisper signature is: ✂ ✂ ✠ ✝ ✞ This countermeasure is applied only when we notice a discrepancy across different TCP connections to the same destination prefix, i.e., number of incomplete connections and complete connections are roughly the same. In this case, we sample and test whether a few complete connections are indeed bogus. 4.1.2 Detailed Algorithm Figure 6 presents the pseudo-code for the listen algorithm. The algorithm takes a conservative approach towards de- typedef struct { BIGNUM *seed; BIGNUM *N; }Signature; BIGNUM is a basic data structure used within the OpenSSL crypto library to represent large numbers. The whisper library supports these three functions using the Signature data structure: 1: generate signature(Signature *sg); 2: update signature(Signature *sg, int asnumber, int position); 3: verify signatures(Signature *r, Signature *s,int *aspath r, int *aspath s); These functions exactly map to the three whisper operations described earlier in Section 3.2.2. The main advantage of separating the whisper library from the whisper-BGP interface is modularity. The whisper library can be used in isolation with any other BGP implementation sufficiently different from the Zebra version. Operation update signature verify signatures generate signature 512-bit 0.18 msec 0.25 msec 0.4 sec 1024-bit 0.45 msec 0.6 msec 8.0 sec 2048-bit 1.42 msec 1.94 msec 68 sec Table 1: Processing overhead of the Whisper operations on a 1.5 Ghz Pentium IV with 512 MB RAM. 5.1.2 Integration with BGP The Whisper protocol can be integrated with BGP without changing the basic packet format of BGP. Specifically, we do not need any additional field for the Whisper signature. BGP uses community attributes within UPDATE messages that can be leveraged for embedding the signature attributes. Community attributes are ✟✁ bit values which are optional BGP attributes that are mainly used for community-based routing mainly for multi-homing ISPs. ✯ This design offers us many advantages over updating a version of BGP. First, a single update message can have several community attributes and one can split a signature among multiple community attributes. Second, a community attribute can be set using the BGP configuration script to allow operators the flexibility to insert their own community attribute values. In a similar vein, one can imagine a stand-alone whisper library computing the signatures and a simple interface to insert these signatures within the community attributes. Third, one can reserve a portion of the community attribute space for whisper signatures. In today’s BGP, community values can be set to any value as long as they are interpreted correctly by other routers. or packet probing software like Cisco’s Netflow [1]. The current implementation cannot support false positive tests since the code can only passively observe the traffic but cannot actively drop packets (since this does not perform the routing functionality). In our implementation, the complexity of listening to a TCP flow is of the same order as a route lookup operation. Additionally, the state requirement is ✝ for every prefix. We maintain a small hash table for every prefix entry corresponding to the (src,dst) IP addresses of a TCP flow and a time stamp. While a SYN packet sets a bit in the hash table, the DATA packet clears the bit and record a complete connection for the prefix. Using a small hash table, we can crudely estimate the number of complete and incomplete connections within a time-period . Additionally, we sample flows to reduce the possibility of hash conflicts. This implementation uses simple statistical counter estimation techniques used to efficiently maintain statistics in routers. Hence, the basic form of Listen can be efficiently implemented in the fast path of today’s routers. ☞ ✝ ✔ Our implementation uses the following semantics for the community attribute: if the first bits of an attribute are set to ✂ ✁☞✂ and ✂ ✁ , then the remaining ✁✵❯ bits refer to a portion of the ✄✎ ✂☎✂☎✆ and ✤ attributes in the signature. An RSA based Whisper signature uses ✁✄✂✔❯✟ bits per signature field - ✂✟✁❉❯ bits for the seed and ✂✟✁❉❯ bits for ✤ . Such a signature uses ☎ community attributes. An ECC based Whisper implementation uses ✁ bits per signature and hence uses only ✁✟✁ community attributes. Deployment: We deployed our Listen prototype to sniff on TCP traffic to and from a ✞☎✁✵❯ prefix within our university. Additionally, we received BGP updates from the university campus router and constructed the list of prefixes in the routing table used by the edge router. The tool only needs to know the list of prefixes in the routing table and assumes a virtual route for every prefix. The Listen tool can report the list of verifiable and non-verifiable prefixes in real time. Additionally, the Listen algorithm is applied only by observing traffic in one direction (either outbound or inbound). 5.2 Listen Implementation 5.3 Overhead Characteristics We implemented the passive probing component of Listen (i.e. without active dropping) in about ✁✄✂✟✂☎✂ lines of code in C and have ported the code to Linux and FreeBSD operating systems. The current prototype uses the libpcap utility [5] to capture all the packets off the network. This form of implementation has two advantages: (a) is stand-alone and can be implemented on any machine (need not be a router) which can sniff network traffic; (b) does not require any support from router vendors. Additionally, one can execute bgpd (Zebra’s BGP daemon [2]) to receive live BGP updates from a network router. For faster line-rates (e.g. links in ISPs), listen should be integrated with hardware Overhead of Whisper: One of the important requirements of any cryptography based solution is low complexity. We performed benchmarks to determine the processing overhead of the Whisper operations. Table 1 summarizes the average time required to perform the whisper operations for different key sizes: ✁✡✆ bit, ✂ ✁✵❯ ✆ bit and ✁✄✂✔❯✟✡✆ bit. As the key size increases, the RSA-based operations offer better security. Security experts recommend a minimum size of ✂ ✁✵❯ bit keys for better long-term security. ✭ ✭ ✝ ✝ ✕ ✝ ✝ ✯ ✕ ✝ ✝ ✝ We make two observations about the overhead characteristics. First, the processing overhead for all these key sizes 1 are well within the limits of the maximum load observed at routers. For ✁✄✂❉❯ bit keys, a node can process more than ❯ ✁ ✂✟✂☎✂ route advertisements within minute. In comparison, the maximum number of route advertisements observed at a Sprint router is ✔✯☎✂✟✂ updates every minute [9]. For ✂ ✁✵❯ bit keys, Whisper can update and verify over ✂✟✂ ✂☎✂☎✂ route advertisements per minute. Second, generate signature() is an expensive operation and can consume more than sec per operation. However, this operation is performed only once over many days. 0.8 ✝ Cumulative Distribution ✎ 0.9 ✝ ✝ ✎ 0.7 0.6 0.5 0.4 Top Top Top Top 0.3 0.2 100 300 500 1000 ✝ Overhead of Listen: By analyzing route updates for over of the ✂✁ days in Routeviews [8], we observed that ✄ routes in a routing table are stable for at least hour. Based on data from a tier-1 ISP, we find that a router typically observes a maximum of ✁✄✂✟✂☎✂☎✂ active prefixes over a period of hour i.e., only ✁✄✂☎✂✟✂☎✂ prefixes observe any traffic. If the probing mechanism uses a statistical sample of ✂ flows per prefix, the overhead of probing at the router is negligible. Essentially, the router needs to process ✁✄✂✟✂☎✂☎✂✟✂ flows in ✯✔❭☎✂✟✂ sec which translates to monitoring under ❭✟✂ ❭ ✂ routing lookups). flows every second (equivalent to ❀☎ Even if the number of active prefixes scales by a factor of ✂ , current router implementations can easily implement the passive probing aspect of Listen. ✝ 0.1 0 −1 10 0 1 10 10 2 10 Fraction of nodes vulnerable to attack(%) Figure 7: Effects of penalty based route selection ✞ ✝ ✝ ✝ ✝ ☞ policy-based routing path between a pair of AS’s is determined using customer–provider and peer–peer relationships, which have been inferred based on the technique used in [32]. 6.1 Whisper: Security Properties against Isolated Adversaries ✔ ✝ Active dropping and retransmission checks are applied only in the IP slow path and are invoked only when a prefix observes a combination of both incomplete and complete connections. To minimize the additional overhead of these operations, we restrict these checks to a few prefixes. 6 Evaluation In this section, we evaluate the key properties of Listen and Whisper. Our evaluation is targeted at answering specific questions about Listen and Whisper: 1. How much security can Whisper provide in the face of isolated adversaries? 2. How useful is Listen in the real world? In particular, can it detect reachability problems? 3. How does Listen react in the presence of port scanners? How does one adapt to such port scanners? In this section, we quantify the maximum damage an isolated adversary can inflict on the Internet given that Strong Split Whisper is deployed. Since SSW offers path integrity, an isolated adversary cannot propagate invalid routes without raising alarms unless there exists no alternate route from the origin to the verifier (i.e. adversary is present in all paths from the origin to the Internet). Given an adversary that is willing to raise alarms, we analyzed how many AS’s can one such adversary affect. In this analysis, we exclude cases where the adversary is already present in the only routing path to a destination AS. We use penalty based route selection as the main defense to contain the effects of such invalid routes. We assume that in the worst-case, an adversary compromising a single router in an AS is equivalent to compromising the entire AS especially if all routers within the AS choose the invalid route propagated by the compromised router. Let ✮ represent an isolated adversary propagating an invalid route claiming direct connectivity to an origin AS . AS ❅ is said to be affected by the invalid route if ❅ chooses the route through ✮ rather than a genuine route to either due to BGP policies or shorter hop length. Based on common practices, we associate all AS’s with a simple policy where customer routes have the highest preference followed by peers and providers [18]. Given all these relationships, we define the vulnerability of an origin AS, , ★✮ as ❅ to be the maximum fraction of AS’s, ✮ can affect. Given an isolated adversary ✮ , we can quantify the worst-case effect that ✮ can have on the Internet using the cumulative distribution of ❅ across all origin AS’s ★✮ in the Internet. ✝ We answer question (1) in Section 6.1, questions (2),(3) in Section 6.2. Our evaluation methodology is two-fold: (a) empirically evaluate the security properties of Whisper; (b) use a real-world deployment to determine usefulness of Listen. To evaluate the security properties of Whisper, it is necessary to determine the effects of the worst-case scenario which is better quantified using an empirical evaluation. We collected the Internet AS topology data based on BGP advertisements observed from ✕ different vantage points over ✂✁ days including Routeviews [8] and RIPE [7]. The ✝ ✝ ✝ ✝ ☞ ✝ ✎ ✔ ☞ ✝ ✎ ✔ Outbound Inbound Number of Reachability Problems 235 343 Probability of False Negatives 0.93% 0.37% Table 2: Listen: Summary of Results With AS’s deploying penalty based route selection as a to reduce. defense, we expect the vulnerability ❅ ★✮ We study how the cumulative distribution of ❅ for ✮ a single adversary ✮ varies as a function of how many AS’s deploy penalty based route selection. We consider ✆ the scenario where the top ISPs deploy penalty based route selection (based on AS degree). Figure 7 shows this ✆ ✢ cumulative distribution for for different values of ✝ ✂✟✂ ★✯☎✂☎✂ ★✕☎✂☎✂ and ✝ ✂☎✂✟✂ . These distributions are averaged across all possible choices for ✮ . ☞ ✎ ✔ ✓☞ ✎ ✔ ✝ ✝ ✎ ✎ We make the following observations. First, a median value ✆ of ✝ ✞ for ✢ ✝ ✂✟✂☎✂ indicates that a randomly located adversary can affect at most ✝✟✞ of destination AS’s by propagating bogus advertisements assuming that the top ✝ ✂☎✂☎✂ ISPs use penalties. This is orders of magnitude better that what the current Internet can offer where a randomly located adversary can on an average affect nearly ✯☎✂ ✞ of the routes (repeat the same analysis without SSW) to a randomly chosen destination AS. Second, in the worst case, a single AS can at most affect ✞ ✆ of the destination AS’s for ✢ ✝ ✂☎✂✟✂ . ✞ is a limit imposed by the structure of the Internet topology since it represents the size of the largest connected without the top ✝ ✂☎✂☎✂ ISPs. One malicious AS in this component can potentially affect other AS’s within the same component. Third, if all provider AS’s use penalties for route selection, the worst case behavior can be brought to a much smaller value than ✞ . Additionally, there is very little benefit in deploying penalty based route selection in the end-host networks since they are not transit networks and typically are sources and sinks of route advertisements. Hence, any filtering at these end-hosts only protects themselves but not other AS’s. To summarize, the Whisper protocol in conjunction with penalty based route selection can guarantee that a randomly placed isolated adversary propagating invalid routes can affect at most ✝ ✞ of the AS’s in the Internet topology. 6.2 Listen: Experimental Evaluation In this section, we describe our real-world experiences using the Listen protocol. We make two important observations from our analysis. First, we found that a large fraction of incomplete TCP connections are spurious i.e., not indicative of a reachability problem. We show that by adaptively setting the parameters ❆✤ of our listen algorithm ✎ Number of end-hosts behind ✂✁☎✄ network Number of days Total No. of TCP connections No. of complete connections No. of incomplete connections Average Routing Table Size Total No. of Active Prefi xes Average No. of Active Prefi xes per hour Average No. of Active Prefi xes per day Verifi able Prefi xes Prefi xes with perennial problems 28 40 994234 894897 99337 123482 11141 141 2500-3000 9711 42 Table 3: Aggregate characteristics of Listen from the deployment we can drastically reduce the probability of such false negatives due to such connections. Second, we are able to detect several reachability problems using Listen including specific misconfiguration related problems like forwarding errors. Table 2 presents a concise summary of the results obtained from our deployment. We were able to detect reachdifferent prefixes from our testbed ability problems to ✕ with a very false negative probabilities of ✂✡✠ ✖✕ ✞ and ✂ ✠ ✯ ✞ respectively due to spurious outbound and inbound connections. ✁ ✁ We will now describe our deployment experience in greater detail. In our testbed, we use three active probing tests to verify the correctness of results obtained using Listen: (a) ping the destination; (b) traceroute and check whether any IP address along in the path is in the same prefix as the destination; (c) perform a port 80 scan on the destination IP address. These tests are activated for every incomplete connection. We classify an incomplete connection as having a reachability problem only if all the three probing tests fail. We classify an incomplete connection as a spurious connection if one of the probing techniques is able to detect that the route to a destination prefix works. A spurious TCP connection is an incomplete connection that is not indicative of a reachability problem. Table 3 presents the aggregate characteristics of the traffic we observed from a ✄✁✵❯ network for over ❯✟✂ days. In reality, we found that nearly ✝ ✂ ✞ of the connections are incomplete of which a large fraction of these connections are spurious ( ✝ ✞ inbound and ❭✔✯ ✞ outbound). A more careful observation at the spurious connections showed that nearly ☎✂ ✞ of spurious inbound connections are due to port scanners and worms. The most prominent ones being the Microsoft NetBIOS worm and the SQL server worms [6]. Spurious outbound connections occur primarily due to failed connection attempts to non-live hosts and attempts to access a disabled ports of other end-hosts (e.g., telnet port being disabled in a destination end-host).Given this alarmingly high number of spurious connections, we propose defensive measures to reduce the probability of ✞ false negatives due to such connections. 6.2.1 Defensive Measures to reduce False Negatives In this section, we show that one can adaptively set the parameters ✤ , in the listen algorithm to drastically reduce the probability of false negatives due to spurious TCP connections. In particular, we show that by adaptively tuning the minimum time period, , one can reduce false negatives due to port scanners and by tuning the number of distinct destinations, ✤ , one can deal with non-live hosts. Given the nature of incomplete connections in our testbed, we use outbound incomplete connections as a test sample for non-live hosts and inbound connections as the test sample for port scanners and worms. In both inbound and outbound, we restricted our samples to only those connections which are known to be false negatives. Setting : One possibility is to choose an interval large enough such that the router will notice at least one genuine TCP flow during the interval. Such a value of will depend on the popularity of a prefix. The popularity of a prefix, ✮✁ ✮ ☞❀❁ ✔ , is defined as the mean time between two complete TCP connections to prefix ❁ . We can model the arrival of TCP connections as a Poisson process with a mean arrival rate as ☎✞✲✮✂ ✮ ☞❀❁ ✔ [30]. Given this, we can set the value of certain that one ✢ ❯ ✠ ❭ ✯ ✮✁ ✮ ☞☎❁ ✔ to be would experience at least one genuine connection within certainty, one needs to set the period . To have a ✡✠ ✢ ❭✡✠ ✠✯✠✮✂ ✮ ☞❀❁ ✔ . For prefixes that hardly observe any traffic, the value of will be very high implying that port scanners generating incomplete connections to such prefixes will not generate any false alarms. ✝ ✞ ✞ From our testbed, we determine the mean separation time between the arrival of two incoming connections to be ✮✂ ✮✟☞☎❁ ✔ ✢ ✯ ❯ ✠ sec. By merely setting ❉ ✢ ✦✕✔❭✡✠ to certainty, we could reduce the probability of achieve false negatives in Listen from ☎✠ ✔✯ to ✂✡✠ ✯ . Throughout the entire period of measurement, only during periods of ✦❉✕ ❭ seconds each did we verify incorrectly that the local prefix is not reachable. ✝ ✝ Type of problem Routing Loops Forwarding Errors Generic (forward path) Generic (reverse path) Number of Prefi xes 51 64 146 317 Table 4: The number of prefixes affected by different types of reachability problems. ✞✄❉ ✁ ❯ prefixes, we need to set ✤ ✢ . For ✞ and ✞ ❍❭ prefixes, one can choose larger values of ✤ ✢✩❯ or ✤ ✢ ✕ provided the prefix observes diversity in the traffic. ✝ ✝ 6.2.2 Detected Reachability Problems Among the reachability problems detected by Listen, two specific types of routing problems (as detected by active probing) include: routing loops and forwarding errors due to unknown IP addresses. Using traceroute, we were able to detect routing loops and we inferred forwarding errors using the routing table entries at the University exit router. A forwarding error arises when the destination IP address in a packet is a genuine one but the router has no next hop forwarding entry for the IP address. This can potentially arise due to staleness of routes. Table 4 summarizes the number of prefixes which are affected by each type of problem. In particular, we observe routing loops to ✕ different prefixes and forwarding errors to ❭❉❯ different prefixes. Additionally, Listen detected ❯✖❭✔✯ other prefixes having other forms of reachability problems. ✝ To cite a few examples of reachability problems we observed: (a) A BGP daemon within our network attempted to connect to another such daemon within the destination prefix 193.148.15.0/24. The route to this prefix was perennially unreachable due to a routing loop. (b) The route to Yahoo-NET prefix 207.126.224.0/20 was fluctuating. During many periods, the route was detected as unavailable. ✞ ✝ ✞ ✁ ✞ ✝ Setting ✤ : The choice of an appropriate value of ✤ trades off between minimizing the false negative ratio due to nonlive hosts and the number of reachability problems detected. In our testbed, we noticed that by merely setting ✤ ✢ ✁ , we can significantly reduce the false negative ratio in outbound connections from ❭✖✯ to less than . However, Listen reported only ✯ ✕ out of ❭✔❭✖✯ potential prefixes to have routing problems. For several ✞✄❉✁ ❯ prefixes, we observed TCP connections to only a single host and by setting ✤ ✢ ✁ , we tend to omit these cases. In practice, the value of ✤ is dependent on the diversity of traffic to a destination prefix and the traffic concentration at a router. For many ✞ ✝ 7 Colluding Adversaries Additional to acting as a group of isolated adversaries, colluding adversaries can tunnel advertisements and secrets between them and create invalid routes with fake AS links without being detected by the Whisper protocols. These invalid routes are not detectable even with a PKI unless the complete topology is known and enforced. Despite the limitation, we can provide protective measures for avoiding these invalid routes. ✞ Given the hierarchical nature and the skewed structure of the Internet topology, the invalid paths from colluding adversaries not detectable by the Whisper tend to be longer in AS path length. This is because, a normal route would traverse the Internet core (tier-1 + tier-2 ISPs) once while a consistent invalid route through ✁ colluding adversaries 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.5 0.4 0.3 2 Tier−1 ASes 2 Tier−2 ASes 12 Customer ASes 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Cumulative Distribution Cumulative Distribution Cumulative Distribution 1 0.6 0.5 0.4 0.3 2 Tier−1 ASes 2 Tier−2 ASes 12 Customer ASes 0.2 0.8 0.6 0.4 2 Tier−1 ASes 2 Tier−2 ASes 12 Customer ASes 0.2 0.1 0.9 1 Percentage of affected ASes 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percentage of affected ASes 0 −1 10 0 10 1 2 10 10 Percentage of affected ASes Figure 8: The effects of colluding adver- Figure 9: Effects of colluding adversaries Figure 10: Effect of colluding adversaries saries in the current Internet. with whisper + policy routing. with whisper + shortest path routing traverses the Internet core twice (since the adversary cannot remove any AS from the path). Hence, by choosing the shortest path we have a better chance of avoiding the invalid route. Figures 8, 9 and 10, illustrates this effect of colluding adversaries for scenarios: (a) the current Internet with no protection; (b) whisper protocols with policy routing; (c) whisper protocols with shortest path routing. All these graphs show the cumulative distribution of the vulnerability metric (defined in Section 6.1) for a set of colluding malicious adversaries. We specifically consider three cases: (a) ✁ colluding tier-1 AS’s; (b) ✁ colluding tier-2 AS’s (c) ✞✁ colluding customer AS’s. customer routes today have a path length less than . ✯ To summarize, whisper protocols in combination with the modified policies (emulating shortest path routing) can largely restrict the damage of colluding adversaries. ✯ ✝ We make two observations. First, ✁ randomly compromised customer routers can inflict the same magnitude of damage as that of two tier-1 nodes illustrating the effect of colluding adversaries in the current Internet. Typically, customer AS’s are easier to compromise since many of them are unmanaged. Second, whisper protocols with shortest path routing drastically reduces the possibility of colluding adversaries (in comparison to policy routing) propagating invalid routes without triggering alarms. In particular, even when ✞✁ customer AS’s are compromised, the effect on the Internet routing is negligible. ✝ ✝ Whisper protocols with policy routing offers much lesser protection since BGP tends to choose routes based on the local preference. The typical policy convention based on stable routing and economic constraints is to prefer customer routes over peer and provider routes [18]. This preference rule increases the vulnerability of BGP to pick consistent invalid routes from customers over potentially shorter routes through peers /providers. In principle, this problem also exists in S-BGP. To strike a middle ground between the flexibility of policy routing and this vulnerability, we propose a simple modification to the policy engine: Do not associate any local preference to customer routes that have an AS path length greater than ✁ (any route from a pair of colluding route should have a minimum path length of ). We believe that this modification to BGP policies should have little impact on current operation since most ✯ 8 Discussion We now discuss a few important aspects about Listen and Whisper not covered earlier. Hijacking unallocated prefixes: With the deployment of Whisper, a malicious adversary can still claim ownership over unallocated address spaces without triggering alarms by propagating bogus announcements. One way of dealing with this problem is to request ICANN [3] to specifically advertise unallocated address spaces with its own corresponding Whisper signatures whenever it notices an advertisement for an unallocated prefix. Additionally, to avoid a DoS attack on ICANN for such prefixes, routers should not maintain forwarding entries for these prefixes. Route Aggregation: Whenever an AS aggregates several route advertisements into one, it is required to perform one of the following operations to maintain the consistency of the aggregated route: (a) Append the individual signatures corresponding to each advertisement so that an upstream AS can match at least one of the signatures with the whisper signatures for alternate routes to sub-prefixes. (b) If the AS owns the entire aggregated prefix (common form of aggregation in BGP), ignore the whisper signatures in the subprefixes and append its own whisper signature. Other types of security attacks: Other than propagation of invalid routes, one can imagine other forms of routing attacks or misconfiguration errors which may result in routing loops, persistent route oscillations or convergence problems. Such problems are out of the scope of this paper. 9 Conclusions In this paper we consider the problem of reducing the vulnerability of BGP in the face of misconfigurations and malicious attacks. To address this problem we propose two techniques: Listen and Whisper. Used together these techniques can detect and contain invalid routes propagated by isolated adversaries, and a large number of problems due to misconfigurations. To demonstrate the utility of Listen and Whisper, we use a combination of real world deployment and empirical analysis. In particular, we show that Listen can detect unreachable prefixes with a low probability of false negatives, and that Whisper can limit the percentage of nodes affected by a randomly placed isolated adversary to less than . Finally, we show that both Listen and Whisper are easy to implement and deploy. Listen is incrementally deployable and does not require any changes to BGP, while Whisper can be integrated with BGP without changing the packet format. ✝ ✞ Acknowledgments The anonymous reviewers and Amin Vahdat, our shepherd provided us with invaluable feedback which helped substantially towards improving the quality of the paper. Tom Anderson, Anand Desai, Nick Feamster, Mark Handley, Chris Karlof, Ratul Mahajan, Satomi Okazaki, Vern Paxson, Adrian Perrig, Jennifer Rexford, Dawn Song, Doug Tygar and David Wagner provided several technical comments on this work. Krishna Gummadi and Konstantina Papagianakki provided us with valuable data for empirically evaluating our Listen algorithm. Several students in Berkeley read earlier drafts of this paper and provided useful feedback. The authors would like to thank them all. References [1] Cisco ios netflow. http://www.cisco.com/warp/ public/732/Tech/nmp/netflow/index.shtml. [2] Gnu zebra router implementation. http://www.zebra. org/. [3] Internet Corporation for Assigned Names and Numbers. http://www.icann.org/. [4] Internet routing registry. http://www.irr.net/. Version current January 2003. [5] libpcap utility. http://sourceforge.net/ projects/libpcap. [6] Microsoft port 1433 vulnerability. http:/lists. insecure.org/lists/vuln-dev/2002/Aug/ 0073.html. [7] Ripe ncc. http://www.ripe.net. [8] Routeviews. http://www.routeviews.org/. [9] Sprint IPMON project. http://ipmon.sprint. com/. [10] Trends in dos attack technology. http://www.cert. org/archive/pdf/DoS_trends.pdf. [11] J. Arkko and P. Nikander. How to authenticate unknown principals without trusted parties. In Proc. Security Protocols Workshop 2002, April 2002. [12] M. Bellare and D. Micciancio. A new paradigm for collision-free hashing: Incrementality at reduced cost. volume 1223 of Lecture Notes in Computer Science. Springer Verlag, 1997. [13] I. Blake, G. Serossi, and N. Smart. Elliptic Curves in Cryptography. Cambridge University Press, 2000. [14] V. J. Bono. 7007 explanation and apology. http://www.merit.edu/mail.archives/ nanog/1997-04/msg00444.html. [15] R. Clarke. Conventional public key infrastructure: An artefact ill-fi tted to the needs of the information society. Technical report. http://www.anu.edu.au/people/ Roger.Clarke/II/PKIMisFit.html. [16] D. Davis. Compliance defects in public key cryptography. In Proc. 6th USENIX Security Symposium, 1996. [17] C. Ellison and B. Schneier. Ten risks of PKI: What you’re not being told about public key infrastructure. Computer Security Journal, 16(1):1–7, 2000. Available online at URL http://www.counterpane.com/ pki-risks.html. [18] L. Gao and J. Rexford. Stable internet routing without global coordination. In IEEE/ACM Transactions on Networking, 2001. [19] G. Goodell, W. Aiello, T. Griffi n, J. Ioannidis, P. McDaniel, and A. Rubin. Working around BGP: An incremental approach to improving security and accuracy of interdomain routing. In Proc. of NDSS, San Diego, CA, USA, Feb. 2003. [20] Y. Hu, D. B. Johnson, and A. Perrig. SEAD: Secure effi cient distance vector routing for mobile wireless ad hoc networks. In Proc. of WMCSA, June 2002. [21] Y. Hu, A. Perrig, and D. B. Johnson. Wormhole detection in wireless ad hoc networks. Technical Report TR01-384, Department of Computer Science, Rice University, Dec. 2001. [22] Y. Hu, A. Perrig, and D. B. Johnson. Effi cient security mechanisms for routing protocols. In Proc. of NDSS’03, February 2003. [23] S. Kent, C. Lynn, and K. Seo. Design and analysis of the Secure Border Gateway Protocol (S-BGP). In Proc. of DISCEX ’00. [24] S. Kent, C. Lynn, and K. Seo. Secure Border Gateway Protocol (Secure-BGP). IEEE Journal on Selected Areas of Communications, 18(4):582–592, Apr. 2000. [25] R. Mahajan, D. Wetherall, and T. Anderson. Understanding BGP misconfi gurations. In Proc. ACM SIGCOMM Conference, Pittsburg, Aug. 2002. [26] Z. Mao, J. Rexford, J. Wang, and R. H. Katz. Towards an accurate AS-level traceroure tool. In ACM SIGCOMM, 2003. [27] S. Murphy, O. Gudmundsson, R. Mundy, and B. Wellington. Retrofi tting security into Internet infrastructure protocols. In Proc. of DISCEX ’00, volume 1, pages 3–17, 1999. [28] J. Ng. Extensions to BGP to support Secure Origin BGP (sobgp). Internet Draft draft-ng-sobgp-bgp-extensions-00, Oct. 2002. [29] V. N. Padmanabhan and D. R. Simon. Secure traceroute to detect faulty or malicious routing. In Proc. HotNets-I, 2002. [30] V. Paxson and S.Floyd. Wide area traffi c: Failure of poisson modeling. In Proc. ACM SIGCOMM, 1994. [31] B. Smith and J. Garcia-Luna-Aceves. Securing the Border Gateway Routing Protocol. In Proc. Global Internet ’96, London, UK, November 1996. [32] L. Subramanian, S.Agarwal, J.Rexford, and R. H. Katz. Characterizing the Internet hierarchy from multiple vantage points. In IEEE INFOCOM, New York, 2002. [33] R. Thomas. http://www.cmyru.com. [34] X. Zhao, D. Pei, L. Wang, D. Massey, A. Mankin, S. F. Wu, and L. Zhang. An analysis of BGP multiple origin AS (MOAS) conflicts. In ACM SIGCOMM IMW, 2001. [35] D. Zhu, M. Gritter, and D. Cheriton. Feedback based routing. In Proc. of HotNets-I, October 2002.