Sec18-Chen 0
Sec18-Chen 0
Sec18-Chen 0
Table 1: Summary of Different Off-Path TCP Side Channel Attacks including the one we propose in this paper
packets have been sent during a time interval, through matches the one shown in Fig. 2. Notably, it implements
diffing the queried IPIDs of a Windows machine. This the recommended ACK throttling feature by introducing
is leveraged in several off-path TCP attacks [1, 25]. Us- a global system variable to control the maximum num-
ing IP spoofing, an off-path attacker can tell whether the ber of challenge ACKs generated per second. As this
guess is correct based on whether a response is triggered. limit is shared across all connections, the shared state
However, at the time of writing, we experimentally can be exploited as a side channel. For instance, to in-
verify that Windows 10 has finally eliminated this side fer if an ongoing connection exists, an off-path attacker
channel by adopting a safer IPID generation algorithm can initially send a spoofed packet with one guessed port
similar to that used in Linux [33], where connections number and SYN bit set; after the attacker sends another
destined for different IP addresses will no longer share 100 4 non-spoofed in-window RST packets to exhaust
the same IPID counter. the challenge ACK count, it can then observe the num-
ber of responses to tell whether its initial spoofed packet
• Browser page read. In this attack [27], the shared matches the four tuples of an ongoing connection and
state is a browser page where an attacker runs malicious hence triggers a challenge ACK.
Javascript and attempts to inject data into connections to Since the shared rate limit is a simple software artifact,
a benign website (both the benign connection and ma- shortly after the vulnerability was reported, it was elimi-
licious script run under the same page). The success- nated in a patch introduced in Linux 4.6 [8, 42] where a
ful guess of the TCP sequence number results in a di- per-socket rate limit is used instead.
rect feedback from the browser page load. There are
three main culprits of the attack: (1) older operating sys- • System-wide packet counter. Packet counters report
tems follow an earlier standard RFC 793 that consid- aggregated statistics across all connections and are re-
ers half of the ACK number space valid. An off-path liable side channels demonstrated in recent off-path at-
attacker only needs to guess two ACK values with ev- tacks [40, 39]. These attacks require a piece of unprivi-
ery guessed sequence number to inject data successfully. leged malware to run on the client machine that can ac-
Therefore, the feedback about when the injection suc- cess these packet counters and use them as feedback for
ceeds is when the malicious payload gets loaded and ren- spoofed packets sent by the off-path attacker. Due to the
dered by the browser. (2) modern browsers are tolerant fact that these counters are internal to TCP implementa-
of response data: if the HTTP response header is miss- tions, they may leak more diverse and fine-grained infor-
ing, the browser simply attaches one automatically. This mation (more than what the standard packet validation
frees the attacker from having to prepare the header at logic can leak). In the extreme case, for example, a Lin-
an exact sequence number (otherwise the browser con- ux/Android TCP packet named DelayedACKLost is in-
siders the response invalid and closes the connection). cremented only when it receives a packet with a sequence
(3) HTTP pipeline is required so that a response arrives number smaller than the expected one. This allows an
ahead of time will be deemed valid. attacker to conduct a binary search on the expected se-
This attack no longer works because the first culprit is quence number. Similar dangerous packet counters exist
eliminated by most modern operating systems (including on macOS as well [40].
Windows, Linux, Android), which adopted a more strin- These packet counters are being mitigated in a num-
gent check on ACK numbers as defined in RFC 5961 ber of ways. For Linux, it introduced the mechanism
where only a much smaller window is considered valid. of namespace so that sensitive apps and untrusted apps
In addition, from our testing, HTTP pipeline is disabled can run in separate namespaces with isolated counters.
or not implemented in all modern browsers, eliminating For macOS, the side channel vulnerability has recently
the third culprit as well. been assigned CVE-2017-13810 and patches have been
pushed out to zero the sensitive counters [9, 10].
• Global challenge ACK rate limit. The Linux kernel
4 It’s
first implemented all the features suggested in RFC 5961 the default threshold in Linux version 3.6
in version 3.6 and its TCP packet validation logic closely
ry
ry
ry
ry
ue
ue
ue
ue
Full Gap_2
Q
Gap_1 Probe
Q
e
e
Duplex
e
ob
ob
b
b
ro
ro
Pr
Pr
-P
-P
e-
e-
st
st
Pr
Pr
Attacker
Po
Po
Not Trigger ACK Trigger ACK
RTT_1 RTT_2
Server
ry
ry
ry
ry
ue
Probe
ue
ue
ue
Full Gap_2
Q
Q
Q
Q
Gap_1
be
be
e
e
Duplex
ob
ob
ro
ro
Pr
Pr
-P
-P
e-
e-
st
st
Pr
Pr
Po
Po
Attacker
Summary. Overall we listed four different types of it waits for a period of time (e.g., usually random or
software-based side channels that have been exploited exponential backoff [17]) attempting to avoid collision.
to launch off-path TCP attacks. We summarize them in Although it might benefit the performance when many
Table 1 for reference. In short, only the packet counter nodes are active, it creates a significant overhead when
side channels still exist (validated on Linux and Android only one is present (plus the AP). In addition to backoffs,
8.0). In any event, this side channel requires a high bar to Request to Send/Clear to Send (RTS/CTS) [16] may op-
launch because of the malware requirement. In the next tionally be used to mediate access to the shared medium
section, we describe our newly discovered side channel to solve the hidden-terminal problem [46] where multi-
in detail. ple stations can see the Access Point but not each other.
Unfortunately, in the same scenario where there is only
3 Wi-Fi Timing Channel one node, it introduces unnecessary traffic to the net-
work, slowing everything down. Finally, it is important
Fundamentally, the half-duplex nature of Wi-Fi creates to note the latency is amplified further when more con-
a “shared resource” among uplink and downlink traffic, tention is present (e.g., more frames to be transmitted in
a prerequisite of any side channel. By sharing the same either direction).
set of frequency bands with both directions, Wi-Fi relies Exploiting the timing channel. To demonstrate the tim-
on carrier-sense multiple access (i.e., CSMA) to share/- ing channel, we create a probing strategy to measure the
divide the channel over time. This means that a node delay effects. As we can see in Fig. 3a, we simulate an
transmits only when the channel is sensed to be idle and off-path TCP attack where the attacker sends a spoofed
thus it has the exclusive right to transmit. This effectively probing packet, along with a pre-probe query and post-
creates a timing channel that delays the local transmis- probe query to measure the RTT before and after. If the
sion if the opposite direction is transmitting at the same spoofed packet does not trigger an ACK on the client,
time. e.g., because the guessed sequence number is in-window
Even worse, this timing difference becomes more vis- (left half of the figure), then the post-probe query arrives
ible due to retransmissions caused by contention (col- at the client faster and gets back sooner (smaller RTT).
lision). Specifically, the protocol starts by listening on On the other hand, if the spoofed packet triggers an ACK
the channel and immediately sends the first frame to the on the client, e.g., because the guessed sequence number
transmit queue if the channel is found to be idle; how- is out-of-window (right half of the figure), then the post-
ever, this leads to waste of transmissions if collision oc- probe query experiences contention with the ACK from
curs. If the channel is subsequently sensed to be busy,
≈
probes Response
fies the connection inference, as the attacker can simply Probe
≈
sponse, it must be the case that no connection exists and
packet is dropped by a NAT.
For the second problem of real operating system im-
plementations, we survey the latest Linux, macOS, and (a) Connection (four-tuple) test
In-window Out-of-
Windows in terms of their packet validation logic. Our seq window seq
methodology is to inspect the kernel source code of Mallory Router Client Router Mallory Server
≈
probes
Finally, we apply the same test program to measure the
behavior of Windows. We summarize our findings in Ta- RTT
ble 2. RTT
≈
The result is, for the most part, consistent with the
standard (except Windows which we talk about later).
Linux is the one that most closely follows the standard
(also observed previously in [18]). It has implemented (b) Sequence number test
the challenge ACKs and the rate limit as suggested by
Figure 7: Infer port and sequence number by exploit-
RFC 5961. MacOS is similar to Linux except that it does
ing the timing side channel. Note that these diagrams
not implement rate limit and is in general weaker in its
are simplified for clearness. In reality, packets belong-
validation logic. For instance, even if an incoming packet
ing to different sockets can be processed simultaneously,
has no flag bit set, it still checks the sequence number of
and uplink and downlink should have equal access to the
the packet instead of dropping it without any processing.
wireless channel rather than uplink waiting for downlink.
Based on the concrete testing results, we conclude that
all three operating systems have packet validation logic
that can be exploited via the Wi-Fi timing channel. We path attacker now needs to guess a valid sequence num-
describe how to leverage their specifics to conduct the ber. By continuously tracking how the sequence number
attack: progresses, the attacker can effectively count the num-
Connection (Port Number) Inference. This attack ber of bytes received by the client (and the reverse di-
breaches the user privacy because knowing the websites rection can be monitored similarly through ACK number
a user visits often reveals a user’s medical condition and inference). We label the sequence number inference op-
sexual orientation [36]. As with previous off-path TCP portunities in Table 2 by combining two rows with dif-
exploits [25, 18], the first step is to infer whether an on- ferent outcomes (w/ or w/o responses) when the same se-
going connection with a particular target (server IP and quence of packets are processed. For Linux, if 10 incom-
server port are given) exists. We know that NAT drops ing ACK packets with just one-byte payload are received,
incoming packets that do not match any ongoing con- depending on their sequence numbers, 10 responses are
nections. All we need to make sure is that all operating triggered (out-of-window), or at most 1 (in-window) due
systems do generate outgoing ACKs otherwise. Indeed, to rate limiting (row no. 1, 2, and 3). For macOS, if an
from the table, an incoming ACK matching an ongoing incoming packet with no flags is received, a response is
connection with an out-of-window sequence number is triggered for the out-of-window case; otherwise no re-
guaranteed to trigger an ACK on all operating systems sponse is triggered (row no. 10 and 11). Interestingly,
(row no. 1, 10, and 17)). Fig. 7a depicts the sequence of if the ACK flag is on, macOS only generates ACKs half
packets that an off-path attacker can send to differentiate of the time (row no. 12 and 13). Windows is similar
between the cases of (i) the presence or (ii) the absence and requires only the regular ACK packets (row no. 17
of an ongoing connection. In both cases, the attacker and 18); SYN packets can do the trick as well (row no.
sends the same sequence of packets, leveraging the prob- 17 and 19). Fig. 7b demonstrates the sequence of packets
ing strategy described in §3 to measure the delay effects. that an attacker can send to distinguish between the cases
Sequence Number Inference Assuming the attacker of in-window and out-of-window sequence number.
has already identified the four-tuple connection, the off- ACK Number Inference Finally, knowing the four
: Although the client replies to such packets, it would also cause de-synchronization leading to the victim connection
to be closed during the keep-alive procedure, if the SACK option enables.
† : Typically, ACK number window refers to the range [SND.UNA-MAX.SND.WND, SND.NXT], but Windows deploys
a more stringent check if the connection is idle, requiring a valid ACK to equal SND.NXT.
a : RCV.NXT = next sequence number expected on an incoming segments, and is the left or lower edge of the receive
tuples and the expected sequence number, the attacker versus zero cannot create significant enough of a timing
now needs to learn the correct ACK number to success- channel. In addition, if a packet with in-window ACK
fully inject malicious payload. According to the standard number has no payload, Linux also ignores the packet
behavior earlier in §2.2, an attacker can infer whether a with no response (row no. 6), which leaves no oppor-
guessed ACK number is in-window or not by sending a tunity to differentiate the in-window and out-of-window
pure ACK (no payload) assuming its sequence number is cases (result similar to row no. 2 and 3). However, it
already in-window. If its ACK number is out-of-window, does correctly handle packets with payload; a response
a response is triggered and otherwise no response. Sur- is triggered only when the ACK number is in window
prisingly, from our analysis and experiments, we con- (row no. 5). The issue is that when an ACK number is
clude that no operating system is fully compliant with inferred, the client buffers the payload in its receive win-
the standard. Their own variants have often allowed sim- dow, which is undesirable for two reasons: (1) it may
pler strategies to conduct the ACK number inference. cause future server’s responses to be corrupted; (2) if
Linux. As shown in Table 2, instead of always trigger- selective ACK (SACK) is enabled, the client selectively
ing an ACK packet for out-of-window ACK numbers, acknowledges the data which has not actually been sent
when the ACK number is too old (smaller than SND.UNA by the server, causing the server to ignore future packets
- MAX.SND.WND), Linux responds with an ACK (with from the client, effectively de-synchronizing the client
rate limit); when the ACK number is too new (larger and server. Interestingly, Linux has a special edge case
than SND.NXT), Linux incorrectly drops the packet with- that allows us to infer ACK number without the hassle.
out any reply (row no. 2 and 3). Had there been no rate According to the specification, if the sequence number of
limit, an attacker can infer the correct ACK number via an incoming packet is equal to RCV.NXT-1 (indicating a
binary search. With rate limit, however, one response keep-alive message), it should trigger an ACK. Interest-
Time(us)
Firefox
MacOS Chrome 10/10 48.91 Timing Side Channel
Firefox
Windows Chrome 10/10 43.42 Timing Side Channel
Firefox & Direct Page Read
Number of Packets Number of Packets Linux Firefox 9/10 103.53 Timing Side Channel
& Blind Data Injection
(a) RTT measurement for 5GHz (b) Gap measurement for 5GHz
of a Huawei router of a Huawei router Table 3: Summary of attacks in a local setup
Time(ms)
8 Defenses
After we discovered the time side channel issue, we have
disclosed it to the working group in February 2018. They
Number of Packets Number of Packets have quickly acknowledged this weakness and became
highly engaged in discussion of the matter. However, due
Figure 10: RTT measurement of macOS using 5GHz net- to the expected challenges in changing the half-duplex
work of a Xiaomi router at two different locations with design, we are yet to see an appropriate solution at the
RTTs over 20ms 802.11 level. Therefore, the immediate mitigations are
expected to be at higher levels. We’ve also disclosed it to
Result Time cost (s) #FN Result Time cost (s) #FN vendors of the routers that we tested, among whom only
success 25.66 0 success 23.08 0
one replied and actively discussed it with us. Though the
success 286.31 0 success 580.32 1
success 549.15 1 success 195.03 0 company employees acknowledged this weakness, they
success 335.10 0 success 227.43 0 decided to submit this security issue to Wi-Fi Alliance,
failure 634.03 2 success 185.74 0 hoping that this would be fixed in the protocol standard.
FN: False Negative (i.e., Missing correct SEQ number)
In the reminder of this section, mitigations/patches at dif-
Table 4: 10 trials of remote attacks against macOS ferent layers are offered and thoroughly discussed.
Defenses in Wi-Fi technology. Unlike the previous
software-induced side channels, the timing channel in-
7 Discussion troduced by Wi-Fi is inherently difficult to eliminate or
mitigate (just as the recent meltdown and spectre vulner-
As discussed in §3, the timing side channel results from ability in CPUs [35, 34]). One straightforward defense
the half-duplex nature of wireless networks. It is further would be to make the Wi-Fi channel full-duplex. For in-
magnified due to the collision and backoff inherent in stance, with frequency-division duplexing, different fre-
wireless protocols. As we demonstrated, a full-duplex quency sub-bands can be used for uplink and downlink
system does not exhibit any timing channel (see §3) as traffic. However, this can potentially introduce low band-
no collision will occur when uplink and downlink traffic width utilization as separate dedicated sub-bands have to
happen at the same time. Finally, as confirmed in our test be pre-allocated (and real-world Internet traffic volume
routers, modern wireless routers all support CSMA/CA is not symmetric). Even though IEEE 802.11ax work-
and RTS/CTS as it is part of the 802.11 standards [31], ing group has been considering the possibility of sup-
and the principle is unlikely to change any time soon. porting in-band full-duplex communication [2], research
Although we only discuss the threat model where con- still needs to be done to make sure the real-world chal-
nections originated from a victim client are targeted, lenges such as backward compatibility are carefully con-
the attack actually also applies to connections originated sidered and addressed [12, 30]. At this point though, it is
from other clients connected through the same wireless unclear when the technology will be widely deployed in
router. This is because all these clients (e.g., behind the practice, according to our conversation with the 802.11
same NAT) share the same collision domain and there- working group.
fore suffer from the same timing channel. Responses Defenses in the TCP stacks. As described in §2.2,
triggered on any client by probing packets will effec- the packet validation logic of the latest TCP specification
tively delay the post-probe query. In this case, the victim inherently treats valid and invalid incoming packets dif-
connection (opened through puppet) simply opens up op- ferently, in terms of whether a response should be gener-
portunities for an off-path attacker to measure collision. ated. One solution is to revisit the specification and look
In addition, we can expand the threat model to consider for alternatives. A good hint is that all three modern op-
servers that are wirelessly connected, e.g., IoT devices. erating systems implement the ACK number validation
It has been shown that millions of IoT devices are reach- differently, yet they have co-existed without any major
able through public IP addresses and open ports [14]. In issues for a long time now. This leaves some flexibility
such cases, a completely off-path attack can be launched in the ACK number validation logic. Ideally, no matter
against a connection on such IoT devices, e.g., counting what ACK number an incoming packet has, it should ei-
bytes exchanged on the connection, terminating its con- ther consistently respond or never respond. Assuming an
nection with another host, injecting malicious command incoming packet already has a valid sequence number,
on an ongoing telnet connection (similar to the capability the only constraints we have here are: