MissingThePointer CodePointerIntegrity
MissingThePointer CodePointerIntegrity
MissingThePointer CodePointerIntegrity
Abstract—Memory corruption attacks continue to be a major retrofitting memory safety into C/C++ applications can cause
vector of attack for compromising modern systems. Numerous significant overhead (up to 4x slowdown) [36] or may require
defenses have been proposed against memory corruption attacks, annotations [37, 28].
but they all have their limitations and weaknesses. Stronger
defenses such as complete memory safety for legacy languages In response to these perceived shortcomings, research has
(C/C++) incur a large overhead, while weaker ones such as focused on alternative techniques that can reduce the risk
practical control flow integrity have been shown to be ineffective. of code injection and code reuse attacks without significant
A recent technique called code pointer integrity (CPI) promises performance overhead and usability constraints. One such
to balance security and performance by focusing memory safety technique is Data Execution Prevention (DEP). DEP enables
on code pointers thus preventing most control-hijacking attacks
while maintaining low overhead. CPI protects access to code a system to use memory protection to mark pages as non-
pointers by storing them in a safe region that is protected by executable, which can limit the introduction of new executable
instruction level isolation. On x86-32, this isolation is enforced code during execution. Unfortunately, DEP can be defeated
by hardware; on x86-64 and ARM, isolation is enforced by using code reuse attacks such as return-oriented program-
information hiding. We show that, for architectures that do ming [11, 17], jump-oriented programming [10] and return-
not support segmentation in which CPI relies on information
hiding, CPI’s safe region can be leaked and then maliciously into-libc attacks [56].
modified by using data pointer overwrites. We implement a proof- Randomization-based techniques, such as Address Space
of-concept exploit against Nginx and successfully bypass CPI Layout Randomization (ASLR) [43] and its medium- [30], and
implementations that rely on information hiding in 6 seconds with fine-grained variants [57] randomize the location of code and
13 observed crashes. We also present an attack that generates data segments thus providing probabilistic guarantees against
no crashes and is able to bypass CPI in 98 hours. Our attack
demonstrates the importance of adequately protecting secrets in code reuse attacks. Unfortunately, recent attacks demonstrate
security mechanisms and the dangers of relying on difficulty of that even fine-grained memory randomization techniques may
guessing without guaranteeing the absence of memory leaks. be vulnerable to memory disclosure attacks [52]. Memory dis-
closure may take the form of direct memory leakage [53] (i.e.,
I. I NTRODUCTION as part of the system output), or it can take the form of indirect
Despite considerable effort, memory corruption bugs and memory leakage, where fault or timing side-channel analysis
the subsequent security vulnerabilities that they enable remain attacks are used to leak the contents of memory [9, 47]. Other
a significant concern for unmanaged languages such as C/C++. forms of randomization-based techniques include instruction
They form the basis for attacks [14] on modern systems in the set randomization (ISR) [8] or the multicompiler techniques
form of code injection [40] and code reuse [49, 14]. [26]. Unfortunately, they are also vulnerable to information
The power that unmanaged languages provide, such as leakage attacks [53, 47].
low-level memory control, explicit memory management and Control flow integrity (CFI) is a widely researched runtime
direct access to the underlying hardware, make them ideal enforcement technique that can provide practical protection
for systems development. However, this level of control against code injection and code reuse attacks [3, 61, 62].
comes at a significant cost, namely lack of memory safety. CFI provides runtime enforcement of the intended control
Rewriting systems code with managed languages has had flow transfers by disallowing transfers that are not present in
limited success [24] due to the perceived loss of control that the application’s control flow graph (CFG). However, precise
mechanisms such as garbage collection may impose, and the enforcement of CFI can have a large overhead [3]. This has
fact that millions of lines of existing C/C++ code would need motivated the development of more practical variants of CFI
to be ported to provide similar functionality. Unfortunately, that have lower performance overhead but enforce weaker
restrictions [61, 62]. For example, control transfer checks are
1 This work is sponsored by the Assistant Secretary of Defense for Research
relaxed to allow transfers to any valid jump targets as opposed
& Engineering under Air Force Contract #FA8721-05-C-0002. Opinions,
interpretations, conclusions and recommendations are those of the author and to the correct target. Unfortunately, these implementations
are not necessarily endorsed by the United States Government. have been shown to be ineffective because they allow enough
782
that an attack can be completed without any crashes in ∼98 the section begins with an overview of CPI and continues
hours for the most performant and complete implementation with information about remote leakage attacks. For additional
of CPI. This implementation relies on ASLR support from the information, we refer the reader to the CPI paper [31] and a
operating system. recent remote leakage attack paper [47].
A. Contributions
A. CPI Overview
This paper make the following contributions:
• Attack on CPI: We show that an attacker can defeat
CPI consists of three major components: static analysis,
CPI, on x86-64 architectures, assuming only control of instrumentation, and safe region isolation.
the stack. Specifically, we show how to reveal the location 1) Static Analysis: CPI uses type-based static analysis to
of the safe region using a data-pointer overwrite without determine the set of sensitive pointers to be protected. CPI
causing any crashes, which was assumed to be impossible treats all pointers to functions, composite types (e.g., arrays
by the CPI authors. or structs containing sensitive types), universal pointers
• Proof of Concept Attack on Nginx: We implement a
(e.g., void* and char*), and pointers to sensitive types
proof-of-concept attack on a CPI protected version of the as sensitive types (note the recursive definition). CPI protects
popular Nginx web server. We demonstrate that our attack against the redirection of sensitive pointers that can result in
is accurate and efficient (it takes 6 seconds to complete control-hijacking attacks. The notion of sensitivity is dynamic
with only 13 crashes). in nature: at runtime, a pointer may point to a benign integer
• Experimental Results: We present experimental results
value (non-sensitive) and it may also point to a function
that demonstrate the ability of our attack to leak the safe pointer (sensitive) at some other part of the execution. Using
region using a timing side-channel attack. the results of the static analysis, CPI stores the metadata for
• Countermeasures: We present several possible improve-
checking the validity of code pointers in its safe region. The
ments to CPI and analyze their susceptibility to different metadata includes the value of the pointer and its lower and
types of attacks. upper thresholds. An identifier is also stored to check for
temporal safety, but this feature is not used in the current
Next, Section II describes our threat model which is con-
implementation of CPI. Note that static analysis has its own
sistent with CPI’s threat model. Section III provides a brief
limitations and inaccuracies [33] the discussion of which is
background on CPI and the side-channel attacks necessary
beyond the scope of this paper.
for understanding the rest of the paper. Section IV describes
our attack procedure and its details. Section V presents the 2) Instrumentation: CPI adds instrumentation that propa-
results of our attack. Section VI describes a few of CPI’s gates metadata along pointer operations (e.g. pointer arithmetic
implementation flaws. Section VII provides some insights into or assignment). Instrumentation is also used to ensure that
the root cause of the problems in CPI and discusses possible only CPI intrinsic instructions can manipulate the safe region
patch fixes and their implications. Section VIII describes and that no pointer in the code can directly reference the safe
the possible countermeasures against our attack. Section IX region. This is to prevent any code pointers from disclosing the
reviews the related work and Section X concludes the paper. location of the safe region using a memory disclosure attack
(on code pointers).
II. T HREAT M ODEL 3) Safe Region Isolation: On the x86-32 architecture CPI
In this paper, we assume a realistic threat model that is relies on segmentation protection to isolate the safe region.
both consistent with prior work and the threat model assumed On architectures that do not support segmentation protection,
by CPI [31]. For the attacker, we assume that there exists such as x86-64 and ARM, CPI uses information hiding to
a vulnerability that provides control of the stack (i.e., the protect the safe region. There are two major weaknesses in
attacker can create and modify arbitrary values on the stack). CPI’s approach to safe region isolation in x86. First, the x86-
We also assume that the attacker cannot modify code in 32 architecture is slowly phased out as systems migrate to
memory (e.g., memory is protected by DEP [41]). We also 64-bit architectures and mobile architectures. Second, as we
assume the presence of ASLR [43]. As the above assumptions show in our evaluation, weaknesses in the implementation of
prevent code injection, the attacker would be required to the segmentation protection in CPI makes it bypassable. For
construct a code reuse attack to be successful. protection in the x86-64 architecture, CPI relies on the size of
We also assume that CPI is properly configured and cor- the safe region (242 bytes), randomization and sparsity of its
rectly implemented. As we will discuss later, CPI has other safe region, and the fact that there are no direct pointers to its
implementation flaws that make it more vulnerable to attack, safe region. We show that these are weak assumptions at best.
but for this paper we focus on its design decision to use CPI authors also present a weaker but more efficient version
information hiding to protect the safe region. of CPI called Code Pointer Separation (CPS). CPS enforces
safety for code pointers, but not pointers to code pointers.
III. BACKGROUND Because the CPI authors present CPI as providing the strongest
This section presents the necessary background informa- security guarantees, we do not discuss CPS and the additional
tion required to understand our attack on CPI. Specifically, safe stack feature further. Interested readers can refer to the
783
original publication for more in-depth description of these CPI safe region, an attacker can land inside an allocated mmap
features. page with high probability. In our evaluation we show that
this probability is as high as 1 for the average case. In other
B. Side Channels via Memory Corruption words, since the size of the mmap region is much larger than
Side channel attacks using memory corruption come in two the entropy in its start address, an attacker can effectively land
broad flavors: fault and timing analysis. They typically use in a valid location inside mmap without causing crashes.
a memory corruption vulnerability (e.g., a buffer overflow) as
the basis from which to leak information about the contents of IV. ATTACK M ETHODOLOGY
memory. They are significantly more versatile than traditional This section presents a methodology for performing attacks
memory disclosure attacks [54] as they can limit crashes, they on applications protected with CPI. As outlined in Section II,
can disclose information about a large section of memory, the attacks on CPI assume an attacker with identical capa-
and they only require a single exploit to defeat code-reuse bilities as outlined in the CPI paper [31]. The section begins
protection mechanisms. with a high-level description of the attack methodology and
Blind ROP (BROP) [9] is an example of a fault analysis then proceeds to describe a detailed attack against Nginx [45]
attack that uses the fault output of the application to leak using the approach.
information about memory content (i.e., using application At a high level, our attack takes advantage of two design
crashes or freezes). BROP intentionally uses crashes to leak weaknesses in CPI. First, on architectures that do not support
information and can therefore be potentially detected by segmentation protection, such as x86-64 and ARM, CPI uses
mechanisms that monitor for an abnormal number of program information hiding to protect the safe region. Second, to
crashes. achieve low performance overhead, CPI focuses protection
Seibert, et al. [47] describe a variety of timing- and fault- on code pointers (i.e., it does not protect data pointers).
analysis attacks. In this paper, we focus on using timing This section demonstrates that these design decisions can be
channel attacks via data-pointer overwrites. This type of exploited to bypass CPI.
timing attack can prevent unwanted crashes by focusing timing Intuitively, our attack exploits the lack of data pointer
analysis on allocated pages (e.g., the large memory region protection in CPI to perform a timing side channel attack that
allocated as part of the safe region). can leak the location of the safe region. Once the location of
Consider the code sequence below. If ptr can be over- a code pointer in the safe region is known, the code pointer,
written by an attacker to point to a location in memory, the along with its metadata, is modified to point to a ROP chain
execution time of the while loop will be correlated with the that completes the hijacking attack. We note that using a data-
byte value to which ptr is pointing. For example, if ptr is pointer overwrite to launch a timing channel to leak the safe
stored on the stack, a simple buffer overflow can corrupt its region location can be completely transparent to CPI and may
value to point to an arbitrary location in memory. This delay avoid any detectable side-effects (i.e., it does not cause the
is small (on the order of nanoseconds); however, by making application to crash).
numerous queries over the network and keeping the fastest The attack performs the following steps:
samples (cumulative delay analysis), an attacker can get an 1) Find data pointer vulnerability
accurate estimate of the byte values [47, 16]. In our attack, 2) Gather data
we show that this type of attack is a practical technique for
• Identify statistically unique memory sequences
disclosing CPI’s safe region.
• Collect timing data on data pointer vulnerability
1 i = 0; 3) Locate safe region
2 while (i < ptr->value) 4) Attack safe region
3 i++;
Next, we describe each of these steps in detail.
784
stack, it can be overwritten using a stack overflow attack; if it
is stored in heap, it can be overwritten via a heap corruption stack
higher memory addresses
attack.
lower memory addresses
In the absence of complete memory safety, we assume that
such vulnerabilities will exist. This assumption is consistent stack gap (at least 128MB)
with the related work in the area [50, 12]. In our proof-of- max mmap_base
concept exploit, we use a stack buffer overflow vulnerability
similar to previous vulnerabilities [1] to redirect a data pointer random mmap_base
in Nginx. linked libraries
min mmap_base =
B. Data Collection max-2^28*PAGE_SIZE
In the above equation, baseline represents the average Fig. 1. Safe Region Memory Layout.
round trip time (RTT) that the server takes to process requests
for a byte value of zero. di represents the delay sample RTT for
a nonzero byte value, and s represents the number of samples C. Locate Safe Region
taken. Figure 1 illustrates the memory layout of a CPI-protected
Once we set byte = 0, the above equation simplifies to: application on the x86-64 architecture. The stack is located
at the top of the virtual address space and grows downwards
di (towards lower memory addresses) and it is followed by the
baseline =
s stack gap. Following the stack gap is the base of the mmap
region (mmap base), where shared libraries (e.g., libc) and
Due to additional delays introduced by networking condi- other regions created by the mmap() system call reside. In
tions, it is important to establish an accurate baseline. In a systems protected by ASLR, the location of mmap base
sense, the baseline acts as a band-pass filter. In other words, is randomly selected to be between max mmap base (lo-
we subtract the baseline from di in Eq. 1 so that we are cated immediately after the stack gap) and min mmap base.
only measuring the cumulative differential delay caused by min mmap base is computed as:
our chosen loop.
We then use the set of delay samples collected for byte min mmap base =
255 to calculate the constant c. Once we set byte = 255, the max mmap base − aslr entropy ∗ page size
equation is as follows:
where aslr entropy is 228 in 64-bit systems, and the
page size is specified as an operating system parameter
(typically 4KB). The safe region is allocated directly after any
255
c= linked libraries are loaded on mmap base and is 242 bytes.
s
Immediately after the safe region lies the region in memory
(di ) − s ∗ baseline
where any dynamically loaded libraries and any mmap-based
i=1
heap allocations are made.
Given that the safe region is allocated directly after all
Once we obtain c, which provides of the ratio between the
linked libraries are loaded, and that the linked libraries are
byte value and cumulative differential delay, we are able to
linked deterministically, the location of the safe region can
estimate byte values.
785
be computed by discovering a known location in the linked
libraries (e.g., the base of libc) and subtracting the size of
the safe region (242 ) from the address of the linked library.
A disclosure of any libc address or an address in another
linked library trivially reveals the location of the safe region
in the current CPI implementation. Our attack works even if
countermeasures are employed to allocate the safe region in a
randomized location as we discuss later.
To discover the location of a known library, such as the
base of libc, the attack needs to scan every address starting
at min mmap base, and using the timing channel attack
described above, search for a signature of bytes that uniquely
identify the location.
The space of possible locations to search may require
aslr entropy∗page size scans in the worst case. As the base Fig. 3. Tolerated Number of Crashes
address of mmap is page aligned, one obvious optimization is
to scan addresses that are multiples of page size, thus greatly
reducing the number of addresses that need to be scanned to: mmap base may crash the application. If the application
restarts after a crash without rerandomizing its address space,
(aslr entropy ∗ page size)/page size then we can use this information to perform a search with the
In fact, this attack can be made even more efficient. In the goal of finding an address x such that x can be dereferenced
x86-64 architecture, CPI protects the safe region by allocating safely but x + libc size causes a crash. This implies that x
a large region (242 bytes) that is very sparsely populated with lies inside the linked library region, thus if we subtract the
pointer metadata. As a result, the vast majority of bytes inside size of all linked libraries from x, we will obtain an address
the safe region are zero bytes. This enables us to determine in the safe region that is near libc and can reduce to the case
with high probability whether we are inside the safe region or a above. Note that it is not guaranteed that x is located at the
linked library by sampling bytes for zero/nonzero values (i.e., top of the linked library region: within this region there are
without requiring accurate byte estimation). Since we start in pages which are not allocated and there are also pages which
the safe region and libc is allocated before the safe region, do not have read permissions which would cause crashes if
if we go back in memory by the size of libc, we can avoid dereferenced.
crashing the application. This is because any location inside To find such an address x, the binary search proceeds
the safe region has at least the size of libc allocated memory as follows: if we crash, our guessed address was too high,
on top of it. As a result, the improved attack procedure is as otherwise our guess was too low. Put another way, we maintain
follows: the invariant that the high address in our range will cause
a crash while the lower address is safe, and we terminate
1) Redirect a data pointer into the always allocated part of when the difference reaches the threshold of libc size. This
the safe region (see Fig. 1). approach would only require at most log2 219 = 19 reads and
2) Go back in memory by the size of libc. will crash at most 19 times (9.5 times on average).
3) Scan some bytes. If the bytes are all zero, goto step 2.
More generally, given that T crashes are allowed for our
Else, scan more bytes to decide where we are in libc.
scanning, we would like to characterize the minimum number
4) Done.
of page reads needed to locate a crashing boundary under the
Note that discovery of a page that resides in libc directly optimum scanning strategy. A reason for doing that is when
reveals the location of the safe region. T < 19, our binary search method is not guaranteed to find a
Using this procedure, the number of scans can be reduced crashing boundary in the worst case.
to: We use dynamic programming to find the optimum scanning
strategy for a given T . Let f (i, j) be the maximum amount of
(aslr entropy ∗ page size)/libc size memory an optimum scanning strategy can cover, incurring
up to i crashes, and performing j page reads. Note that to
Here libc size, in our experiments, is approximately 221 . In
cause a crash, you need to perform a read. Thus, we have the
other words, the estimated number of memory scans is: 228 ∗
recursion
212 /221 = 219 . This non-crashing scan strategy is depicted on
the left side of Fig. 2. f (i, j) = f (i, j − 1) + f (i − 1, j − 1) + 1
We can further reduce the number of memory scans if we
are willing to tolerate crashes due to dereferencing an address This recursion holds because in the optimum strategy for
not mapped to a readable page. Because the pages above f (i, j), the first page read will either cause a crash or not.
mmap base are not mapped, dereferencing an address above
786
2nd page scan Crash!
… …
Non-crashing scan strategy Crashing scan strategy
787
Armed with the exact address of a code pointer in the safe A. Vulnerability
region, the value of that pointer can be hijacked to point to a We patch Nginx to introduce a stack buffer overflow vul-
library function or the start of a ROP chain to complete the nerability allowing the user to gain control of a parameter
attack. used as the upper loop bound in the Nginx logging system.
This is similar to the effect that an attacker can achieve with
E. Attack Optimizations (CVE-2013-2028) seen in previous Nginx versions [1]. The
A stronger implementation of CPI might pick an arbitrary vulnerability enables an attacker to place arbitrary values on
address for its safe region chosen randomly between the the stack in line with the threat model assumed by CPI (see
bottom of the linked libraries and the end of the mmap region. Section II). We launch the vulnerability over a wired LAN
Our attack still works against such an implementation and can connection, but as shown in prior work, the attack is also
be further optimized. possible over wireless networks [47].
We know that the safe region has a size of 242 bytes. Using the vulnerability, we modify a data pointer in
Therefore, there are 248 /242 = 26 = 64 possibilities for the Nginx logging module to point to a carefully chosen
where we need to search. In fact, in a real world system address. The relevant loop can be found in the source code
like Ubuntu 14.04 there are only 246.5 addresses available to in nginx_http_parse.c.
1
mmap on Ubuntu x86-64 –thus there is a 25 chance of getting
the right one, even with the most extreme randomization for (i = 0; i < headers->nelts; i++)
assumptions. Furthermore, heap and dynamic library address
disclosures will increase this chance. We note that CPI has The data pointer vulnerability enables control over the
a unique signature of a pointer value followed by an empty number of iterations executed in the loop. Using the timing
slot, followed by the lower and upper bounds, which will make analysis presented in Section IV, we can distinguish between
it simple for an attacker to verify that the address they have zero pages and nonzero pages. This optimization enables the
reached is indeed in the safe region. Once an address within attack to efficiently identify the end of the safe region, where
the safe region has been identified, it is merely a matter of nonzero pages indicate the start of the linked library region.
time before the attacker is able to identify the offset of the safe
address relative to the table base. There are many options to B. Timing Attack
dramatically decrease the number of reads to identify exactly We begin the timing side channel attack by measuring the
where in the safe region we have landed. For instance, we HTTP request round trip time (RTT) for a static web page
might profile a local application’s safe region and find the most (0.6 KB) using Nginx. We collect 10,000 samples to establish
frequently populated addresses modulo the system’s page size the average baseline delay. For our experiments, the average
(since the base of the safe region must be page-aligned), then RTT is 3.2ms. Figure 4 and 5 show the results of our byte
search across the safe region in intervals of the page size at estimation experiments. The figures show that byte estimation
that offset. Additionally, we can immediately find the offset if using cumulative differential delay is accurate to within 2%
we land on any value that is unique within the safe region by (±20).
comparing it to our local reference copy.
We can now make some general observations about choos-
ing the variable of interest to target during the search. We
would be able to search the fastest if we could choose a pointer
from the largest set of pointers in a program that has the same
addresses modulo the page size. For instance, if there are 100
pointers in the program that have addresses that are 1 modulo
the page size, we greatly increase our chances of finding one
of them early during the scan of the safe region.
Additionally, the leakage of any locations of other libraries
(making the strong randomization assumption) will help iden-
tify the location of the safe region. Note that leaking all other
libraries is within the threat model of CPI.
788
our false positive rate at the cost of a factor of 2 in speed.
In our experiments, we found that scanning 5 extra bytes in
addition to the two signature bytes can yield 100% accuracy
using 30 samples per byte and considering the error in byte
estimation. Figure 6 illustrates the sum of the chosen offsets
for our scan of zero pages leading up to libc. Note that we
jump by the size of libc until we hit a non-zero page. The dot
on the upper-right corner of the figure shows the first non-zero
page.
In short, we scan 30∗7 = 210 bytes per size of libc to decide
whether we are in libc or the safe region. Table I summarizes
the number of false positives, i.e. the number of pages we
estimate as nonzero, which are in fact 0. The number of data
samples and estimation samples, and their respective fastest
percentile used for calculation all affect the accuracy. Scanning
5 extra bytes (in addition to the two signature bytes for a page)
and sampling 30 times per bytes yields an accuracy of 100% in
Fig. 5. Observed Byte Estimation our setup. As a result, the attack requires (2 + 5) ∗ 219 ∗ 30 =
7 ∗ 219 ∗ 30 = 110, 100, 480 scans on average, which takes
about 97 hours with our attack setup.
C. Locate Safe Region Once we have a pointer to a nonzero page in libc, we send
more requests to read additional bytes with high accuracy
After we determine the average baseline delay, we redi- to determine which page of libc we have found. Figure 7
rect the nelts pointer to the region between address illustrates that we can achieve high accuracy by sending
0x7bfff73b9000 and 0x7efff73b9000. As mentioned 10, 000 samples per byte.
in the memory analysis, this is the range of the CPI safe region Despite the high accuracy, we have to account for errors
we know is guaranteed to be allocated, despite ASLR being in estimation. For this, we have developed a fuzzy n−gram
enabled. We pick the the top of this region as the first value matching algorithm that, given a sequence of noisy bytes, tells
of our pointer. us the libc offset at which those bytes are located by comparing
A key component of our attack is the ability to quickly the estimated bytes with a local copy of libc. In determining
determine whether a given page lies inside the safe region or zero and nonzero pages, we only collect 30 samples per byte as
inside the linked libraries by sampling the page for zero bytes. we do not need very accurate measurements. After landing in a
Even if we hit a nonzero address inside the safe region, which nonzero page in libc, we do need more accurate measurements
will trigger the search for a known signature within libc, the to identify our likely location. Our measurements show that
nearby bytes we scan will not yield a valid libc signature and 10, 000 samples are necessary to estimate each byte to within
we can identify the false positive. In our tests, every byte read 20. We also determine that reading 70 bytes starting at
from the high address space of the safe region yielded zero. page offset 3333 reliably is enough for the fuzzy n−gram
In other words, we observed no false positives. matching algorithm to determine where exactly we are in libc.
One problematic scenario occurs if we sample zero bytes This offset was computed by looking at all contiguous byte
values while inside libc. In this case, if we mistakenly interpret sequences for every page of libc and choosing the one which
this address as part of the safe region, we will skip over required the fewest bytes to guarantee a unique match. This
libc and the attack will fail. We can mitigate this probability orientation inside libc incurs additional 70∗10, 000 = 700, 000
by choosing the byte offset per page we scan intelligently. requests, which adds another hour to the total time of the attack
Because we know the memory layout of libc in advance, for a total of 98 hours.
we can identify page offsets that have a large proportion of After identifying our exact location in libc, we know the
nonzero bytes, so if we choose a random page of libc and read exact base address of the safe region:
the byte at that offset, we will likely read a nonzero value.
In our experiments, page offset 4048 yielded the highest safe region address = libc base − 242
proportion of non-zero values, with 414 out of the 443 pages
of libc having a nonzero byte at that offset. This would give D. Fast Attack with Crashes
our strategy an error rate of 1 − 414/443 = 6.5%. We note We can make the above attack faster by tolerating 12 crashes
that we can reduce this number to 0 by scanning two bytes per on average. The improved attack uses binary search as opposed
page instead at offsets of our choice. In our experiments, if we to linear search to find libc after landing in the safe region as
scan the bytes at offsets 1272 and 1672 in any page of libc, described in section IV-C. We also use an alternative strategy
one of these values is guaranteed to be nonzero. This reduces for discovering the base of libc. Instead of sampling individual
pages, we continue the traversal until we observe a crash that
789
TABLE I
E RROR RATIO IN ESTIMATION OF 100 ZERO PAGES USING OFFSETS 1, 2, 3,
4, 5, 1272, 1672
F. Summary
In summary, we show a practical attack on a version of
Nginx protected with CPI, ASLR and DEP. The attack uses a
data pointer overwrite vulnerability to launch a timing side
channel attack that can leak the safe region in 6 seconds
with 13 observed crashes. Alternatively, this attack can be
completed in 98 hours without any crashes.
790
(__llvm__cpi_table) which is not zeroed after its and the size of the safe region in the 64-bit one. However, since
value is moved into the segment register. Thus, it is trivially the safe region is stored in the same address space to avoid
available to an attacker by reading a fixed offset in the data performance expensive context switches, these protections are
segment. In the two alternate implementations, the location not enough and as illustrated in our attacks they are easy
of the table is not zeroed because it is never protected by to bypass. Note that the motivation for techniques such as
storage in the segment registers at all. Instead it is stored as CPI is the fact that existing memory protection defenses such
a local variable. Once again, this is trivially vulnerable to as ASLR are broken. Ironically, CPI itself relies on these
an attack who can read process memory, and once disclosed defenses to protect its enforcement. For example, relying on
will immediately compromise the CPI guarantees. Note that randomization of locations to hide the safe region has many
zeroing memory or registers is often difficult to perform of the weaknesses of ASLR that we have illustrated.
correctly in C in the presence of optimizing compilers [44]. 2) Detecting Crashes: Second, it is assumed that leaking
We note that CPI’s performance numbers rely on support large parts of memory requires causing numerous crashes
for superpages (referred to as huge pages on Linux). In the which can be detected using other mechanisms. This in fact is
configurations used for performance evaluation, ASLR was not not correct. Although attacks such as Blind ROP [9] and brute
enabled (FreeBSD does not currently have support for ASLR, force [51] do cause numerous crashes, it is also possible on
and as of Linux kernel 3.13, the base for huge table allocations current CPI implementations to avoid such crashes using side-
in mmap is not randomized, although a patch adding support channel attacks. The main reason for this is that in practice
has since been added). We note this to point out a difference large number of pages are allocated and in fact, the entropy
between CPI performance tests and a real world environment, in the start address of a region is much smaller than its
although we have no immediate reason to suspect a large size. This allows an attacker to land correctly inside allocated
performance penalty from ASLR being enabled. space which makes the attack non-crashing. In fact, CPI’s
It is unclear exactly how the published CPI implementation implementation exacerbates this problem by allocating a very
intends to use the segment registers on 32-bit systems. The large mmap region.
simpletable implementation, which uses the %gs register, 3) Memory Disclosure: Third, it is also implicitly assumed
warns that it is not supported on x86, although it compiles. that large parts of memory cannot leak. Direct memory dis-
We note that using the segment registers may conflict in closure techniques may have some limitations. For example,
Linux with thread-local storage (TLS), which uses the %gs they may be terminated by zero bytes or may be limited to
register on x86-32 and the %fs register on x86-64 [18]. As areas adjacent to a buffer [54]. However, indirect leaks using
mentioned, the default implementation, simpletable, does not dangling data pointers and timing or fault analysis attacks do
support 32-bit systems, and the other implementations do not not have these limitations and they can leak large parts of
use the segment registers at all, a flaw noted previously, so memory.
currently this flaw is not easily exposed. A quick search of 4) Memory Isolation: Fourth, the assumption that the safe
32-bit libc, however, found almost 3000 instructions using the region cannot leak because there is no pointer to it is incorrect.
%gs register. Presumably this could be fixed by using the %fs As we show in our attacks, random searching of the mmap
register on 32-bit systems; however, we note that this may region can be used to leak the safe region without requiring
cause compatibility issues with applications expecting the %fs an explicit pointer into that region.
register to be free, such as Wine (which is explicitly noted in To summarize, the main weakness of CPI is its reliance
the Linux kernel source) [2]. on secrets which are kept in the same space as the process
Additionally, the usage of the %gs and %fs segment being protected. Arguably, this problem has contributed to the
registers might cause conflicts if CPI were applied to protect weaknesses of many other defenses as well [59, 51, 54, 47].
kernel-mode code, a stated goal of the CPI approach. The
Linux and Windows kernels both have special usages for these B. Patching CPI
registers. Our attacks may immediately bring to mind a number
of patch fixes to improve CPI. We considered several of
VII. D ISCUSSION these fixes here and discuss their effectiveness and limitations.
In this section we discuss some of the problematic CPI Such fixes will increase the number of crashes necessary for
design assumptions and discuss possible fixes. successful attacks, but they cannot completely prevent attacks
on architectures lacking segmentation (x86-64 and ARM).
A. Design Assumptions 1) Increase Safe Region Size: The first immediate idea is
1) Enforcement Mechanisms: First, the authors of CPI to randomize the location of the safe region base within an
focus on extraction and enforcement of safety checks, but even larger mmap- allocated region. However, this provides no
they do not provide enough protection for their enforcement benefit: the safe region base address must be strictly greater
mechanisms. This is arguably a hard problem in security, but than the beginning of the returned mmap region, effectively
the effectiveness of defenses rely on such protections. In the increasing the amount of wasted data in the large region but
published CPI implementation, protection of the safe region is not preventing our side channel attack from simply continuing
very basic, relying on segmentation in the 32-bit architecture to scan until it finds the safe region. Moreover, an additional
791
register must be used to hide the offset and then additional this flag for 400.perlbench to run). In our setup, no bench-
instructions must be used to load the value from that register, mark compiled with the CPI hashtable produced correct output
add it to the safe region segment register, and then add the on 400.perlbench, 403.gcc and 483.xalancbmk.
actual table offset. This can negatively impact performance. Table II lists the overhead results for SPECint. N T in the
2) Randomize Safe Region Location: The second fix can table denotes “Not terminated after 8 hours”. In this table, we
be to specify a fixed random address for the mmap allocation have listed the performance of the default CPI hashtable size
using mmap_fixed. This has the advantage that there will (233 ). Using a hashtable size of 226 , CPI reports that it has run
be much larger portions of non-mapped memory, raising the out of space in its hashtable (i.e. it has exceed a linear probing
probability that an attack might scan through one of these maximum limit) for 471.omnetpp and 473.astar. Using
regions and trigger a crash. However, without changing the a hashtable size of 220 , CPI runs out of space in the safe region
size of the safe region an attacker will only need a small num- for those tests, as well as 445.gobmk and 464.h264ref.
ber of crashes in order to discover the randomized location. The other tests incurred an average overhead of 17% with
Moreover, this approach may pose portability problems; as the the worst case overhead of 131% for 471.omnetpp. While
mmap man page states, “the availability of a specific address in general decreasing the CPI hashtable size leads to a small
range cannot be guaranteed, in general.” Platform-dependent performance increase, these performance overheads can still
ASLR techniques could exacerbate these problems. There are be impractically high for some real-world applications, partic-
a number of other plausible attacks on this countermeasure: ularly C++ applications like 471.omnetpp.
• Unless the table spans a smaller range of virtual memory, Table III lists the overhead results for SPECfp. IR in
attacks are still possible based on leaking the offsets and the table denotes “Incorrect results.” For SPECfp and a
knowing the absolute minimum and maximum possible CPI hashtable size of 226 , two benchmarks run out of
mmap_fixed addresses, which decrease the entropy of space: 433.milc and 447.dealII. In addition, two
the safe region. other benchmarks return incorrect results: 450.soplex and
• Induce numerous heap allocations (at the threshold caus- 453.povray. The 453.povray benchmark also returns
ing them to be backed by mmap) and leak their ad- incorrect results with CPI’s default hashtable size.
dresses. When the addresses jump by the size of the
TABLE II
safe region, there is a high probability it has been found. SPEC INT 2006 B ENCHMARK P ERFORMANCE BY CPI FLAVOR
This is similar to heap spraying techniques and would be
particularly effective on systems employing strong heap Benchmark No CPI CPI simpletable CPI hashtable
randomization. 401.bzip2 848 sec 860 (1.42%) 845 (-0.35%)
429.mcf 519 sec 485 (-6.55%) 501 (-3.47%)
• Leak the addresses of any dynamically loaded libraries. 445.gobmk 712 sec 730 (2.53%) 722 (1.40%)
If the new dynamically loaded library address increases 456.hmmer 673 sec 687 (2.08%) 680 (1.04%)
over the previous dynamic library address by the size of 458.sjeng 808 sec 850 (5.20%) 811 (0.37%)
462.libquantum 636 sec 713 (12.11%) 706 (11.01%)
the safe region, there is a high probability the region has
464.h264ref 830 sec 963 (16.02%) 950 (14.46%)
been found. 471.omnetpp 582 sec 1133 (94.67%) 1345 (131.10%)
3) Use Hash Function for Safe Region: The third fix can 473.astar 632 sec 685 (8.39%) 636 (0.63%)
400.perlbench 570 sec NT NT
be to use the segment register as a key for a hash function into 403.gcc 485 sec 830 (5.99%) NT
the safe region. This could introduce prohibitive performance 483.xalancbmk 423 sec 709 (67.61%) NT
penalties. It is also still vulnerable to attack as a fast hash
function will not be cryptographically secure. This idea is
similar to using cryptography mechanisms to secure CFI [35]. TABLE III
4) Reduce Safe Region Size: The fourth fix can be to make SPEC FP 2006 B ENCHMARK P ERFORMANCE BY CPI FLAVOR
the safe region smaller. This is plausible, but note that if mmap Benchmark No CPI CPI simpletable CPI hashtable
is still contiguous an attacker can start from a mapped library 433.milc 696 sec 695 (-0.14%) 786 (12.9%)
and scan until they find the safe region, so this fix must be 444.namd 557 sec 571 (2.51%) 574 (3.05%)
combined with a non-contiguous mmap. Moreover, making the 447.dealII 435 sec 539 (23.9%) 540 (24.1%)
450.soplex 394 sec 403 (2.28%) 419 (6.34%)
safe region compact will also result in additional performance 453.povray 250 sec IR IR
overhead (for example, if a hashtable is being used, there will 470.lbm 668 sec 708 (5.98%) 705 (5.53%)
be more hashtable collisions). A smaller safe region also runs a 482.sphinx3 863 sec 832 (-3.59%) 852 (-1.27%)
higher risk of running out of space to store “sensitive” pointers
more easily. To evaluate the effectiveness of a scheme which might
In order to evaluate the viability of this proposed fix, we dynamically expand and reduce the hashtable size to reduce
compiled and ran the C and C++ SPECint and SPECfp 2006 the attack surface at the cost of an unknown performance
benchmarks [22] with several sizes of CPI hashtables on an penalty and loss of some real-time guarantees, we also ran the
Ubuntu 14.04.1 machine with 4GB RAM. All C benchmarks SPEC benchmarks over an instrumented hashtable implemen-
were compiled using the -std=gnu89 flag (clang requires tation to discover the maximum number of keys concurrently
792
resident in the hashtable; our analysis showed this number to VIII. P OSSIBLE C OUNTERMEASURES
be 223 entries, consuming 228 bytes. However, some tests did In this section we discuss possible countermeasures against
not complete correctly unless the hashtable size was at least control hijacking attacks that use timing side channels for
228 entries, consuming 233 bytes. Without any other mmap memory disclosure.
46
allocations claiming address space, we expect 2228 = 218 a) Memory Safety: Complete memory safety can defend
46
crashes with an expectation of 217 , or 2233 = 213 crashes with against all control hijacking attacks, including the attack
an expectation of 212 . This seems to be a weak guarantee of outline in this paper. Softbound with the CETS extensions [36]
the security of CPI on programs with large numbers of code enforces complete spatial and temporal pointer safety albeit at
pointers. For instance, a program with 2GB of memory in a significant cost (up to 4x slowdown).
which only 10% of pointers are found to be sensitive using a On the other hand, experience has shown that low overhead
CPI hashtable with a load factor of 25% would have a safe mechanisms that trade off security guarantees for performance
region of size (2 ∗ 109 /8 ∗ 8% ∗ 4 ∗ 32 bytes). The expected (e.g., approximate [48] or partial [5] memory safety) eventu-
number of crashes before identifying this region would be only ally get bypassed [9, 52, 21, 11, 17].
slightly more than 214 . This number means that the hashtable Fortunately, hardware support can make complete memory-
implementation of CPI is not effective for protecting against a safety practical. For instance, Intel memory protection ex-
local attacker and puts into question the guarantees it provides tensions (MPX) [25] can facilitate better enforcement of
on any remote system that is not monitored by non-local memory safety checks. Secondly, the fat-pointer scheme shows
logging. As a comparison, it is within an order of magnitude that hardware-based approaches can enforce spatial memory
of the number of crashes incurred in the Blind ROP [9] attack. safety at very low overhead [32]. Tagged architectures and
capability-based systems can also provide a possible direction
5) Use Non Contiguous Randomized mmap: Finally, the for mitigating such attacks [58].
fifth fix can be to use a non-contiguous, per-allocation random- b) Randomization: One possible defense against timing
ized mmap. Such non-contiguous allocations are currently only channel attacks, such as the one outlined in this paper, is to
available using customized kernels such as PaX [43]. However, continuously rerandomize the safe region and ASLR, before
even with non-contiguous allocations, the use of super pages an attacker can disclose enough information about the memory
for virtual memory can still create weaknesses. An attacker layout to make an attack practical. One simple strategy is to
can force heap allocation of large objects, which use mmap use a worker pool model that is periodically re-randomized
directly to generate entries that reduce total entropy. Moreover, (i.e., not just on crashes) by restarting worker processes.
knowing the location of other libraries further reduces the Another approach is to perform runtime rerandomization [20]
entropy of the safe region because of its large size. As a by migrating running process state.
result, such a technique must be combined with a reduction Randomization techniques provide probabilistic guarantees
in safe region size to be viable. More accurate evaluation of that are significantly weaker than complete memory safety
the security and performance of such a fix would require an at low overhead. We note that any security mechanism that
actual implementation which we leave to future work. trades security guarantees for performance may be vulnerable
to future attacks. This short term optimization for the sake of
The lookuptable implementation of CPI (which was non- practicality is one reason for the numerous attacks on security
functional at the time of our evaluation) could support this systems [9, 52, 21, 11, 17].
approach by a design which randomly allocates the address of c) Timing Side Channel Defense: One way to defeat
each subtable at runtime. This would result in a randomized attacks that use side channels to disclose memory is to remove
scattering of the much smaller subtables across memory. There execution timing differences. For example, timing channels
46
are, however, only 32∗2222entries = 219 slots for the lookup can be removed by causing every execution (or path) to take
table’s subtable locations. The expectation for finding one of the same amount of time. The obvious disadvantage of this
19
these is 22K crashes, where K is the number of new code approach is that average-case execution time now becomes
pointers introduced that cause a separate subtable table to be worst-case execution time. This change in expected latency
allocated. If there are 25 such pointers (which would be the might be too costly for many systems. We note here that
case for a 1GB process with at least one pointer across the adding random delays to program execution cannot effectively
address space), that number goes to 213 crashes in expectation, protect against side channel attacks [19].
which as previously argued does not provide strong security
IX. R ELATED W ORK
guarantees.
Memory corruption attacks have been used since the early
We argue that we can identify a subtable because of 70’s [6] and they still pose significant threats in modern
the recognizable CPI structure, and search it via direct/side- environments [14]. Memory unsafe languages such as C/C++
channel attacks. While we cannot modify any arbitrary code are vulnerable to such attacks.
pointer, we believe that it is only a matter of time until an Complete memory safety techniques such as the SoftBound
attacker discovers a code pointer that enables remote code technique with its CETS extension [36] can mitigate mem-
execution. ory corruption attacks, but they incur large overhead to the
793
execution (up to 4x slowdown). “fat-pointer” techniques such attacks.
as CCured [37] and Cyclone [28] have also been proposed On the attack side, direct memory disclosure attacks have
to provide spatial pointer safety, but they are not compatible been known for many years [54]. Indirect memory leakage
with existing C codebases. Other efforts such as Cling [4], such as fault analysis attacks (using crash, non-crash signal)
Memcheck [38], and AddressSanitizer [48] only provide tem- [9] or in general other forms of fault and timing analysis
poral pointer safety to prevent dangling pointer bugs such as attacks [47] have more recently been studied.
use-after-free. A number of hardware-enforced memory safety Non-control data attacks [13], not prevented by CPI, can
techniques have also been proposed including the Low-Fat also be very strong in violating many security properties;
pointer technique [32] and CHERI [58] which minimize the however, since they are not within the threat model of CPI
overhead of memory safety checks. we leave their evaluation to future work.
The high overhead of software-based complete memory
safety has motivated weaker memory defenses that can be X. C ONCLUSION
categorized into enforcement-based and randomization-based We present an attack on the recently proposed CPI tech-
defenses. In enforcement-based defenses, certain correct code nique. We show that the use of information hiding to protect
behavior that is usually extracted at compile-time is enforced the safe region is problematic and can be used to violate the
at runtime to prevent memory corruption. In randomization- security of CPI. Specifically, we show how a data pointer
based defenses different aspects of the code or the execution overwrite attack can be used to launch a timing side channel
environment are randomized to make successful attacks more attack that discloses the location of the safe region on x86-
difficult. 64. We evaluate the attack using a proof-of-concept exploit
The randomization-based category includes address space on a version of the Nginx web server that is protected with
layout randomization (ASLR) [43] and its medium-grained CPI, ASLR and DEP. We show that the most performant
[30] and fine-grained variants [57]. Different ASLR imple- and complete implementation of CPI (simpletable) can be
mentations randomize the location of a subset of stack, bypassed in 98 hours without crashes, and 6 seconds if a small
heap, executable, and linked libraries at load time. Medium- number of crashes (13) can be tolerated. We also evaluate
grained ASLR techniques such as Address Space Layout the work factor required to bypass other implementations
Permutation [30] permutes the location of functions within of CPI including a number of possible fixes to the initial
libraries as well. Fine-grained forms of ASLR such as Binary implementation. We show that information hiding is a weak
Stirring [57] randomize the location of basic blocks within paradigm that often leads to vulnerable defenses.
code. Other randomization-based defenses include in-place
instruction rewriting such as ILR [23], code diversification XI. ACKNOWLEDGMENT
using a randomizing compiler such as the multi-compiler This works is sponsored by the Office of Naval Research
technique [27], or Smashing the Gadgets technique [42]. under the Award #N00014-14-1-0006, entitled Defeating Code
Unfortunately, these defenses are vulnerable to information Resue Attacks Using Minimal Hardware Modifications and
leakage (memory disclosure) attacks [54]. It has been shown DARPA (Grant FA8650-11-C-7192). The opinions, interpre-
that even one such vulnerability can be used repeatedly by an tations, conclusions and recommendations are those of the
attacker to bypass even fine-grained forms of randomization authors and do not reflect official policy or position of the
[52]. Other randomization-based techniques include Genesis Office of Naval Research or the United States Government.
[60], Minestrone [29], or RISE [8] implement instruction set The authors would like to sincerely thank Dr. William
randomization using an emulation, instrumentation, or binary Streilein, Fan Long, the CPI team, Prof. David Evans, and
translation layer such as Valgrind [38], Strata [46], or Intel Prof. Greg Morrisett for their support and insightful comments
PIN [34] which in itself incurs a large overhead, sometimes and suggestions.
as high as multiple times slowdown to the applications.
In the enforcement-based category, control flow integrity R EFERENCES
(CFI) [3] techniques are the most prominent ones. They [1] Vulnerability summary for cve-2013-2028, 2013.
enforce a compile-time extracted control flow graph (CFG) at [2] Linux cross reference, 2014.
runtime to prevent control hijacking attacks. Weaker forms of [3] M. Abadi, M. Budiu, U. Erlingsson, and J. Ligatti.
CFI have been implemented in CCFIR [61] and bin-CFI [62] Control-flow integrity. In Proceedings of the 12th ACM
which allow control transfers to any valid target as opposed conference on Computer and communications security,
to the exact ones, but such defenses have been shown to pages 340–353. ACM, 2005.
be vulnerable to carefully crafted control hijacking attacks [4] P. Akritidis. Cling: A memory allocator to mitigate
that use those targets to implement their malicious intent dangling pointers. In USENIX Security Symposium, pages
[21]. The technique proposed by Backes et al. [7] prevents 177–192, 2010.
memory disclosure attacks by marking executable pages as [5] P. Akritidis, C. Cadar, C. Raiciu, M. Costa, and M. Cas-
non-readable. A recent technique [15] combines aspects of tro. Preventing memory error exploits with wit. In
enforcement (non-readable memory) and randomization (fine- Security and Privacy, 2008. SP 2008. IEEE Symposium
grained code randomization) to prevent memory disclosure on, pages 263–277. IEEE, 2008.
794
[6] J. P. Anderson. Computer security technology planning [21] E. Göktas, E. Athanasopoulos, H. Bos, and G. Portoka-
study. volume 2. Technical report, DTIC Document, lidis. Out of control: Overcoming control-flow integrity.
1972. In IEEE S&P, 2014.
[7] M. Backes, T. Holz, B. Kollenda, P. Koppe, [22] J. L. Henning. Spec cpu2006 benchmark descrip-
S. Nürnberger, and J. Pewny. You can run but tions. SIGARCH Comput. Archit. News, 34(4):1–17, Sept.
you can’t read: Preventing disclosure exploits in 2006.
executable code. In Proceedings of the 2014 ACM [23] J. Hiser, A. Nguyen, M. Co, M. Hall, and J. Davidson.
SIGSAC Conference on Computer and Communications Ilr: Where’d my gadgets go. In IEEE Symposium on
Security, pages 1342–1353. ACM, 2014. Security and Privacy, 2012.
[8] E. G. Barrantes, D. H. Ackley, T. S. Palmer, D. Ste- [24] G. Hunt, J. Larus, M. Abadi, M. Aiken, P. Barham,
fanovic, and D. D. Zovi. Randomized instruction set M. Fähndrich, C. Hawblitzel, O. Hodson, S. Levi,
emulation to disrupt binary code injection attacks. In N. Murphy, et al. An overview of the singularity project.
Proceedings of the 10th ACM Conference on Computer 2005.
and Communications Security, CCS ’03, pages 281–289, [25] intel. Introduction to intel memory protection extensions,
New York, NY, USA, 2003. ACM. 2013.
[9] A. Bittau, A. Belay, A. Mashtizadeh, D. Mazieres, and [26] T. Jackson, A. Homescu, S. Crane, P. Larsen, S. Brun-
D. Boneh. Hacking blind. In Proceedings of the 35th thaler, and M. Franz. Diversifying the software stack
IEEE Symposium on Security and Privacy, 2014. using randomized nop insertion. In Moving Target
[10] T. Bletsch, X. Jiang, V. Freeh, and Z. Liang. Jump- Defense, pages 151–173. 2013.
oriented programming: A new class of code-reuse attack. [27] T. Jackson, B. Salamat, A. Homescu, K. Manivannan,
In Proc. of the 6th ACM Symposium on Info., Computer G. Wagner, A. Gal, S. Brunthaler, C. Wimmer, and
and Comm. Security, pages 30–40, 2011. M. Franz. Compiler-generated software diversity. Moving
[11] N. Carlini and D. Wagner. Rop is still dangerous: Break- Target Defense, pages 77–98, 2011.
ing modern defenses. In USENIX Security Symposium, [28] T. Jim, J. G. Morrisett, D. Grossman, M. W. Hicks,
2014. J. Cheney, and Y. Wang. Cyclone: A safe dialect of c. In
[12] S. Checkoway, L. Davi, A. Dmitrienko, A. Sadeghi, USENIX Annual Technical Conference, General Track,
H. Shacham, and M. Winandy. Return-oriented program- pages 275–288, 2002.
ming without returns. In Proc. of the 17th ACM CCS, [29] A. D. Keromytis, S. J. Stolfo, J. Yang, A. Stavrou,
pages 559–572, 2010. A. Ghosh, D. Engler, M. Dacier, M. Elder, and D. Kien-
[13] S. Chen, J. Xu, E. C. Sezer, P. Gauriar, and R. K. Iyer. zle. The minestrone architecture combining static and
Non-control-data attacks are realistic threats. In Usenix dynamic analysis techniques for software security. In
Security, volume 5, 2005. SysSec Workshop (SysSec), 2011 First, pages 53–56.
[14] X. Chen, D. Caselden, and M. Scott. New zero-day IEEE, 2011.
exploit targeting internet explorer versions 9 through 11 [30] C. Kil, J. Jun, C. Bookholt, J. Xu, and P. Ning. Address
identified in targeted attacks, 2014. space layout permutation (aslp): Towards fine-grained
[15] S. Crane, C. Liebchen, A. Homescu, L. Davi, P. Larsen, randomization of commodity software. In Proc. of
A.-R. Sadeghi, S. Brunthaler, and M. Franz. Readactor: ACSAC’06, pages 339–348. Ieee, 2006.
Practical code randomization resilient to memory disclo- [31] V. Kuznetsov, L. Szekeres, M. Payer, G. Candea,
sure. In IEEE Symposium on Security and Privacy, 2015. R. Sekar, and D. Song. Code-pointer integrity. 2014.
[16] S. A. Crosby, D. S. Wallach, and R. H. Riedi. Opportu- [32] A. Kwon, U. Dhawan, J. Smith, T. Knight, and A. Dehon.
nities and limits of remote timing attacks. ACM Trans- Low-fat pointers: compact encoding and efficient gate-
actions on Information and System Security (TISSEC), level implementation of fat pointers for spatial safety and
12(3):17, 2009. capability-based security. In Proceedings of the 2013
[17] L. Davi, D. Lehmann, A.-R. Sadeghi, and F. Monrose. ACM SIGSAC conference on Computer & communica-
Stitching the gadgets: On the ineffectiveness of coarse- tions security, pages 721–732. ACM, 2013.
grained control-flow integrity protection. In USENIX [33] W. Landi. Undecidability of static analysis. ACM Letters
Security Symposium, 2014. on Programming Languages and Systems (LOPLAS),
[18] U. Drepper. Elf handling for thread-local storage, 2013. 1(4):323–337, 1992.
[19] F. Durvaux, M. Renauld, F.-X. Standaert, L. v. O. tot [34] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser,
Oldenzeel, and N. Veyrat-Charvillon. Efficient removal of G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood.
random delays from embedded software implementations Pin: building customized program analysis tools with dy-
using hidden markov models. Springer, 2013. namic instrumentation. ACM Sigplan Notices, 40(6):190–
[20] C. Giuffrida, A. Kuijsten, and A. S. Tanenbaum. En- 200, 2005.
hanced operating system security through efficient and [35] A. J. Mashtizadeh, A. Bittau, D. Mazieres, and D. Boneh.
fine-grained address space randomization. In USENIX Cryptographically enforced control flow integrity. arXiv
Security Symposium, pages 475–490, 2012. preprint arXiv:1408.1451, 2014.
795
[36] S. Nagarakatte, J. Zhao, M. M. Martin, and S. Zdancewic. S. Lachmund, and T. Walter. Breaking the memory
Cets: compiler enforced temporal safety for c. In ACM secrecy assumption. In Proc. of EuroSec’09, pages 1–
Sigplan Notices, volume 45, pages 31–40. ACM, 2010. 8, 2009.
[37] G. C. Necula, S. McPeak, and W. Weimer. Ccured: Type- [54] R. Strackx, Y. Younan, P. Philippaerts, F. Piessens,
safe retrofitting of legacy code. ACM SIGPLAN Notices, S. Lachmund, and T. Walter. Breaking the memory
37(1):128–139, 2002. secrecy assumption. In Proceedings of EuroSec ’09,
[38] N. Nethercote and J. Seward. Valgrind: a framework for 2009.
heavyweight dynamic binary instrumentation. In ACM [55] L. Szekeres, M. Payer, T. Wei, and D. Song. Sok: Eternal
Sigplan Notices, volume 42, pages 89–100. ACM, 2007. war in memory. In Proc. of IEEE Symposium on Security
[39] H. Okhravi, T. Hobson, D. Bigelow, and W. Streilein. and Privacy, 2013.
Finding focus in the blur of moving-target techniques. [56] M. Tran, M. Etheridge, T. Bletsch, X. Jiang, V. Freeh,
IEEE Security & Privacy, 12(2):16–26, Mar 2014. and P. Ning. On the expressiveness of return-into-libc
[40] A. One. Smashing the stack for fun and profit. Phrack attacks. In Proc. of RAID’11, pages 121–141, 2011.
magazine, 7(49):14–16, 1996. [57] R. Wartell, V. Mohan, K. W. Hamlen, and Z. Lin. Binary
[41] OpenBSD. Openbsd 3.3, 2003. stirring: Self-randomizing instruction addresses of legacy
[42] V. Pappas, M. Polychronakis, and A. D. Keromytis. x86 binary code. In Proceedings of the 2012 ACM
Smashing the gadgets: Hindering return-oriented pro- conference on Computer and communications security,
gramming using in-place code randomization. In IEEE pages 157–168. ACM, 2012.
Symposium on Security and Privacy, 2012. [58] R. N. Watson, J. Woodruff, P. G. Neumann, S. W. Moore,
[43] PaX. Pax address space layout randomization, 2003. J. Anderson, D. Chisnall, N. Dave, B. Davis, B. Laurie,
[44] C. Percival. How to zero a buffer, Sept. 2014. S. J. Murdoch, R. Norton, M. Roe, S. Son, M. Vadera,
[45] W. Reese. Nginx: the high-performance web server and and K. Gudka. Cheri: A hybrid capability-system archi-
reverse proxy. Linux Journal, 2008(173):2, 2008. tecture for scalable software compartmentalization. In
[46] K. Scott, N. Kumar, S. Velusamy, B. Childers, J. W. IEEE Symposium on Security and Privacy, 2015.
Davidson, and M. L. Soffa. Retargetable and reconfig- [59] Y. Weiss and E. G. Barrantes. Known/chosen key
urable software dynamic translation. In Proceedings of attacks against software instruction set randomization.
the international symposium on Code generation and op- In Computer Security Applications Conference, 2006.
timization: feedback-directed and runtime optimization, ACSAC’06. 22nd Annual, pages 349–360. IEEE, 2006.
pages 36–47. IEEE Computer Society, 2003. [60] D. Williams, W. Hu, J. W. Davidson, J. D. Hiser, J. C.
[47] J. Seibert, H. Okhravi, and E. Soderstrom. Information Knight, and A. Nguyen-Tuong. Security through diver-
Leaks Without Memory Disclosures: Remote Side Chan- sity: Leveraging virtual machine technology. Security &
nel Attacks on Diversified Code. In Proceedings of the Privacy, IEEE, 7(1):26–33, 2009.
21st ACM Conference on Computer and Communications [61] C. Zhang, T. Wei, Z. Chen, L. Duan, L. Szekeres,
Security (CCS), Nov 2014. S. McCamant, D. Song, and W. Zou. Practical control
[48] K. Serebryany, D. Bruening, A. Potapenko, and flow integrity and randomization for binary executables.
D. Vyukov. Addresssanitizer: A fast address sanity In Security and Privacy (SP), 2013 IEEE Symposium on,
checker. In USENIX Annual Technical Conference, pages pages 559–573. IEEE, 2013.
309–318, 2012. [62] M. Zhang and R. Sekar. Control flow integrity for cots
[49] H. Shacham. The geometry of innocent flesh on the binaries. In USENIX Security, pages 337–352, 2013.
bone: Return-into-libc without function calls (on the
x86). In Proceedings of the 14th ACM conference on
Computer and communications security, pages 552–561.
ACM, 2007.
[50] H. Shacham. The geometry of innocent flesh on the bone:
Return-into-libc without function calls (on the x86). In
Proc. of ACM CCS, pages 552–561, 2007.
[51] H. Shacham, M. Page, B. Pfaff, E.-J. Goh, N. Modadugu,
and D. Boneh. On the effectiveness of address-space
randomization. In Proc. of ACM CCS, pages 298–307,
2004.
[52] K. Z. Snow, F. Monrose, L. Davi, A. Dmitrienko,
C. Liebchen, and A.-R. Sadeghi. Just-in-time code reuse:
On the effectiveness of fine-grained address space layout
randomization. In Security and Privacy (SP), 2013 IEEE
Symposium on, pages 574–588. IEEE, 2013.
[53] R. Strackx, Y. Younan, P. Philippaerts, F. Piessens,
796