Bulk Collection of Signals Intelligence PDF
Bulk Collection of Signals Intelligence PDF
Bulk Collection of Signals Intelligence PDF
record_id=19414
ISBN
978-0-309-32520-2
80 pages
6x9
PAPERBACK (2015)
Distribution, posting, or copying of this PDF is strictly prohibited without written permission of the National Academies Press.
Unless otherwise indicated, all materials in this PDF are copyrighted by the National Academy of Sciences.
Request reprint permission for this book
BULK COLLECTION OF
SIGNALS
INTELLIGENCE
TECHNICAL OPTIONS
NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from
the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible
for the report were chosen for their special competences and with regard for
appropriate balance.
Support for this project was provided by the Office of the Director for National
Intelligence, Contract Number 2014-14041100003-001. Any opinions, findings,
conclusions, or recommendations expressed in this publication are those of the
author(s) and do not necessarily reflect the views of the organizations or agencies
that provided support for the project.
International Standard Book Number 13: 978-0-309-32520-2
International Standard Book Number 10: 0-309-32520-X
Library of Congress Control Number: 2015933164
This report is available from
Computer Science and Telecommunications Board
National Research Council
500 Fifth Street, NW
Washington, DC 20001
Additional copies of this report are available from the National Academies Press,
500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202)
334-3313; http://www.nap.edu.
Copyright 2015 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
iv
Preface
In January 2014, the President addressed the nation and the broader
global community to explain U.S. policy regarding the collection of foreign intelligence. Shortly thereafter, the White House released Presidential
Policy Directive 28 (PPD-28), in which Section 5(d) requested the Director
of National Intelligence (DNI) to assess the feasibility of creating software that would allow the IC more easily to conduct targeted information
acquisition [of signals intelligence] rather than bulk collection. 1
The Office of the Director of National Intelligence (ODNI) then asked
the National Academies to form a committee to study this question, and
discussions led to the charge to the committee shown in Box P.1. Note that
the charge does not request recommendations, and the analysis and conclusions of the Committee on Responding to Section 5(d) of Presidential
Policy Directive 28: The Feasibility of Software to Provide Alternatives to
Bulk Signals Intelligence Collection are made with this in mind.
The committee assembled for this study included individuals with
expertise in national security law; counterterrorism operations; privacy
and civil liberties as they relate to electronic communications; data mining; large-scale systems development; software development; Intelligence
Community (IC) needs as they relate to research and development; and networking and social media. See Appendix C for biographical information.
1 The
vii
viii PREFACE
BOX P.1
The Charge to the Committee
A committee appointed by the National Research Council will assess the
feasibility of creating software that would allow the U.S. intelligence community
more easily to conduct targeted information acquisition rather than bulk collection,
as called for in section 5(d) of Presidential Policy Directive 28. To the extent possible, it will consider the efficacy, practicality, and privacy implications of alternative
software architectures and uses of information technology, and explore tradeoffs
among these aspects in the context of representative use cases. The study will
consider a broad array of communications modalities, e.g., phone, email, instant
message, and so on. It will not address the legality or value of signals intelligence
collection. The study will identify and assess options and alternatives but will not
issue recommendations.
Specifically, the committee will address the following:
1. What are a small set of representative use cases within which one can
explore alternative software architectures and uses of information technology, and
consider trade-offs?
2. What is the current state of the software technology to support targeted
information acquisition? What are feasible and likely trajectories for future relevant
software development; near, mid, and far term? What are possible technology
alternatives to bulk collection in the context of the use cases?
3. What are relevant criteria or metrics for comparing bulk collection to
targeted collection (e.g. effectiveness, response time, cost, efficacy, practicality,
privacy impacts)?
4. What tradeoffs arise with the technology alternatives analyzed in the
context of the use cases and criteria/metrics?
5. How might requirements for information collection be altered in light of this
analysis?
6. What uncertainties are associated with the assumptions and analyses,
and how might they affect the basis for decisions?
With 5 months from study inception to delivery, the study committee was not blessed with a luxury of time. The committee sought to be
responsive to the context in which the report was requested. In general
terms, the committee saw its mission as exploring whether technological
software-based alternatives to bulk collection might be identified in order
to retain, to the extent possible, current intelligence capabilities while
intruding less on parties that are not of known or potential interest to the
IC. The legal protections provided by the Fourth Amendment and legislation such as the Foreign Intelligence Surveillance Act distinguish between
foreign and U.S. persons; a factor that informed the committees thinking.
ix
PREFACE
x PREFACE
unclassified report suffices to answer its charge to the best of its ability.
One consequence of this approach is that some details must be omitted to
protect sources and methods that the IC rightly guards with care.
An unclassified report risks being overtaken by newly declassified
material. As this report was being finalized, documents were being declassified by the IC (see http://icontherecord.tumblr.com/) and released as a
result of Freedom of Information Act requests. As a result, numerous omissions are bound to appear in the report; these omissions are not expected to
change the committees fundamental arguments, although new information may change details along the way.
The committee met six times in person, with the first meeting in midJune 2014, and held numerous conference calls. Open sessions during
its meetings were devoted to briefings from outside parties, and closed
sessions were devoted to committee deliberations.
ACKNOWLEDGMENTS
The complexity and classified aspects of the issues explored in this
report meant that the committee had much to learn from its briefers. The
committee is grateful to many parties for presentations on:
June 30-July 2, 2014. Joel Brenner (Joel Brenner LLC, the Chertoff
Group, and former Inspector General, National Security Agency [NSA]),
Carmen Medina (Deloitte Consulting LLP and former Deputy Director for Intelligence, Central Intelligence Agency [CIA]), Mark Maybury
(The MITRE Corporation), General Keith B. Alexander (retired), Chris
Inglis (former Deputy Director, National Security Agency), Wesley Wilson
(ODNI/National Counterterrorism Center), Robert Brose (ODNI), William
Crowell (Alsop-Louie Partners), Stephanie OSullivan (ODNI), David
Honey (ODNI), and Marjory Blumenthal (Office of Science and Technology Policy).
August 4-6, 2014. Jeff Jonas (IBM), Mark Lowenthal (Intelligence
and Security Academy), and Philip Mudd (New America Foundation,
Mudd Management, and former Deputy Director, CIA Counterterrorism
Center).
August 27-29, 2014. David Grannis (Senate Select Committee on
Intelligence) and Kate Martin (Center for National Security Studies).
September 8-10, 2014. Alexander Joel (ODNI), J.C. Smart (Georgetown
University), Peter Highnam (Intelligence Advanced Research Projects
Activity), and members of the Privacy and Civil Liberties Oversight
Board.
xi
PREFACE
The committee requested but did not receive comments from the
American Civil Liberties Union, the Electronic Frontier Foundation, and
the Electronic Privacy Information Center.
The committee appreciates the support of David Honey (Assistant
Deputy Director of National Intelligence for Science and Technology
[ADDNI/S&T]), Steven D. Thompson (Senior S&T Advisor), John C.
Granger (Senior Advisor to the ADDNI/S&T), and their colleagues from
ODNI who helped make this study possible and the many officials
of ODNI and NSA who briefed the committee or answered its questions.
In addition, the committee acknowledges the intellectual contributions
of its staff, Alan Shaw (Study Director, Air Force Studies Board), Herbert
S. Lin (Chief Scientist, Computer Science and Telecommunications Board
[CSTB]), and Jon Eisenberg (Director, CSTB); consultants Alex Gliksman
(AGI Consulting, LLC), M. Anthony Fainberg (Institute for Defense
Analyses), and Allan Friedman (George Washington University); and
Eric Whitaker (Senior Program Assistant, CSTB), who provided administrative support.
THE COMMITTEES PERSPECTIVE ON ITS CHARGE
This report is part of the national discussion about the balance between
the powers of government and the rights of the governed, as the government tries to carry out its constitutionally mandated responsibilities. As
indicated above, the committee was asked a question about technology.
Accordingly, this report emphasizes technology but also attends to the
need for effective and trustworthy processes, even as more sophisticated
technologies are developed. But neither technology nor processalone
or togethercan guarantee the proper balance between collective and
individual security.
Acknowledgment of Reviewers
xiii
xiv
ACKNOWLEDGMENT OF REVIEWERS
Although the reviewers listed above have provided many constructive comments and suggestions, they were not asked to endorse the
reports conclusions, nor did they see the final draft of the report before
its release. The review of this report was overseen by Samuel H. Fuller,
Analog Devices, Inc., and William H. Press, University of Texas, Austin.
Appointed by the National Research Council, they were responsible for
making certain that an independent examination of this report was carried out in accordance with institutional procedures and that all review
comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring committee and the
institution.
Contents
SUMMARY
26
40
xv
xvi CONTENTS
3.2
51
59
78
xvii
CONTENTS
6.3
95
97
99
Summary
This report of the Committee on Responding to Section 5(d) of Presidential Policy Directive 28: The Feasibility of Software to Provide Alternatives to Bulk Signals Intelligence Collection responds to a request to
the National Academies from the Office of the Director of National Intelligence (ODNI). That request, in turn, was occasioned by Presidential
Policy Directive 28 (PPD-28) Section 5(d), which had asked the Director
of National Intelligence for a report assessing the feasibility of creating
software that would allow the Intelligence Community (IC) more easily
to conduct targeted information acquisition rather than bulk collection [of
signals intelligence].1 This study is among several of the administrations
responses to heightened public concern about U.S. intelligence agency
surveillance programs that followed Edward Snowdens disclosure of
numerous internal National Security Agency (NSA) documents beginning
in mid-2013. These responses include other activities called for in PPD-28
as well as in a study of big data and privacy by the Presidents Council
of Advisors on Science and Technology that is largely focused on civilian
applications.2
SUMMARY
6 In the case of telephone communications, metadata include the calling and called
telephone numbers, the time and duration of a call, but not its content. For email, metadata
have been interpreted to exclude the subject line. Other types of communications have different metadata elements.
7 For example, FISA and Foreign Intelligence Surveillance Court (FISC) orders restrict bulk
collection of domestic telephony records to querying targets with reasonable and articulable
suspicion (RAS) that they belong to a foreign terrorist organization. For another example,
PPD-28 restricts collection to six specific purposes.
8 For recent reports that deal with policy associated with signals collection, see two reports from the Privacy and Civil Liberties Oversight Board: Report on the Telephone Records
Program Conducted under Section 215 of the USA Patriot Act and on the Operations of the Foreign
Intelligence Surveillance Court, January 23, 2014, http://www.pclob.gov/library/215-Report_
on_the_Telephone_Records_Program.pdf, and Report on the Surveillance Program Operated
Pursuant to Section 702 of the Foreign Intelligence Surveillance Act, July 2, 2014, http://www.
pclob.gov/library/702-Report.pdf. See also Presidents Review Group on Intelligence and
Communications Technologies, Liberty and Security in a Changing World, December 12, 2013,
http://www.whitehouse.gov/sites/default/files/docs/2013-12-12_rg_final_report.pdf.
9 Indeed, as this study was under way, the President announced he would seek legislation
to end bulk collection of domestic telephony metadata (The White House, The Administrations Proposal for Ending the Section 215 Bulk Telephony Metadata Program, Fact
Sheet, March 27, 2014, Office of the Press Secretary, Washington, D.C.), and legislation was
proposed.
10 The sources of the signals are a separate topic that the committee did not consider,
although some examples are given later in the report.
SUMMARY
Discriminant
Signal
Extract
Filter
Query
Store
Analyze
Disseminate
Collection
FIGURE S.1 A conceptual model for the signals intelligence process.
A broader set of use cases, such as ones involving collection of communications content, detecting suspicious foreign communications patterns and suspicious queries to Internet search engines, might point to
other possibilities for alternatives to bulk collection.
BULK COLLECTION AND INFORMATION ABOUT PAST EVENTS
A common aspect of the categories of use cases above is that they rely
in part on information from the past to link or connect identifiers. If past
events become interesting in the presentbecause of new circumstances
such as identifying a new target, a nonnuclear nation that is now pursuing
the development of nuclear weapons, an individual who is found to be a
terrorist, or new intelligence-gathering prioritiesthen historical events
and the data they provide will be available for analysis only if they were
previously collected. If it is possible to do targeted collection of similar
events in the future, and if they happen soon enough, then the past events
might not be needed. If the past events are unique or if delay in obtaining
results is unacceptable (because of an imminent threat or perhaps because
of press coverage or public demand), then the intelligence will not be as
complete. So restricting bulk collection will make intelligence less effective, and technology cannot do anything about this; whether the gain in
privacy is worth the loss of information is a policy question that the committee does not address.
CONTROLLING USAGE
Controls on usage can help reduce the conflicts between collection
and privacy. There are other entities that collect highly sensitive data and
use it for purposes that the people who provide it might not like, such
as companies that provide cloud services such as email and social media
and data brokers that collect and correlate data from a wide variety of
public and proprietary sources and sell it to help with decisions about
extending credit or for marketing purposes. It is worth comparing how
society controls these activities with how it controls the IC. The accepted
control paradigm is notice and consent, the terms of service that almost
no one reads. Although today people are more tolerant of private data collection than of government data collection, this may change as the collection of private data grows. The 2014 report on privacy and big data from
the Presidents Council of Advisors on Science and Technology proposes
SUMMARY
instead that people should have control over how their data are used.11
Controls on use thus offer an alternative to controls on collection as a way
of protecting privacy.
There are two ways to control usage: manually and automatically.
NSA already has both automated and strong manual controls in place.
Despite rigorous auditing and oversight processes, however, it is hard
to convince outside parties of their strength, because necessary secrecy
prevents the public from observing the controls in action, and because
popular descriptions of the controls are imprecise and sometimes wrong.
Technical means can isolate collected data and automatically restrict
queries that analysts make, and the way these means work can be public
without revealing sensitive sources and methods. Then people outside the
IC concerned about privacy and civil liberties would have new ways to
verify that the IC has adequate procedures and follows them. Enhanced
automated controls also offer the promise of reduced burdens on analysts
because they can be more efficient than manual controls. Some manual
controls would still be necessary to ensure that the automatic controls are
actually imposed and that they are configured according to the rules, and
to decide cases that are too complex to be automated.
Automated controls and audits require expressing, in software, the
rules embodied in laws, policies, regulations, and directives that constrain
how intelligence is collected, analyzed, and disseminated. The current
rules form a complex network that has grown with changes in technology and in the national security environment. They contain conflicting
definitions and inconsistencies. Deriving from the legislative and administrative expressions of the rules, an expression in a concise, consistent,
machine-processable form would not only simplify automation software
but also make the rules more understandable to the public.
The next section outlines the key technical elements required to control and automate usage.
TECHNICAL ELEMENTS OF AUTOMATED CONTROLS
An automated system for controlling usage of bulk data with high
assurance has three parts: isolating bulk data so that it can be accessed
only in specific ways, restricting the queries that can be made against
it, and auditing the queries that have been done. In each of these areas,
there are opportunities for automated control; some of them are already
11 Presidents Council of Advisors on Science and Technology (PCAST), Big Data and
Privacy: A Technological Perspective, Executive Office of the President, May 2014, http://
www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_
privacy_-_may_2014.pdf.
Analysts
Query
Control
Guard
Result
Policy
Policy
Isolation
Bulk data
Audit
log
SUMMARY
See footnote 8.
10
no software technique that will fully substitute for bulk collection; there
is no technological magic.
Conclusion 1.1. Other sources of information might provide a partial substitute for bulk collection in some circumstances.
Data retained from targeted SIGINT collection is a partial substitute if
the needed information was in fact collected. Bulk data held by other parties, such as communications service providers, might substitute to some
extent, but this relies on those parties retaining the information until it is
needed, as well as the ability of intelligence agencies to collect or access it
in an efficient and timely fashion. Other intelligence sources and methods
might also be able to supply some of the lost information, but the committee was not charged to and did not investigate the full range of such
alternatives. Note that these alternatives may introduce their own privacy
and civil liberties concerns.
Conclusion 1.2. New approaches to targeting might improve the
relevance of the collected information to future use and would rely
on capabilities such as creating and using profiles of potentially
relevant targets, possibly by using other sources of information.
Because bulk collection cannot for practical reasons be truly comprehensive, it is itself inherently selective and unable to capture all relevant
history.13 It may be possible to improve targeted collection to the point
where it provides a viable substitute for bulk collection in at least some
cases, using profiles of potential targets that are compiled from a wide
range of information. This might reduce collection against persons who
are not targets, but it might also introduce new privacy and civil liberties
concerns about how such profiles are developed and used.
Rapidly updating discriminants of ongoing collections to include
new targets as they are discovered will collect data that would otherwise
be lost. If targeted collection can be done quickly and well enough, bulk
information about past events may not be needed. Targeted collection
cannot be a substitute if the past events were unique or if the delay
incurred to collect new information would be unacceptable.
Conclusion 2. Automatic controls on the usage of data collected in
bulk can help to enforce privacy protections.
13
The FISA Section 215 program collects only a small percentage of the total telephony
metadata held by service providers (Presidents Review Group on Intelligence and Communications Technologies, Liberty and Security in a Changing World, 2013, p. 97).
11
SUMMARY
14
This conclusion is consistent with Recommendation 2 in PCAST, Big Data and Privacy:
A Technological Perspective, 2014.
15 See also Ibid., Recommendation 3.
16 Examples of manual procedures for target approval are in National Security Agency,
NSAs Civil Liberties and Privacy Protections for Targeted SIGINT Activities Under Executive
Order 12333, NSA Director of Civil Liberties and Privacy Office Report, October 7, 2014,
https://www.nsa.gov/civil_liberties/_files/nsa_clpo_report_targeted_EO12333.pdf.
12
17 PCAST, Big Data and Privacy: A Technological Perspective, 2014. Recommendation 1, Sections 4 and 4.5.2.
1
Introduction and Background
The White House, Remarks by the President on Review of Signals Intelligence, Office of the Press Secretary, January 17, 2014, http://www.whitehouse.gov/
the-press-office/2014/01/17/remarks-president-review-signals-intelligence.
13
14
suppress any political activity. He clarified that the use of any bulk collection of SIGINT was even more limited, explicitly stating that it could
be used only for six specific security requirements: counterintelligence;
counterterrorism; counterproliferation; cybersecurity; force protection for
our troops and our allies; and combating transnational crime, including
sanctions evasion.
While defending the nature of American collection and use of bulk
data to support national security, the President also acknowledged how
many in America and around the world might still be concerned. He
declared an interest in exploring how the United States can preserve
current intelligence capabilities but with less government collection and
storage of bulk data. He conceded that it would not be easy to match the
capabilities and fill the gaps that the [metadata2 collection program] was
designed to address, but he is committed to exploring several options
that might enhance protections of privacy, including decreasing the number of hops in a contact network search to two from three, having the
Foreign Intelligence Surveillance Court (FISC) review reasonable and
articulable suspicion (RAS) selectors and identifying a means to have the
storage of the bulk metadata occur outside the federal government.
Shortly after the Presidents speech, the White House released Presidential Policy Directive 28 (PPD-28),3 the topic of which was U.S. policy
on SIGINT. PPD-28 both laid out the principles that govern how the U.S.
collects SIGINT and strengthened executive branch oversight of SIGINT
activities. PPD-28 seeks to ensure that U.S. policy takes into account security requirements, alliances, trade and investment relationships (including
the concerns of U.S. companies), and the U.S. commitment to privacy and
basic rights and liberties. The document also promised review of U.S.
decisions about intelligence priorities and sensitive targets by the Presidents senior national security team on an annual basis.
Of most importance to this report, PPD-28 requested the Office of the
Director of National Intelligence (ODNI) to assess the feasibility of creating software that would allow the Intelligence Community more easily
to conduct targeted information acquisition rather than bulk collection.
In turn, ODNI asked the National Academies to study and report on this
question. The Committee on Responding to Section 5(d) of Presidential
Policy Directive 28: The Feasibility of Software to Provide Alternatives to
Bulk Signals Intelligence Collection was formed in response.
The term metadata is defined in Section 2.3. Loosely, for telephone calls it includes calling
and called number, and time and duration of call, but not any content of the call.
3 See The White House, Presidential Policy Directive/PPD-28, Signals Intelligence Activities, Office of the Press Secretary, January 17, 2014, http://www.whitehouse.gov/
the-press-office/2014/01/17/presidential-policy-directive-signals-intelligence-activities.
15
4 Whether this reason is in some sense sincere, or a cover for protectionism is unclear.
But it may not matter. Whether perception or reality, U.S. leadership has concluded that the
upset created is sufficient to require a response.
5 For example, a New York Times article of March 2014 reports on estimates of losses to
U.S. technology companies ranging from $35 billion to $180 billion by 2016. See Claire Cain
Miller, Revelations of N.S.A. Spying Cost U.S. Tech Companies, The New York Times,
March 21, 2014, http://www.nytimes.com/2014/03/22/business /fallout-from-snowdenhurting-bottom-line-of-tech-companies.html?_r=0.
16
Pew Research Center, Global Opposition to U.S. Surveillance and Drones, but Limited Harm to
Americas Image, Washington, D.C., July 14, 2014, http://www.pewglobal.org/2014/07/14/
global-opposition-to-u-s-surveillance-and-drones-but-limited-harm-to-americas-image/.
7 Josh Levs and Catherine E. Shoichet, Europe furious, shocked by report of U.S. spying,
CNN, July 1, 2013, http://www.cnn.com/2013/06/30/world/europe/eu-nsa/.
17
18
10
19
ligence Directives (USSID), the most important of which for this report is
USSID 18, which has been declassified in substantial part. 13
The original enactment of FISA responded to significant contemporary political pressures, which resulted from abuses revealed in a series
of congressional hearings in the 1970s, and demanded greater control of
foreign intelligence collection by SIGINT methods when an activity occurs
in the United States or involves U.S. persons. The level of statutory and
regulatory control responds to political pressures that ebb and flow over
time; as will be seen. The 9/11 attacks caused an adjustment in this balance to respond to foreign attacks in domestic space.
At its initial enactment, FISA was not without controversy. Although
some argued that there was a critical need for the oversight that FISA
provided through a specially created court, others argued (and continue
to do so today) the long-standing view that foreign intelligence, as a core
presidential function, could not be regulated constitutionally by congressional statute.14 Nonetheless, passage of FISA, which introduced court
approval of intelligence collection for the first time, was encouraged by
a contemporaneous decision of the U.S. Supreme Court, intimating that
much of such domestic national security collection might be subject to
Fourth Amendment requirements for prior judicial approval through a
warrant application process.15 In response, FISA created a unique procedural approval process overseen by a new Article III court, the FISC,
which was designed to authorize electronic intelligence surveillance in the
13 National Security Agency, United States Signals Intelligence Directive USSID SP0018,
(U) Legal Compliance and U.S. Persons Minimization Procedures, Issue Date January 25,
2011, approved for release on November 13, 2013, referred to as USSID 18, http://www.dni.
gov/files/documents/1118/CLEANEDFinal USSID SP0018.pdf.
14 A recently released May 6, 2004, Memorandum for the Attorney General authored
by Professor Jack L. Goldsmith, then Assistant Attorney General, Department of Justice,
Office of Legal Counsel, describes this view. See Jack L. Goldsmith, Review of the Legality of
the STELLAR WIND Program, Office of the Assistant Attorney General, Washington, D.C.,
May 6, 2004, http://www.justice.gov/sites/default/files/pages/attachments/2014/09/19/
may_6_2004_goldsmith_opinion.pdf.
15 Although Title III of the Omnibus Crime Control and Safe Streets Act of 1968, 18 U.S.C.
2510-2520, 1968, authorizes electronic surveillance for specifically limited crimes with a
prior court order, a proviso at 18 U.S.C. 2511(3) protected the Presidents long-standing
right to conduct surveillance for national security purposes. Nonetheless, Justice Lewis
Powells language in the majority decision of U.S. v. United States District Court (Keith), 407
U.S. 297, 1972, had made clear that this exception would be narrowly construed in cases of
domestic security. FISA responded to indications of the direction of Supreme Court decisions. In the Keith decision, it was argued that the defendants, U.S. citizens who had acted
only domestically, constituted national security threats by bombing a government facility
and so the warrant requirement of the Fourth Amendment did not apply. The Supreme
Court rejected this contention, but left open the possibility that the executive branch might
not be so limited if national security threats involved foreign powers.
20
United States by NSA and the Federal Bureau of Investigation upon application to, and approval by, the court. The FISC decisions have remained
largely classified throughout much of the courts history. This proved
controversial to some. They questioned the independence of a judicial
body that operated largely out of the public eye to authorize intrusive
surveillance that, unlike warrants in criminal matters, would likely never
be publicly available, lacked any adversarial process, and limited the right
of appeal to the government applicant alone. These questions remain and
provide part of the backdrop to this report.
As originally enacted, FISA governed electronic surveillance for
foreign intelligence or counterintelligence information when collection
would occur within the United States. To collect such information, a
showing must be made to the FISC establishing probable cause that the
target is either a foreign power or an agent of a foreign power. Where the
target is a U.S. person, a showing based solely on First Amendment activities is not sufficient. Collection is subject to minimization protections, procedures designed to limit the acquisition and retention, and prohibit the
dissemination, of nonpublicly available information concerning unconsenting United States persons, but in ways nonetheless consistent with
the need for foreign intelligence.16 As a practical matter, minimization
involves removing the names of and references to U.S. persons with these
exceptions: the information is necessary to assess the value of the foreign
intelligence or the targeting of a U.S. person was approved by the FISC.
FISA was amended following the collection of domestic communications metadata that began in 2001. This was done initially at presidential
direction outside normal FISA processes, a decision that proved controversial.17 It was subsequently brought within the FISA process in 2006
through the business records provision of Section 215 of the USA Patriot
Act.18 This allowed the FISC to require production of documents and
other tangible things determined relevant to national security investigations, much as other courts do in criminal and grand jury investigations.
This provision has served as the authority under which the U.S. government has requested telecommunications providers to produce telephony
metadata, when relevant to a national security investigation.19 This provi16 See Foreign Intelligence Surveillance Act of 1978, 50 U.S.C. 1801(h)(1) and 1821(4)
(A), 1978.
17 See footnote 14.
18 USA Patriot Act 2001, http://www.gpo.gov/fdsys/pkg/PLAW-107publ56/pdf/PLAW107publ56.pdf.
19 Standards of relevance vary according to context. What is relevant for a criminal investigation will differ from the far broader standard for civil discovery or a grand jury subpoena.
The FISC has acceded to the governments argument that for national security investigations, relevance must be broadly construed. See Robert S. Litt, Privacy, Technology and
National Security, 2013, p. 6.
21
sion, approved in the course of several reviews by the FISC since 2006,
was also reauthorized by Congress in 2009 and again in 2011. It should
be noted that the interpretation of Section 215 permitting bulk collection
of such business records, although provided to Congress and relevant
committees, was not publicly acknowledged by the U.S. government until
after the Snowden disclosures.20
A third provision was added when Section 702 was passed as part of
the FISA Amendments Act of 2008 and reauthorized in 2012.21 The Section 702 amendment brought all communications, whether by satellite,
radio, wire, etc., acquired with the assistance of electronic communication
service providers under FISC oversight and supervision, even though
these communications were occurring overseas. Section 702 allows the
targeting of non-U.S. persons who are reasonably believed to be outside
the United States and expected to possess, receive, and/or communicate
foreign intelligence information, consistent with the Fourth Amendment.
Although full communications content, not just metadata, can be
collected under this authority, only non-U.S. persons may be targeted for
approved foreign intelligence purposes. To ensure that these limitations
are followed while preserving the flexibility and nimbleness needed for
effective foreign intelligence collection, annual certifications by the U.S.
Attorney General are presented to the FISC for approval, rather than specific prior judicial approval on a case-by-case basis.
The foregoing FISA provisions do not fully describe NSAs collection authority. To ensure that all collection was consistent with constitutional requirements, a broad operational charter, Executive Order 12333,
United States Intelligence Activities, was promulgated in 1981 by the
Reagan Administration; this has continued without significant change
in collection authorities until the present. This executive order provides
the basic authorities and principles under which all national security
agencies must operate.22 Importantly, at 2.8, Consistency with Other
Laws, it provides: Nothing in this Order shall be construed to authorize any activity in violation of the Constitution or statutes of the United
States. The provisions of Executive Order 12333 are further supported
by detailed operating regulations applicable to each individual agency;
in the case of NSA, Department of Defense Regulation 5240.1-R, its clas-
20 David S. Kris, On the bulk collection of tangible things, Journal of National Security Law
and Policy 7:209, 2014.
21 FISA Amendments Act of 2008, http://www.gpo.gov/fdsys/pkg/BILLS-110hr6304enr/
pdf/BILLS-110hr6304enr.pdf.
22 Executive Order 12333, http://www.archives.gov/federal-register/codification/
executive-order/12333.html. NSAs 13 specified responsibilities are defined at Executive
Order No. 12333 1.12(b), 3 C.F.R. 200, 1981, Intelligence Components Utilized by the Secretary of Defense.
22
sified annex, and USSID 18, approved by the Attorney General, provide
the specific implementation guidance for all authorized activities.
USSID 18 offers an important window into the detailed operational
authorities that govern NSA activities.23 It begins by observing that all
NSA activities must be consistent with the Constitutions provisions, as
interpreted by the U.S. Supreme Court. Annex A to USSID 18 sets forth
minimization procedures approved by the Attorney General that govern
the handling of information under FISA authority that may relate to U.S.
persons. The procedures limit the retention and dissemination of information about U.S. persons, whether or not the information is pertinent.
Incidental collection of data about individuals who are not themselves
subjects of interest is common to all forms of collection, and the concept
of minimization is thus one of long standing in law enforcement activities.
1.4.2 Policy and Practical Controls
Responding to the legal framework described above, NSA has developed a system of internal compliance and oversight. All parts of the foreign intelligence collection system are involved: access, storage, analysis,
and dissemination.
Both manual and automated controls are used to implement the legal
search framework that governs foreign intelligence information. Controls
and secure databases are used next to protect the subsequent storage of foreign intelligence information. Subsequent review of all actions is extensive.
An automatically generated audit trail and internal and external human
review are involved. Extensive training for all NSA employees also occurs.
An example of how policy and practical controls work together to
protect privacy in the case of data gathered under Section 215 authority
is provided in Box 1.1.
1.4.3 Legal Authorities for Collection and Use of Information
The legal authorities under which NSA operates are described in a
public document entitled NSA Missions, Authorities, Oversight and Partnerships.24 As noted above, these authorities include Executive Order 12333
and the Foreign Intelligence Surveillance Act of 1978, as amended. Executive Order 12333 is the foundational authority on which NSA relies to
collect, retain, analyze, and disseminate foreign SIGINT information.
23
BOX 1.1
Privacy Protections for Phone Metadata
Collected Under Section 215
Privacy protections for telephone metadata collected under Section 215
uthority were described in a speech by Office of the Director of National Intela
ligence (ODNI) General Counsel Robert Litt on July 18, 2014.a,b He noted that
before reports from queries are returned to analysts, the queries themselves must
be a
pproved to ensure compliance with legal and policy rules. These rules may
stem from law (e.g., Section 215 restrictions on surveillance of U.S. persons) or
from internal controls (e.g., that an analyst must be trained on the proper use of
the returned data). All queries must meet a reasonable and articulable suspicion
test. These rules seek to ensure that there can be no domestic fishing expeditions in which queries seek information about parties unrelated to an intelligence
investigation.
Litt also reported on other measures that are applied to protect privacy of
Section 215 telephone metadata:
The information is stored in secure databases.
The only intelligence purpose for which the information can be used is
counterterrorism.
Only a limited number of analysts may search these databases.
A search is allowed only when there is already a reasonable and articulable suspicion that the telephone number is associated with a terrorist organization that has been identified by the FISC.
The data may be used only to map a network of telephone numbers calling
other telephone numbers.
If an analyst finds a previously unknown (domestic) telephone number
that warrants further investigation, that number may only be disseminated in a way
that avoids identifying a person associated with the number. Further investigation
may be done only by other lawful means, including other FISA provisions and law
enforcement authority.
The telephony metadata is destroyed after 5 years.
Audit records are kept for all database queries, and a set of auditing and
compliance-checking procedures applies, implemented by not only NSA but also
ODNI and the Department of Justice.
In addition, only a limited number of NSA officials (22) are designated to make
a determination that a telephone number satisfies the reasonable and articulable
suspicion (RAS) criteria.c
a Robert S. Litt, Privacy, Technology and National Security: An Overview of Intelligence
Collection, speech, Washington, D.C., July 18, 2013, http://www.dni.gov/index.php/newsroom/
speeches-and-interviews/174-speeches-interviews-2009.
b PPD-28 added two additional restrictions: a requirement that the FISC approve the RAS
and a reduction in the number of hops that can be followed from three to two (The White
House, Presidential Policy Directive/PPD-28, Signals Intelligence Activities, Office of the
Press Secretary, January 17, 2014, http://www.whitehouse.gov/the-press-office/2014/01/17/
presidential-policy-directive-signals-intelligence-activities).
c Testimony of Chris Inglis, Statement, House Permanent Select Committee on Intelligence, Hearing on How Disclosed NSA Programs Protect Americans, and Why Disclosure
Aids Our Enemies, June 18, 2013, http://icontherecord.tumblr.com/post/57812486681/
hearing-of-the-house-permanent-select-committee-on.
23
24
There have been two important changes to the Section 215 program
as a result of the current public debate. A January 2014 presidential state-
25
25
ment announced that the number of hops would be reduced from three
to two and that the FISC would be tasked with approving RAS selectors.27
27 See The White House, Remarks by the President on Review of Signals Intelligence,
2014, and U.S. Foreign Intelligence Surveillance Court, In Re Application of the Federal
Bureau of Investigation for an Order Requiring the Production of Tangible Things. Order
Granting the Governments Motion to Amend the Courts Primary Order Dated January 3,
2014, Docket No. BR 14-01, Washington, D.C., http://www.uscourts.gov/uscourts/courts/
fisc/br14-01-order.pdf.
2
Basic Concepts
27
BASIC CONCEPTS
28
Other
intelligence
sources
Discriminant
Signal
Extract
Filter
Query
Store
Analyze
Disseminate
Collection
FIGURE 2.1 A conceptual model of signals intelligence.
The committees definition of collection differs from that used by NSA in certain ways.
See, for example, NSA, NSAs Civil Liberties and Privacy Protections for Targeted SIGINT
Activities Under Executive Order 12333, NSA Director of Civil Liberties and Privacy Office
Report, October 7, 2014, https://www.nsa.gov/civil_liberties/_files/nsa_clpo_report_
targeted_EO12333.pdf. See also footnote 3 in this chapter.
BASIC CONCEPTS
29
Extract. The first step is to obtain the signal from a source, convert it into a digital stream, and parse the stream to extract the kind of
information being sought, such as an email message or the digital audio
of a telephone call. Extraction interprets layers of communications and
Internet protocols, such as Optical Transport Network (OTN), Synchronous Digital Hierarchy (SDH), Ethernet, Internet Protocol (IP), Transmission Control Protocol (TCP), Simple Mail Transport Protocol (SMTP), or
Hypertext Transport Protocol (HTTP). In cases where business records
are sought, this step extracts and reformats relevant SIGINT data from a
business record format used by the business.
Filter. This step selects, from all the items extracted, items of interest that should be retained. It is sometimes controlled by a discriminant,
which the IC agency running the collection provides to describe in precise
terms the properties of an item that should be retained. For example,
a discriminant might specify all telephone calls from 301-555-1212 to
Somalia, all telephone calls from France to Yemen, or all searchengine queries containing the word sarin. If there is no discriminant,
then all extracted items are retained.
Store. Retained items are stored in a database operated by the U.S.
government. This is the point at which collection is deemed in this model
to occur for the retained data.3 By contrast, the previous steps are fleeting, with data processed in near real-time (keeping data only for short
periods of timeminutes to hoursfor technical reasons) as fast as it is
supplied, with all but the items to be retained discarded. Items collected
from separate sources are usually combined into a modest number of
large databases to facilitate searching and analysis.
In modern communication systems, traffic from many sources and
destinations is aggregated into a single channel. For example, the radio
signals to and from a base station serving all mobile phones in a cell are
all on the same radio channels, and all of the IP packets between two
routers may be carried on the same fiber. With rare exceptions, there is
no single physical access point comparable to the central office connection of a landline telephone at which to observe only the items of interest
and nothing more. Reflecting this reality, the committees definition of
collection says that SIGINT data is collected only when it is stored,
not when it is extracted. Put another way, every piece of data that passes
by a potential monitoring point must be machine-filtered as part of the
3 Not everyone agrees on a definition of the word collection, which is widely used in policy,
law, and regulation pertaining to SIGINT. This lack of collective agreement extends to entities within the IC itself. Moreover, subtle distinctions among the definitions lead to different
views on certain SIGINT properties, especially its intrusion on privacy.
30
31
BASIC CONCEPTS
Called
Call Duration
+1-617-555-0131
+1-703-555-0198
2014:10:3:15:45:10
3:41
+1-703-555-0198
+1-703-555-0013
2014:10:3:15:49:10
1:10
+1-415-555-0103
+963 99 2210403
2014:10:3:16:01:43
73:43
+1-603-555-0141
+1-603-555-0152
2014:10:3:22:10:03
3:01
+1-617-555-0183
+1-413-555-0137
2014:10:3:22:33:48
7:03
+1-802-555-0141
+1-802-555-0108
2014:10:3:22:41:17
3:02
NOTE: In this hypothetical example of call detail records as they might appear in a signals
intelligence database, the call shown in the first line might be relaying a message through
an intermediary at +1-703-555-0198. The call on the third line is to an international number,
which might belong to a foreign national or a U.S. person. The call in the fourth line was
probably ordering a pizza, since a directory of telephone numbers reveals that the called
number is a pizza shop.
The committees understanding, based on the briefings it received, is that most data incidentally collected about U.S. persons are never examined, because U.S. person data is not
returned in response to analyst queries for foreign intelligence information.
32
well. In this way, analysts can build a network that depicts how parties of
interest relate to one another and characterize the activities of each of the
parties in a network or more formally structured enterprise.
Analysts use a variety of software tools as they work with SIGINT
data. They may use tools to formulate queries or display the results (e.g.,
see Figure 3.1). They may set up standing queries (which need special
approval) that run each day to report new events associated with their
active targets. Using results of queries of the data, they build a record of
data and evidence for investigations in a working store, a set of digital
files separate from the SIGINT databases.
2.1.3Dissemination
The last step in the SIGINT process is dissemination. SIGINT analysts
will routinely disseminate the results of their work to others, both inside
and outside the IC. For example, NSA analysts working on a specific
terrorism investigation might disseminate their findings to other analysts
and collectors who are working on related issues or directly to policy
makers who may choose to take action based on the SIGINT.
Like the initial collection, SIGINT dissemination is governed by various laws and regulations designed to protect the sources and methods
involved in the collection as well as the privacy and civil liberties of the
subjects of the collection, especially if the intelligence involves U.S. persons.5 Specifically to the latter, and pursuant to U.S. Signals Intelligence
Direcive (USSID) 18,6 such reports will normally cloak the identity of U.S.
persons until a reader of the report specifically asks for the identity to be
disclosed and provides a valid reason for the release, such as initiating
a further investigation. This process is designed to ensure that both the
requesting agency and NSA, as the disseminator of the information, can
verify that disclosing this sensitive information is appropriate and necessary to understand the foreign intelligence value of the report.
2.2 BULK AND TARGETED COLLECTION
Presidential Policy Directive 28 (PPD-28) asks whether it is feasible
to create software that could replace bulk collection with targeted
5
Section 4 of PPD-28 indicates that the IC should endeavor to give the same protections
to foreign persons as to U.S. persons with regard to the retention and dissemination of
identifying information.
6 National Security Agency, United States Signals Intelligence Directive USSID SP0018,
(U) Legal Compliance and U.S. Persons Minimization Procedures, Issue Date January 25,
2011, approved for release on November 13, 2013, referred to as USSID 18, http://www.dni.
gov/files/documents/1118/CLEANEDFinal USSID SP0018.pdf.
BASIC CONCEPTS
33
collection.7 This section attempts to explain this distinction, which, unfortunately, is quite unclear. This question will be answered in Chapter 4.
Bulk collection results in a database in which a significant portion of
the data pertains to identifiers not relevant to current targets. Such items
usually refer to parties that have not been, are not now, and will not become
subjects of interest. Moreover, they are not closely linked to anyone of that
sort: knowing to whom these parties talk will not help locate threats or
develop more information about threats. Bulk collection occurs because
it is usually impossible to determine at the time of filtering and collection
that a party will have no intelligence value. Although the amount of information retained from bulk collection is often large, and often larger than
the amount of information retained from targeted collection, it is not their
size that makes them bulk. Rather, it is the (larger) proportion of extra
data beyond currently known targets that defines them.
Targeted collection tries to reduce, insofar as possible, items about
parties with no past, present, or future intelligence value. This is achieved
by using discriminants that narrowly select relevant items to store. For
example, if the email address hardcase45@example.com was obtained from
a terrorists smartphone when he was arrested, using a discriminant to
instruct the filter to save only email to or from hardcase45@example.com
would result in a targeted collection. Some or many of the people communicating with this person might turn out to have no intelligence value, but
the collection is far more selective than, say, collecting all email to or from
anyone with an email address served by aol.com. A discriminant could be a
top-level Internet domain, a country code (e.g., .cn for China, .fr for France),
a date on which communication occurred, a device type, and so on. A discriminant could even refer to the content in a communication, such as all
email with the word nuclear in it. Note that if a discriminant is broadly
crafted, the filter may retain such a large proportion of data on people of no
intelligence value that the collection cannot be called targeted.
PPD-28 seeks ways to reduce or avoid bulk collection in order to
increase privacy and civil liberty protections for those not relevant to the
intelligence collection purposes. Note that there is no precise threshold in
collecting data on such harmless persons that will distinguish between
bulk and targeted; its a matter of degree. Also note that the bulk/targeted
distinction applies broadly to different data types: telephony content,
metadata, business records, Internet searches, and so on.
The fundamental trade-off, which can be seen in the Chapter 3 use
cases and is explored further in Chapter 4, is between more intrusive
7
34
John DeLong testimony to committee; see also Memorandum of the United States in Response to the Courts Order Dated January [sic] 28, 2009 at 11, In re Production of Tangible
Things From [REDACTED], No. BR 08-13 (FISA Ct. February 17, 2009), http://www.dni.
gov/files/documents/section/pub_Dec%2012%202008%20Supplemental%20Opinions%20
from%20the%20FISC.pdf.
35
BASIC CONCEPTS
Ruled out
Unknowns
Subjects of interest
Targets
RAS
Targets
Seeds
Universe of identifiers
FIGURE 2.2 Classification of identifiers used in signals intelligence analysis.
RAS is a term of art used in the context of Section 215 collection. See David Kris, On
the bulk collection of tangible things, Journal of National Security Law and Policy 7:209, 2014.
10 National Security Agency, USSID 18, approved for release in 2013.
36
BOX 2.1
Working Definitions in Signals Intelligence and Technology
identifier
subject of
interest
target
(n, adj)
seed (target)
RAS target
query
discriminant
___________
37
BASIC CONCEPTS
selector
collection
(of SIGINT
data)
bulk
collection
targeted
collection
Collection that stores only the SIGINT data that remains after a
filter discriminant removes most non-target data.
minimization
___________
___________
continued
38
call detail
record
(CDR)
business
records
39
BASIC CONCEPTS
foreign
intelligence
information
U.S. person
1 Administration White Paper: Bulk Collection of Telephony Metadata Under Section 215
of the USA Patriot Act, August 9, 2013, p. 4. Found various places online, including http://big.
assets.huffingtonpost.com/Section215.pdf.
2 National Security Agency, United States Signals Intelligence Directive USSID SP0018,
(U) Legal Compliance and U.S. Persons Minimization Procedures, Issue Date January 25,
2011, approved for release on November 13, 2013, referred to as USSID 18, http://www.dni.
gov/files/documents/1118/CLEANEDFinal USSID SP0018.pdf.
3 See U.S. Department of Justice, Justice News, Acting Assistant Attorney General E
lana
Tyrangiel Testifies Before the U.S. House Judiciary Subcommittee on Crime, Terrorism,
Homeland Security, and Investigations, March 19, 2013, http://www.justice.gov/iso/opa/doj/
speeches/2013/olp-speech-1303191.html.
4 Administration White Paper, 2014.
5 FISC ruling, U.S. Foreign Intelligence Surveillance Court, In Re Motion of Propublica,
Inc. for the Release of Court Records, Docket No.: Misc. 13-09, The United States Opposition
to the Motion of Propublica, Inc. for the Release of Court Records, http://www.dni.gov/files/
documents/1118/CLEANEDPRTT%201.pdf, p. 11.
6 USA Patriot Act 2001, http://www.gpo.gov/fdsys/pkg/PLAW-107publ56/pdf/PLAW-107publ56.
pdf, p. 17.
7 Administration White Paper, 2014.
8 Ibid., p. 2, item (e).
9 Ibid., p. 4, item (i).
3
Use Cases and Use Case Categories
1 For scenarios of four counterterrorism investigations studied by the Privacy and Civil
Liberties Oversight Board, see Report on the Telephone Records Program Conducted under Section 215 of the USA PATRIOT Act and on the Operations of the Foreign Intelligence Surveillance
Court, January 23, 2014, http://www.pclob.gov/library/215-Report_on_the_Telephone_
Records_Program.pdf, p. 144 ff.
2 National Security Agency, presentation to the committee on August 28, 2014.
40
41
BOX 3.1
Some Specific Cases of Signals Intelligence in Use
Very little has been made public about actual cases where U.S. signals intelligence has contributed to counterterrorism. A principal reason is that the Intelligence Community (IC) carefully protects information about sources and methods
from adversaries. Nevertheless, information on some cases can be found in public
speeches and testimony to Congress by IC leaders and in two reports prepared
by the Privacy and Civil Liberties Oversight Board.
The accounts of these cases are incomplete and possibly inconsistent. The
selection of the cases that were made public, the details of the accounts, and their
significance have all been controversial.
Pointers to some of this public information are provided below, not because
the committee endorses the views of its authors, but simply to supplement the abstract use case categories presented in this chapter with some concrete examples:
Testimony by Gen. Keith Alexander and others before the House Select Committee on Intelligence, June 18, 2013, http://icontherecord.tumblr.com/
post/57812486681/hearing-of-the-house-permanent-select-committee-on.
Four cases using Foreign Intelligence Surveillance Act (FISA) Section 215
authority:
Basaaly Moalin, financial support of Al Shabab.
Najibullah Zazi, plotted to bomb the New York Subway system.
David Coleman Headley, helped plan the 2008 Mumbai attack.
Khalid Ouazzani, suspected of plotting to bomb the New York Stock
Exchange.
Described in Privacy and Civil Liberties Oversight Board, Report on the
Telephone Records Program Conducted under Section 215 of the USA Patriot Act
and on the Operations of the Foreign Intelligence Surveillance Court, http://www.
pclob.gov/library/215-Report_on_the_Telephone_Records_Program.pdf, p. 144 ff.
Some uses of FISA Section 702 authority are described in Privacy and
Civil Liberties Oversight Board, Report on the Surveillance Program Operated
Pursuant to Section 702 of the Foreign Intelligence Surveillance Act, http://www.
pclob.gov/library/702-Report.pdf, p. 104 ff.
42
elements used are the to and from identifiers in the form of telephone
numbers or email addresses, or the Internet Protocol (IP) address of a
computer used for communication. Collection methods are not described,
and it is assumed that the data are collected in such a way that they contain the entries that are required to satisfy the scenario. The Intelligence
Community (IC) may collect additional kinds of SIGINT metadata.
3.1 CONTACT CHAINING
Communications metadata, domestic and foreign, are used to develop
contact chains by starting with a target and using metadata records to
indicate who has communicated directly with the target (one hop), who
has in turn communicated with those people (two hops), and so on.
Studying contact chains can help identify members of a network of people
who may be working together; if one is known or suspected to be a terrorist, it becomes important to inspect others with whom that individual is in
contact who may be members of a terrorist network. Similarly, studying
contact chains can help analysts to understand the structure of an organization under investigation.
3.1.1 Use Case 1
In Use Case 1, the U.S. government has identified a Somali pirate
network that includes target A. An analyst queries and displays all the
call contacts to or from As telephone number in the last 18 days. Some
contacts are identified as already known targets; others are undetermined.
The analyst invokes a similar query and display for target B, who has communicated frequently with A, and notes that there are three people, not
yet determined to be targets, who have been in contact with both A and B.
The analyst can see this relationship immediately, because the contact sets
of A and B are displayed as a network, with contacts as nodes, linked by
lines to indicate calls. The analyst invokes the query-and-display function
again on one of these three, C, and discovers this person is in contact not
only with targets A and B but also with other known pirates. Perhaps C is
a missing link between the networks in which A and B are operating. 3
Many contacts uncovered this way are ruled out as having no intelligence value. Calls to a car mechanic, an IT help desk, or an automated
3 Inside the NSA,60 Minutes, CBS News, video segment, December 15, 2013, 3:404:45, http://www.cbsnews.com/videos/inside-the-nsa/. The transcript for the 60 Minutes
segment is at CBS News, NSA Speaks out on Snowden, Spying, December 15, 2013,
http://www.cbsnews.com/news/nsa-speaks-out-on-snowden-spying/. Note that the video
that plays on the page with the transcript is not guaranteed to be the correct segment of
60 Minutes; the URL for the correct video segment is given above.
43
weather report are likely to be ruled out, although perhaps some may
later be found to have intelligence value. Further, laws or regulations
restrict what an analyst is allowed to do. For instance, there are special
rules applied to subjects of interest who are or might be U.S. persons
and various (and differing) sets of rules depending on which authority
allowed the collection of the underlying information (see Section 1.4).
3.1.2 How Metadata Are Used in Contact Chaining
Either bulk or targeted collection can lead to the result in Figure 3.1.
Since A and B are targets, targeted collection using a discriminant that
specifies collect all calls to or from A or B would collect all the contacts
and subjects shown in the figure. However, if all calls between A or B
and C occurred before either A or B was identified as a target, later collection targeted on A or B will not find C by way of A or B, but might
find C because of communication with some other target. Bulk collection
provides useful history, because it does not limit collection to only the
targets known at the time of collection.
seed
target
identifier of interest
unknown
contact
FIGURE 3.1 A network of contacts among identifiers.
Figure 3-1
Copyright National Academy of Sciences. All rights reserved.
44
45
46
47
48
49
50
4
Bulk Collection
52
53
BULK COLLECTION
54
BULK COLLECTION
55
56
fiers. Collection software could be designed to chain targets this way only
if such chaining is pre-approved. While this approach may collect a few
more rapidly unfolding scenarios, it does not provide the complete view
of past events afforded by bulk collection.
Big data analytics. It may be possible to use big data analytics to
help narrow collection, even if the results from such analytical tools are
not sufficiently precise to identify individual targets. That is, the government may be able to rely on the power of large private-sector databases,
analytics, and machine learning to shape data collection constraints to
data predicted to have high value. But even if the government collection
becomes more narrowly targeted through the use of such analytic tools to
develop the targeting, this is not necessarily a win for privacy. Depending on what aggregate data is used to determine the targeted government collection, use of such techniques may well raise privacy concerns.
There will also be concerns that the methods used for targeting are akin
to socially unacceptable profiling (e.g., targeting purchases of camping
goods, males, ages 15 to 30). Thus, the use of big data analytics to provide
better targeting may not be acceptable from a policy point of view, even if
such techniques were to ultimately result in a more narrow government
collection.
Cascaded filtering. Some of these methods may benefit from the use
of cascaded filtering. One benefit of this approach is that it allows one
to reduce the computing burden by first applying cheap tests, followed
by more expensive filters only if earlier filters warrant. For example, if
metadata indicates a civilian telephone call to a military unit under surveillance, speech recognition and subsequent semantic analysis might be
applied to the voice signal, resulting in an ultimate collection decision.
Richer targeting may require enhancing the ability of collection hardware
and software to apply complex discriminants to real-time signals feeds.
Another benefit is that it will tend to reduce the amount of data that ends
up being collected through fast and early filtering.
4.3CONCLUSION
There is no doubt that bulk collection of SIGINT leaves many uncomfortable. Various courts have indeed questioned whether such collection
is constitutional. This discomfort arises for many reasons. Some find the
idea that the U.S. government collects vast amounts of communications
signals information about unsuspected U.S. persons abhorrent to the
very notion of democracy, while others object to this decision being made
under the cover of secrecy.
This chapter has explored uses of bulk collection and technical alternatives the committee uncovered during its work that might mitigate
BULK COLLECTION
57
some of the privacy and civil liberties concerns of that collection. None
of these alternatives changes a fundamental point: A key value of bulk
collection is its record of past SIGINT that may be relevant to subsequent
investigations. If past events become interesting in the present because of
new circumstancessuch as the identification of a new target, indications
that a nonnuclear nation is now pursuing the development of nuclear
weapons, discovery that an individual is a terrorist, or emergence of new
intelligence-gathering prioritieshistorical events and the data they provide will be available for analysis only if they were previously collected.
Conclusion 1. There is no software technique that will fully substitute for bulk collection where it is relied on to answer queries about
the past after new targets become known.
This conclusion does not mean that all current bulk collection must
continue. What it does mean is that a choice to eliminate all forms of bulk
collection would have costs in intelligence capabilities. The analysis in this
report provides a partial basis from which to make such policy choices.
Other groups, such as the Presidents Review Group on Intelligence
and Communications Technologies and the Privacy and Civil Liberties
Oversight Board have said that bulk collection of telephone metadata is
not valuable enough to justify the loss in privacy.3 This is a policy judgment, which is not in conflict with the committees conclusion that there
are no technical alternatives that can accomplish the same functions as
bulk collection and serve as a complete substitute for it; there is no technological magic.
The committee was not asked to and did not consider whether the
loss of effectiveness from reducing bulk collection would be too great,
or whether the potential gain in privacy from adopting an alternative is
worth the potential loss of intelligence information. Nor was it able to
identify broad categories of use where substitution of alternatives might
be possible or detect metrics that would inform such decisions. The Office
of the Director of National Intelligence may wish to study these questions
further.
Data retained from targeted SIGINT collection might be a partial
substitute if the needed information was in fact collected. Bulk data held
by other parties might substitute to some extent, but this relies on those
3 Presidents Review Group on Intelligence and Communications Technologies, Liberty
and Security in a Changing World, http://www.whitehouse.gov/sites/default/files/
docs/2013-12-12_rg_final_report.pdf, and Privacy and Civil Liberties Oversight Board, Report
on the Telephone Records Program Conducted under Section 215 of the USA PATRIOT Act and on the
Operations of the Foreign Intelligence Surveillance Court, January 23, 2014, http://www.pclob.
gov/SiteAssets/Pages/default/PCLOB-Report-on-the-Telephone-Records-Program.pdf.
58
5
Controlling Usage of Collected Data
60
The last two are the main threats for most people concerned about
privacy and civil liberties. Hence, the emphasis of this report is on controls, oversight, and transparency, which are the principal ways to address
these threats.
5.2 CONTROLLING USAGE
Chapter 4 states the committees conclusion that refraining entirely
from bulk collection will reduce the nations intelligence capability and
that there is no kind of targeted collection that can fully substitute for all
of todays bulk collection. However, the committee believes that controlling the usage of data collected in bulk (and indeed all data) is another
way to protect the privacy of people who are not targets (see Figure 5.1).
Controls on usage can help reduce the conflicts between collection and
privacy. There are two ways to control usage: manually and automatically.
NSA automates some of its controls and plans additional automation.
Despite rigorous auditing and oversight processes, however, it is hard
to convince outside parties of their strength because necessary secrecy
prevents them from observing the controls in action, and because popular
descriptions of the controls are imprecise and sometimes wrong.1 Examples of usage controls in place today are minimization (Section 1.4.1) and
restricting queries to targets with reasonable and articulable suspicion
(Section 1.4.3).
Technical means can isolate collected data and restrict queries that
analysts can make, and the way these means work can be made public
without revealing sources and methods.
This is similar to the well-established doctrine in cryptography2 that
the security of the system should depend only on keeping the cryptographic key secret, not on keeping the cryptographic algorithm secret. The
main reason for this is that the algorithm exists in many more places than
the keywith every sender or receiver of messages that uses the cryptosystemso it is much harder to keep the algorithm secret and to change
it if it is compromised. In contrast, a key is usually used only between a
single sender and receiver, or at most a few of them, and only for a limited
time, so it is much easier to keep it secret and to change it if it is compromised. In addition, a public algorithm may be more secure because many
people can scrutinize it for weaknesses.
1 See, for example, this newspaper account of President Obamas description of NSA
practices: http://www.washingtonpost.com/world/national-security/obamas-restrictionson-nsa-surveillance-rely-on-narrow-definition-of-spying/2014/01/17/2478cc02-7fcb-11e393c1-0e888170b723_story.html, Washington Post, January 17, 2014.
2 First formulated by Kerckhoff in 1883. See Fabien Peticolas, electronic version and English
translation of La cryptographie militaire.
61
Discriminant
Signal
Extract
Filter
Query
Store
Analyze
Disseminate
Collection
Figure 5-1
In the same way, the specifics of actual use cases would be kept secret
while the rules and the usage controls that enforce them are made public.
This transparency makes the control of usage more credible.
Implementing usage controls in technology also forces those specifying the rules to be much more explicit than if they are providing instructions for human analysts to follow. Today, many of the descriptions for
what is and is not allowed are in certain ways imprecise and ambiguous.
Such ambiguities can lead to confusion and differing interpretations of
the same rule. Furthermore, automatic controls may reduce the need for
human labor implementing manual controls. Thus, technology may make
the control more reliable and economical as well as more transparent.
It is impossible, however, for technical means to guarantee that information is not misused, because someone with properly authorized access
can always misuse the information they obtain. This is like the analog
hole in digital media; there are many ways to prevent digital copying,
but when a human views or hears information, that information can be
copied with a camera or sound recorder. Similarly, when an analyst sees
information, he or she can misuse it. Thus misuse can only be deterred by
the threat of punishment. Deterrence requires technical capabilities to
detect access, to identify the (authorized) accessing party, and to audit
records of access to spot suspicious patterns of access.3 In addition, both
3
Note that, to date, the only allegations that information collected in bulk has been used
for an unauthorized purpose was the so-called LOVINT set of incidents in which some
NSA analysts inappropriately used this data to track the activities of significant others.
62
manual and automatic controls are primarily aimed at analysts and others
not in positions of authority. Detecting bad behavior by people in positions of authority needs multiple independent audit paths and oversight.
Lastly, it may be true that manual controls can be overridden more
easily than automatic controls, because a technical change is usually
more difficult than a procedural change. Changes are sometimes necessary to fix problems that arise, but whether it is good or bad for changes
to be easily made is a policy judgment.
Manual and automatic methods can control usage in many ways,
including the following:
Constraining the selectors associated with targets to those that are
approved in some way (e.g., analysts may target only those parties for
which they have reasonable and articulable suspicion of involvement
with terrorism);
Limiting the time period for which data are accessible;
Limiting the kinds of algorithms that are applied to data (e.g.,
algorithms that look for patterns, or various statistical techniques); and
Using advanced information technology techniques to limit risk of
disclosure, as described below.
The bulk of this chapter discusses how to control queries that analysts
make against collected data. Controlling the use of such a large amount
of data is critical, which is why the committee has emphasized it. When
the data are queried, rules are applied about what uses of collected data
are allowed. If a policy decision is made to continue bulk collection, protection of privacy and civil liberties will necessarily rely on these rules.
Once the results of a query are delivered to an analyst, other means
must be used to control proper use of the data between queries and disseminated intelligence reports. These other means must be matched to
what analysts actually do and to the tools they use. This cannot be done
in the same way that queries on the collection database are controlled, for
several reasons:
1. To do their jobs, analysts need flexibility to use the query results in
many ways, such as combining them with other data or processing them
with other programs, some perhaps written specifically for the current
These involved very few incidents (around a dozen). A letter from NSA to Senator Charles
Grassley on September 11, 2013, details these incidents (see https://www.nsa.gov/public_
info/press_room/2013/grassley_letter.pdf). According to testimony to the committee on
August 23, 2014, by NSA Director of Compliance, the activities were uncovered through
internal investigations.
63
purpose. These uses are much less standardized than the collection database and the ways of querying it, and it is not practical to control them in
detail. The reason is that in order to construct software that tracks in detail
the way that the inputs of a program affect its outputs, it is first necessary
to formalize how the program works. This is usually much more difficult
than writing the program in the first place.
2. Analysts share their work in progress with other analysts, so that
even if the queries made by a single analyst return only 50 items, the
queries made by 200 analysts may return 10,000 items altogether, and a
single analyst or systems administrator may end up with all of these items.
3. In some cases, analysts import query results into commercial applications such as a spreadsheet like Excel or a statistical analysis system
like SAS/STAT. It is not practical to modify these applications to track the
way that their inputs affect their outputs, and it is impractical for the IC
to develop its own substitutes.
4. Analysts do their work and store their data on workstations and
servers that run commercial off-the-shelf operating systems because it is
neither economical nor efficient for the IC to build its own operating systems and the applications. Furthermore, there are many versions of these
systems in use at any given time, as is normal for any large organization.
It is not practical to use these systems for fine-grained control of data.
It would be naive for the committee to claim that it understands what
happens today, and presumptuous to pretend to design an ideal system
for NSAs use. Furthermore, it is not enough to understand the normal
information flow; possible changes, mistakes, and errors also need to be
dealt with. For instance, something might change that would make yester
days legitimate query unacceptable today. A target might have become
a non-target, or an error might have been found in the rules governing
queries.
It is possible, however, to have very coarse-grained controls on the
data held by analysts, controls that implement the existing U.S. government information classification system. Indeed, the IC has supported
research on such controls since the 1970s, under the rubric of multi-level
security. More recent academic work calls it information flow control.
It is quite well understood in theory, and several systems have been built
that enforce the rules for handling classified data. Unfortunately, attempts
to use these systems in practice have been unsuccessful, and almost none
are deployed. Information flow control cannot do the kind of queryspecific control that is described in this chapter; instead, it tends to push
computed outputs to the highest level of classification, which is not useful
in practice. However, it is the best technique known at present.
64
65
a privacy and civil liberties standpoint. None has found any deliberate
attempts to circumvent or defeat these procedures, although there have
been documented incidents of error.
Purely automatic control of usage would mean that the rules would
be enforced automatically using published mechanisms. Then people
outside the IC concerned about privacy and civil liberties would not have
to trust that the IC has adequate procedures and follows them, which
many of them are reluctant to do. Such purity is not possible, however;
it is thus necessary to independently audit the ICs procedures to some
extent. The impractical alternative is to make every step on the path from
raw data to query results secure from any possible tampering; this would
be a rigid and unworkable system. Some manual controls are necessary
to ensure that the automatic controls are actually imposed and that they
are configured according to the rules, and to decide cases that are too
complex to be automated.
Thus, the goal of reassuring the public by the exclusive use of transparent automatic controls is elusive. Those who do not trust the power
of government, both its elected officials and the IC, will argue that its
technical expertise could be misused to override automatic controls, and
no amount of manual or automatic oversight is likely to reassure them. In
short, perfect controls are impossible. The goal should be to balance controls against practicality, recognizing that some amount of risk, tempered
by trust in those who manage the system, will always remain.
5.4 AUTOMATIC CONTROLS
A technical system for controlling usage of bulk data has three parts:
isolating the bulk data so that it can only be accessed in specific ways,
restricting the queries that can be made against it, and auditing the queries
that have been done. All three parts are equally important, although isolation is most fully developed and hence has the fullest description, and
auditing is the least developed. This chapter gives brief descriptions of
each of these parts. It emphasizes the architecture of the possible systems,
giving only a sketch of the technical details; consult the references for the
full story.
Note that any technical mechanism must be tested under realistic
conditions to establish confidence that it actually works. This is especially
important for mechanisms that are intended to handle rare events, like the
ones described here. The only practical way to do this is to deliberately
inject disallowed queries into the running system and verify that they are
detected and handled correctly.
Some of the methods described here are in widespread use commercially, and perhaps within NSA. Others have been demonstrated in the
66
Analysts
Query
Guard
Result
Control
Policy
Policy
Isolation
Bulk data
Audit
log
67
Dirk A.D. Smith, Exclusive: Inside the NSAs private cloud, Network World, September 29, 2014, http://www.networkworld.com/article/2687084/security0/exclusive-insidethe-nsa-s-private-cloud.html.
68
9 This is the process used in wiretap investigations authorized by the 1986 Pen Register
Act (Title III of the Electronic Communications Privacy Act).
69
Policy
Analysts
Query
Result
Aggregator
Guard
Policy
Guard
Policy
Bulk data
Audit
log
Bulk data
Audit
log
Bulk data
Audit
log
70
gap is a very good isolation boundary, the isolation also depends on the
guard that is supposed to check all inputs, as discussed below. 10
Hypervisor. A cheaper host is a hypervisor that implements separate
virtual machines instead of separate physical ones. Currently, the hypervisor is part of the TCB, and, unfortunately, commercial hypervisors are
rather complicated because their main selling point is performance rather
than security. But this is cheaper than the airgap because there is only one
physical machine, and the bandwidth of communication between the
virtual machines can be close to the full memory bandwidth. There are
many variations on the hypervisor idea, with different costs and security
considerations.11
Enclaves. In between separate physical machines and separate virtual machines is a fairly new way of doing isolation, called an enclave in
the implementation, developed by Intel. This is like a virtual machine,
but its isolation is provided directly by the central processing unit (CPU).
Because this mechanism is tightly integrated into the CPU and the memory system, it can provide good performance much more simply than a
hypervisor.12
Language virtual machines. Programs written in languages intended
for web pages, such as Java and JavaScript, are usually executed inside
isolation boundaries with names like Java Virtual Machine. In this case,
the main purpose of the isolation is to protect the rest of the system from
the untrusted web program rather than the other way around.
5.4.1.3 The Guard
If the isolation mechanism is sound, the guard is the main weak point;
if the guard makes the wrong decisions about what to allow through, the
system inside the isolation boundary can be completely compromised,
and this has happened many times in practice with every kind of isolation boundary, including airgaps. For example, executable malware
included in an email message can infect an isolated system. The same
thing can happen with a USB flash drive, which can contain malware that
is executed automatically. The guard needs to block all executable content
that has not been properly vetted. As with every aspect of security, the
10 Although a good example of an isolation technique, technology alternatives listed in this
subsection can be engineered to provide adequate isolation for this application.
11 M. Pearce, S. Zeadally, and R. Hunt, Virtualization: Issues, security threats, and solutions, ACM Computing Surveys 45(2), Article No. 17, 2013.
12 F. McKeen, I. Alexandrovich, A. Berenzon, C. Rozas, H. Shafi, V. Shanbhogue, and
U. Savagaonkar, Innovative instructions and software model for isolated execution, in Proceedings of the Second International Workshop on Hardware and Architectural Support for Security
and Privacy, Association of Computing Machinery, New York, N.Y., 2013.
71
only practical approach today is to keep both the specification of what the
guard has to do and the code that does it as simple as possible.
If each item of bulk data is tagged with access control information
that specifies which analysts are allowed to see it, the job of the guard
is easier. The Apache Accumulo open-source database, for example, has
this feature; it was originally developed by NSA, which transferred it
to Apache, an organization that develops open-source software for the
Internet. This kind of tagging is the standard way of doing access control
in computer security; it is helpful for controlling usage of collected data,
but not sufficient for enforcing a rule such as trace contacts for at most
two hops, which restricts the algorithm that processes the data rather
than access to the data itself.
5.4.1.4 Bulk Data Processing
In general, there are a lot of bulk data, so that simply storing the
bits securely and reliably is complex, and the data are processed by a
general-purpose database system that is even more complex, usually tens
of millions of lines of code. Much of this code might not be needed for
a particular application, but it is likely to be impractical to separate the
parts that are needed from the rest. Thus, it is highly desirable to keep
as much of this storage and processing out of the TCB, which should be
small and simple.
There has been a lot of work on isolation to protect a cloud client from
its cloud service provider, because there is a big market for cloud computing, and many customers care about the security of their data and do not
want to trust the service provider. This is the most important application
for the enclaves described above. Figure 5.4 shows another way for a
client to store and process data in the cloud without trusting the cloud
provider. The idea is to do everything in the cloud in encrypted form, so
that the result appears in encrypted form as well. Only the client holds the
key, so only the client can see anything about the data or the result except
its size, and perhaps something about the shape of the query. This gives
no guarantee that the result is correct or that it reads only the data actually
needed for the query, but it does guarantee that only the client sees any
data, and since the client is entitled to see all the data, the clients secrecy
is maintained. It is not obvious how to actually implement this scheme,
but in some cases it is possible, and ways to do it are explained below.
Unfortunately, although the architecture shown in Figure 5.4 serves
the needs of the cloud client, it is not enough for automatic control of
access to bulk data. Unlike the cloud client, the analyst is not entitled to
see all of the data. How can the guard enforce the policy about what the
analyst is allowed to see? This takes a proof, or perhaps some convincing
72
Client
Client
data/query
Encrypt
Decrypt
Cloud
Data/
Query
Result
Untrusted
processing
Encrypted
bulk data
FIGURE 5.4 Smaller trusted computing base by processing encrypted data for a
client.
Figure 5-4
evidence, provided by the untrusted cloud side of the picture, that the
result is correct or at least that it does not reveal any more data than
the query demands. Figure 5.5 illustrates this approach; the parts that are
unchanged from Figure 5.2 are dimmed.
What would such a proof look like? That depends on the query. For
example, if the query is Return all the endpoints of communications with
this target, a proof would be a list of all the database entries that yielded
the result; recall that these are all encrypted, so the untrusted side cannot make them up. If the target is X, and each database entry represents
a call detail record with a triple <from, to, time>, verifying the proof
means checking that every result endpoint Y is in an entry <X, Y, time>
or <Y, X, time>. Note that this does not prove that the result is correct, but
it does prove that no extra information is disclosed. For another example,
see the next section.
5.4.1.5 Encrypted Data at Rest
The simplest example of the idea in Figure 5.5 uses the untrusted side
only to store data, not to do any computing on it, as shown in Figure 5.6
(where the unchanging left side of the figure has been cut off). This means
that each data block is encrypted before being handed over to untrusted
storage by the collection system and decrypted when it is read back. Each
73
Trusted
Computing Base
Analysts
Query
Guard
Result
Policy
Trusted
processing
A dit
Audit
logg
Conventional
NSA systems
Query
Verify/
Decrypt
Result
+ proof
Conventional
processing
Encrypted
bulk data
Trusted
Computing Base
Untrusted
Storage
Figure 5-5
Guard
Policy
Bulk data
processing
Audit
log
Read
block
Verify/
Decrypt
Encrypted
Data bulk data
+ MAC
FIGURE 5.6 Smaller trusted computing base by encrypting bulk data at rest.
Ken Beer and Ryan Holland, Securing Data at Rest with Encryption, Amazon Web Services white paper, November 2013, http://media.amazonwebservices.com/AWS_Securing_
Data_at_Rest_with_Encryption.pdf.
74
75
Trusted
Computing Base
Guard
Policy
Trusted
processing
A dit
Audit
logg
Conventional
NSA systems
Query
Verify/
Decrypt
Result
+ proof
Conventional
processing
Simulated
homomorphic
cryptography
Decrypt
Encrypt
Simple
trusted
processing
Encrypted
bulk data
76
Doing this manually is feasible, and is, indeed, NSAs current practice.
Although it is thorough, it is expensive and not transparentoutsiders
must rely on the agencys assurance that it is being done properly, because
the queries are usually highly classified. Automation of auditing, a direction NSA is pursuing, could both streamline audits and provide assurance
to outside inspectors, who can then examine the auditing technology.
The resulting ability to inspect the privacy-protecting mechanisms
of the SIGINT process on an unclassified basis may help allay privacy
and civil liberty concerns. The inspection would focus on the automation
software and the usage rules it enforces, rather than on the data, which
must remain classified.
Greater automation of auditing is an area that has been greatly
neglected by government, industry, and academia; for example, operating systems write voluminous logs of security-relevant events, but they
are seldom looked at, and when they are, a great deal of manual effort is
required. Chapter 6 discusses some possible improvements.
5.5CONCLUSION
This chapter has reviewed a variety of feasible mechanisms, both
manual and automatic, for controlling the way that collected data is used.
Some of these are deployed in the IC. Others may be deployed, but the
committee was not told about them in briefings. All of these mechanisms
are feasible to deploy within the next 5 years. Opportunities to introduce
enhancements to such capabilities are expected to arise as the information technology systems used for collection and analysis are refreshed
and modernized.
Automation of usage controls may simultaneously allow a more
nuanced set of usage rules, facilitate compliance auditing, and reduce
the burden of controls on analysts. Similarly, there are opportunities to
automate the various audit mechanisms to verify that rules are followed.
These techniques may permit more of the use controls and audit mechanisms to be explained clearly to the public. It may be possible to express
a large fraction of the rules required by law and policy in a machineprocessable form that can be rapidly and consistently applied during
collection, analysis, and dissemination.
Conclusion 2. Automatic controls on the usage of data collected in
bulk can help to enforce privacy protections.
Conclusion 2.1. It will be easier to automate controls if the rules
governing collection and use are technology-neutral (i.e., not tied
to specific, rapidly changing information and communications tech-
77
6
Looking to the Future
79
Closed-circuit TV surveillance, a form of bulk collection, has been practiced for years
with relatively little complaint, despite its privacy invasion.
80
that many more details of everyday life are recorded in this way. However, businesses that wish to minimize surveillance of their customers
can arrange to reduce or eliminate the intelligence value of their records.
For example, if a telephone company bills a flat monthly rate, it need
not keep a record of each call, so no call data records would be available
for intelligence purposes.2 Communications providers today are acutely
aware of their customers concerns about surveillance,3 a fact that gives
providers an additional incentive to refrain from keeping records that
might be used against them.
Services that hold data for customers may find ways to encrypt the
data with a key known only to the customer so as to evade surveillance.
This technique could be used by email providers and social-networking
services, among others. Some businesses are being established with
exactly this objective. But today, the ability to examine customer data and
use it for marketing purposes is an essential part of the hosting companys
business model, so customers are unlikely to have email that is both free
and surveillance-proof.
Attempts to evade surveillance are unlikely to slow the big data trend.
Businesses collect huge amounts of data not associated with individuals,
which may not cause privacy concerns, and are sure to collect still more.
Some of this data has a large public benefit, such as for weather prediction, crop management, or public health monitoring. Businesses may
implement different levels of protection for different business records, so
that customer-sensitive data is not comingled with data that has benign
uses, both public and private.
6.1.3Encryption
One of the most imminent threats to SIGINT collection is the increasing use of strong encryption for signals in transmission. Increasingly,
website servers are routinely encrypting traffic to and from the browser
clients. To a lesser extent, data at rest is being encrypted. The cybersecurity vulnerabilities of the endpoints (browser, server) are becoming
much greater than the vulnerability of the communications between them,
a point suggesting that access may still be possible (although more difficult), even when transmission links are encrypted.
Other business records of such a company, however, linking customer name, address,
and telephone number, might still be very valuable for intelligence purposes.
3 See, for example, Vodafone Group, Law Enforcement Disclosure Report, 2014, http://
www.vodafone.com/content/sustainabilityreport/2014/index/operating_responsibly/
privacy_and_security/law_enforcement.html, accessed January 16, 2015.
81
82
83
Assurance Research (SPAR6) program, which addressed topics of particular relevance to implementing secure SIGINT systems of the sort
described in Chapter 5.
This section does not delve into the many technologies that NSA and
other IC organizations use to operate large, complex IT operations. It does
not cover network security, operating-system security, physical security of
computer systems, authentication of users, or a host of other areas that are
part of making SIGINT technologies trustworthy. Research in these and
other areas that affect the general state of complex IT will help the IC too.
6.3.1 Technologies for Isolation
The approaches described in Section 5.4 are not in widespread use,
but they are not unexplored either. Their successful use will depend on
not only choosing a sound architecture, but also on developing a careful implementation: the trustworthiness of key components depends on
keeping them simple to avoid mistakes that lead to vulnerabilities. And
system-wide properties, such as security, will depend on many details,
such as managing cryptographic keys properly, distributing them securely,
changing them occasionally, ensuring that no single system administrator
can penetrate security, and so on. These are not simple systems to engineer and operate.
Variants of the systems described in Chapter 5 often involve executing separate components on separate computers (often under control of
separate organizations) and protecting the communications among the
components. Techniques for doing this, usually based on encryption, are
the topic of a research area dubbed secure multi-party computation,
which was investigated by the IARPA SPAR program. For example, recent
research shows how to protect data and communications in a three-part
system: one issues queries, a second authorizes queries, and a third holds
data and performs searches specified by authorized queries.7
6.3.2 Other Technologies for Protecting Data Privacy
Although the focus of this report is signals intelligence that provides
data about individual people and groups, signals intelligence can also
6
84
8 See PCAST, Big Data, 2014, p. 38. A good view of anonymization and reidentification is in
Sections 3 and 4 of Opinion 05/2014 on Anonymisation Techniques (European Commission, Article 29 Data Protection Working Party, adopted April 10, 2014, http://ec.europa.eu/
justice/data-protection/article-29/documentation/opinion-recommendation/files/2014/
wp216_en.pdf).
9 Cynthia Dwork and Aaron Roth, The Algorithmic Foundations of Differential Privacy, Now
Publishers, Boston, Mass., 2014.
10 C. Task and C. Clifton, A guide to differential privacy theory in social network analysis,
pp. 411-417 in Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social
Networks Analysis and Mining (ASONAM), IEEE Computer Society, Washington, D.C.
85
NSA has developed and donated to the Apache open-source community such a database. Accumulo is a scalable key/value store that allows access labels to be attached to
each cell that enables low-level query authorization checks (Apache Software Foundation,
Apache Accumulo, https://accumulo.apache.org/, accessed January 16, 2015).
12 A.C. Myers and B. Liskov, A decentralized model for information flow control, pp. 129142 in Proceedings of the 17th ACM Symposium on Operating System Principles (SOSP), 1997,
Association for Computing Machinery, New York, N.Y.
86
87
suspicious patterns, filter out the great majority of queries that do not
raise any issues or that were vetted by automatic query approval, and
present the remainder for manual review.
Automating the audit or overview process has much in common
with automating query authorization. Because there is a lot of audit data,
machine learning can also play a role, although it would probably require
introducing a lot of synthetic misbehavior (that is, deliberately introduced
misbehavior) to get enough true positives into the training set.
6.3.5 Formal Expression of Laws and Regulations
If it were possible to express the laws, policies, and rules governing
SIGINT in a machine-understandable form, it might be possible to generate tools that do automatic approval and oversight for a portion of the
queries. One approach would be to develop formal policy languages to
represent the precise meanings of policies. These could serve as an intermediate language between the output of lawyers and the technological
control of processes and computer programs. The process of formulating
them would likely reveal many anomalies, ranging from ambiguities to
misinterpretations to inconsistencies. NSA reported that it had looked into
deontic description logic for this purpose. To the extent that the field of
computational law thrives, its results would be relevant. Projects around
this area would seem to be an ideal unclassified research topic, appropriate for an interdisciplinary team of experts in law, policy, and computer
science.
Basing automation on formal definitions has another advantage: if the
rules must change, the automation will change as a direct consequence.
Formal rule expressions will change, due to new laws, policies, and regulations, or in order to adapt to emergencies. Of course, the rule expressions and the process for changing them must be controlled carefully to
ensure compliance with the governing documents.
Advances in this area might lead outside organizations to gain confidence that the rules for handling personal data are being followed. If
these techniques are not being used today, how might they be applied to
reassure overseers that what they see is a full report of what happened?
Can zero-knowledge proofs be used in some way to reassure members
of the public who wish to monitor operations? Are there general ways of
scanning logs and reliably picking out transactions that need to be looked
at? Cybersecurity defense tries to do this, but even with specialized logs,
it is an incompletely solved problem.
88
89
do not have to be collected. As the number of data sources grows, especially from public information, it may become important to routinely
assess the value of these sources. And such analysis would provide, at
least in classified form to the IC, an answer to a question that Presidential
Policy Directive 28,16 in effect, asks, How valuable is bulk collection of
domestic telephone metadata?
6.4 ENGAGEMENT WITH THE RESEARCH COMMUNITY
As the committee did its work, it noted an evolving relationship
between NSA and the academic research community on problems such as
those addressed in this report. For many years, NSA has formally funded
unclassified, basic research in mathematics (algebra, number theory, discrete mathematics, probability, and statistics) in the United States in its
Mathematical Sciences Program.17 According to NSA, this program was
initiated in response to a need to support mathematics research in the
United States and recognizes the benefits both to academia and NSA
accruing through a vigorous relationship with the academic community.
Further developing a similarly vigorous and sustained relationship
between NSA and the academic computer science community could have
similar benefits. Mechanisms would have to be found to translate classified problems into unclassified ones that researchers could tackle without
being subject to security reviewdoing so would improve the coupling of
the research mission with the operational mission. The IC has two mechanisms that help bridge the classification chasm. IARPA funds research
relevant to the IC, some of which targets the future of SIGINT. Many of
its research programs are predominantly unclassified, and it is working
to develop unclassified proxies for research problems of more direct
applicability to the IC. The firm In-Q-Tel acts somewhat like a venture
fund for innovative technology potentially useful to the IC, supporting
commercially viable technologies that might serve IC needs. Both appear
to be effective, but their structures and policies are not primarily intended
to build long-term and vigorous relationships with academic disciplines.
Bridging the chasm would benefit both communities.
Even in a report that was intended to address primarily technical
issues, the committee found it necessary to engage with a number of
legal and policy issues. This point underscores the fact that it is often
16
The White House, Presidential Policy Directive/PPD-28, Signals Intelligence Activities, Office of the Press Secretary, January 17, 2014, http://www.whitehouse.gov/sites/
default/files/docs/2014sigint_mem_ppd_rel.pdf.
17 National Security Agency, Mathematical Sciences Program, last modified August 30,
2013, https://www.nsa.gov/research/math_research/.
90
91
the voice signal, resulting in an ultimate collection decision. Richer targeting may require enhancing the ability of collection hardware and software
to apply complex discriminants to real-time signals feeds.
Conclusion 3.2. More powerful automation could improve the
precision, robustness, efficiency, and transparency of the controls,
while also reducing the burden of controls on analysts.
Some of the necessary technologies exist today, although they may
need further development for use in intelligence applications; others will
require research and development work. This approach and others for
privacy protection of data held by the private sector can be exploited by
the IC. Research could also advance the ability to systematically encode
laws, regulations, and policies in a machine-processable form that would
directly configure the rule automation.
Appendixes
A
Observations about the
Charge to the Committee
95
96
B
Acronyms
CDR
CIA
FISA
FISC
http
IARPA
IC
IP
ISP
IT
MAC
NSA
ODNI
PPD
RAS
97
98
SIGINT
SMTP
SPAR
SQL
signals intelligence
Simple Mail Transport Protocol
Security and Privacy Assurance Research
Structured Query Language
TCB
TCP
USB
USSID
VoIP
WMD
C
Biographical Information for Committee
Members, Consultants, and Staff
COMMITTEE
ROBERT F. SPROULL, Chair, is an adjunct professor of computer science
at the University of Massachusetts, Amherst. Dr. Sproull retired in 2011
as vice president and director of Oracle Labs, an applied research group
that originated at Sun Microsystems (acquired by Oracle in 2010). Before
joining Sun in 1990, he was a principal with Sutherland, Sproull, and
Associates; an associate professor at Carnegie Mellon University; and a
member of the Xerox Palo Alto Research Center. He has served as chair
of the National Research Councils (NRCs) Computer Science and Telecommunications Board (CSTB) since 2009. He is also on the Computing
Community Consortium (CCC) Council. In June, Dr. Sproull completed
a 6-year term on the National Academy of Engineering (NAE) Council.
He is a member of the NAE and a fellow of the American Association for
the Advancement of Science (AAAS) and the American Academy of Arts
and Sciences. Dr. Sproull received his M.S. and Ph.D. in computer science
from Stanford University and an A.B. in physics from Harvard College.
FREDERICK R. CHANG is the director of the Darwin Deason Institute
for Cyber Security, the Bobby B. Lyle Endowed Centennial Distinguished
Chair in Cyber Security, and a professor in the Department of Computer
Science and Engineering in Southern Methodist Universitys (SMUs)
Lyle School of Engineering. Dr. Chang is also a senior fellow in the John
Goodwin Tower Center for Political Studies in SMUs Dedman College.
He has been professor and AT&T Distinguished Chair in Infrastructure
99
100
APPENDIX C
101
102
guished engineer at Sun Microsystems, a faculty member at the University of Massachusetts, Amherst, and at Wesleyan University. She has held
visiting positions at Harvard University, Cornell University, and Yale
University, and the Mathematical Sciences Research Institute. Dr. Landau
is the author of Surveillance or Security? The Risks Posed by New Wiretapping
Technologies (2011) and co-author, with Whitfield Diffie, of Privacy on the
Line: The Politics of Wiretapping and Encryption (1998, rev. ed. 2007). She
has written numerous computer science and public policy papers and
op-eds on cybersecurity and encryption policy and testified in Congress
on the security risks of wiretapping and on cybersecurity activities at the
National Institute of Standards and Technologys Information Technology Laboratory. Dr. Landau currently serves on the NRCs CSTB. A 2012
Guggenheim fellow, she was a 2010-2011 fellow at the Radcliffe Institute
for Advanced Study, the recipient of the 2008 Women of Vision Social
Impact Award, and also a fellow of AAAS and ACM. She received her
B.A. from Princeton University, her M.S. from Cornell University, and
her Ph.D. from MIT.
MICHAEL E. LEITER is executive vice president for business development, strategy, and mergers and acquisitions at Leidos. Prior to taking
on his current role at Leidos, Mr. Leiter was a senior counselor at Palantir
Technologies. Before that, he was the director of the National Counterterrorism Center (NCTC). He was sworn in as the Director of NCTC on
June 12, 2008, upon his confirmation by the U.S. Senate and after serving as the acting director since November 2007. Before joining NCTC,
Mr. Leiter served as the deputy chief of staff for the Office of the Director of
National Intelligence (ODNI). In this role, he assisted in the establishment
of the ODNI and coordinated all internal and external operations for the
ODNI, to include relationships with the White House, the Departments
of Defense, State, Justice, and Homeland Security, the Central Intelligence
Agency, and the Congress. He was also involved in the development of
national intelligence centers, including NCTC and the National Counter
proliferation Center, and their integration into the larger Intelligence Community. In addition, Mr. Leiter served as an intelligence and policy advisor
to the Director and the Principal Deputy Director of National Intelligence.
Prior to his service with the ODNI, Mr. Leiter served as the deputy general counsel and assistant director of the Presidents Commission on the
Intelligence Capabilities of the United States Regarding Weapons of Mass
Destruction (the Robb-Silberman Commission). While with the RobbSilberman Commission, Mr. Leiter focused on reforms of the U.S. Intelligence Community, in particular the development of what is now the
National Security Branch of the Federal Bureau of Investigation. From
2002 until 2005, he served with the Department of Justice as an Assistant
APPENDIX C
103
United States Attorney for the Eastern District of Virginia. At the Justice
Department, Mr. Leiter prosecuted a variety of federal crimes, including
narcotics offenses, organized crime and racketeering, capital murder, and
money laundering. Immediately prior to his Justice Department service, he
served as a law clerk to Associate Justice Stephen G. Breyer of the Supreme
Court of the United States and to Chief Judge Michael Boudin of the U.S.
Court of Appeals for the First Circuit. From 1991 until 1997, he served as a
Naval Flight Officer flying EA-6B Prowlers in the U.S. Navy, participating
in U.S., NATO, and United Nations operations in the former Yugoslavia
and Iraq. Mr. Leiter received his J.D. from Harvard Law School, where he
graduated magna cum laude and was president of the Harvard Law Review,
and his B.A. from Columbia University.
ELIZABETH RINDSKOPF PARKER is dean emerita at the University of
the Pacific, McGeorge School of Law. A noted expert on national security
law and terrorism, Ms. Parker served 11 years in key federal government
positions, most notably as general counsel for NSA; principal deputy
legal adviser, Department of State; and general counsel for the Central
Intelligence Agency. In private practice, she has advised clients on public
policy and international trade issues, particularly in the areas of encryption and advanced technology. Ms. Parker began her career as a Reginald
Heber Smith Fellow at Emory University School of Law and later served
as the director, New Haven Legal Assistance Association, Inc. Early in her
career, she was active in litigating civil rights and civil liberties matters,
with two successful arguments before the U.S. Supreme Court while a
cooperating attorney for the NAACP Legal Defense and Education Fund.
Immediately before her arrival at McGeorge, Ms. Parker served as general
counsel for the 26-campus University of Wisconsin System. She is a member of the Security Advisory Group of the DNI, the board of directors of
the MITRE Corporation, the American Bar Foundation, and the Council
on Foreign Relations, and she is a frequent speaker and lecturer. Her academic background includes teaching at Pacific McGeorge, Case Western
Reserve Law School, and Cleveland-Marshall State School of Law. From
2006 to 2013, she held a presidential appointment to the Public Interest
Declassification Board. Ms. Parker received her B.A. and J.D. from the
University of Michigan.
PETER J. WEINBERGER has been a software engineer at Google, Inc.,
since 2003. After teaching mathematics at the University of Michigan,
Ann Arbor, he moved to Bell Laboratories. At Bell Labs, he worked on
Unix and did research on topics including operating systems, compilers,
network file systems, and security. He then moved into research management, ending up as Information Sciences Research vice president, respon-
104
sible for computer science research, math and statistics, and speech. His
organization included productive new initiatives, one using all call detail
to detect fraud and another doing applied software engineering research
to support building software for the main electronic switching systems
for central offices. After Lucent and AT&T split, he moved to Renaissance Technologies, a technical trading hedge fund, as head of technology, responsible for computing and security. He is a former member of
the NRCs CSTB, current co-chair of an NRC committee on cybersecurity
research, and served on several other NRC studies. He serves in a variety
of other advisory roles related to science, technology, and national security. He has a Ph.D. in mathematics (number theory) from the University
of California, Berkeley.
CONSULTANTS
M. ANTHONY FAINBERG became a research staff member at the Institute
for Defense Analyses, where he focuses on risk assessment methodologies,
countering nuclear terrorism, and nuclear non-prolieration issues, upon
retiring from federal service after 20 years. At retirement, Dr. Fainberg
was director of the Office of Transformational Research and Development
of the Domestic Nuclear Detection Office of the Department of Homeland
Security. Previously, he had been division chief at the Advanced Systems
and Concepts Office, Defense Threat Reduction Agency, Department of
Defense; before that, he directed the Office of Policy and Planning for Aviation Security in the Federal Aviation Administration. He also is a senior scientific advisor to the Pacific Basin Development Council, an organization
comprising the governors of the U.S. Pacific island territories and Hawaii.
He holds a Ph.D. in physics.
ALLAN FRIEDMAN is a research scientist at the Cyber Security Policy
Research Institute (CSPRI) in the School of Engineering and Applied Sciences at George Washington University, where he works on cybersecurity
policy. Wearing the hats of both a technologist and a policy scholar, his
work spans computer science, public policy, and the social sciences, and
has addressed a wide range of policy issues, from privacy to telecommunications. Dr. Friedman has over a decade of experience in cybersecurity
research, with a particular focus on economic, market, and trade issues.
He is the coauthor of Cybersecurity and Cyberwar: What Everyone Needs to
Know (2014). Prior to joining CSPRI, Dr. Friedman was a fellow at the
Brookings Institution and the research director for the Center for Technology Innovation. Before moving to Washington, D.C., he was a postdoctoral
fellow at the Harvard University Computer Science Department, where
he worked on cybersecurity policy, privacy-enhancing technologies, and
105
APPENDIX C
106