Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Introducing Accountability To Anonymity Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Introducing Accountability to Anonymity Networks

Michael Backes

, Jeremy Clark

, Peter Druschel

, Aniket Kate

and Milivoj Simeonovski

Saarland University, Germany


backes,simeonovski@cs.uni-saarland.de

MMCI, Saarland University, Germany


aniket@mmci.uni-saarland.de

Concordia University, Canada


clark@ciise.concordia.ca

MPI-SWS, Germany
druschel@mpi-sws.org
AbstractMany anonymous communication (AC) networks
rely on routing trafc through proxy nodes to obfuscate the
originator of the trafc. Without an accountability mechanism,
exit proxy nodes risk sanctions by law enforcement if users
commit illegal actions through the AC network. We present
BACKREF, a generic mechanism for AC networks that provides
practical repudiation for the proxy nodes by tracing back the
selected outbound trafc to the predecessor node (but not in
the forward direction) through a cryptographically veriable
chain. It also provides an option for full (or partial) traceability
back to the entry node or even to the corresponding user when
all intermediate nodes are cooperating. Moreover, to maintain
a good balance between anonymity and accountability, the
protocol incorporates whitelist directories at exit proxy nodes.
BACKREF offers improved deployability over the related work,
and introduces a novel concept of pseudonymous signatures
that may be of independent interest.
We exemplify the utility of BACKREF by integrating it into
the onion routing (OR) protocol, and examine its deployability
by considering several system-level aspects. We also present
the security denitions for the BACKREF system (namely,
anonymity, backward traceability, no forward traceability, and
no false accusation) and conduct a formal security analysis of
the OR protocol with BACKREF using ProVerif, an automated
cryptographic protocol verier, establishing the aforementioned
security properties against a strong adversarial model.
I. INTRODUCTION
Anonymous communication networks are designed to
hide the originator of each message within a larger set of
users. In some systems, like DC-Nets [1] and Dissent [2],
the message emerges from aggregating all participants
messages. In other systems, like onion routing [3], mix net-
works [4], and peer-to-peer anonymous communication net-
works [5], messages are routed through volunteer nodes that
act as privacy-preserving proxies for the users messages. We
call this latter class proxy-based anonymous communication
(AC) networks and concentrate on it henceforth.
Proxy-based AC networks provide a powerful service
to their users, and correspondingly they have been the
most successful AC networks so far [6], [7]. However the
nature of the properties of the technology can sometimes be
harmful for the nodes serving as proxies. If a network users
online communication results in a criminal investigation
or a cause of action, the last entity to forward the trafc
may become embroiled in the proceedings [8], [9], [10],
whether as the suspect/defendant or as a third party with
evidence. While repudiation in the form of a partial or full
traceability has never been a component of any widely-
deployed AC network, it may become the case that new
anonymity networks, or a changing political climate, initiate
an interest in providing a veriable trace to users who misuse
anonymity networks according to laws or terms of service.
While several proposals [11], [12], [13], [14], [15], [16],
[17] have been made to tackle or at least to mitigate this
problem under the umbrella term of accountable anonymity,
as we discuss in the next section some of them are broken,
while others are not scalable enough for deploying in low
latency AC networks.
Contributions. In this work, we design BACKREF, a novel
practical repudiation mechanism for anonymous communi-
cation, which has advantages in terms of deployability and
efciency over the literature. To assist in the design of
BACKREF, we propose a concept of pseudonymous signa-
tures, which employ pseudonyms (or half Dife-Hellman
exponents) as temporary public keys (and corresponding
temporary secrets) employed or employable in almost all AC
networks for signing messages. These pseudonym signatures
are used to create a veriable pseudonym-linkability mech-
anism where any proxy node within the route or path, when
required, can veriably reveal its predecessor in time-bound
manner. We use this property to design a novel repudiation
mechanism, which allows each proxy node, in cooperation
with the network, to issue a cryptographic guarantee that a
a
r
X
i
v
:
1
3
1
1
.
3
1
5
1
v
1


[
c
s
.
C
R
]


1
3

N
o
v

2
0
1
3
selected trafc ow can be traced back to its originator (i.e.,
predecessor node) while maintaining the eventual forward
secrecy of the system.
Unlike the related work, which largely relies on group
signatures and/or anonymous credentials, BACKREF avoids
the logistical difculties of organizing users into groups and
arranging a shared group key, and does not require access
to a trusted party to issue credentials. While BACKREF is
applicable to all proxy-based AC networks, we illustrate its
utility by applying it to the onion routing (OR) protocol. We
observe that it introduces a small computational overhead
and does not affect the performance of the underlying OR
protocol. BACKREF also includes a whitelisting option; i.e.,
if a exit node considers traceability to one or more web-
services unnecessary, then it can include those services in a
whitelist directory such that accesses to those are not logged.
We formally dene the important properties of the
BACKREF network. In particular, we formalize anonymity
and no forward traceability as observational equivalence
relations, and backward traceability and no false accusation
as trace properties. We conduct a formal security analysis
of BACKREF using ProVerif, an automated cryptographic
protocol verier, establishing the aforementioned security
and privacy properties against a strong adversarial model.
We believe both the denitions and the security analysis are
of independent interest, since they are the rst for the OR
protocol.
Organization. In Section II, we discuss the anonymous
communication networks, and consider the related work.
In Section III, we describe our threat model and system
goals, and present our key idea, while in Section IV, we
incorporate the BACKREF mechanism in the OR protocol.
We discuss important systems issues in Section V, and we
briey analyze the security and privacy properties of the
BACKREF mechanism in Section VI.
II. BACKGROUND AND RELATED WORK
Anonymous communication (AC) networks aim at pro-
tecting personally identiable information (PII), in particular
the network addresses of the communicating parties by
hiding correlation between input and output messages at one
or more network entities. For this purpose, the AC protocols
employ techniques such as using a series of intermediate
routers and layered encryptions to obfuscate the source of
a communication, and adding fake trafc to make the real
communication difcult to extract.
Anonymous Communication Protocols. Single-hop proxy
servers, which relay trafc ows, enable a simple form of
anonymous communication. However anonymity in this case
requires, at a minimum, that the proxy is trustworthy and
not compromised, and this approach does not protect the
anonymity of senders if the adversary inspects trafc through
the proxy [18]. Even with the use of encryption between
the sender and proxy server, timing attacks can be used to
correlate ows.
Starting with Chaum [4], several AC technologies have
been developed in the last thirty years to provide stronger
anonymity not dependent on a single entity [6], [3], [7], [19],
[2], [1], [20], [21], [22], [23], [24], [25]. Among these, mix
networks [4], [7] and onion routing [6] have arguably been
most successful. Both offer user anonymity, relationship
anonymity and unlinkability [26], but they obtain these
properties through differing assumptions and techniques.
An onion routing (OR) infrastructure involves a set of
routers (or OR nodes) that relay trafc, a directory service
providing status information for OR nodes, and users. Users
benet from anonymous access by constructing a circuit
a small ordered subset of OR nodesand routing trafc
through it sequentially. The crucial property for anonymity
is that an OR node within the built circuit is not able to
identify any portion of the circuit other than its predecessor
and successor. The user sends messages (to the rst OR
node in the circuit) in a form of an oniona data structure
multiply encrypted by symmetric session keys (one encryp-
tion layer per node in the circuit). The symmetric keys are
negotiated during an initial circuit construction phase. This
is followed by a second phase of low latency communication
(opening and closing streams) through the constructed circuit
for the session duration. An OR network does not aim
at providing anonymity and unlinkability against a global
passive observer, which in theory can analyze end-to-end
trafc ow. Instead, it assumes an adversary that adaptively
compromises a small fraction of OR nodes and controls a
small fraction of the network.
A mix network achieves anonymity by relaying messages
through a path of mix nodes. The user encrypts a message to
be partially decrypted by each mix along the path. Mix nodes
accept a batch of encrypted messages, which are partially
decrypted, randomly reordered, and forwarded. Unlike onion
routing, an observer is unable to link incoming and outgoing
messages at the mix node; thus, mix networks provide
anonymity against a powerful global passive adversary. In
fact, as long as a single mix node in the users path remains
uncompromised, the message will maintain some anonymity.
However, batching of messages at a mix node introduces
inherent delays, making mix networks unsuitable for low-
latency, interactive applications (e.g., web browsing, instant
messaging). When used, it is for latency-tolerant applications
like anonymous email.
A. Accountable Anonymity Mechanisms
The literature has examined several approaches for
adding accountability to AC technologies, allowing: mis-
behaving users to be selectively traced [11], [12], [13],
exit nodes to deny originating trafc it forwards [14], [15],
misbehaving users to be banned [16], [17], and misbehaving
participants to be discovered [2], [27], [28]. All of these
approaches either require users to obtain credentials or do
not extend to interactive, low-latency, internet-scale AC net-
works. A number also partition users into subgroups, which
reduces anonymity and requires a group manager. BACKREF
does not require credentials, subgroups, and is compatible
with low-latency AC networks like onion routing, adding
minimal overhead.
Kopsell et al. [11] propose traceability through threshold
group signatures. A user logs into the system to join a
group, signs messages with a group signature, and a group
manager is empowered to revoke anonymity. The system
also introduces an external proxy to inspect all outbound
trafc for correct signatures and protocol compliance. The
inspector has been criticized for centralizing trafc ows,
which enables DOS, censorship, and increases observabil-
ity [29].
Von Ahn et al. [12] also use group signatures as the basis
for a general transformation for traceability in AC networks
and illustrate it with DC networks. Users are required to
register as members of a group capable of sending messages
through the network. Our solution can be viewed as a follow-
up to this paper, with a concentration on deployability: we
do not require users to be organized into groups or introduce
new entities, and we concentrate on onion routing.
Diaz and Preneel [13] propose traceability through is-
suing anonymous credentials to users and utilizing a traitor
tracing scheme to revoke anonymity. It is tailored to high-
latency mix networks and requires a trusted authority to
issue credentialsboth impede deployability. Danezis and
Sassaman [29] demonstrate a bypass attack on this and
the Kopsell et al. scheme [11]. The attack is based on
the protocols assumption that there can be no leakage of
information from inside the channel to the world unless
it passes through the verication step. This attack is only
applicable for the family of protocols where traceability
property is ensured. In our protocol we do not claim ensured
traceability therefore this attack is out of the scope of
BACKREF.
Short of revoking to the anonymity of misbehaving users,
techniques have been proposed to at least allow exit nodes
to deny originating the trafc. Golle [14] and Clark et
al. [15] pursue this goal, with the former being specic to
high-latency mix networks and the latter requiring anony-
mous credentials. Tor offers a service called ExoneraTor
that provides a record of which nodes were online at a
given time, but it does not explicitly prove that a given
trafc ow originated from Tor. Other techniques, such
as Nymble [16] and its successors (see a survey [17]),
enable users to be banned. However these systems inherently
require some form of credential or pseudonym infrastructure
for the users, and also mandate web-servers to verify user
requests. Finally, Dissent [2] and its successors [27], [28]
presents an interesting approach for accountable anonymous
communication for DC Nets [1], however even when highly
optimized [27], DC Nets are not competitive for internet-
scale application.
III. DESIGN OVERVIEW
In this section we describe our threat model and system
goals, and present our key idea and design rationale.
A. Threat Model and System Goals
We consider the same threat model as the underlying
AC protocol in which we wish to incorporate the BACKREF
mechanism. Our active adversary / aims at breaking some
anonymity property by determining the ultimate source
and/or destination of a communication stream or breaking
unlinkability by linking two communication streams of the
same user. We assume that some, but not all, of the nodes
in the path of the communication stream are compromised
by the adversary /, who knows all their secret values,
and is able to fully control their functionalities. For high
latency AC networks like mix networks, we assume that
the adversary can also observe all trafc in the network, as
well as intercept and inject arbitrary messages, while for
low latency AC networks like onion routing, we assume the
adversary can observe, intercept, and inject trafc in some
parts of the network.
While maintaining the anonymity and unlinkability prop-
erties of the AC network, we wish to achieve the following
goals when incorporating BACKREF in an AC network:
Repudiation: For a communication stream owing through
a node, the node operator should be able to prove that
the stream is coming from another predecessor node or
user.
Backward traceability: Starting from an exit node of a
path (or circuit), it should be possible to trace the source
of a communication stream when all nodes in the path
veriably reveal their predecessors.
No forward traceability: For a compromised node, it
should not be possible for the adversary / to use
BACKREF to veriably trace its successor in any com-
pleted anonymous communication session through it.
No false accusation: It should not be possible for a com-
promised node to corrupt the BACKREF mechanism to
trace a communication stream:
Illegal
Activity
N
3
IP
N
3
Operator
N
2
N
1
ISP
I
P
Evidence
Fig. 1: Backward Traceability Verication
1) to a path different from the path employed for the
stream, and
2) to a node other than its predecessor in the path.
Non-Goals. We expect our accountability notion to be
reactive in nature. We do not aim at proactive accountability
and do not try to stop an illegal activity in an AC network
in a proactive manner, as we believe perfect white- or black-
listing of web urls and content to be an infeasible task. More-
over, some nodes may choose not to follow the BACKREF
mechanism locally (e.g., they may not maintain or share
the required evidence logs), and backward traceability to
the user cannot be ensured in those situations; nevertheless,
the cooperating nodes can still prove their innocence in a
veriable manner.
Due to its reactive nature, our repudiation mechanism
inherently requires evidence logs containing veriable rout-
ing information. Encrypting these logs and regularly rotating
the corresponding keys can provide us eventual forward
secrecy [30]. However, we cannot aim for immediate forward
secrecy due to the inherently eventual forward secret nature
of the encryption mechanism.
B. Design Rationale and Key idea
Fig. 1 presents a general expected architecture to achieve
the above mentioned goals. It is clear the network level logs
as well as the currently cryptographic mechanism in the AC
networks cannot be used for veriably backward traceability
purpose as they cannot stop false accusations (or traceability)
by compromised nodes: a compromised node can tamper
with its logs to intermix two different paths as there is no
cryptographic association between different parts of an AC
path.
We observe that almost all OR circuit construction pro-
tocols [21], [31], [32], [33], [34], [30] (except TAP) and mix
network protocols [35], [22], [36], [7], [24], [37] employ (or
can employ
1
) an element of a cyclic group of prime order
satisfying some (version of) Dife-Hellman assumption as
an authentication challenges or randomization element per
node in the path. In particular, it can be represented as
X = g
x
, where g is a generator of a cyclic group G of
prime order p with the security parameter and x
R
Z
p
is a
random secret value known only to the user. This element is
used by each node on the path to derive a secret that is shared
with the user and is used to extract a set of (session) keys
for encryption and integrity protection. In the literature, these
authentication challenges X are known as user pseudonyms.
The key idea of our BACKREF mechanism is to use
these pseudonyms X = g
x
and the corresponding secret
keys x as signing key pairs to sign pseudonyms for suc-
cessor nodes at entry and middle nodes, and to sign the
communication stream headers at the exit nodes. Signatures
that use (x, g
x
) as the signing key pair are referred to
as pseudonym signatures. As pseudonyms are generated
independently for every single node, and the corresponding
secret exponents are random elements of Z
p
, they do not
reveal the users identity. Moreover, it also is not possible to
link two or more pseudonyms to a single identity. Therefore,
pseudonym signatures become particularly useful in our
BACKREF mechanism, where users utilize them to sign
messages without being identied by the verier.
We can employ a CMA-secure [38] signature scheme
against a computationally bounded adversary (with the se-
curity parameter ) such that, along with the usual existential
unforgeability, the resultant pseudonym signature scheme
satises the following property:
Unconditional signer anonymity: The adversary cannot
determine a signers identity, even if it is allowed to
obtain signatures on an unbounded number of messages
of its choice.
We use such temporary signing key pairs (or pseudonym
1
Although some these have been dened using RSA encryptions, as
discussed in [22] they can be modied to work in the discrete logarithm
(DL) setting.
signatures) to sign consecutively employed pseudonyms in
an AC path and the web communication requests leaving the
AC path. Pseudonym signatures provide linkability between
the employed pseudonyms and the communicated message
on an AC path. However, these pseudonyms are not sufcient
to link the node employed in the AC path: for a pseudonym
received by a node, its predecessor node can always deny
sending the pseudonym in the rst place. We solve this
problem by introducing endorsement signatures: We assume
that every node signs the pseudonym while sending it to the
successor so that it cannot plausibly deny this transfer during
backward tracing.
C. Scope of Solution
To understand the scope of BACKREF, rst consider
traceability in the context of the simplest AC network:
a single-hop proxy. Any traceability mechanism from the
literature implicitly assumes a solution to the problem of
how users can be traced through a simple proxy. We dub
this the last mile problem. The proxy can keep logs, but
this requires a trusted proxy. Alternatively the ISP could
observe and log relevant details about trafc to the proxy,
requiring trust in the ISP. The solution more typically used
in the literature is to assume individual users have digital
credentials or signing keysessentially some form of PKI
is in place to certify the keys of individual users. [11], [12],
[13], [14], [15]
None of these last mile solutions are particularly attrac-
tive. The assumption of a PKI provides the best distribution
of trust but short-term deployment appears infeasible. We be-
lieve the involvement of ISPs is the most readily deployable.
Such a solution involves an ISP with a packet attestation
mechanism [39] which acts as a trusted party capable of
proving the existence of a particular communication. We dis-
cuss the packet attestation mechanism further in Section V.
For selected trafc ows, BACKREF provides traceabil-
ity to the entrance node. This is effectively equivalent to
reducing the strong anonymity of a distributed cryptographic
AC network to the weak anonymity of a single hop proxy.
For full traceability, we then must address the last mile
problem: tracing the ow back to the individual sender. Thus
BACKREF is not a full traceability mechanism, but rather an
essential component that can be composed with any solution
to the last mile problem. While we later discuss a solution
that involves ISPs, we emphasize that BACKREF itself is
concentrated on, arguably, the more difculty problem of
offering ensured traceability within the AC network.
IV. REPUDIATION (OR TRACEABILITY)
In this section, we present our BACKREF repudiation
scheme. For ease of exposition, we include our scheme in
an OR protocol instead of including it in the generic AC
protocol. Nevertheless, our scheme is applicable to almost
all AC protocols mentioned in Section III-B. We start our
discussion with a brief overview of the OR protocol in
the Tor notions [40]. We then discuss the protocol ow
for BACKREF, describe our cryptographic components, and
present a formal pseudocode.
A. The OR Protocol: Overview
The OR protocol is dened in two phases: circuit con-
struction and streams relay.
OR Circuit Construction. The circuit construction phase
involves the user onion proxy (OP) randomly selecting a
short circuit of (e.g., 3) OR nodes, and negotiating a session
key with each selected OR node using one-way authenticated
key exchange (1W-AKE) [34] such as the ntor protocol. (We
refer the readers to Appendix C for more details.) When a
user wants to create a circuit with an OR node N
1
, she
runs the Initiate procedure of the ntor protocol to generate
and send an authentication challenge to N
1
. Node N
1
then
runs the respond procedure and returns the authentication
response. Finally, the user uses the ComputeKey procedure
of ntor along with the response to authenticate N
1
and
to compute a session key with it. To extend the circuit
further, the user sends an extend request to N
1
specifying the
address of the next node N
2
and a new ntor authentication
challenge for N
2
. The process continues to until the user
exchanges the key with the exit node N
3
.
Relaying Streams. Once a circuit (denoted as
U N
1
N
2
N
3
) has been constructed through N
1
,
N
2
and N
3
, the user-client U routes trafc through the
circuit using onion-wrapping WrOn and onion-unwrapping
UnwrOn procedures. WrOn creates a layered encryption
of a payload (plaintext or onion) given an ordered list of
(three) session keys. UnwrOn removes one or more layers
of encryptions from an onion to output a plaintext or an
onion given an input onion and a ordered list of one or
more session keys. To reduce latency, many of the users
communication streams employ the same circuit [6].
The structure and components of communication streams
may vary with the network protocol. For ease of exposition,
we assume the OR network uses TCP-based communication
in the same way as Tor, but our schemes can easily be
adapted for other types of communication streams.
In Tor, the communication between the users TCP-based
application and her Tor proxy takes place via SOCKS. To
open a communication stream (i.e., to start a TCP connection
to some web server and port), the user proxy sends a relay
begin cell (or packet) over the circuit to the exit node N
3
.
When N
3
receives the TCP request, it makes a standard
TCP handshake with the web server. Once the connection is
established, N
3
responds to the user with a relay connected
cell. The user then forwards all TCP stream requests for the
server as relay data cells to the circuit. (See [6], [40] for a
detailed explanation.)
B. The BackRef Protocol Flow
Consider a user U who wishes to construct an OR circuit
U N
1
N
2
N
3
, and use it to send communication
stream m. BACKREF adds the repudiation mechanism as a
layer on the top of the existing OR protocol. We assume
that every OR node possesses a signing (private) key for
which the corresponding verication (public) key is publicly
available through the OR directory service.
The corresponding OR protocol with the BACKREF
scheme works according to the following ve steps:
1. Circuit construction with an entry node: The user
U creates a circuit with the entry node N
1
using the ntor
protocol. If the user is an OR node, then it endorses its
pseudonym X
1
by signing it with its public key and sending
the signature along with X
1
.
However, if the user U is not an OR node, it cannot
endorse the pseudonym X
1
as no public-key infrastructure
(PKI) or credential system is available to him. We solve this
endorsement problem by entrusting the ISP with a packet
attestation mechanism [39] such that the ISP can prove that
a pseudonym was sent by U to N
1
. We discuss the packet
attestation mechanism in Section V.
2. Circuit extension: To extend a circuit to N
2
, U
generates a new pseudonym X
2
of an ntor instance, signs
X
2
and the current timestamp with the secret value x
1
associated with X
1
, and sends an extend request to N
1
along
with the identier for N
2
, X
2
[[ts
x2

X
1
and a timestamp
ts
x2
. Notice that the extension request is encrypted by a
symmetric session key negotiated between U and N
1
.
Upon receiving a message, N
1
decrypts and veries
X
2
[[ts
x2

X
1
using the previously received pseudonym
X
1
and timestamp. We call this verication pseudonyms
linkability verication. If the signature is valid, it creates
an evidence record as discussed in Step 4, signs X
2
using
its private key to generate X
2
[[ts
2

sk
2
and sends a circuit
create request to the node N
2
with X
2
[[ts
2

sk
2
.
Node N
2
, upon receiving a circuit creation request along
with X
2
[[ts
2

sk
2
, veries the signature. Upon a successful
verication, it replies to N
1
with an ntor authentication
response for the OR key agreement and generates the OR
session key for its session with (unknown) user U. N
1
sends the authentication response back to U using their OR
session, who then computes the session key with N
2
and
continues to build its circuit to N
3
in a similar fashion.
Notice that we carefully avoid any conceptual modi-
cation of the OR circuit construction protocol; the above
signature generation and verication steps are the only
adjustments that BACKREF makes to this protocol.
3. Stream verication: Once a circuit
U N
1
N
2
N
3
has been established, the user U
can utilize it to send her web stream requests. To open a
TCP connection, the user sends a relay begin cell to the
exit node N
3
through the circuit. The user U includes a
pseudonym signature (or stream request signature) on the
cell contents signed with the secret exponent x
3
of X
3
.
The user also includes a timestamp in her stream request.
When the relay cell reaches the exit node N
3
, the exit
node veries the pseudonym signature with X
3
. Once the
verication is successful and the timestamp is current, N
3
creates the evidence log (Step 4) and proceeds with the
TCP handshake to the destination server. The relay stream
request is discarded otherwise This stream verication helps
N
3
to prove linkability between its handshakes with the
destination server and the pseudonym X
3
it received from
N
2
.
When a whitelist directory exists, the exit node rst
consults the directory and if the request (i.e., web stream
request) is whitelisted, the exit node just forwards it to the
destination server. In such a case, the exit node does not
require any signature verication and also does not create
an evidence log. We further discuss the server whitelisting
in Section IV-D.
4. Log generation: After every successful pseudonym
linkability or stream verication, the evidence record is cre-
ated. A pseudonym linkability verication evidence record
associates linkability between two pseudonyms X
i
and X
i+1
and an endorsement signature on X
i
, while a stream veri-
cation evidence record associates a stream verication with
an endorsement signature on X
3
for N
3
.
5. Repudiation or traceability: The verier contacts the
exit node N
3
with the request information (e.g., IP address,
port number, and timestamp) for a malicious stream coming
out of the exit node N
3
. The operator of N
3
can determine a
record using the stream request information. This evidence
record veriably reveals the identity of the middle node N
2
.
As an optional next step, using the evidence records, it
is possible for N
2
to veriably reveal the identity of the
predecessor node N
1
. Then, the last mile of a full traceability
is to reach from N
1
to the user U in a veriable manner
using the evidence record on N
1
and the request information
on the ISP [39]. When the user U is an OR node a record
at N
1
is sufcient and the last mile problem does not exist.
C. Cryptographic Details
For pseudonym and endorsement signatures, we use the
short signature scheme of Boneh, Lynn and Shacham (BLS)
[41]. We recall the BLS signature scheme in Appendix B.
We choose the BLS signature scheme due to the shorter
size of their signatures; however, if signing and verication
efciency is more important, we can choose faster signature
schemes such as [42].
Circuit Extension. To extend the circuit U N
1
to the
next hop N
2
, the user U chooses x
2

R
Z
p
and generates
a pseudonym X
2
= g
x2
2
, where g
2
G
2
. U then signs
the pseudonym X
2
with pseudonym X
1
as public key.
Also we include the current timestamp value ts
x2
in the
signature
X1
= H(X
2
[[ts
x2
)
x1
Upon receiving the signed
pseudonym X
2
[[ts
x2

X
1
along with the timestamp ts
x2
,
the node N
1
checks if the timestamp is current and veries
it as follows:
e(H(X
2
[[ts
x2
), X
1
)
?
= e(
X1
, g
2
)
Pseudonym endorsement. After successful verication, N
1
creates an endorsement signature
1
= H(X
2
[[ts
2
)
sk1
for
pseudonym X
2
and current timestamp ts
2
using its signing
key sk
1
and sends it along with X
2
and ts
2
to N
2
.
The node N
2
then follows the pseudonym endorsement
step. Upon receiving the signed pseudonym X
2
[[ts
2

1
, the
exit node N
2
veries it as follows:
e(H(X
2
[[ts
2
), pk
1
)
?
= e(
1
, g
2
).
On a successful verication, N
2
continues with the OR
protocol.
Stream verication. To generate a stream request signature,
the user signs the stream request (i.e., selected contents of
the relay begin cell) using the pseudonym X
3
= g
x3
2
where
x
3
is the secret corresponding to X
3
. For contents of the
relay cell m = address|port|ts
xm
, the stream request
signature
X3
is dened as

X3
= H(m)
x3
.
The user sends the signature along with the relay cell and
the current timestamp ts
xm
to the exit node through the
already-built circuit.
Once the signed stream request reaches N
3
, it veries
the signature as follows:
e(H(m), X
3
)
?
= e(
X3
, g
2
). (1)
Upon a successful verication, the exit node N
3
proceeds
with the TCP handshake. A veried request allows the node
to link X
3
and the request.
Log generation. After every successful pseudonym or
stream verication, an evidence record is added to the evi-
dence log. The evidence records differ with nodes positions
within a circuit, and we dene two types of evidence logs.
Exit node log: For every successful stream verication, an
evidence record is added to the evidence log at the
exit node. A single evidence record consists of the
signature on X
3
(i.e., X
3
[[ts
3

2
), and the stream
request (m = address|port|ts
xm
) coupled by the
pseudonym signature m

X
3
and the timestamp ts
xm
.
Middle and entry node log: The middle and entry node ev-
idence record comprises two pseudonyms X
i
, X
i+1
,
and a timestamp value ts
xi+1
coupled with the ap-
propriate signatures and the IP address of N
i1
.
The pseudonym X
i
is coupled with an endorsement
signature X
i
[[ts
i

i1
from node N
i1
, and the
pseudonym X
i+1
is coupled by a pseudonym signature
X
i+1
[[ts
x+1

X
i
.
When the user is not an OR node and does not posse a
veriable signature key pair, the corresponding record
at N
1
consists of a signed pseudonym X
2
[[ts
x2

X
1
,
pseudonym X
1
, timestamp value ts
x2
, and the IP of the
user.
Repudiation or traceability. Given the server logs of a
stream request, an evidence record corresponding to the
stream request can be obtained. In the rst step, it is
checked whether the timestamp matches the stream request
under observation. In the next step, the association between
the stream request and the pseudonym of the exit node
X
3
is veried using the pseudonym signature. Then, the
association of the pseudonym X
3
and N
2
is checked using
the pseudonym endorsement signature.
Given the pseudonym X
3
and a timestamp ts
xm
, the
backward traceability verication at node N
2
is carried out
as follows:
1) Do a lookup in the evidence log to locate the signed
pseudonym X
3
[[ts
x3

X
2
and the timestamp ts
x3
,
where X
3
is the lookup index.
2) Compare the timestamps (ts
xm
and ts
x3
) under obser-
vation and prove the linkability between X
2
and X
3
by
verifying the signature X
3
[[ts
x3

X
2
.
3) If verication succeeds, reveal the IP address of the
node N
1
who has forwarded X
2
and verify X
2
[[ts
2

1
with pk
1
.
The above three steps can be used repeatedly to reach the
entry node. However, they cannot be used to veriably reach
the user if we do not assume any public key and credential
infrastructure for the users. Instead, our protocol relies on
the ISP between user U and N
1
to use packet attestation [39]
to prove that the pseudonym X
1
was sent from U to N
1
.
D. Exit Node Whitelisting Policies
To provide a good balance between anonymity and ac-
countability, we include a whitelisting option for exit nodes.
This option allows a user to avoid the complete verication
and logging mechanisms if her destination is in the whitelist
directory of her exit node. In particular, we categorize the
destinations into two groups:
Whitelisted destinations: For several destinations such
as educational .edu websites, an exit node may nd
traceability to be unnecessary. The exit node includes such
destinations in a whitelist directory such that, for these
destinations, the employed circuit nodes do not demand
any endorsement and pseudonym signatures. Trafc sent to
these whitelisted destinations through the circuit remains
anonymous in the current AC networks sense.
Non-listed destinations: For destinations that are not listed
in the exit-node whitelist directory, the user has to use
BACKREF while building the circuit to it; otherwise, the exit
node will drop her requests to the non-listed destinations.
We emphasize that BACKREF is not an all-or-nothing
design alternative: it allows an AC network to conveniently
disable the complete verication and logging mechanisms
for some pre-selected destinations. In particular, an exit node
with Sorry, it is an anonymity network, no logs opinion
can still whitelist the whole Internet, while others employ
BACKREF for non-whitelisted sites. The use of BackRef
is transparent, and users can choose if they wish to use a
BackRef node for their circuits.
E. Pseudocode
In this subsection, we present pseudocode for the OR
protocol with BACKREF extending the OR pseudocode
developed by Backes et al. [43] following the Tor specica-
tion [40]. We highlight our changes to their original (
OR
)
protocol pseudocode from [43] by underlining those. Our
pseudocode formalism demonstrates that our modication
the original OR protocol are minimal. It also forms the
basis for our applied pi calculus [44] based OR model in
Section VI. In the pseudocode, an OR node maintains a
state for every protocol execution and responds (changes
the state and/or sends a message) upon receiving a message.
There are two types of messages that the protocol employs:
the rst type contains input and output actions, which carry
respectively the user inputs to the protocol, and the protocol
outputs to the user; the second message type is a network
message (a cell in the OR literature), which is to be delivered
by one protocol node to another.
In onion routing, a directory server maintains the list
of valid OR nodes and the respective public keys. A func-
upon an input (setup):
Generate an asymmetric key pair (sk, pk) G.
send a cell (register, N, pk) to the F
N
REG
functionality
wait for a cell (registered, Nj, pk
j

n
j=1
) from F
N
REG
output (ready, N = Nj
n
j=1
)
upon an input (createcircuit, N = N, Nj

j=1
):
store N and C N; call ExtendCircuit(N, C)
upon an input (send, C = N
cid
1
N1 N

, m):
look up the keys (kj

j=1
) for cid1
O WrOn(m, X

, ts, (kj)

j=1
); Used(cid1)++
send a cell (cid1, relay, O) to N1 over FSCS
upon receiving a cell (cid, create, X, i, ts) from Ni over
FSCS:
if Verify(i , pk
N
i
) then
Y, knew Respond(pk
N
, skN, X)
store C Ni
cid,knew
N
store Log H(X), IPN
i
X, i, ts
send a cell (cid, created, Y, t) to Ni over FSCS
upon receiving a cell (cid, created, Y, t) from Ni over FSCS:
if prev(cid) = (N

, cid

, k

) then
O WrOn(extended, Y, t, k

)
send a cell (cid

, relay, O) to N

over FSCS
else if prev(cid) = then
knew ComputeKey(pk
i
, Y, t)
update C with knew; call ExtendCircuit(N, C)
upon receiving a cell (cid, relay, O) from Ni over FSCS:
if prev(cid) = then
if getkey(cid) = (kj)

j=1
then
(type, m) or O UnwrOn(O, (kj)

j=1
)
(N

, cid

) or next(cid)
else if prev(cid) = (N

, cid

, k

) then
O WrOn(O, k

) /* a backward onion */
switch (type)
case extend:
get Nnext , X, X
i
, ts from m; cidnext
$
{0, 1}

if Verify(X
i
, Xi ) then
update C Ni
cid,k
N
cid
next
Nnext
store Log H(X), IPN
i
X, X
i
, ts
send a cell (cidnext , create, X) to Nnext over FSCS
case extended:
get Y, t from m; get Nex from (C, N)
kex ComputeKey(pk
ex
, Y, t)
update C with (kex); call ExtendCircuit(N, C)
case data:
if (N = OP) then output (received, C, m)
else if m = (S, m

, X, ts)
store Log H(m), IPN
i
, X, X, ts
generate or lookup the unique sid for cid
send (N, S, sid, m

) to the network
case default: /*encrypted forward/backward onion*/
send a cell (cid

, relay, O) to N

over FSCS
upon receiving a msg (sid, m) from FNET
q :
get C N

cid,k
N for sid; O WrOn(m, k)
send a cell (cid, relay, O) to N

over FSCS
Without circuit destruction.
Fig. 2:
OR
with BACKREF for Party N
ExtendCircuit(N = Nj

j=1
, C = N
cid
1
,k
1
N1
k
2

):
determine the next node N

+1
from N and C
if N

+1
= then
output (created, N
cid
1
N1 N

)
else
X Initiate(pk
N

+1
, N

+1
)
if N

+1
= N1 then
cid1
$
{0, 1}

send a cell (cid1, create, X) to N1 over FSCS


else
O WrOn({extend, N

+1
, X, X

, ts}, (kj)

j=1
)
send a cell (cid1, relay, O) to N1 over FSCS
Fig. 3: Subroutine for
OR
with BACKREF for N
upon a verication request (m):
if LookupLog(H(m)) = then
TraceFail(m)
else
get Log H(m), Nprev , X, , ts for H(m)
if ((N = N1) & V erify(, X)) then
output (X, Nprev )
else
get Log H(X), NNprev
, pkNprev
,

, ts for H(X)
if (V erify(, X) & V erify(

, pkNprev
)) then
output (X, Nprev )
else
TraceFail(m)
Fig. 4: Backward Traceability Verication
tionality T
N
REG
abstracts this directory server. Each OR node
initially computes its long-term keys (sk, pk) (for both 1W-
AKE and signature schemes) and registers the public part at
T
N
REG
.
For ease of exposition, cryptographically important Tor
cells are considered in the protocol. This includes create,
created and destroy cells among control cells, and data,
extend and extended cells among relay cells. There are two
input messages createcircuit and send, where the user uses
createcircuit to create OR circuits and uses send to send
messages m over already-created circuits.
The ExtendCircuit function dened in Figure 3 presents
the circuit construction description from Section IV-A in
a pseudocode form. Circuit IDs (cid 0, 1

) associate
two consecutive circuit nodes in a circuit. The terminology
( = N
i1
cidi,ki
N
i
cidi+1
N
i+1
, says that N
i1
and
N
i+1
are respectively the predecessor and successor of N
i
in a circuit (. k
i
is a session key between N
i
and the
OP, while the absence of k
i+1
indicates that a session key
between N
i+1
and the OP is not known to N
i
; analogously
the absence of a circuit id cid in that notation means
that only the rst circuit id is known, as for OP, for
example. Functions prev and next on cid correspondingly
return information about the predecessor or successor of the
current node with respect to cid; e.g., next(cid
i
) returns
(N
i+1
, cid
i+1
) and next(cid
i+1
) returns . The OP passes
on to the user N
cid1
N
1
N

.
Within a circuit, a users OP (onion proxy) and the
exit node use relay cells created using wrapping algorithm
WrOn to tunnel end-to-end commands and communication.
The exit nodes use the streams to synchronize communica-
tion between the network and a circuit (. It is represented as
sid in the pseudocode. End-to-end communication between
OP and the exit node happens with a WrOn call with
multiple session keys and a series of UnwrOn calls with
individual session keys. Cells are exchanged between OR
nodes over a secure and authenticated channels, e.g., a
TLS connection, and they are modeled a secure channel
functionality T
SCS
[45]. Circuit destruction remains exactly
the same in our case, and we omit it in our pseudocode and
refer the readers to [43] details.
In Figure 4, we formalize the backward traceability
verication of BACKREF. Here, function LookupLog de-
termines an entry from the log index by its input. Func-
tion V erify performs signature verication, while function
TraceFail outputs that a valid log entry does not exists at
node N.
V. SYSTEMS ASPECTS AND DISCUSSION
Communication overhead. Communication overhead for
BACKREF is minimal: every circuit creation, circuit exten-
sion, and stream request carries a 32 byte BLS signature and
additional 4 byte timestamp.
Computation overhead. In a system with BACKREF,
every node has to verify a signature and generate another.
Using the pairing-based cryptography (PBC) library, a BLS
signature generation takes less than 1ms while a verication
asks for nearly 3ms for 128-bit security on a commodity PC
with an Intel i5 quad-core processor with 3.3 GHz and 8 GB
RAM. Signing and verication time (and correspondingly
system load) can be further reduced using faster signature
schemes (e.g., [42]).
Log storage. BACKREF requires nodes to maintain logs of
cryptographic information for potential use by law enforce-
ment. These logs are not innocuous, and the implications
of publicly disclosing a record need to be considered. The
specicity of the logs should be carefully designed to bal-
ance minimal disclosure of side-information (such as specic
timings) while allowing ows to be uniquely identied. It
must also be possible to reconstruct the logged data from
the types of information available to law enforcement. The
simplest entry would contain the destination IP, source (exit
node) IP, a coarse timestamp, as well as the signature. Logs
should be maintained for a pre-dened period and then
erased.
No single party can hold the logs without entrusting the
anonymity of all users to this entity. The OR nodes can
retain the logs themselves. This, however, would require
law enforcement to acquire the logs from every such node
and consequently involve the nodes in the investigation
a scenario that may not be desirable. Furthermore, trace-
ability exposes nodes of all types, not just exit nodes,
to investigation. We are aware of a number of entities
who deliberately run middle nodes in Tor to avoid this
exposure. An alternative is to publish encrypted logs, where
a distributed set of trustees share a decryption key and act
as a liaison to law enforcement, while holding each other
accountable by refusing to decrypt logs of users who have
not violated the traceability policy. Such an entity acts in
a similar fashion to the group manager schemes based on
group signatures [12].
Non-cooperating nodes. Given the geographic diversity of
the AC networks, it is always possible that some proxy nodes
cooperate with the BACKREF mechanism, while others do
not. The repudiation property of BACKREF ensures that a
cooperating node can always at least correctly shift liability
to a non-cooperating node. Furthermore, such a cooperating
node may also reactively decide to block any future com-
munication from the non-cooperating node as a policy.
ISP as a trusted party. In the absence of a PKI for users,
to solve the last mile problem, our protocol has to rely on
some trust mechanism to prove the linkability between the
IP address of the user and the entry node pseudonym. For
this purpose, we consider an ISP with packet attestation
mechanism [39] to be a proper solution that adds a small
overhead for the existing ISP infrastructure and at the same
time does not harm any of the properties provided by the
anonymity network. In some countries there is an obligation
for the ISPs to retain data that identify the user, in others
where the ISPs are not obligated by law, it is a common
practice. The protocol is designed in a way that the ISP
has to attest only to the ClientKeyExchange message (this
message is a part of the TLS establishing procedure, and
also is public and not encrypted message) which is used
to establish the initial TLS communication. This message
does not reveal any sensitive information related with the
identity of the user. By its design, we reuse this message as
a pseudonym for the entry OR node.
VI. SECURITY ANALYSIS
In this section we present a formal security analysis of
BACKREF. We model our protocol from the previous section
(in a restricted form) in the applied pi calculus [44] and
verify the important properties anonymity, backward trace-
ability, no forward traceability, and no false accusation with
ProVerif [46], a state-of-the-art automated theorem prover
that provides security guarantees for an unbounded number
of protocol sessions. We model backward traceability and no
false accusation as trace properties, and anonymity and no
forward traceability as observational equivalence relations.
The ProVerif scripts used in the analyses are publicly
available [47].
Basic Model. We model the OR protocol in the applied
pi calculus to use circuits of length three (i.e., one user
and three nodes); the extension to additional nodes is
straightforward. To prove different security properties we
upgrade the model to use additional processes and events.
The event contents used to decorate the various steps in the
OR protocol as well as BACKREF mechanism follow the
pseudocode from the previous section. We also involve an
ISP between the user and the entry node, which participate
in the protocol as a trusted party. The ISP is honest and can
prove the existence of a communication channel between
the user and the entry node. This channel is modeled to be
private, preventing any ISP log forgeries. The cryptographic
log collection model is designed in a decentralized way
such that nodes retain the logs themselves in a table that
is inaccessible to the adversary.
We model the ow of the pseudonyms and the onion,
together with the corresponding verication. However, we
do not model the underlying, cryptographically veried 1W-
AKE ntor protocol, and assume that the session key between
the user and the selected OR process is exchanged securely.
The attacker is a standard Dolev-Yao active adversary with
full control over the public channels: It learns everything
ever send on the network, and can create and insert messages
on the public channels. It also controls network scheduling.
Backward Traceability. The essential goal of our protocol
is to trace the source of the communication stream starting
from an exit node. We verify that the property of backward
traceability arrives from the correctness of the (backward)
traceability verication mechanism.
The correctness property can be formalized in ProVerif
notation as follows:
N
2
N
1
ISP
U
1
U
2
m
1
m
2
ISP Log Table
IP
Address
Pseudo.
N
1
U
1
IP
U
2
IP
E
x
te
n
d
re
q
u
e
st
IP,
1
IP, 1
*
O
n
io
n
Extend request
O
n
io
n
N
1
Log Table
Hashed
pseudonym
IP
Address
Pseudo.
Signed
Pseudo.
H( ) U
1
IP ( )
H( ) U
2
IP ( )
N
3
Log Table
Lookup
Key
IP
Address
Pseudo.
Signed
Pseudo.
H( ) N
2
IP ( )
H( ) N
2
IP ( )
H(
1
) N
2
IP (m
1
)
H(
2
) N
2
IP (m
2
)
N
3
Compromised
Node/Client
Public Channel
Private Channel
N
2
Log Table
Hashed
pseudonym
IP
Address
Pseudo.
Signed
Pseudo.
H( ) N
1
IP ( )
H( ) N
1
IP ( )
H( ) N
1
IP ( )
H( ) N
1
IP ( )
Fig. 5: No False Accusation adversarial model
TraceUser(IP) = (LookupISP(X
1
, IP) =
(RevealPredecessorU(IP)) =
(RevealPredecessor(ipN1)) =
(RevealPredecessor(ipN2)) CheckSignature
LookupN3(m)))
(2)
where the notation A = B denotes the require-
ment that the event A must be preceded by a event
B. In our protocol, the property says that the user is
traced if and only if all nodes in the circuit veriably
trace their predecessors. The traceability protocol P starts
with the event LookupN3(m) which means that for a
given message m (stream request) the verier consults the
log, and if such request exists, it checks the signature
CheckSignature. Finally when all these conditions are ful-
lled, the verier reveals the identity of the predecessor node
RevialPredecessor(ipN2) (i.e., the middle node). This
completes the nested correspondence (CheckSignature
LookupN3(m) RevealPredecessor(ipN2)) which ver-
iably traces N
2
. In a similar fashion, after all conditions
are fullled, the verier traces N
1
and the user U.
After the identity of U is revealed, the verier lookup
into the evidence table of the ISP (LookupISP) to prove
the connection between the identity of the user IP and the
pseudonym of the entry node X
1
. If such record exist into
the table, the address of the user is revealed and the event
TraceUser(IP) is executed.
Theorem: The trace property dened in equation (2)
holds true for all possible executions of process P.
Proof: Automatically proven by ProVerif.
No false accusation. There are two aspects associated with
false accusations:
1) It should not be possible for a malicious node N
A
to
trace a communication stream to an OR node N
C
other
than to its predecessor in the corresponding circuit.
Informally, to break this property, N
A
has to be obtain
a signature of N
C
on a particular pseudonym associated
with the circuit. This requires N
A
to forge a signature
for N
C
, which is not possible due to the unforgeability
property of the signature scheme.
2) It should not be possible for a malicious node N
A
to
trace a communication stream to a circuit C
1
other than
the circuit C
2
employed for the communication stream.
Consider a scenario where two concurrent circuits (C
1
and C
2
), established by two different users U
1
and U
2
,
pass through a malicious node N
A
. Suppose that N
A
collaborates with U
2
who is misbehaving and have used
the OR network for a criminal activities. To help U
2
by falsely accusing a different predecessor, N
A
must
forge two signatures: To link two pseudonyms X
1i1
and X
2i
from circuits C
1
and C
2
respectively, N
A
has
to forge the pseudonym signature on X
2i
with X
1i1
as a public key, or he has to know the temporal signing
key pair for the predecessor in C
1
.
Intuitively, the rst case is ruled out by the unforgeability
property of the signature scheme. We model the later case
as a trace property. Here, even when N
A
collaborates with
U
2
, it cannot forge the signed pseudonym received from
its predecessor. The property remains intact as long as one
[m
2
]
N3
[m
1
]
N3
[m
2
]
N3
[m
1
]
N3
N
2
N
1
ISP
U
1
U
2
[[[m
1
]
N3
]
N2
]
N1
[[[m
2
]
N3
]
N2
]
N1 N
3
Public Channel
Symmetric Encryption
for Node N
X
N
2
N
1
ISP
U
1
U
2
[[[m
2
]
N3
]
N2
]
N1
[[[m
1
]
N3
]
N2
]
N1
N
3
[ ]
Nx
Compromised fraction
of the OR network
m
1 ,m
2
m2
,
m1
Fig. 6: Anonymity Game
of nodes on C
1
and the packet attesting ISP [39] remains
uncompromised. In absence of a PKI or credential system
for users, the last condition is unavoidable.
We formalize and verify the latter case of the property
in an adversarial model where the attacker has compromised
one user (U
1
or U
2
). Figure 5 provide a graphical repre-
sentation of the protocol P. We upgrade the basic model
involving additional user U
2
who sends additional message
m
2
. As mentioned before, to simulate the packet attesting
mechanism [39] we involve a honest ISP between the user
and the entry node. The ISP only collects data that identies
the user (IP address of the user) and the pseudonym for the
entry node (X
1
) which is send in plain-text. The adversary
does not have an access to the log stored by the ISP i.e.
cannot read or write anything into the log table. We want to
verify that for all protocol execution the request m
i
cannot
be associated with any user U
i
other than the originator.
To formalize the no false accusation property in ProVerif,
we model security-related protocol events with logical predi-
cates. The events CorrN1, CorrN2, CorrN3 in the proto-
col occur only when the OR nodes N
1
, N
2
, N
3
, respectively,
are corrupted. The event CorrISP denes the point of the
protocol where the ISP is corrupted. The no false accusation
property is formalize as the following policy:
Accuse(IP, m) =
(CorrN1 CorrN2 CorrN3 CorrISP).
(3)
This policy says that if a user with address IP is falsely
accused for a message m i.e. Accuse(IP, m), then indeed
all of the parties in the protocol has to be corrupted.
Theorem: The trace property dened in equation (3)
holds true for all possible executions of process P.
Proof: Automatically proven by ProVerif.
Anonymity. We model this property as an observational
equivalence relation between two processes that are repli-
cated an unbounded numbers of time and execute in parallel.
In the rst process P, users U
1
and U
2
send two messages
m
1
and m
2
, respectively. While in the second process Q
the two messages are swapped. If the two dened processes
are observationally equivalent (P Q), then we say that the
attacker cannot distinguish between m
1
and m
2
i.e. cannot
learn which message is sent by which user. In our scenario
we assume that the attacker can compromise some fraction
of the OR node, but not all. Figure 6 provide a graphical
representation of the anonymity game where the exit node
N
3
is honest. The game works as follows:
1) U
1
and U
2
create an onion data structure O
1
and O
2
,
respectively, intended for N
3
and send via previously
built circuits C
1
(U
1
N
1
N
2
N
3
) and
C
2
(U
2
N
1
N
2
N
3
). Nodes communicate
between each other through public channel.
2) Two of the intermediate nodes are corrupted and the
attacker has full control over them. The intermediate
compromised nodes (in our case N
1
and N
2
) remove
one layer of encryption from O
1
and O
2
and send the
onion to the exit node N
3
.
3) After receiving these two onions from the users U
1
and U
2
and possibly other onions from compromised
users, the exit OR node N
3
remove the last layer of
the encryption and publish the message on a public
channel.
N
2_1
N
1
ISP
U
1
U
2
[[[m
1
]
N3
]
N2
]
N1
[[[m
2
]
N3
]
N2
]
N1
N
3
Public Channel
Symmetric Encryption
for Node N
X
N
1
ISP
U
1
U
2
[[[m
2
]
N3
]
N2
]
N1
[[[m
1
]
N3
]
N2
]
N1
N
3
[ ]
Nx
[
[
m1
] N
3
] N
2
[
[
m
2 ]
N
3 ]
N
2
[
m
1 ]
N
3 ]
[
m
2
] N
3
m
1
,
m
2
N
2_2
m2
,
m1
N
2_1
N
2_2
[
[
m1
] N
3
] N
2
[[m
2
]
N3
]
N2
[
m2
] N
3
[m
1
]
N3
]
Fig. 7: No Forward Traceability
Note that the ISP does not affect the anonymity game
and only act as a proxy between the users and the outside
world. For the anonymity verication, we assume that user
U
1
and user U
2
are honest and they follow the protocol.
Nevertheless, the action of any compromised user and honest
users can be interleaved in any order.
Theorem: The observational equivalence relation P Q
holds true.
Proof: Automatically proven by ProVerif.
Notice that the evidence records here inherently break
anonymity: anybody with access to logs of the entry, middle,
and exit nodes of a circuit can break the user anonym-
ity. Therefore, traceability logs have to be indexed and
individually encrypted using an appropriate trust-enforcing
mechanism. In Section V, we discuss the possible solutions.
No forward traceability. The evidence log of the back-
ward traceability protocol in BACKREF does not store any
information (i.e., IP addresses) that can identify or veriably
reveal the identity of a nodes successor. The log contains
only the pseudonym for the successor node which does not
reveal anything about the identity of the node.
We formalize this property as an observational equiv-
alence relation between two distinct processes and verify
that an adversary cannot distinguish them. Figure 7 pro-
vides a graphical representation of the game. To prove
the observational equivalence, we model a scenario with
concurrent circuit executions. In this game, the adversary
can corrupt parties and extract their secrets only after the
message transmission over the circuit has completed. For
this game, our model involves an additional middle node and
user U
2
. Two users U
1
and U
2
send two different messages
m
1
and m
2
via two circuits. We verify that it is impossible
for an attacker to deduce any meaningful information about
the successor node for a particular request. Our game works
as follows:
1) U
1
and U
2
start the protocol and constructs two dif-
ferent circuits C
1
(U N
1
N
2
N
3
) and
C
2
(U N
1
N

2
N
3
), respectively with adequate
values (x
1
, x
2
, x
3
) for a circuit C
1
and (x

1
, x

2
, x

3
) for
C
2
.
2) U
1
and U
2
create an onion data structure O
1
and O
2
and
send to the exit node N
3
via previously built circuits
C
1
and C
2
. Nodes communicate between each other
through public channels.
3) After receiving the two onions from the users and
possibly other onions from compromised users, N
3
removes the last layer of the encryption and publishes
the messages on a public channel.
4) After protocol completion, the entry node N
1
is com-
promised and the adversary obtains the evidence log.
In the rst process P, U
1
sends m
1
and U
2
sends m
2
,
while the process is reversed process Q. For the no forward
traceability verication, we assume that all other parties in
the protocol remain honest, except the compromised N
1
.
For example, if two neighbor nodes are compromised, the
no forward traceability can be easily broken with activating
the backward traceability mechanism.
Theorem: The observational equivalence relation P Q
holds true.
Proof: Automatically proven by ProVerif.
Finally, to the best of our knowledge, our formal analysis
is the rst ProVerif-based analysis of the OR protocol; it can
be of independent interest towards formalizing and verifying
other properties of the OR protocol.
VII. CONCLUSIONS
In this paper, we presented BACKREF, an accountability
mechanism for AC networks that provides practical repu-
diation for the proxy nodes, allowing selected outbound
trafc ows to be traced back to the predecessor node. It
also provides a full traceability option when all intermediate
nodes are cooperating. While traceability mechanisms have
been proposed in the past, BACKREF is the rst that is both
compatible with low-latency, interactive applications (such
as anonymous web browsing) and does not introduce new
trusted entities (like group managers or credential issuers).
BACKREF is provably secure, requires little overhead, and
can be adapted to a wide range of anonymity systems.
REFERENCES
[1] D. Chaum, The dining cryptographers problem: Unconditional
sender and recipient untraceability, J. Cryptology, vol. 1, no. 1,
1988.
[2] H. Corrigan-Gibbs and B. Ford, Dissent: accountable anonymous
group messaging, in CCS, 2010, pp. 340350.
[3] P. F. Syverson, D. M. Goldschlag, and M. G. Reed, Anonymous
connections and onion routing, in IEEE Symposium on Security and
Privacy, 1997.
[4] D. Chaum, Untraceable electronic mail, return addresses, and digital
pseudonyms, CACM, vol. 24, no. 2, 1981.
[5] P. Mittal and N. Borisov, Shadowwalker: peer-to-peer anonymous
communication using redundant structured topologies, in CCS,
2009, pp. 161172.
[6] R. Dingledine, N. Mathewson, and P. Syverson, Tor: the second-
generation onion router, in USENIX Security, 2004.
[7] U. M oller, L. Cottrell, P. Palfrader, and L. Sassaman, Mix-
master Protocol Version 2, IETF Internet Draft, 2003,
http://mixmaster.sourceforge.net/.
[8] A. W. Janssen, Tor madness reloaded, Online: http://itnomad.
wordpress.com/2007/09/16/tor-madness-reloaded/, 2007.
[9] AccusedOperator, Raided for operating a Tor exit node, Online:
http://raided4tor.cryto.net/, 2012.
[10] A. W. Janssen, The onion router: A brief inftroduction and legal as-
pects, http://yalla.ynfonatic.de/media/lbw2007/tor talk-LBW2007.
pdf, 2007, online; accessed October-2013.
[11] S. K opsell, R. Wendolsky, and H. Federrath, Revocable anonymity,
in ETRICS, 2006, pp. 206220.
[12] L. V. Ahn, A. Bortz, N. J. Hopper, and K. ONeill, Selectively
traceable anonymity, in PET, 2006.
[13] C. Diaz and B. Preneel, Accountable anonymous communication,
in Security, Privacy, and Trust in Modern Data Management.
Springer, 2007.
[14] P. Golle, Reputable mix networks, in PET, 2004.
[15] J. Clark, P. Gauvin, and C. Adams, Exit node repudiation for
anonymity networks, in On the Identity Trail: Privacy, Anonymity
and Identity in a Networked Society. Oxford University Press, 2009.
[16] P. C. Johnson, A. Kapadia, P. P. Tsang, and S. W. Smith, Nymble:
Anonymous ip-address blocking, in PETS, 2007.
[17] R. Henry and I. Goldberg, Formalizing anonymous blacklisting
systems, in IEEE Symposium on Security and Privacy, 2011, pp.
8195.
[18] I. Goldberg, D. Wagner, and E. Brewer, Privacy-enhancing tech-
nologies for the internet, in IEEE Compcon, 1997.
[19] M. K. Reiter and A. D. Rubin, Crowds: anonymity for web
transactions, ACM Trans. Inf. Syst. Secur., vol. 1, no. 1, 1998.
[20] I. Goldberg and A. Shostack, Freedom network 1.0 architecture and
protocols, Zero-Knowledge Systems, Tech. Rep., 2001.
[21] A. Kate, G. M. Zaverucha, and I. Goldberg, Pairing-based onion
routing with improved forward secrecy, ACM Trans. Inf. Syst. Secur.,
vol. 13, no. 4, 2010.
[22] G. Danezis and I. Goldberg, Sphinx: A compact and provably secure
mix format, in IEEE Symposium on Security and Privacy, 2009.
[23] O. Berthold, H. Federrath, and S. Kopsell, Web MIXes: A system
for anonymous and unobservable internet access, in PET, 2001.
[24] G. Danezis, R. Dingledine, and N. Mathewson, Mixminion: design
of a type iii anonymous remailer protocol, in IEEE Symposium on
Security and Privacy, 2003.
[25] M. J. Freedman and R. Morris, Tarzan: a peer-to-peer anonymizing
network layer, in CCS, 2002.
[26] A. Ptzmann and M. Hansen, A terminology for talking about
privacy by data minimization: Anonymity, unlinkability, unde-
tectability, unobservability, pseudonymity, and identity management,
http://dud.inf.tu-dresden.de/literatur/Anon Terminology v0.34.pdf,
Aug. 2010, v0.34.
[27] D. I. Wolinsky, H. Corrigan-Gibbs, B. Ford, and A. Johnson, Dissent
in numbers: making strong anonymity scale, in OSDI, 2012.
[28] H. Corrigan-Gibbs, D. I. Wolinsky, and B. Ford, Proactively ac-
countable anonymous messaging in verdict, in USENIX Security,
2013.
[29] G. Danezis and L. Sassaman, How to bypass two anonymity
revocation schemes, in PETS, 2008.
[30] L. verlier and P. F. Syverson, Improving efciency and simplicity
of tor circuit establishment and hidden services, in PETS, 2007.
[31] A. Kate and I. Goldberg, Using sphinx to improve onion routing
circuit construction, in FC, 2010.
[32] M. Backes, A. Kate, and E. Mohammadi, Ace: an efcient key-
exchange protocol for onion routing, in WPES, 2012.
[33] D. Catalano, D. Fiore, and R. Gennaro, Certicateless onion rout-
ing, in CCS, 2009.
[34] I. Goldberg, D. Stebila, and B. Ustaoglu, Anonymity and one-
way authentication in key exchange protocols, Designs, Codes and
Cryptography, 2012.
[35] J. Camenisch and A. Lysyanskaya, A formal treatment of onion
routing, in CRYPTO, 2005, pp. 169187.
[36] G. Danezis, C. Daz, C. Troncoso, and B. Laurie, Drac: An
architecture for anonymous low-volume communications, in PETS,
2010, pp. 202219.
[37] E. Shimshock, M. Staats, and N. Hopper, Breaking and provably
xing minx, in PETS, 2008, pp. 99114.
[38] S. Goldwasser, S. Micali, and R. L. Rivest, A digital signature
scheme secure against adaptive chosen-message attacks, SIAM J.
Comput., vol. 17, no. 2, pp. 281308, 1988.
[39] A. Haeberlen, P. Fonseca, R. Rodrigues, and P. Druschel, Fighting
cybercrime with packet attestation, MPI-SWS, Tech. Rep., 2011,
http://www.mpi-sws.org/tr/2011-002.pdf.
[40] R. Dingledine and N. Mathewson, Tor Protocol Specication, https:
//gitweb.torproject.org/torspec.git/tree/HEAD, 2008, accessed May
2013.
[41] D. Boneh, B. Lynn, and H. Shacham, Short signatures from the weil
pairing, in ASIACRYPT, 2001.
[42] D. J. Bernstein, N. Duif, T. Lange, P. Schwabe, and B.-Y. Yang,
High-speed high-security signatures, in CHES, 2011.
[43] M. Backes, I. Goldberg, A. Kate, and E. Mohammadi, Provably
secure and practical onion routing, in CSF, 2012, pp. 369385.
[44] M. Abadi and C. Fournet, Mobile values, new names, and secure
communication, in POPL, 2001.
[45] R. Canetti, Universally composable security: A new paradigm for
cryptographic protocols, in FOCS, 2001.
[46] B. Blanchet, An efcient cryptographic protocol verier based on
prolog rules, in CSFW, 2001.
[47] BackRef, Introducing accountability to anonymity networks
(extended version), http://crypsys.mmci.uni-saarland.de/projects/
BackRef/.
[48] I. Blake, G. Seroussi, N. Smart, and J. W. S. Cassels, Advances in
Elliptic Curve Cryptography. Cambridge University Press, 2005.
APPENDIX A
BILINEAR PAIRINGS
In this section, we briey review bilinear pairings. For
more detail see [48] and references therein.
Consider two additive cyclic groups G
1
and G
2
and a
multiplicative cyclic group G
T
, all of the same prime order
p. A bilinear map e is a map e : G
1
G
2
G
T
with the
following properties.
Bilinearity: For all P G
1
, Q G
2
and a, b Z
p
,
e(P
a
, Q
b
) = e(P, Q)
ab
.
Non-degeneracy: The map does not send all pairs in G
1

G
2
to unity in G
T
.
Computability: There is an efcient algorithm to compute
e(P, Q) for any P G
1
and Q G
2
.
APPENDIX B
BLS SIGNATURES
In this section, we briey review BLS signatures. For
more detail see [41] and references therein.
Consider two Gap co-Dife-Hellman groups (or co-GDH
group) G
1
and G
2
and a multiplicative cyclic group G
T
, all
of the same prime order p, associated by a bilinear map [48]
e : G
1
G
2
G
T
. Let g
1
, g
2
, and g
T
be generators for
G
1
, G
2
, and G
T
respectively and let a full-domain hash
function H : 0, 1

G
1
. The BLS signature scheme [41]
comprises three algorithms, Key Generation, Signing and
Verication dened as follows:
Key Generation: Choose random sk
R
Z
p
and compute
pk = g
sk
2
. The private key is sk, and the public key is
pk.
Signing: Given a private key pk Z
p
, and a message m
0, 1

, compute h = H(m) G
1
and signature =
h
sk
, where G
1
.
Verication: Given a public key pk G
2
, message m
0, 1

, and signature G
1
, compute h = H(m)
G
1
and verify that (g
2
, pk, h, ) is a valid co-Dife-
Hellman tuple.
APPENDIX C
1W-AKE PROTOCOL
Until recently, Tor has been using an authenticated
Dife-Hellman (DH) key agreement protocol called the Tor
authentication protocol (TAP), where users authentication
challenges are encrypted with RSA public keys of OR nodes.
However, this atypical use of RSA encryption is found to be
inefcient in practice, and several different interactive and
non-interactive (one-way authenticated) key agreement (1W-
AKE) protocols have been proposed in the literature [30],
[34], [33], [21], [31], [32]. TAP has recently been replaced
by the ntor protocol by Goldberg, Stebila and Ustaoglu [34].
Initiate(pk
Q
, Q):
1) Generate an ephemeral key pair (x, X g
x
).
2) Set session id P Hst (X).
3) Update st(P ) (ntor, Q, x, X).
4) Set mP (ntor, Q, X).
5) Output mP .
Respond(pk
Q
, skQ, X):
1) Verify that X G

.
2) Generate an ephemeral key pair (y, Y g
y
).
3) Set session id Q Hst (Y ).
4) Compute (k

, k) H(X
y
, X
sk
Q
, Q, X, Y, ntor).
5) Compute tQ Hmac(k

, Q, Y, X, ntor, server).
6) Set mQ (ntor, Y, tQ).
7) Set out (k, , X, Y, pk
Q
), where is the anonymous
party symbol.
8) Delete y and output mQ.
ComputeKey(pk
Q
, P , tQ, Y ):
1) Retrieve Q, x, X from st(P ) if it exists.
2) Verify that Y G

.
3) Compute (k

, k) H(Y
x
, pk
x
Q
, Q, X, Y, ntor).
4) Verify tQ = Hmac(k

, Q, Y, X, ntor, server).
5) Delete st(P ) and output k.
If any verication fails, the party erases all session-specic
information and aborts the session.
Fig. 8: The ntor protocol
The ntor protocol is in turn derived from a protocol by
verlier and Syverson [30].
The protocol ntor [34] is a 1W-AKE protocol between
two parties P (client) and Q (server), where client P
authenticates server Q. Let (pk
Q
, sk
Q
) be the static key pair
for Q. We assume that P holds Qs certicate (Q, pk
Q
).
P initiates an ntor session by calling the Initiate function
and sending the output message m
P
to Q. Upon receiving
a message m

P
, server Q calls the Respond function and
sends the output message m
Q
to P. Party P then calls the
ComputeKey function with parameters from the received
message m

Q
, and completes the ntor protocol. We assume
a unique mapping between the session ids
P
of the cid in

OR
.

You might also like