Abstract of “Cryptography for Efficiency: New Directions in Authenticated Data Structures”
by Charalampos Papamanthou, Ph.D., Brown University, May 2011.

Cloud computing has emerged as an important new computational and storage medium and
is increasingly being adopted both by companies and individuals as a means of reducing
operational and maintenance costs. However, remotely-stored sensitive data may be lost or
modified and third-party computations may not be performed correctly due to errors, opportunistic behavior, or malicious attacks. Thus, while the cloud is an attractive alternative
to local trusted computational resources, users need integrity guarantees in order to fully
adopt this new paradigm. Specifically, they need to be assured that uploaded data has not
been altered and outsourced computations have been performed correctly.
Tackling the above problems requires the design of protocols that, on the one hand, are
provably secure and at the same time remain highly efficient, otherwise the main purpose
of adopting cloud computing, namely efficiency and scalability, is defeated. It is therefore
essential that expertise in cryptography and efficient algorithmics be combined to achieve
these goals.
This thesis studies techniques allowing the efficient verification of data integrity and
computations correctness in such adversarial environments. Towards this end, several new
authenticated data structures for fundamental algorithmics and computation problems, e.g.,
hash table queries and set operations, are proposed. The main novelty of this work lies
in employing advanced cryptography such as lattices and bilinear maps, towards achieving
high efficiency, departing from traditional hash-based primitives. As such, the proposed
techniques lead to efficient solutions that introduce minimal asymptotic overhead and at the
same time enable highly-desirable features such as optimal verification mechanisms and parallel authenticated data structures algorithms. The small asymptotic overhead does translate
into significant practical savings, yielding efficient protocols and system prototypes.

Cryptography for Efficiency: New Directions in Authenticated Data Structures

by
Charalampos Papamanthou
B.Sc., Applied Informatics, University of Macedonia, 2003
M.Sc., Computer Science, University of Crete, 2005
M.Sc., Computer Science, Brown University, 2007

A dissertation submitted in partial fulfillment of the
requirements for the Degree of Doctor of Philosophy
in the Department of Computer Science at Brown University

Providence, Rhode Island
May 2011

c Copyright 2011 by Charalampos Papamanthou

This dissertation by Charalampos Papamanthou is accepted in its present form by
the Department of Computer Science as satisfying the dissertation requirement
for the degree of Doctor of Philosophy.

Date
Roberto Tamassia, Director

Recommended to the Graduate Council

Date
Michael T. Goodrich, Reader
University of California, Irvine

Date
Anna Lysyanskaya, Reader

Date
Franco P. Preparata, Reader

Approved by the Graduate Council

Date
Peter M. Weber
Dean of the Graduate School
iii

Vita

Charalampos Papamanthou was born in Trikala, Greece, 29 years ago. Right after graduation
from high school, he began his college studies in Thessaloniki, Greece, receiving his bachelor’s
degree in Applied Informatics from the University of Macedonia in 2003. He then traveled
south to pursue a master’s degree in Computer Science at the beautiful island of Crete.
There, and under the Mediterranean sun, he also did research at the Foundation for Research
and Technology Hellas. Upon completion of his studies at the University of Crete in 2005,
he decided to cross the Atlantic and move to Providence, Rhode Island, in order to attend
Brown University for graduate school. At Brown, he received both his master’s and doctoral
degrees in Computer Science in 2007 and 2011 respectively. He was also the recipient of the
Kanellakis and the van Dam fellowships. While at graduate school, he spent two summers at
the West Coast, interning at Intel Research and Microsoft Research. His research interests
are in computer security, applied cryptography and in the design and analysis of algorithms.
Beginning summer 2011, he will be joining the University of California at Berkeley to work
as a postdoctoral researcher at the Computer Science Division.

iv

Preface

Cloud computing has emerged as an important new computational and storage medium and
is increasingly being adopted both by companies and individuals as a means of reducing
operational and maintenance costs. However, remotely-stored sensitive data may be lost or
modified and third-party computations may not be performed correctly due to errors, opportunistic behavior, or malicious attacks. Thus, while the cloud is an attractive alternative
to local trusted computational resources, users need integrity guarantees in order to fully
adopt this new paradigm. Specifically, they need to be assured that uploaded data has not
been altered and outsourced computations have been performed correctly.
Tackling the above problems requires the design of protocols that, on the one hand, are
provably secure and at the same time remain highly efficient, otherwise the main purpose
of adopting cloud computing, namely efficiency and scalability, is defeated. It is therefore
essential that expertise in cryptography and efficient algorithmics be combined to achieve
these goals.
This thesis studies techniques allowing the efficient verification of data integrity and
computations correctness in such adversarial environments. Towards this end, several new
authenticated data structures for fundamental algorithmics and computation problems, e.g.,
hash table queries and set operations, are proposed. The main novelty of this work lies
in employing advanced cryptography such as lattices and bilinear maps, towards achieving
high efficiency, departing from traditional hash-based primitives. As such, the proposed
v

techniques lead to efficient solutions that introduce minimal asymptotic overhead and at the
same time enable highly-desirable features such as optimal verification mechanisms and parallel authenticated data structures algorithms. The small asymptotic overhead does translate
into significant practical savings, yielding efficient protocols and system prototypes.

vi

Acknowledgments

Many individuals contributed to the outcome of this beautiful educational journey at Brown
University.
First and foremost, I deeply thank my thesis advisor, Roberto Tamassia, who guided
me through the challenging path of graduate school. Roberto’s vast experience in research,
combined with his kindness, smile and sincerity, taught me how to produce high-quality work
with a positive attitude, always being precise, objective and very self-critical. His efficient
quest for perfection, his work ethic, as well as his constructive feedback were vital in shaping
not only my research philosophy, but also my daily presence and interactions in an academic
environment. Finally, Roberto’s advice on personal matters and academics has been really
invaluable and was always promptly and generously provided, whenever needed. I could not
have hoped for a better advisor.
Second, I am grateful to Franco P. Preparata, with whom I closely collaborated during
my first two years at Brown. Franco was the first faculty member I met as soon as I arrived
in Providence, back in 2005. Having known Franco for six years now, I am still amazed
by his seemingly endless knowledge of Computer Science, his high integrity, and his loyalty
to his colleagues. I thank him for the so many technical and political discussions we had,
his meticulously prepared lectures on parallel algorithms and computational biology, and
for the provably correct advice he would always provide at the right time. Also, I would
like to thank his wife, Rosa Maria, for inviting me multiple times for dinner at their place.
vii

Admittedly these have been the most original and tasteful Italian dinners ever!
I would also like to thank the other members of my committee, Michael T. Goodrich and
Anna Lysyanskaya. Michael has been a great collaborator, always encouraging new ideas
and a diverse research agenda. He provided excellent feedback on the final text of this thesis.
Anna taught me foundations of cryptography, through an engaging introductory class and
through the crypto reading group. Her presence in the department and my interactions with
her greatly influenced the research path of this dissertation. Also, many thanks to Nikos
Triandopoulos, who, apart from a close friend, has been a reliable colleague, always eager
to carefully listen to all my ideas and concerns. Many results in this dissertation have been
the outcome of a great deal of fruitful discussions and long technical meetings with him.
Finally, I would like to thank Alptekin Küpçü, C. Chris Erway, Bernardo Palazzi, Alexander
Heitzmann, Olya Ohrimenko and Danfeng Yao for the work we did together on topics related
to this thesis, as well as Petros Maniatis and Seny Kamara for being my internship mentors
at Intel Research and Microsoft Research respectively.
The Brown CS faculty, and in particular the professors I interacted with mostly, namely,
John Savage, Claire Mathieu, Philip Klein, Tom Doeppner, Pascal Van Hentenryck, Rodrigo
Fonseca and Eli Upfal, have been extraordinary. Their persistent dedication to high-quality
research and teaching nurtured an inspiring and challenging environment for every graduate
student in the department. Also, everyday life in the CIT would not have come easy had
it not been for the Brown CS Astaff and Tstaff. Thank you Janet, Lauren, Jane, Genie,
Dorinda, Max, Phirum and Jeff!
Back in Greece, my professors and friends at UoM and UoC provided just the right
academic environment. My undergraduate mentors, Konstantinos Paparrizos and Nikolaos
Samaras, introduced me to research. My advisor at the University of Crete, Ioannis (Yanni)
G. Tollis, helped me transform from an excited undergraduate student to an ambitious and
focused graduate student. Working with Yanni was a great experience, and I am grateful to
him for inspiring me to do research on exciting problems. Finally, many thanks to my friends
viii

and colleagues from my university years in Greece, Dimitris Xinidis, Manos Papaggelis, Adam
Arvelakis, Alexandros Stamatakis, Pavlos Pavlidis, Christos Sgaras and Kostas Tzouvaras,
for keeping in touch and for sharing their exciting stories with me while I was far away.
Life in Providence would not have been as fun without the good times spent with the
following people: Socrate, Dimitri, Yanni, Ari, Pari, Misha, Foteini, Yorgo, Maria, Anastasia,
Aggeliki, Olga, Panagioti, Katerina, Basili, Saki, Aparna, Menia, Sophocle, Wenjin, Radu
and Doria, thank you all guys! The Papavasiliou family in Attleboro made sure America felt
like home, and of course, a big thank you belongs to my childhood buddies Pete, Achilleas,
Ilias and Manos, as well as to all the members of my extended family, for being a continuous
source of encouragement. Finally, many thanks to Vasili, Petro and Michali, for an amazing
summer in Seattle (and for taking care of me while on crutches).
The research performed in this thesis was supported by the Center for Geometric Computing, the Plastech Professorship of Computer Science, the van Dam fellowship and the
Kanellakis fellowship at Brown University, the National Science Foundation, NetApp and
IAM Technology. It has been an honor for me to be a Kanellakis fellow and I would like to
wholeheartedly thank the Kanellakis family for their generous support.
Last but not least, I am grateful to my parents Yianni and Gioula and to my brother
Christo, for their unconditional love and support during all these years, as well as for everything they sacrificed for my upbringing and education. Finally, to honor her memory, I
dedicate this thesis to my late grandmother Artemisia, for the values she imparted to me
about life with her simple, but deep in meaning, sayings.

ix

Sth mn mh thc giagic mou, ArtemisÐac QristodoÔlou.

To the memory of my grandmother, Artemisia Christodoulou.

x

Contents

List of Tables

xv

List of Figures

xviii

1 Introduction

1

1.1

Thesis motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.2

Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

2 Preliminaries and related work
2.1

2.2

2.3

8

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.1.1

Data structures and authenticated data structures . . . . . . . . . . .

8

2.1.2

Complexity model

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.1.3

Optimality and public verifiability . . . . . . . . . . . . . . . . . . . .

14

Protocols and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.2.1

Three-party authenticated data structures protocol . . . . . . . . . .

16

2.2.2

Two-party authenticated data structures protocol . . . . . . . . . . .

21

Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.3.1

Generic collision-resistant hashing . . . . . . . . . . . . . . . . . . . .

30

2.3.2

More advanced cryptography . . . . . . . . . . . . . . . . . . . . . .

31

2.3.3

Relation to memory checking . . . . . . . . . . . . . . . . . . . . . .

32

xi

3 Accumulators for authenticated hash tables
3.1

3.2

3.3

3.4

Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.1.1

Hash tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.1.2

The RSA accumulator . . . . . . . . . . . . . . . . . . . . . . . . . .

39

3.1.3

The bilinear-map accumulator . . . . . . . . . . . . . . . . . . . . . .

43

3.1.4

The accumulation tree . . . . . . . . . . . . . . . . . . . . . . . . . .

47

Scheme based on the RSA accumulator . . . . . . . . . . . . . . . . . . . . .

48

3.2.1

Main authenticated data structure . . . . . . . . . . . . . . . . . . .

49

3.2.2

Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

3.2.3

Queries and verification . . . . . . . . . . . . . . . . . . . . . . . . .

60

3.2.4

Correctness and security . . . . . . . . . . . . . . . . . . . . . . . . .

63

3.2.5

A more practical scheme . . . . . . . . . . . . . . . . . . . . . . . . .

69

3.2.6

Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

Scheme based on the bilinear-map accumulator . . . . . . . . . . . . . . . .

74

3.3.1

Queries and verification . . . . . . . . . . . . . . . . . . . . . . . . .

81

3.3.2

Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

Complexity limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

4 Authenticated structures based on lattices
4.1

4.2

35

Lattice definitions

93

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

4.1.1

What is a lattice? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

4.1.2

Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

4.1.3

Lattice-based hash function . . . . . . . . . . . . . . . . . . . . . . . 100

4.1.4

Parallel models of computation . . . . . . . . . . . . . . . . . . . . . 102

Main construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.2.1

Algebraic tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.2.2

Algorithms of the scheme

4.2.3

Partial digests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

. . . . . . . . . . . . . . . . . . . . . . . . 105

xii

4.2.4

Correctness and security . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.2.5

A note on repeated linearity . . . . . . . . . . . . . . . . . . . . . . . 117

4.3

Authenticated bloom filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

4.4

Parallel online memory checking . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.5

Protocols

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5 Authenticated sets operations with bilinear maps
5.1

127

Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.1.1

Sets collection data structure scheme . . . . . . . . . . . . . . . . . . 132

5.1.2

Subset witnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5.2

Construction and algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.3

Queries and verification

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.3.1

Intersection query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.3.2

Union query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

5.3.3

Subset query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.3.4

Set difference query . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5.4

Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

5.5

Proof of correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

5.6

Proof of security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.6.1

5.7

5.8

Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.7.1

Keyword-search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

5.7.2

Timestamped keyword-search . . . . . . . . . . . . . . . . . . . . . . 167

Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
5.8.1

System setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

5.8.2

Communication cost . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

5.8.3

Verification cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

xiii

6 Optimality with multilinear forms
6.1

Dictionary data structure
6.1.1

174

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Non-optimal authenticated dictionaries . . . . . . . . . . . . . . . . . 179

6.2

Multilinear forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

6.3

An optimal authenticated dictionary . . . . . . . . . . . . . . . . . . . . . . 181

6.4

6.3.1

Dictionary queries and verification . . . . . . . . . . . . . . . . . . . 185

6.3.2

Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

6.3.3

Application in the two-party protocol . . . . . . . . . . . . . . . . . . 194

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

7 Conclusions

197

7.1

Overview of thesis results and discussion . . . . . . . . . . . . . . . . . . . . 199

7.2

Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

xiv

List of Tables

3.1

In this table, we exhibit a detailed comparison of asymptotic access and group
complexities of various authenticated data structure schemes in the literature
with the complexities of our schemes. The underlying data structure scheme
is for a hash table storing n elements. All the authenticated data structure
schemes compared are defined by algorithms {genkey, setup, update, refresh,
query, verify} (see Definition 2.3). Parameter 0 <  < 1 is a constant, “D. Log”
stands for “Discrete Logarithm”, “Generic CR” stands for “Generic Collision
Resistance” and “B. q-DH” stands for “Bilinear q-strong Diffie-Hellman”. In
all constructions the authenticated data structure has group complexity (i.e.,
size) O(n) and genkey() has O(1) complexity. Π(q) denotes the proof for a
query q and upd is the update information output by algorithm update().
Our schemes are denoted with RHT (RSA-based authenticated hash table)
and BHT (bilinear-map-based authenticated hash table). The “one-star”
notation ∗ denotes an expected complexity, the “two-star” notation ∗∗ denotes
an expected amortized complexity, whereas the “plus” notation

+

denotes an

amortized complexity. All schemes in the table are publicly-verifiable. . . . .

xv

36

4.1

Asymptotic access and group complexities of various authenticated data structure schemes (see Definition 2.3) for a dynamic table of n entries. Parameter
0 <  < 1 is a constant and GAPSVP is the gap shortest vector problem
in lattices (Definition 4.1). In all schemes, the authenticated structure has
group complexity O(n) and genkey() has O(1) complexity. Note that [90] is
the published conference version of Chapter 3. The acronyms of the other
assumptions can be found in Table 3.1. All presented schemes in the table are
publicly verifiable.

5.1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

Asymptotic access and group complexities of various authenticated data structure schemes defined by algorithms {genkey, setup, update, refresh, query,
verify}, for a sets collection data structure of m sets: The sum of sizes of all
the sets is M and 0 <  < 1 is a constant. FHE stands for fully-homomorphic
encryption, the security of which is based on lattice assumptions, such as the
bounded distance decoding and the SplitKey distinguishing problems—see [43].
We note that the scheme based on FHE is not publicly-verifiable. It however
provides privacy on top of integrity of computations. We show complexities for
an intersection query on t = O(1) sets, outputting an intersection δ elements.
All sizes of the intersected and updated sets are Θ(n).

5.2

. . . . . . . . . . . . 130

Comparison of a 2-intersection communication overhead (proof size) of the
scheme presented by Morselli et al. [79] with our scheme. Here n1 and n2 are
the sets sizes that are intersected and δ is the size of the intersection. . . . . 172

xvi

6.1

Asymptotic access and group complexities of various authenticated data structure schemes for a dynamic dictionary storing n elements, compared with the
optimal authenticated dictionary MFD based on multilinear forms and derived in this chapter. Parameter 0 <  < 1 is a constant and “M. q-DH”
stands for “Multilinear q-strong Diffie-Hellman”. The various acronyms used
for variables and assumptions have all been defined in Table 3.1. Note that
our construction requires two assumptions, namely the assumptions M. q-DH
and Generic CR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.1

Asymptotic access and group complexities of the authenticated data structure
schemes presented in this thesis, applied to the fundamental problem of verifying read/write operations on an array of n entries, and compared with the
first result on dynamic authenticated data structures by Naor and Nissim [81].
We note that, since all complexities for the plain table data structure are constant, no authenticated data structure scheme presented is optimal. Moreover,
based on the recent lower bound for memory checking by Dwork et al. [35], it
seems unlikely that such a scheme could be derived. . . . . . . . . . . . . . . 198

xvii

List of Figures

2.1

The three-party authenticated data structures protocol. During the update
phase, the source sends an update u ∈ U to the server along with the respective update information upd output by update(). During the query phase, the
client sends a query q ∈ Q to the server and the server runs algorithm query()
to output the proof Π(q) for the respective answer. . . . . . . . . . . . . . .

2.2

16

The two-party authenticated data structures protocol. During the query
phase, the client sends a query q ∈ Q to the server and the server runs
algorithm query() to output the proof Π(q) for the answer. During the update
phase, the client sends to the server an update u ∈ U, which relates to a
certain set of queries Qu ⊆ Q. Then the server computes the set of proofs
Π(Qu ). This set of proofs will be used by function z(.) of Assumption 2.1,
which will output δu (Dh ) and δu (auth(Dh )), which are subsequently input to
algorithm update(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.1

22

The accumulation tree of a set of 64 elements for  = 31 : every internal node
1

has 4 = 64  children, there are 3 =

1


levels in total, and there are 641−i/3

nodes at level i = 0, 1, 2, 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xviii

47

4.1

Tree T built on top of a table with 8 values x1 , x2 , . . . , x8 . After producing an
n-admissible radix-2 g(.) representation of the children digests, we multiply
with either U or D, then we add the two resulting digests and we compute the
hash function on them by multiplying with M. At the leaves of the tree we
show the terms that correspond to each index, as computed by Theorem 4.3
(i.e., the partial digests of the root r with reference to every value at the table).
The g(.) representation of the internal nodes are indicated with dashed lines
(see Definition 4.9). Note that the g(.) representations of the internal nodes are
the sum of specific f (.) representations of the leaves, for example, g(d(r12 )) =
f (Lf (Lf (x5 ))) + f (Lf (Rf (x6 ))) + f (Rf (Lf (x7 ))) + f (Rf (Rf (x8 ))), where
MU = L and MD = R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

xix

Chapter

1

Introduction
During the last few years, cloud computing has emerged as an important new computational
and storage medium [55]. In fact, remote data storage (e.g., Amazon S3) and outsourcing of
computation (e.g., Google docs) have become a major everyday phenomenon. An increasingly large number of companies and individuals have adopted cloud computing as a means
of reducing operational and maintenance costs.
However, the cloud is not a panacea. Quoting from an article that was published in 2009
at techcrunch.com [2],
“...T-Mobile and Danger, the Microsoft-owned subsidiary that makes the Sidekick,
has just announced that they’ve likely lost all user data that was being stored on
Microsoft’s servers due to a server failure...”.
Therefore, and beyond the control of its owner, remotely stored sensitive data may be lost,
modified or accessed by unauthorized entities. Additionally, third-party (i.e., cloud) computations may not be performed correctly, due to errors, opportunistic behavior or malicious
attacks. All these cases imply that, while the cloud is an attractive alternative to local
trusted computational resources, there is a series of security threats that needs to be addressed in order for this paradigm to be fully adopted by the users. For example, integrity
and privacy guarantees are highly needed: Specifically, users need to be assured that remote
1

2
data and computations have not been altered and no cloud data has leaked.
Tackling the above problems requires the design of protocols and the development of
prototypes that, on the one hand, are provably secure and at the same time remain highly
efficient, otherwise the main purpose of adopting cloud computing, i.e., efficiency and scalability, is defeated. In other words, the provable security added to a cloud service should
not lead to major performance penalties and the induced overhead should be negligible compared to the actual computational resources needed by the application when executing in an
insecure environment. It is essential that expertise in cryptography and efficient algorithmics
be combined to achieve these goals.
This thesis addresses the first aspect of cloud security, namely the verification of cloud
data integrity and cloud computations correctness. It is an extended and formal study of
authenticated data structures [105], which comprise systematic and efficient methods for
cryptographically checking the integrity of dynamic structured data—stored at adversarial
environments—and queries executed on it. Namely, given that a data structure is stored
at some untrusted entity and some computation can be performed over it by this untrusted
entity (e.g., output yes if x is contained in a dictionary D or output the shortest path from
node v to node u in a graph G), an authenticated data structure provides the cryptographic
machinery and the algorithms for deciding, without having access to the data structure itself
but only to some constant reliable (and possibly secret) space, whether the returned answer
is correct or not.
First, an authenticated data structure should be secure: A computationally-bounded
adversary should not be able to produce a valid proof for a false answer, under a well-accepted
computational assumption. Second, it should be efficient: Its algorithms should have low
complexity, ideally not adding too much overhead. Successfully combining both these goals
comprises a challenging task, substantially depending on the underlying cryptography. Under
this premise, the main novelty of this thesis lies in employing
advanced cryptography for constructing highly efficient authenticated data structures

3
departing in this way from widely employed hash-based constructions (e.g., Merkle trees [77])
which traditionally use collision-resistant hash functions (e.g., SHA-2 [85]) as a black box and
enforce in this way certain complexity lower bounds [106]. Specifically, the coupling of certain
cryptographic primitives, such as accumulators [13], lattices [3] and bilinear maps [60], with
suitable data structuring and algorithms techniques, such as hash tables [29] and search
trees [47], is explored and exploited, leading to efficient solutions that introduce minimal
(and sometimes zero) asymptotic overhead. The small asymptotic overhead does translate
into significant practical savings, yielding protocols that compare favorably in practice with
existing work. The security of our constructions is based on computational assumptions
widely established and accepted by the cryptography community.

1.1

Thesis motivation

The problem of efficiently verifying the integrity of structured data stored at untrusted resources has been an active research area for almost two decades, beginning with the seminal
paper of Merkle [77], where the well-known—and widely used in practice since then—hash
trees were introduced. Being an alternative to plain digital signatures, hash trees comprise
an efficient way to provide integrity proofs for structured data (e.g., an array or a dictionary)
stored at computationally-bounded adversarial sites—certain costs, compared to digital signatures techniques, decrease from linear to logarithmic. The problem was later formalized by
Blum et al. [15] who introduced memory checking, i.e., mechanisms for reliably reading and
writing memory cells, when the memory is not to be trusted. Later on, and after Naor and
Nissim dynamized Merkle’s solution [81], it became clear that verification mechanisms for
more complicated structures (e.g., supporting dynamic operations) and more general query
types (e.g., a verifying the output of an algorithm) were needed. Both these verification tasks
could be achieved via memory checking techniques, since every data update or computation
may be viewed as writing and reading bits from memory respectively. However, and as it

4
usually happens when directly employing fundamental primitives (e.g., zero-knowledge [45],
oblivious RAM [86]) in cryptography for solving more complicated problems, inefficiency
is a major issue. This motivated the study of authenticated data structures [105], a model
where untrusted parties answer queries on a dynamic data structure providing a proof of
validity of each answer to the user, in an efficient way.
So far in the literature, the vast majority of algorithms and techniques for authenticated
data structures (e.g., [4, 6, 48, 53, 72, 74, 75, 77, 81, 88, 104, 112]) have traditionally
relied on cryptographic hashing. In particular, collision-resistance, a fundamental property
required for data integrity, is achieved through the use of black box generic functions, such
as SHA-2 (these functions are believed to be collision-resistant in practice but there exists
no formal proof for this). However, this black box property imposes certain complexity limitations through existing lower bounds [106]. For instance, for a dynamic dictionary of size n,
Ω(log n) proof complexities are inevitable, since the internal function of these primitives is
not exploitable in any meaningful way. Aiming at more efficient solutions, this thesis studies the effect of cryptography on the efficiency of various authenticated data structures, by
exploring certain algebraic properties of advanced cryptographic primitives that traditional
black box generic hash functions lack. Combined with suitable data structures and algorithms, these properties comprise a deciding factor in deriving asymptotically better (even
optimal) authenticated data structures, resulting in more scalable protocols and applications.

1.2

Thesis outline

The thesis outline is as follows. We begin with Chapter 2 where we present basic definitions
that we extensively use in the thesis, such as the data structure scheme definition (Definition 2.2) and the authenticated data structure scheme definition (Definition 2.3), along with
its correctness and security definitions (Definition 2.5 and 2.4 respectively). Moreover in
Chapter 2 we show how, given any authenticated data structure scheme as a black box, we

5
can derive a three-party authenticated data structures protocol (Theorem 2.1) or a two-party
authenticated data structures protocol (Theorem 2.2), both traditionally used to describe authenticated data structure solutions in the literature (e.g., see [75] for a three-party protocol
and [92] for a two-party protocol). This black box approach not only eases the presentation
of our results in subsequent chapters of the thesis but also helps us avoid repeating notions
related to protocols in each chapter. Each of the remaining chapters of the thesis (Chapters 3 through 6) comprises a study of the application of a certain cryptographic primitive
for solving a specific authenticated data structure problem, efficiently:
In Chapter 3, we design authenticated data structures for set membership queries on
hash tables, using cryptographic primitives called accumulators [23, 83] and applying them
in a novel hierarchical way over the stored data. We provide the first construction for
authenticating a hash table with constant query cost and sublinear update cost, strictly
improving upon previous methods, addressing and answering an open problem posed in [81].
The algebraic property we take advantage of in this construction is the commutativity of the
RSA exponentiation function, which enables fast updates of cryptographic digests whenever
partial information included in the digest changes, offering incrementality at no cost and
at the same time allowing for succinct, constant-size proofs—notice that not all functions
achieving incrementality at no cost [11] offer efficient proof complexity. The main results
of this chapter are given in Theorems 3.2 and 3.4. A preliminary version of this chapter
appears in [90].
In Chapter 4, we initially design a new authenticated data structure for a dynamic table
with n entries. We present the first dynamic authenticated table that is update-optimal and
is based on lattices, a mathematical object that found many applications in cryptography
during the last decade. In particular, the update complexity of the authenticated table we
design is O(1), improving in this way the “a priori” O(log n) update bounds of previous
constructions, such as the Merkle tree. To achieve this result, we establish and exploit a
property that we call repeated linearity of lattice-based hash functions. We secondly observe

6
that the repeated linearity of the used lattice-based cryptographic primitive lends itself to
a natural notion of parallelism: As such, we describe parallel versions of our authenticated
data structure algorithms, yielding the first parallel online memory checker [15] with O(1)
query complexity using O(log n) checkers in the CREW model and without using a secret
key setting, i.e., there is only need for small reliable but not secret memory. Theorem 4.4
describes the basic (parallel) authenticated data structure and Theorem 4.6 gives the application of the presented lattice-based authenticated table in parallel online memory checking.
A preliminary version of this chapter appears in [93].
In Chapter 5, we study the problem of verifying outsourced set operations over a dynamic
collection of sets. Based on the convenient use of the bilinear-map primitive, which proved
to be a very useful tool in cryptography after its first appearance in the literature within
a cryptographic context [60], we are able to construct the first operation-sensitive scheme
for verifying set operations, such as union and intersection (see Theorem 5.1): Operationsensitivity is a strong property that enables us to achieve verification costs (proof and verification complexity) proportional to the size of the answer, and not to the time taken to
produce the answer, a property that could not be achieved otherwise (e.g., with traditional
hash-based techniques), and is obviously highly desirable. In this chapter, we also address
applications of our techniques for verification of keyword searches on outsourced document
collections (e.g., inverted-index queries) and queries in outsourced databases (e.g., equi-join
queries). Since set intersection is heavily used in these applications, we obtain new authenticated data structures that compare favorably to existing approaches. This chapter closes an
open problem, that of operation-sensitivity of set operations, posed in [33]. A preliminary
version of this chapter appears in [94].
In Chapter 6, we observe that no optimal authenticated data structure (i.e., an authenticated data structure that adds no extra asymptotic overhead to the respective plain data

7
structure) is known to date1 . However, by assuming the existence of multilinear form generators [91], a cryptographic primitive proposed by Silverberg and Boneh in 2003 [19], the
construction of which, however, remains an open problem, we introduce the first optimal
authenticated dictionary data structure. The presented authenticated dictionary, described
in Theorem 6.1, enjoys proofs of constant size, i.e., asymptotically equal to the size of the
answer. We close this chapter with Theorem 6.2, showing a reduction connecting the existence of optimal authenticated dictionaries with the existence of multilinear form generators.
A preliminary version of this chapter appears in [91].
We conclude in Chapter 7 by commenting on some open problems and future work.

1

Note that the lattice-based authenticated structure of Chapter 4 is only update-optimal whereas the
authenticated data structure of Chapter 5 achieves optimal verification costs only.

Chapter

2

Preliminaries and related work
This chapter presents preliminary definitions and results that we are using in the rest of the
thesis as well as an extended study of authenticated data structures related work.

2.1

Definitions

In the remainder of the thesis, we denote with k ∈ N the security parameter. We begin with
the definition of a negligible function, extensively used to express our security arguments:
Definition 2.1 (Negligible function) Let f : N → R. We say that f (k) is neg(k) iff for
any nonzero polynomial p(k) there exits N such that for all k > N it is f (k) < 1/p(k).
Typical examples of functions that are neg(k) are the functions

2.1.1

1
2k

and

poly(k)
.
2k

Data structures and authenticated data structures

To formally describe our authenticated data structures solutions in Chapters 3, 4, 5 and 6, we
give definitions for a data structure scheme and the respective authenticated data structure
scheme. Similar definitions have appeared in the work of Tamassia and Triandopoulos [107].
To avoid unnecessary complications, we do not define an abstract data type and we instead

8

9
define a data structure scheme directly. We use the notation
{O1 , O2 , . . . , Oo } ← alg(I1 , I2 , . . . , Ii ) ,
to denote that algorithm alg has i inputs I1 , I2 , . . . , Ii and o outputs O1 , O2 , . . . , Oo . Whenever an input I or an output O appears as (I)∗ or (O)∗ (e.g., algorithms update() and
verify() in Definition 2.3), this means that I or O are not required as inputs or outputs of
the algorithm but might appear depending on the implemented scheme.
Definition 2.2 (Data structure scheme) Let D be any data structure supporting a set
of updates U and a set of queries Q. Denote with Dh the state of the data structure D at
time h, where h ≥ 0 is an integer and D0 is the initial state of the data structure D. A data
structure scheme D(U, Q) is a collection of the following three polynomial-time algorithms
{update, query, check}:
1. Dh+1 ← update(u, Dh ): On input an update u ∈ U and the data structure Dh , this
algorithm outputs the updated data structure Dh+1 ;
2. α(q) ← query(q, Dh ): On input a query q ∈ Q and the data structure Dh , this
algorithm outputs the answer α(q) to query q;
3. {accept, reject} ← check(q, α, Dh ): On input a query q ∈ Q, an answer α and the
data structure Dh , this algorithm outputs accept if α is a correct answer for query q
on data structure Dh . Else it outputs reject.
For example, consider a data structure scheme for the dictionary data structure (see Section 6), implemented with a red-black tree [29]. The query() algorithm performs a binary
search while the update() algorithm performs the relevant rotations needed for re-balancing
the structure. We note here that in Definition 2.2, script letter D in the notation D(U, Q)
implies that D(U, Q) is a data structure scheme for data structure D (non-script letter)
which (the data structure D) supports the set updates U and the set of queries Q.

10
Definition 2.3 (Authenticated data structure scheme) Let D(U, Q) be a data structure scheme defined by the collection of algorithms {update, query, check}. An authenticated data structure scheme A for the data structure scheme D(U, Q) is a collection of the
following six polynomial-time algorithms {genkey, setup, update, refresh, query, verify}:
1. {sk, pk} ← genkey(1k ): This algorithm outputs the secret key sk and the public key pk,
given the security parameter k;
2. {auth(D0 ), d0 } ← setup(D0 , sk, pk): This algorithm computes the authenticated data
structure auth(D0 ) and the respective digest d0 of auth(D0 ), given a data structure D0 ,
the secret key sk and the public key pk;1
3. {Dh+1 , (auth(Dh+1 ))∗ , dh+1 , upd} ← update(u, Dh , (auth(Dh ))∗ , dh , sk, pk): This algorithm takes as input an update u ∈ U, a data structure Dh , possibly an authenticated
data structure auth(Dh ), the digest dh of auth(Dh ) and both the secret and the public
keys sk and pk. It outputs the data structure Dh+1 ← update(u, Dh ), possibly the authenticated data structure auth(Dh+1 ), the digest dh+1 of auth(Dh+1 ) and some relative
information upd;2
4. {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): This algorithm takes
as input an update u ∈ U, a data structure Dh , an authenticated data structure
auth(Dh ), the digest dh of auth(Dh ), the information upd computed by algorithm update()
and only the public key pk. It outputs the data structure Dh+1 ← update(u, Dh ), the
authenticated data structure auth(Dh+1 ) and the digest dh+1 of auth(Dh+1 );3
1

The digest d0 of the authenticated data structure auth(D0 ) is a collision-resistant representation of D0 ,
e.g., the roothash of a Merkle tree [77]. It is usually of constant size.
2

Note that this algorithm is only required to output the new digest dh+1 and the new data structure Dh+1 .
Outputting the new authenticated data structure auth(Dh+1 ) is not a requirement of the algorithm—this
will be important in improving the complexity of this algorithm is some schemes. Also, the secret key is
required for execution.
3

Note here that the secret key is not used for execution. However, for correct inputs, the output digest
dh+1 is the same as in algorithm update().

11
5. {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): On input a query q ∈ Q, a data structure
Dh , an authenticated data structure auth(Dh ) and the public key pk, this algorithm
returns the answer α(q) ← query(q, Dh ) to the query q, along with a respective proof
Π(q);
6. {accept, reject} ← verify(q, α, Π, dh , (sk)∗ , pk): This algorithm takes as input a query
q ∈ Q, an answer α, a proof Π, the digest dh of auth(Dh ), possibly the secret key sk
and the public key pk and outputs either accept or reject.
There are two properties that an authenticated data structure scheme should satisfy, i.e.,
correctness and security (intuition follows from signature schemes definitions, e.g., see Camenisch and Lysyanskaya [24]): Roughly speaking, the correctness property requires that,
for every query q ∈ Q, if a proof Π(q) is computed by algorithm query() (i.e., faithfully), then
verify(), on input Π(q) and a correct answer α(q), should always accept, as long as the digest
d is updated through algorithm refresh() (i.e., it is the correct one). The security property
requires that a computationally-bounded adversary, i.e., an adversary that has access to
polynomially-bounded resources (time and space) in the security parameter k, should not be
able (except with negligible probability) to produce verifying proofs Π for incorrect answers
α corresponding to queries q ∈ Q on an authenticated data structure auth(D) whose digest
is updated through adversarially chosen oracle calls to algorithm update()—this is why we
require update() have access to the secret key.
Definition 2.4 (Correctness of authenticated data structure scheme) Let D(U, Q)
be a data structure scheme defined by the collection of algorithms {update, query, check}
and let also A be an authenticated data structure scheme for D(U, Q) defined by the collection of algorithms {genkey, setup, update, refresh, query, verify}. We say that the authenticated data structure scheme A is correct if, for all k ∈ N, for all {sk, pk} output by
algorithm genkey(), for all Dh , auth(Dh ), dh output by one invocation of algorithm setup()
followed by polynomially-many invocations of algorithm refresh(), where h ≥ 0, for all

12
queries q ∈ Q and for all Π(q), α(q) output by algorithm query(q, Dh , auth(Dh ), pk), with
all but negligible probability, whenever algorithm check(q, α(q), Dh ) accepts, so does algorithm verify(q, Π(q), α(q), dh , (sk)∗ , pk).
Definition 2.5 (Security of authenticated data structure scheme) Let D(U, Q) be a
data structure scheme defined by the collection of algorithms {update, query, check},
A be an authenticated data structure scheme for D(U, Q) defined by the collection of algorithms {genkey, setup, update, refresh, query, verify}, k be the security parameter and
{sk, pk} ← genkey(1k ). Denote with Adv a polynomially-bounded adversary that is only given
the public key pk. The adversary has unlimited access to all algorithms of A, except for
algorithms setup(), update() and possibly algorithm verify(), to which he has only oracle access. The adversary picks an initial state of the data structure D0 and computes auth(D0 ), d0
through oracle access to algorithm setup(). Then, for i = 0, . . . , h = poly(k), Adv issues an
update ui ∈ U in the data structure Di and outputs Di+1 , possibly the authenticated data
structure auth(Di+1 ) and di+1 through oracle access to algorithm update(). Finally the adversary enters the attack stage where he picks an index 0 ≤ t ≤ h + 1, a query q ∈ Q, an
answer α and a proof Π. We say that the authenticated data structure scheme A is secure if
for all k ∈ N, for all {sk, pk} output by algorithm genkey(), and for all polynomially-bounded
adversaries Adv it is


k
∗
 {q, Π, α, t} ← Adv(1 , pk); accept ← verify(q, α, Π, dt , (sk) , pk); 
Pr 
 ≤ ν(k) ,
reject ← check(q, α, Dt ).
where ν(k) is neg(k).

2.1.2

Complexity model

To explicitly measure complexity of various algorithms with respect to the number of primitive cryptographic operations, without considering the dependency on the security parameter, we adopt the complexity model used in memory checking [15, 35], which has been only
implicitly used in the literature of authenticated data structures:

13
Definition 2.6 (Access complexity) The access complexity of an algorithm is defined as
the number of memory accesses this algorithm performs on the (authenticated) data structure
stored in an indexed memory of n cells, in order for the algorithm to complete its execution.
Here, “access complexity” is used instead of “query complexity” used in memory checking [15,
35] to avoid ambiguity when referring to algorithm query() of the authenticated data structure
scheme. We also require that each memory cell can store up to O(poly(log n)) bits, a word
size used in Blum’s original memory checking work [15] but also in subsequent work [35]. For
example, a Merkle tree [77] has O(log n) update access complexity since the update algorithm
needs to read and write O(log n) memory cells of the authenticated data structure, each cell
storing exactly one hash value.
Definition 2.7 (Group complexity) The group complexity of a data collection (e.g.,
proof group complexity or authenticated data structure group complexity) is defined as the
number of elementary data objects (e.g., hash values or elements in Zp ) contained in that
object.
Whenever it is clear from the context, we omit the terms “access” and “group”. Also, concerning the above definitions, we note that since the size of the problem n, in a cryptographic
setting, has to be polynomially-bounded by the security parameter k, i.e., n = o(2k ), the
O(.) notation appearing in the following chapters expresses asymptotic results for values of
n → 2k , and not for values of n → ∞, as is mathematically implied by the O(.) notation [29].
Second, we observe that access complexity captures the notion of “time” and group
complexity captures the notion of “space”. However, access and group complexities are
in principle smaller than time and space complexities. This is because time and space
complexities are counting number of bits and are always functions of the security parameter
k. Being in a cryptographic setting however, the security parameter is always Ω(log n)
and therefore time and space complexities are always Ω(log n)—however, access and group
complexities can be O(1).

14

2.1.3

Optimality and public verifiability

We now give the definition of an optimal authenticated data structure scheme. Intuitively,
an optimal authenticated data structure scheme A for a data structure scheme D(U, Q)
should not add any asymptotic overhead to the complexity of the algorithms of the data
structure scheme D(U, Q):
Definition 2.8 (Optimal authenticated data structure scheme) Suppose D(U, Q) is
a data structure scheme defined by the collection of algorithms {update, query, check}
and A is a correct and secure authenticated data structure scheme for D(U, Q) defined by
the collection of algorithms {genkey, setup, update, refresh, query, verify}. The authenticated
data structure scheme A is optimal if all of the following are true:
1. (Space optimality) For all integers h ≥ 0, the group complexity of the authenticated
data structure auth(Dh ) is no more than the group complexity of the data structure Dh ,
i.e.,
|auth(Dh )| = O(|Dh |) ;
2. (Update optimality) For all updates u ∈ U, the sum of the access complexity of update()
plus the group complexity of upd (output by update()) plus the access complexity of
refresh() is no more than the access complexity of update(), i.e.,
|update()| + |upd| + |refresh()| = O(|update()|) ;
3. (Query optimality) For all queries q ∈ Q, the access complexity of query() is no more
than the access complexity of query(), i.e.,
|query()| = O(|query()|) ;

4. (Proof and verification optimality) Let α(q) and Π(q) be the answer and the proof
output by query(), on input a query q ∈ Q. Then for all queries q ∈ Q:

15
• The group complexity of the proof Π(q) is no more than the group complexity of
the query q plus the group complexity of the answer α(q), i.e.,
|Π(q)| = O(|q| + |α(q)|) ;
• The access complexity of verify() is no more than the group complexity of the query
q plus the group complexity of the answer α(q), i.e.,
|verify()| = O(|q| + |α(q)|) .
We note here that constructing an optimal authenticated data structure scheme appears to
be a difficult task. In Chapter 5 we construct an authenticated data structure scheme that
is almost optimal (update costs are increased by a logarithmic factor), whereas in Chapter 6
we construct an optimal authenticated data structure scheme for a dictionary data structure
that is however based on a cryptographic primitive that is not known to exist yet.
Definition 2.9 (Publicly-verifiable authenticated data structure scheme) Suppose
D(U, Q) is a data structure scheme and A is a authenticated data structure scheme for
D(U, Q) defined by the collection of algorithms {genkey, setup, update, refresh, query, verify}.
Let also k be the security parameter and {sk, pk} ← genkey(1k ). The authenticated data
structure scheme A is publicly-verifiable if algorithm verify() does not require the secret key
sk as an input.

2.2

Protocols and applications

Let A be a (publicly-verifiable) authenticated data structure scheme for a data structure
scheme D(U, Q) and let also Q be a correct and secure public-key signature scheme (e.g.,
see Camenisch and Lysyanskaya [24]). We now describe how an authenticated data structure scheme A may be used in various protocols, such as a three-party authenticated data

16
structures protocol (e.g., see Tamassia [105] and Martel et al. [75]) or a two-party authenticated data structures protocol (e.g., see Papamanthou and Tamassia [92]). For the protocols
description we use the following notation:
• [R]t : program. Program program is executed by party R at time t;
• [R → G]t : data. Data data is sent from party R to party G at time t.

2.2.1

Three-party authenticated data structures protocol

A typical setting where an authenticated data structure scheme can be employed involves
three participating entities, usually referred to as three-party model [105] (see Figure 2.1):
A trusted party called source, owns a data structure, that is replicated along with some
cryptographic information to one or more untrusted parties, called servers. Clients issue
data structure queries to the servers and wish to publicly verify the answers received by the
servers, based only on the trust they have in the source. This trust is conveyed through
a time-stamped signature on the digest of the data structure (a collision resistant succinct
representation of the data structure, e.g., the roothash of a Merkle tree—see Definition 2.3),
that is made available by the source. During an update, the source needs just to compute

q

u,upd

Π(q)

source
setup()
update()

client

server
refresh()
query()

verify()

Figure 2.1: The three-party authenticated data structures protocol. During the update
phase, the source sends an update u ∈ U to the server along with the respective update
information upd output by update(). During the query phase, the client sends a query q ∈ Q
to the server and the server runs algorithm query() to output the proof Π(q) for the respective
answer.
the new digest, whereas the server needs to update the authenticated data structure as a

17
whole. We describe this protocol formally below:
Protocol 2.1 A three-party authenticated data structures protocol involves three participating entities: A trusted source, T , that has access both to the public and the secret keys, an
untrusted server, S, that has access only to the public keys, and a client, C, that has access
only to the public keys. Let k be the security parameter, A = {genkey, setup, update, refresh,
query, verify} be a correct and secure publicly-verifiable authenticated data structure scheme
for a data structure scheme D(U, Q) and Q = {GEN KEY, SIGN , VERIFY} be a correct
and secure public-key signature scheme. The protocol has four phases:
1. Setup phase: Source T owns the data structure D (described by data structure scheme
D(U, Q)) at time t0 . The setup phase consists of the following steps:
(a) [T ]t0 : {SK, PK} ← GEN KEY(1k ); ( signature scheme keys generation)
(b) [T ]t0 : {sk, pk} ← genkey(1k ); ( authenticated data structure scheme keys generation)
(c) [T ]t0 : {auth(D), d} ← setup(D, sk, pk); ( computing the authenticated data structure)
(d) [T ]t0 : Store D, auth(D), d; (source stores necessary information)
(e) [T → S]t0 : D, auth(D), d; ( outsourcing)
(f ) [S]t0 : Store D, auth(D), d. (server stores necessary information)
2. Update phase: Let u ∈ U be an update issued by source T on data structure D at time
tτ . Let D0 be the updated data structure. The update phase consists of the the following
steps:
(a) [T ]tτ : {D0 , (auth(D0 ))∗ , d0 , upd} ← update(u, D, (auth(D))∗ , d, sk, pk); D = D0 ,
(auth(D))∗ = (auth(D0 ))∗ , d = d0 ; ( new digest generation)
(b) [T → S]tτ : u, upd; ( sending relative update information)

18
(c) [S]tτ : {D0 , auth(D0 ), d0 } ← refresh(u, D, auth(D), d, upd, pk); D = D0 , auth(D) =
auth(D0 ), d = d0 . ( authenticated data structure update at the server)
3. Periodical signing: Let tκ = κ∆t for κ = 0, 1, 2, . . . be certain timestamps in time (with
time difference ∆t), where t0 is defined as the time of algorithm setup() execution (see
Item 1c). The periodical signing is performed independently from all the other phases:
(a) [T ]tκ for κ = 0, 1, 2, . . .: sgnκ ← SIGN (SK, d||tκ ); ( signature computation on
most recent digest d)
(b) [T → S]tκ for κ = 0, 1, 2, . . .: sgnκ , tκ ; ( sending signature and timestamp)
(c) [S]tκ for κ = 0, 1, 2, . . .: sgn = sgnκ and t = tκ . ( storing most recent signature
and timestamp)
4. Query phase: The query phase proceeds as follows, between the client C and the server
S. Let T be the time of the query q ∈ Q:
(a) [C → S]T : q; ( sending the query)
(b) [S]T : {Π(q), α(q)} ← query(q, D, auth(D), pk); ( answer and proof computation)
(c) [C ← S]T : α(q), Π(q), d, t, sgn; ( sending answer, proof, most recent digest, most
recent timestamp and signature on the last two by the source)
(d) [C]T : Output ACCEPT if and only if: ( verification of answer)
i. T − t < ∆t; ( t is the most recent timestamp)
ii. ACCEPT ← VERIFY(sgn, d||t, PK); ( d is the most recent digest);
iii. accept ← verify(q, α(q), Π(q), d, pk).
else output REJECT.
We can now state the main theorem of this section:
Theorem 2.1 Let A = {genkey, setup, update, refresh, query, verify} be a correct and secure
publicly-verifiable authenticated data structure scheme for a data structure scheme D(U, Q)

19
and k be the security parameter. Let g(n), s(n), u(n), r(n), q(n) and v(n) be the access
complexities of algorithms genkey, setup, update, refresh, query, verify respectively. Let also
i(n), p(n) and α(n) be the group complexity of the information upd (output by algorithm
update()), of the proof Π(q) (output by algorithm query()) and of the answer α(q) (output by
algorithm query()) respectively and f (n) be the group complexity of the authenticated data
structure auth(D) (output by algorithm setup()). Then there exists a three-party authenticated
data structures protocol involving a trusted source T , an untrusted server S and a client C
for verifying queries q ∈ Q and at the same time supporting updates u ∈ U such that:
1. The setup at the source T has O(s(n)) access complexity;
2. The update at the source T has O(u(n)) access complexity;
3. The space needed at the source T has O(f (n)) group complexity;
4. The communication between the source T and the server S has O(i(n)) group complexity;
5. The update at the server S has O(r(n)) access complexity;
6. The query at the server S has O(q(n)) access complexity;
7. The space needed at the server S has O(f (n)) group complexity;
8. The communication between the server S and the client C has O(p(n) + α(n)) group
complexity;
9. The verification at the client C has O(v(n)) access complexity;
10. For a query q ∈ Q sent by the client C to the server S at any time (even after updates),
let α be an answer and let π be a proof returned by the server S. With probability
Ω(1 − neg(k)), the client C accepts the answer α if and only if α is correct.

20
Proof: (Complexity) The protocol in question is Protocol 2.1. In the setup phase the
access complexity of Item 1a and Item 1b is O(1) (they are not accessing the data structure
at all). The access complexity of Item 1c is O(s(n)) since it involves one call to algorithm
setup(). Therefore the total setup access complexity at the source is O(s(n)). The update
at the source involves one call to algorithm update() (see Item 2a), therefore it has O(u(n))
access complexity. Both the source and the server store the authenticated data structure
auth(D), therefore their space has O(f (n)) group complexity. The communication between
the source and the server has O(i(n)) group complexity, since information u and upd are sent
(see Item 2b) and also periodically (i.e., every ∆t time units) a signature on the timestamped
digest is sent (see Item 3c), which has O(1) group complexity. The update at the server has
O(r(n)) access complexity since it involves one call to algorithm refresh() (see Item 2c).
Computing a proof at the server (i.e., query at the server) has O(q(n)) access complexity
since it involves one call to algorithm query() (see Item 4b). The communication between
the server and the client has O(p(n) + α(n)) group complexity since it involves sending off
the proof and the answer, a digest, a signature and a timestamp (see Item 4c). Finally
the verification at the client has O(v(n)) access complexity since it involves checking the
Relations 4(d)i (O(1) complexity), 4(d)ii (O(1) complexity) and 4(d)iii (O(v(n)) complexity
since it involves one call to algorithm verify()).
(Security) Let now T be the time of a query q ∈ Q. The client sends query q to the
server (Item 4a). The server replies with an answer α, a proof π, a digest d, a timestamp
t, such that T − t < ∆t, and a signature sgn (Item 4c), computed by the trusted source
(Item 3a). If server S follows the protocol, it will output a correct answer α and a correct
digest d (output by refresh()). Since a correct signature scheme and a correct authenticated
data structure scheme are used, it follows that the client will accept with all but negligible
probability (see Definition 2.4). Suppose now that the server does not follow the protocol
and that the client accepts α, while α is an incorrect answer. In this case, the following three
events have to be true:

21
• E1 : ACCEPT ← VERIFY(sgn, d||t, PK). This event can be partitioned into the
following events:
1. E10 : Digest d is not the correct digest, i.e., the one signed by the source at time t;
2. E11 : Digest d is the correct digest, i.e., the one signed by the source at time t;
• E2 : accept ← verify(q, α, π, d, pk);
• F: α is an incorrect answer to query q.
Therefore the probability in question is the probability Pr[E1 ∩ E2 ∩ F]. By using some
probability calculus this is bounded by
Pr[E2 |E11 ∩ F] + Pr[E10 ] = P1 + P2 .
However, since a secure authenticated data structure scheme A is used, P1 is neg(k). Finally,
since a secure (i.e., unforgeable) signature scheme is used, P2 is also neg(k). This concludes
the security proof for the three-party protocol. 2

2.2.2

Two-party authenticated data structures protocol

We now continue with the description of the two-party authenticated data structures protocol. This protocol involves a trusted client and an untrusted server (see Figure 2.2). The
client can store constant space and cannot usually perform expensive computations (e.g.,
iPhone). This model is close to the model of outsourced verifiable computation [5, 28, 41],
which has recently appeared in the literature. The main differences with the three-party
protocol are the following:
1. The client performs the updates, sends the queries to the server and verifies the answers;
2. The client stores some state of constant size only and cannot store the authenticated
data structure;

22

q/u
Π(q) / Π(Qu)

verify()

client

verify()

server
refresh()
query()

setup()
update()

Figure 2.2: The two-party authenticated data structures protocol. During the query phase,
the client sends a query q ∈ Q to the server and the server runs algorithm query() to output
the proof Π(q) for the answer. During the update phase, the client sends to the server an
update u ∈ U, which relates to a certain set of queries Qu ⊆ Q. Then the server computes the
set of proofs Π(Qu ). This set of proofs will be used by function z(.) of Assumption 2.1, which
will output δu (Dh ) and δu (auth(Dh )), which are subsequently input to algorithm update().
3. The authenticated data structure scheme used need not be publicly-verifiable.
Before we continue with the description of the protocol, we give a necessary definition,
inspired by the definition of t-heavy locations in the work on memory checking lower bounds
of Dwork et al. [35]. This definition will allow us to formally characterize the parts of the
authenticated data structure that are accessed during an update. This characterization will
be important for formalizing the two-party protocol:
Definition 2.10 (Heavy locations of an update) Let A be an authenticated data structure scheme for a data structure scheme D(U, Q) defined by the collection of algorithms
{genkey, setup, update, refresh, query, verify}, k be the security parameter and {sk, pk} ←
genkey(1k ). Let u ∈ U be an update that updates the data structure from D to D0 and the
authenticated data structure from auth(D) to auth(D0 ). We denote with δu (D) ⊆ D and
δu (auth(D)) ⊆ auth(D) the set of memory locations of D and auth(D) respectively that are
accessed by algorithm update() during update u. Analogously, we denote with δu (D0 ) ⊆ D0
and δu (auth(D0 )) ⊆ auth(D0 ) the set of memory locations of D0 and auth(D0 ) respectively that

23
are altered by algorithm update() during update u. Namely, we have the following equivalence
{D0 , auth(D0 ), d0 , upd} ← update(u, D, auth(D), d, sk, pk)
⇔
{δu (D0 ), δu (auth(D0 )), d0 , upd} ← update(u, δu (D), δu (auth(D)), d, sk, pk) .
The above definition implies that, in order for update() to execute, it needs to have access
only to the parts of the (authenticated) data structure required for execution, and not to
the whole (authenticated) data structure. This is essential for the two-party model, since
the client does cannot store all the data locally.
An assumption for formalizing the two-party protocol
Let A = {genkey, setup, update, refresh, query, verify} be an authenticated data structure
scheme for a data structure scheme D(U, Q). Let Q0 ⊆ Q be a set of queries and Π(Q0 ) and
α(Q0 ) be the sets of proofs and answers respectively output by query(q 0 , Dh , auth(Dh ), pk)
for all queries q 0 ∈ Q0 . We make the following assumption in order to achieve a generalized
application of an authenticated data structure scheme in the two-party protocol: For every
update u ∈ U, there exists a set of data structure queries Qu ⊆ Q such that the set of
memory locations of Dh and the set of memory locations of auth(Dh ) accessed by algorithm
update() during update u (i.e., δu (Dh ) and δu (auth(Dh ))) is a function of Qu , α(Qu ), Π(Qu ).
We state this formally now:
Assumption 2.1 (Update query) Let A = {genkey, setup, update, refresh, query, verify}
be an authenticated data structure scheme for data structure scheme D(U, Q). For every
update u ∈ U on data structure Dh , there exists a set of queries Qu ⊆ Q such that if
{Π(Qu ), α(Qu )} ← query(Qu , Dh , auth(Dh ), pk), then it is
{δu (auth(Dh )), δu (Dh )} = z(Qu , α(Qu ), Π(Qu )) ,
for some well-defined function z(.), computable with complexity O(|Qu |v(n)), where v(n) is
the access complexity of verify().

24
The following example gives some intuition about Assumption 2.1.
Example 2.1 Let us consider the case of a Merkle tree [77] that is used to verify the contents
of an n-index array A. Let also u be the update “set A[i] = x” and let A[i] = y before
the update u takes place. The respective set of queries Qu consists of one query qu , namely
the query “return the contents of cell i”. Let Π(qu ) be a verifying proof (a logarithmicsized chain of hash values) and let α(qu ) be the correct answer, i.e., α(qu ) = y. Note that
δu (Dh ) = A[i] = y and δu (auth(Dh )) is the path of hash values that is computed during
the verification. Namely, function z(.) in this case is the Merkle tree verification algorithm.
Moreover, z(.) has O(log n) complexity, equal to the complexity of the verification algorithm.
Thus we conclude that for the Merkle tree authenticated data structure (as well as for the
similar authenticated data structure scheme presented in Chapter 4), Assumption 2.1 is true
and therefore we can implement the authenticated data structure scheme in the two-party
model in a generic way (a similar implementation for skip lists is described in [92]).
Protocol 2.2 A two-party authenticated data structures protocol involves two participating
entities: A trusted client C (that has access both to the public and the secret keys) and an
untrusted server S (that has access only to the public keys). Let k be the security parameter
and A = {genkey, setup, update, refresh, query, verify} be a correct and secure authenticated
data structure scheme for a data structure scheme D(U, Q). The protocol has four phases:
1. Setup phase: Client C owns the data structure D at time t0 . The setup phase consists
of the following steps:
(a) [C]t0 : {sk, pk} ← genkey(1k ); ( authenticated data structure scheme keys generation)
(b) [C]t0 : {auth(D), d} ← setup(D, sk, pk); ( authenticated data structure generation)

(c) [C → S]t0 : D, auth(D), d; ( outsourcing)

25
(d) [C]t0 : Delete D and auth(D). Store d. ( storing necessary information only)
2. Update phase: Let u ∈ U be any update issued by client C on data structure D at time
tτ . Let D0 be the updated data structure. The update phase consists of the the following
steps:
(a) [C → S]tτ : u; ( sending the update)
(b) [S]tτ : {Π(Qu ), α(Qu )} ← query(Qu , Dh , auth(Dh ), pk) ( computing proofs and answers for the set of queries Qu related to update u—see Assumption 2.1)
(c) [C ← S]tτ : Π(Qu ), α(Qu ); ( sending the data required for the update)
(d) [C]tτ : Execute the following steps:
i. If reject ← verify(Qu , α(Qu ), Π(qu ), d, (sk)∗ , pk), output REJECT;
ii. {δu (auth(Dh )), δu (Dh )} = z(Qu , α(Qu ), Π(Qu ); (use of function z(.), see Assumption 2.1)
iii. {δu (D0 ), (δu (auth(D0 )))∗ , d0 , upd} ← update(u, δu (D), (δu (auth(D)))∗ , d, sk, pk);
d = d0 ; ( generating and storing the new digest by running update() only on
the heavy locations of the update—see Definition 2.10)
iv. If upd 6= Ø go to Item 2e; Else go to Item 2f; ( not always need for a second
round)
(e) [C → S]tτ : upd; ( sending relative update information)
(f ) [S]tτ : {D0 , auth(D0 ), d0 } ← refresh(u, D, auth(D), d, upd, pk); D = D0 , auth(D) =
auth(D0 ), d = d0 . ( authenticated data structure update)
3. Query phase: The query phase proceeds as follows, between the client C and the server
S. Let T be the time of the query q ∈ Q:
(a) [C → S]T : q; ( sending the query)
(b) [S]T : {Π(q), α(q)} ← query(q, D, auth(D), pk); ( answer and proof computation)

26
(c) [C ← S]T : α(q), Π(q); ( sending answer and proof)
(d) [C]T : Output ACCEPT if and only if accept ← verify(q, α(q), Π(q), d, (sk)∗ , pk) else
output REJECT. ( verification of answer)
Theorem 2.2 Let A = {genkey, setup, update, refresh, query, verify} be a correct and secure
authenticated data structure scheme (not necessarily publicly-verifiable) for a data structure
scheme D(U, Q) and k be the security parameter. Suppose Assumption 2.1 holds for A,
such that for each update u ∈ U there exists a respective set of queries Qu , as defined in
Assumption 2.1. Let g(n), s(n), u(n), r(n), q(n) and v(n) be the access complexities of
algorithms genkey, setup, update, refresh, query, verify respectively. Let also i(n), p(n) and
α(n) be the group complexity of the information upd (output by algorithm update()), of the
proof Π(q) (output by algorithm query()) and of the answer α(q) (output by algorithm query())
respectively and f (n) be the group complexity of the authenticated data structure auth(D)
(output by algorithm setup()). Then there exists a two-party authenticated data structures
protocol involving a trusted client C and an untrusted server S for verifying queries q ∈ Q
and at the same time supporting updates u ∈ U such that:
1. The protocol is non-interactive if the information upd output by update() is empty;
Otherwise it requires one round of interaction;
2. The setup at the client C has O(s(n)) access complexity;
3. The update at the client C has O(|Qu |v(n) + u(n)) access complexity;
4. The verification at the client C has O(v(n)) access complexity;
5. The space needed at the client C has O(1) group complexity;
6. The communication between the client C and the server S has O(|Qu |(p(n) + α(n)) +
i(n)) group complexity during updates and O(p(n) + α(n)) group complexity during
queries;

27
7. The update at the server S has O(|Qu |q(n) + r(n)) access complexity;
8. The query at the server S has O(q(n)) access complexity;
9. The space needed at the server S has O(f (n)) group complexity;
10. For a query q ∈ Q sent by the client C to the server S at any time (even after updates),
let α be an answer and let π be a proof returned by the server S. With probability
Ω(1 − neg(k)), the client C accepts the answer α if and only if α is correct.
Proof: Note that Item 2e, which introduces one round of interaction, is executed only when
upd = Ø. Therefore Item 1 of Theorem 2.2 holds.
(Complexity) The protocol in question is Protocol 2.2. In the setup phase the access
complexity of Item 1a is O(1) (it is not accessing the data structure at all). The access
complexity of Item 1b is O(s(n)) since it involves one call to algorithm setup(). Therefore
the total setup access complexity at the client is O(s(n)). The update at the client involves
one verification of α(Qu ) (see Item 2(d)i, which has O(|Qu |v(n)) complexity), the application
of function z(.) to output δu (Dh ) and δu (auth(Dh )) respectively (see Item 2(d)ii)—this,
by Assumption 2.1 has O(|Qu |v(n)) complexity— and one call to algorithm update() (see
Item 2(d)iii). Therefore the total complexity is O(|Qu |v(n) + u(n)). The server stores the
authenticated data structure auth(D), therefore its space has O(f (n)) group complexity.
The client stores only the digest d, therefore it needs space of O(1) group complexity. The
communication between the client and the server has
• O(|Qu |(p(n)+α(n))+i(n)) group complexity during an update u, since Π(Qu ) (Item 2c),
α(Qu ) (Item 2c) and information upd (Item 2e) are exchanged between the two parties;
• O(p(n) + α(n)) group complexity during queries since it involves sending off a proof
and an answer for one query (see Item 4c).
The update at the server has O(|Qu |q(n)+r(n)) access complexity since it involves computing
α(Qu ) and Π(qu ) (Item 2b) and one call to algorithm refresh() (see Item 2f). Computing a

28
proof at the server (i.e., query at the server) has O(q(n)) access complexity since it involves
one call to algorithm query() (see Item 3b). Finally the verification at the client has O(v(n))
access complexity since it involves one call to algorithm verify() (Item 3d), which has O(v(n))
complexity.
(Security) Let now T be the time of a query q ∈ Q. The client sends query q to the
server (Item 3a). The server replies with an answer α and a proof π (Item 3b). If server
S follows the protocol, it will output a correct answer α. Since a correct authenticated
data structure scheme is used, it follows that the client will accept with all but negligible
probability (see Definition 2.4). Suppose now that the server does not follow the protocol
and that the client accepts α, while α is an incorrect answer. Let E be that event. Note that
E can be written as
E = E ∩ [(digest d is correct) ∪ (digest d is not correct)] .
Therefore
Pr[E] ≤ Pr[E ∩ (digest d is correct)] + Pr[E ∩ (digest d is not correct)]
≤ Pr[E ∩ (digest d is correct)] + Pr[digest d is not correct]
≤ Pr[E|(digest d is correct)] + Pr[digest d is not correct]
= P 1 + P2 .
However, since a secure authenticated data structure scheme A is used, P1 is neg(k). Also
by Lemma 2.1, P2 is also neg(k). This completes the proof. 2
Lemma 2.1 Let k be the security parameter. The digest d used by every verification at
Item 3d of Protocol 2.2 is correct with probability Ω(1 − neg(k)), even in the presence of
updates.
Proof: Let ui , for some i ≥ 0, be the first update where the protocol rejects at Item 2(d)i (if
there is such an update). We prove the lemma by induction on the number of updates before

29
update ui . Before the execution of the first update u0 issued by the client, the lemma holds
since the digest d0 is output by setup() in Item 1b, run by the trusted client. Therefore d0 is
correct with probability 1. Suppose the lemma holds for the time period before the execution
of the update ui−1 at Item 2(d)iii, namely the digest di−1 , used by every verification before update ui−1 , is correct with probability Ω(1−neg(k)). Let α(Qui−1 ) and Π(Qui−1 ) be the answers
and the proofs related to the update ui−1 , derived from Assumption 2.1. Since the protocol
does not reject during update ui−1 , the verification of α(Qui−1 ) and Π(Qui−1 ) at Item 2(d)i is
successful. Therefore, since di−1 is correct with probability Ω(1 − neg(k)) (by inductive hypothesis), the values α(Qui−1 ) and Π(Qui−1 ) are also correct with probability Ω(1 − neg(k)),
since a secure authenticated data structure scheme is used (see Definition 2.5). This implies
that δui−1 (auth(Di−1 )), δui−1 (Di−1 ), output by z(Qui−1 , α(Qui−1 ), Π(Qui−1 )) at Item 2(d)ii are
correct as well with overwhelming probability. Therefore update() outputs the correct digest
di with probability Ω(1 − neg(k)), since it executes on correct data δui−1 (auth(Di−1 )) and
δui−1 (Di−1 ) at Item 2(d)iii. Thus the digest di , used by every verification before update ui ,
is correct with probability Ω(1 − neg(k)). This completes the proof. 2

2.3

Related work

In this section we present a general literature review related to authenticated data structures [105] and more broadly to methods developed for checking the integrity of data and
computations stored and executed by adversarial parties. In the description of the related
work, we denote with n the size of the data structure (e.g., number of elements stored in
the dictionary). Note that in the subsequent chapters of the thesis, there are references
to literature work that is more related to the algorithmic or cryptographic problem that is
studied in the specific chapter.

30

2.3.1

Generic collision-resistant hashing

Since the appearance of Merkle’s seminal paper on hash trees [77], many works in authenticated data structures literature have used generic collision-resistant hashing to realize efficient integrity checking mechanisms. Generic collision-resistant hashing refers to the use of
certain cryptographic hash functions, such a MD-5 and SHA-2, as a black box, i.e., without
exploiting the internal (algebraic) structure of the algorithms implementing them. These
constructions have been known to be very efficient in practice. However, due to the heuristic
nature of their security arguments, various attacks on them (e.g., recent attack on SHA1 [111] and MD-5 [103]) have appeared over the years.
Nevertheless, several constructions based on generic collision-resistant hashing have been
developed for the verification of various queries on dynamic data structures. After Naor
and Nissim dynamized Merkle’s solution [81], providing the first dynamic authenticated dictionary, Goodrich and Tamassia [48] presented another efficient realization of a dynamic
authenticated dictionary with skip lists, which was enhanced with persistence by Anagnostopoulos et al. [4] and was subsequently tested by Goodrich et al. [50] and implemented in
a two-party protocol by Papamanthou and Tamassia [92], finding many applications in authenticated file systems (e.g., see Goodrich et al. [46]) and authenticated outsourced storage
(e.g., see Heitzmann et al. [56]). Martel et al. [75] presented several new authenticated data
structures for more complicated computations (e.g., database queries), supporting efficient
I/O algorithms as well. Distributed authenticated hash tables were introduced by Tamassia
and Triandopoulos [104] and the verification of more complicated queries, such as connectivity queries on graphs and fractional cascading, is presented by Goodrich et al. [53]. All the
above authenticated data structures achieve O(log n) complexity costs, which are shown to
be optimal [106] (only for methods using generic collision-resistant hashing as a black box).
The database community has also employed generic collision-resistant hashing extensively
in order to verify various structures and queries related to systems and applications such as
I/O-efficient search trees [66], queries on streams [67], database join queries [112], shortest

31
path computations [72] and set operations [33]. Other types of queries such as 2-dimensional
range search have also been investigated [6].

2.3.2

More advanced cryptography

Although the majority of authenticated data structures developed in the literature are based
on generic collision-resistant hashing, there have been some solutions for verifying queries in
various settings using other cryptographic primitives, such as one-way accumulators. Oneway accumulators, that were introduced by Benaloh and de Mare [13], are based on the RSA
exponentiation function and comprise an efficient way of securely compressing multiple inputs
into one succinct representation, so that efficient proofs of membership can be computed.
Implemented with an RSA accumulator, they satisfy quasi-commutativity, a useful property
that common generic collision-resistant functions lack, which allows for efficient updates and
convenient preprocessing. Refinements of the RSA accumulator are also given by Baric and
Pfitzmann [10], where except for one-wayness, collision resistance is achieved, and also by
Gennaro et al. [42] and by Sander et al. [101]. Dynamic accumulators (along with protocols
for zero-knowledge proofs) were introduced by Camenisch and Lysyanskaya [23].
A first application of accumulators in the authenticated data structures model was made
by Goodrich et al. [51]; in this work, and in favor of constant complexity proofs, general O(n )
bounds are derived for various complexity measures such as query and update complexity
(as opposed to logarithmic bounds of methods using generic collision-resistant hashing). An
authenticated data structure [52] that combines hierarchical hashing with the scheme of [51]
appeared later, and a similar hybrid authentication scheme was developed by Nuckolls [84].
Accumulators using other cryptographic primitives (groups admitting bilinear pairings)
the security of which is based on other assumptions (hardness of strong Diffie-Hellman problem) are presented by Nguyen [83] and Camenisch et al. [22]. However, updates in the work
of Nguyen [83] are inefficient when the trapdoor information is not known: individual precomputed witnesses can be updated with constant complexity, thus incurring a linear total

32
cost for updating all the witnesses after an update in the set. Also, the accumulator by Camenisch et al. [22] requires space proportional to the number of elements ever accumulated
in the set (book-keeping information of considerable size is needed), or otherwise important
constraints on the range of the accumulated values are required. Efficient dynamic accumulators for non-membership proofs are presented by Li et al. [68]. Accumulators for batch
updates are presented by Wang et al. [110] and accumulator-like expressions to authenticate
static sets under the provable data possession model are presented by Ateniese et al. [7, 37].
The work by Sander et al. [100] studies efficient algorithms for accumulators with unknown
trapdoor information. Finally in the work of lower bounds by Dwork et al. [35], and simultaneously with work performed by Papamanthou et al. [90], logarithmic lower bounds as well
as constructions achieving query-update cost trade-offs have been studied in the memorychecking model. Tree hierarchies are authenticated (with access-control enabled) by Atallah
et al. [6], and by using bilinear pairings.
A study of an extensive suite of various authenticated data structures problems can be
found in the PhD theses of Triandopoulos [108] and Crosby [30]. Applications of authenticated data structures in distributed systems integrity are presented in the PhD thesis of
Maniatis [73].

2.3.3

Relation to memory checking

Authenticated data structures are closely related to the memory checking model, which
was originally defined by Blum et al. [15] and has consequently been studied in several
works [35, 82]. Both authenticated data structures and memory checking have as goal to
verify the operation of some functionality that is offered by an untrusted and possibly malicious server, i.e., to design cryptographic protocols that can efficiently verify the correctness
of the corresponding provided functionality. In memory checking, the functionality offered
by a memory array is being verified, namely read and write operations in a one-dimensional
table with indices 1, . . . , n. A read operation returns the value that is stored at a given index

33
j ∈ [1, n] and a write operation involves changing the content of a given index j ∈ [1, n] to
a new given value (similar to the authenticated data structure introduced in Chapter 4); a
memory-checking protocol verifies that a value that is read from any given index j is the
last value that was written to that index j. The celebrating result by Blum et al. [15] states
that this fundamental read-write functionality on n memory cells can be verified by reading
O(log n) special values that are stored at some additional unreliable memory cells of total
size O(n). In authenticated data structures, the functionality offered by a data structure
(e.g., heap, dynamic trees, hash table, dictionary, inverted index, fractional cascading, etc.)
is being verified, namely query and update operations defined over a structured and dynamic
data collection. For example, for the case of dynamic trees [53], a query can be “is node v a
child of u?” and an update can be “move subtree T to node v”.
It is important to note that since any ordinary data structure (of size n) is implemented in
the RAM model and since every operation is simulated by reading and writing bits in memory,
it is consequently true that every data structure can be authenticated using the memory
checking model by individually verifying (with O(log n) complexity) every elementary read
or write operation needed during a query or update in the data structure. This reduction
immediately gives raise to a very reasonable question: Is it useful to work with the more
abstract model of authenticated data structures?
The answer is yes, and the main reason is related to efficiency. The reduction of authenticated data structures to memory checking is in practice highly inefficient. Firstly, the
verification of any read or write operation introduces a logarithmic (in the size of the data
structure) multiplicative factor in the complexity of the verification protocol. Secondly and
more importantly, verifying the functionality of a data structure through memory checking
corresponds to verifying the entire execution of the query or the update algorithm that is defined for the data structure in study, which is in general unnecessary because what is needed
to be verified in the result of such a query or update algorithm and not its entire execution.
The best way to demonstrate how inefficient such a reduction to memory checking can be is

34
through a concrete and very illustrative example. Consider the verification of range search
queries. To verify a range search query by solely relying on the known memory-checking
techniques, we need to verify the entire search procedure in an underlying data structure of
range searching, e.g., the range search tree, a procedure which requires O(log n + k) reads of
memory locations, where n is the size of the data set and k is the number of the elements
belonging to the queried range. Given the logarithmic overhead introduced by the memory
checking (by using for example Blum’s memory checker [15]), the total verification cost is
O(log2 n+k log n). Instead, by using an implementation of an authenticated dictionary (e.g.,
a Merkle tree), the complexity to authenticate range search queries is only O(log n+k). Even
better as it has been shown by Tamassia and Triandopoulos [107], by using optimal data
certification techniques in combination with optimal authenticated dictionaries, it is possible
to authenticate range search queries, optimally, in O(k) complexity using proofs of size only
O(log k). Therefore, authenticated data structures can significantly reduce the complexity
for the authentication of complicated queries, by combining cryptography with algorithmics,
which is the basis of constructing efficient authenticated data structures solutions. Additionally, apart from efficiency, in the authenticated data structures model, communication
complexity does matter. We are interested in minimizing the size of the proof that the
untrusted server computes for the verification of a query. However in memory checking,
bandwidth does not come into place since query complexity is the main complexity measure that is studied. Overall, authenticated data structures provide a more powerful, more
refined and more expressive model for studying the verification of computations that take
place during data management and data querying.
Finally, we note that memory checking solutions have been traditionally constructed
with cryptographic primitives that bear very weak assumptions (e.g., existence of one-way
functions [15]). However, authenticated data structures do employ stronger (e.g., strong RSA
assumption [51]) assumptions, still widely acceptable and widely used by the cryptography
community.

Chapter

3

Accumulators for authenticated hash tables
In this chapter we describe our first authenticated data structure scheme for a hash table
data structure. We use cryptographic accumulators [23] as our basic cryptographic primitive to verify standard hash table queries. Specifically, our main results (Theorem 3.2 and
Theorem 3.4) show how to use two different accumulator schemes [23, 83] in a hierarchical
way (see Figure 3.1) over the set and the underlying hash table, in order to achieve the
verification of both membership and non-membership queries.
In the presented fully-dynamic schemes, communication and verification complexities are
constant, the query complexity is constant and the update complexity is sublinear, realizing
the first authenticated hash table with this performance. Our schemes (denoted with RHT
and BHT in Table 3.1) strictly improve, in terms of complexity, upon previous schemes
based on accumulators and other cryptographic primitives (for a detailed comparison, see
Table 3.1). Their security is based on two widely accepted assumptions, the strong RSA
assumption [10] and the bilinear q-strong Diffie-Hellman assumption [16]. Finally, to meet
the needs of different data-access patterns, we extend our schemes to achieve a reverse
performance, i.e., sublinear query cost, but constant update cost.

35

[15, 48, 75, 81] [11]
setup()
n
n
update()
log n
1
refresh()
log n
1
query()
log n
n
verify()
log n
n
proof Π(q)
log n
n
info. upd
1
1
assumption
Generic CR
D. Log

[23, 101]
n
1
n log n
1
1
1
1

[51]
RHT
n
n

n
1


n
n log n / 1
n
1 / n
1
1
1
1

n
1
Strong RSA

∗

∗

∗

∗∗

∗∗

[83]
BHT
n
n
1
1

n
n /1
1
1 / n log n
1
1
1
1
1
1
B. q-DH

+

∗

∗∗

∗∗

Table 3.1: In this table, we exhibit a detailed comparison of asymptotic access and group complexities of various authenticated
data structure schemes in the literature with the complexities of our schemes. The underlying data structure scheme is for a
hash table storing n elements. All the authenticated data structure schemes compared are defined by algorithms {genkey, setup,
update, refresh, query, verify} (see Definition 2.3). Parameter 0 <  < 1 is a constant, “D. Log” stands for “Discrete Logarithm”,
“Generic CR” stands for “Generic Collision Resistance” and “B. q-DH” stands for “Bilinear q-strong Diffie-Hellman”. In all
constructions the authenticated data structure has group complexity (i.e., size) O(n) and genkey() has O(1) complexity. Π(q)
denotes the proof for a query q and upd is the update information output by algorithm update(). Our schemes are denoted
with RHT (RSA-based authenticated hash table) and BHT (bilinear-map-based authenticated hash table). The “one-star”
notation ∗ denotes an expected complexity, the “two-star” notation ∗∗ denotes an expected amortized complexity, whereas the
“plus” notation + denotes an amortized complexity. All schemes in the table are publicly-verifiable.

36

37

3.1

Preliminaries

In this section we describe some algorithmic and cryptographic primitives and other useful
concepts that are used in our approach.

3.1.1

Hash tables

The main functionality of a hash table data structure T(X ) is to support optimal complexity
look-ups of elements that belong to a general dynamic set X (i.e., not necessarily ordered).
Elements can be inserted or deleted from X . The elements in X are drawn from a universe
U.
The data structure scheme. The data structure scheme {query(), update(), check()}
as defined in Definition 2.2 for a hash table T(X ) is as follows:
1. {true, false} ← query(x, T(X )): Given an element x ∈ U, return true if x ∈ X or false
otherwise;
2. T(X 0 ) ← update(x, T(X )): Given an element x ∈ U such that x ∈
/ X , insert element
x into X and output T(X 0 ); Given an element x ∈ U such that x ∈ X , delete element
x from X and output T(X 0 );
3. {accept, reject} ← check(x, b ∈ {true, false}, T(X )): If x ∈ X (or x ∈
/ X ) and b = false
(or b = true), return reject. Else return accept.
Note that answering a hash table query can be implemented to have O(1) expected complexity (see Theorem 3.1) and that both insertions and deletions in a hash table can be
implemented to have O(1) expected amortized complexity (see Theorem 3.1).
Implementation. Different ways of implementing hash tables have been extensively studied (e.g., [34, 58, 62, 69, 80]). Here we use a simple approach for the implementation of the
plain hash table: Suppose we wish to store n elements from a universe U in a data structure

38
so that we can have expected constant look-up complexity. For totally ordered universes
and by searching based on comparisons, it is well known that an Ω(log n) lower bound is
in place. Essential for the construction of a hash table—and for achieving better efficiency
than Ω(log n)—is a two-universal hash function:
Definition 3.1 (Two-universal hash function [26]) A two-universal hash function H :
U → {1, . . . , m}, randomly selected from a family of two-universal hash functions H, is a
function such that for any two elements e1 , e2 ∈ U, it is
Pr [H(e1 ) = H(e2 )] ≤

1
.
m

(3.1)

By using a two-universal hash function, hash tables can be constructed as follows.
• Set up an one-dimensional table T[1 . . . m] where m = O(n);
• Pick a two-universal hash function H : U → {1, . . . , m} as defined in Definition 3.1;
• Store element e in slot T[H(e)] of the table.
The probabilistic property that holds for hash function h implies that for any slot of the
table, the expected number of elements mapped to it is O(1). Also, if h can be computed in
O(1) time, looking-up an element has expected constant complexity.
But the above property of hash tables comes at some cost. The expected constantcomplexity look-up holds when the number of elements stored in the hash table does not
change, i.e., when the hash table is static. In particular, because of insertions, the number
of elements stored in a slot may grow and we cannot assume anymore that is expected to be
constant. A different problem arises in the presence of deletions as the number n of elements
may become much smaller than the size m of the hash table. Thus, we may no longer assume
that the hash table uses O(n) space.
In order to deal with updates, we periodically update the size of the hash table by a
constant factor (e.g., doubling or halving its size). This is an expensive operation since we

39
have to rehash all the elements. Therefore, there might be one update (over a course of
O(n) updates) that has O(n) rather than O(1) complexity. Thus, hash tables for dynamic
sets typically have expected O(1) query complexity and O(1) expected amortized update
complexity. Methods that vary the size of the hash table for the sake of maintaining O(1)
expected query complexity, fall into the general category of dynamic hashing. The above
discussion is summarized in the following theorem:
Theorem 3.1 (Dynamic hashing [29]) For a set of size n, dynamic hashing can be implemented to use O(n) space and have O(1) expected query complexity for (non-)membership
queries and O(1) expected amortized complexity for elements insertions or deletions.

3.1.2

The RSA accumulator

We now give an overview of the RSA accumulator, which will be used for the construction
of our first solution, i.e., the construction of the authenticated data structure scheme RHT .

Prime representatives. For security and correctness reasons that will soon become clear,
in our construction we extensively use the notion of prime representatives of elements. Initially introduced by Baric and Pfitzmann [10], prime representatives provide a solution whenever it is necessary to map general elements to prime numbers. In particular, one can map a
k-bit element ei to a 3k-bit prime xi using a two-universal hash function (see Definition 3.1).
In our context, we are using a two-universal hash function h : A → B, which is different
than the one (i.e., H(.)) we use to map elements to buckets, and where set A is the set of
3k-bit boolean vectors and B is the set of k-bit boolean vectors. Specifically, we use the
two-universal hash function
h(x) = Fx ,
where F is a k × 3k boolean matrix. Since the linear system h(x) = Fx has more than one
solution, one k-bit element is mapped to more than one 3k-bit elements. We are interested

40
in finding only one such solution which is prime; this can be computed efficiently according
to the following result:
Lemma 3.1 (Prime representatives [42, 51]) Let H be a two-universal family of functions mapping {0, 1}3k to {0, 1}k and let h ∈ H. For any element ei ∈ {0, 1}k , we can
compute with high probability a prime xi ∈ {0, 1}3k such that h(xi ) = ei by sampling O(k 2 )
times from the set of inverses h−1 (ei ).
By Lemma 3.1, we have that computing prime representatives has expected constant complexity, i.e., independent of n. Also, solving the k × 3k linear system in order to compute the
set of inverses has polynomial complexity in k by using standard methods (e.g., Gaussian
elimination). Finally, we note that, in our context, prime representatives are computed and
stored only once. Indeed, using the above method multiple times for computing the prime
representative of the same element will not yield the same prime as output, for Lemma 3.1
describes a randomized process. From now on, given a k-bit element x, we denote with r(x)
the 3k-bit prime representative that is computed as described by Lemma 3.1.

Description of the RSA accumulator. We now give an overview of the RSA accumulator [10, 13, 23, 68], which provides an efficient technique to produce a short (computational)
proof that a certain element is (or is not) a member of a set. The RSA accumulator works
as follows. Suppose we have the set of k-bit elements X = {x1 , x2 , . . . , xn }. Let N be a
k 0 -bit RSA modulus (k 0 > 3k), namely N = pq, where p, q are strong primes [23]. We can
represent X compactly and securely with an accumulation value acc(X ), which is a k 0 -bit
integer, as follows
acc(X ) = g r(x1 )r(x2 )...r(xn )

mod N ,

where g ∈ QRN and r(xi ) is a 3k-bit prime representative, computed using a two-universal
hash function h(.). Note that the RSA modulus N , the exponentiation base g and the twouniversal hash function comprise the public key pk, i.e., information that is available to the

41
adversary (the factorization of N is kept secret). Subject to the accumulation acc(X ), every
element x in set X has a membership witness (Wx , r(x), x), where
Wx = g

Q

xj ∈X :xj 6=x

r(xj )

mod N .

(3.2)

Membership of x in X is verified by means of the following tests:
1. Checking that r(x) is a prime number;
2. Checking that h(r(x)) = x;
r(x)

3. Computing Wx

mod N and checking that this equals acc(X ).

Moreover, subject to the accumulation acc(X ), every element x ∈
/ X has a non-membership
witness as well [68], namely the integer values (Ax , Bx , r(x), x) such that
!
n
Y
r(xi ) Ax + r(x)Bx = 1 .

(3.3)

i=1

Note that Ax and Bx can be computed by running the extended Euclidean algorithm [102] on
r(x1 )r(x2 ) . . . r(xn ) and r(x). Given the accumulation value acc(X ) and the non-membership
witness (Ax , Bx , r(x), x), non-membership of x in X can be verified by means of the following
tests:
1. Checking that r(x) is a prime number;
2. Checking that h(r(x)) = x;
3. Computing acc(X )Ax g xBx mod N and checking that this equals g.
We finally note that the representation acc(X ) has the crucial property that any computationally bounded adversary Adv who does not know φ(N ) cannot find another set of elements
X 0 6= X such that acc(X 0 ) = acc(X ), unless Adv breaks the the factoring assumption [10].
However, in order to achieve some more advanced security goals we need, we are going to
use a stronger assumption:

42
Assumption 3.1 (Strong RSA assumption) Let k be the security parameter. Given a
k-bit RSA modulus N and a random element x ∈ Z∗N , there is no polynomial-time algorithm
that outputs y > 1 and a such that ay = x mod N , except with negligible probability neg(k).
The security of our RSA-based solution is based on the following result. To assist the
reader, we also recall the proof of the security results for membership proofs [10] and for
non-membership proofs [68].
Lemma 3.2 (Security of the RSA accumulator [10, 68]) Let k be the security parameter, h be a two-universal hash function mapping 3k-bit integers to k-bit integers, N be a
(3k + 1)-bit RSA modulus and g ∈ QRN . Given N , g, a set of k-bit elements X and h,
suppose there is a polynomial-time algorithm for one of the tasks below (or both):
• It outputs x ∈
/ X , W and prime r such that h(r) = x and Wr = acc(X ) mod N ;
• It outputs x ∈ X , A, B and prime r such that h(r) = x and acc(X )A g rB = g mod N .
Then there is a polynomial-time algorithm for breaking the strong RSA assumption.
Proof: Let X = {x1 , x2 , . . . , xn } and let x ∈
/ X . For the membership proof, suppose there
is an algorithm that finds W, r and x such that r is a prime number, h(r) = x and
Wr = g r(x1 )r(x2 )...r(xn )

mod N .

Since x ∈
/ X , by construction of the prime representatives, it is r ∈
/ {r(x1 ), r(x2 ), . . . , r(xn )}
(recall that h(r) = x). Let now e = r and R = r(x1 )r(x2 ) . . . r(xn ). The algorithm can now
compute the e-th root of g as follows: It computes a, b ∈ Z such that aR + br = 1 by using
the extended Euclidean algorithm, since r is a prime and r ∈
/ {r(x1 ), r(x2 ), . . . , r(xn )}. Let
now y = Wa g b mod N . It is
y e = War g br = g aR+br = g

mod N .

Therefore the algorithm can be used for breaking the strong RSA assumption. For the
non-membership proof case, since x ∈ X the algorithm can output the e-th root of g as

43
y = WxA g B , where Wx is the membership witness defined in Relation 3.2. Then
y e = WxeA g eB = acc(X )A g rB = g

mod N .

This completes the proof. 2

3.1.3

The bilinear-map accumulator

We next give an overview of the bilinear-map accumulator [83] which will be used for the
construction of our second solution, i.e., the construction of the authenticated data structure
scheme BHT .

Bilinear pairings. Before presenting the bilinear-map accumulator we describe some basic
terminology and definitions about bilinear pairings. Let G1 , G2 be two cyclic multiplicative
groups of prime order p, generated by g1 and g2 and for which there exists an isomorphism
ψ : G2 → G1 such that ψ(g2 ) = g1 . Let also G be a cyclic multiplicative group with the same
order p and e : G1 × G2 → G be a bilinear pairing with the following properties:
1. Bilinearity: e(P a , Qb ) = e(P, Q)ab for all P ∈ G1 , Q ∈ G2 and a, b ∈ Zp ;
2. Non-degeneracy: e(g1 , g2 ) 6= 1;
3. Computability: There is an efficient algorithm to compute e(P, Q) for all P ∈ G1 and
Q ∈ G2 .
In our setting we have G1 = G2 = G and g1 = g2 = g. A bilinear pairing instance generator
is a probabilistic polynomial-time algorithm that takes as input the security parameter 1k
and outputs a uniformly random tuple t = (p, G, G, e, g) of bilinear pairings parameters.
Here we have to make an important observation: Groups G and G are generic. That
is, their elements are not simple integers and doing operations between elements can be
complicated. E.g., group elements of G and G (for which there exist efficient constructions
of a bilinear map e(., .)) are usually points on an elliptic curve. Also the operations in the

44
exponent of elements of G and G are performed modulo p, since this is the order of both
groups G and G. A simplified exposition of these groups and their arithmetic is given in the
book of Katz and Lindell [61].

Description of the bilinear-map accumulator. Similarly with the RSA accumulator,
the bilinear-map accumulator [32, 83] comprises an efficient way to provide short proofs of
(non-)membership for elements that (do not) belong to a set. The bilinear-map accumulator
works as follows. Let s ∈ Z∗p is a randomly chosen value that constitutes the trapdoor
in the scheme (in the same way that φ(N ) was the trapdoor in the RSA accumulator).
The accumulator accumulates elements in Z∗p − {−s} (where p is a k-bit prime) and the
accumulated value is an element in G. Given a set of n elements X = {x1 , x2 , . . . , xn } the
accumulation value acc(X ) is defined as
acc(X ) = g (x1 +s)(x2 +s)...(xn +s) ,
where g is a generator of group G of prime order p. We note here that acc(X ) can be
2

q

constructed by only using X and g, g s , g s , . . . , g s , where q ≥ |X |, by using polynomial
interpolation (see Lemma 3.15). The proof of membership for an element x that belongs to
set X will be the witness (Wx , x) where
Wx = g

Q

xj ∈X :xj 6=x (xj +s)

.

(3.4)

Accordingly, a verifier can test set membership for x by computing e(Wx , g s g x ) and checking
that this equals e(acc(X ), g).
Moreover, subject to the accumulation acc(X ), every element x ∈
/ X has a non-membership
witness [32], namely the elements in (Ax = g α(s) , Bx = g β(s) , x) such that
" n
#
Y
(xi + s) α(s) + (x + s)β(s) = 1 .

(3.5)

i=1

Note that α(s) and β(s) are polynomials that can be computed by running the extended
Euclidean algorithm on polynomials (x1 + s)(x2 + s) . . . (xn + s) and (x + s). Given the

45
accumulation value acc(X ) and the witnesses Ax and Bx , non-membership of x in X can be
verified by computing e(acc(X ), Ax )e(g s g x , Bx ) and checking that this equals e(g, g).
Proving the security of the bilinear-map accumulator requires the bilinear q-strong DiffieHellman assumption, a slightly stronger assumption than the q-strong Diffie-Hellman assumption [16]1 , that can be stated as follows:
Assumption 3.2 (Bilinear q-strong Diffie-Hellman assumption) Let k be the security parameter and let (p, G, G, e, g) be a uniformly randomly generated tuple of bilinear
q

pairings parameters. Given the elements g, g s , . . . , g s ∈ G for some s chosen at random
from Z∗p , where q = poly(k), there is no polynomial-time algorithm that can output the pair
(a, e(g, g)1/(s+a) ) ∈ Zp × G except with negligible probability neg(k).
Lemma 3.3 (Security of the bilinear-map accumulator [32, 83]) Let k be the security parameter and let t = (p, G, G, e, g) be a uniformly randomly generated tuple of bilinear
q

pairings parameters. Given the elements g, g s , . . . , g s ∈ G for some s chosen at random from
Z∗p and a set of k-bit elements X (q ≥ |X |), suppose there is a polynomial-time algorithm
for one of the tasks below (or both):
• It outputs x ∈
/ X and W such that e(W, g s g x ) = e(acc(X ), g);
• It outputs x ∈ X , A and B such that e(acc(X ), A)e(g s g x , B) = e(g, g).
Then there is a polynomial-time algorithm for breaking the bilinear q-strong Diffie-Hellman
assumption.
Proof: Let X = {x1 , x2 , . . . , xn } and let x ∈
/ X . Suppose there is an algorithm that finds W
such that e(W, g s g x ) = e(acc(X ), g). This implies
e(W, g)s+x = e(g, g)(s+x1 )(s+x2 )...(s+xn ) .
1

However, proving just collision resistance of the accumulator requires the plain q-strong Diffie-Hellman
assumption [83].

46
Note now that the quantity
Πn = (s + x1 )(s + x2 ) . . . (s + xn )
can be viewed as a polynomial in s of degree n. Since x ∈
/ X , we have that (s + x) does
not divide Πn and therefore values c and P can be computed such that Πn = c + P (s + x).
Therefore the algorithm can output (x, e(g, g)1/(s+x) ) as
 
c−1 
x, e(W, g)e(g, g)−P
.
For a non-membership proof, note that we can output e(g, g)1/(s+x) as
e(Wx , A)e(g, B) ,
since x ∈ X and e(acc(X ), A)e(g s g x , B) = e(g, g), where Wx is a membership witness given
in Relation 3.4. Therefore the bilinear q-strong Diffie-Hellmann assumption can be broken
in both cases. 2
Size of non-membership witnesses. We note here that although non-membership witnesses are constructed in the same fashion in both instantiations of the accumulator schemes,
their sizes differ considerably. In the RSA accumulator case, the integers Ax and Bx (see Relation 3.3) can have size proportional to the number of the elements in X . In the bilinear-map
accumulator case, Ax and Bx (see Relation 3.5) are always two group elements in G (this
takes advantage of the bilinear map e(., .), which is not known to exist for RSA groups),
therefore their size never depends on the elements collection X . This observation is very
important and will contribute to significant complexity improvements in Chapter 5.
We now continue with some necessary algorithmic and definitional framework.

We

present an algorithmic construction called accumulation tree that will be used in both our
constructions.

47

3.1.4

The accumulation tree

Let X = {x1 , x2 , . . . , xn } be a set of elements. Given a constant  < 1 such that 0 <  < 1,
the accumulation tree of X , denoted with T (), is a rooted tree with n leaves defined as
follows:
1. The leaves of T () store the elements x1 , x2 , . . . , xn ;
2. T () consists of exactly l =

1


levels;

3. All the leaves are at the same level;
4. Every node of T () has O(n ) children;
5. Level i in the tree contains O(n1−i ) nodes, where the leaves are at level 0 and the root
is at level l.
r

e

f
a

b

g
c

p

d

7 2 9 3

Figure 3.1: The accumulation tree of a set of 64 elements for  = 13 : every internal node
1
has 4 = 64  children, there are 3 = 1 levels in total, and there are 641−i/3 nodes at level
i = 0, 1, 2, 3.

We note that the levels of the accumulation tree are numbered from the leaves to the root
of the tree, i.e., the leaves have level 0, their parents level 1 and finally the root has level l.
The structure of the accumulation tree, which for a set of 64 elements is shown in Figure 3.1,
resembles that of normal “flat” search trees, in particular, the structure of a B-tree [29].
However there are some differences: First, every internal node of the accumulation tree,
instead of having a constant upper bound on its degree, it has a bound that is a function
of the number of its leaves, n; also, its depth is always maintained to be constant, namely

48
O

1



. Note that it is simple to construct the accumulation tree when n is an integer (see

Figure 3.1). Else, we define the accumulation tree to be the unique tree of degree dn e (by
assuming a certain ordering of the leaves). This maintains the degree of internal nodes to
be O(n ).
Using the accumulation tree and search keys stored at the internal nodes, one can search
for an element in O(n ) time and perform updates in O(n ) amortized time. Indeed, as the
depth of the tree is not allowed to vary, one should periodically (e.g., when the number
of elements of the tree doubles) rebuild the tree spending O(n) time. Actually, by using
individual binary trees to index the search keys within each internal node, queries could be
answered in O(log n) time and updates could be processed in O(log n) amortized time. Yet,
the reason we build this flat tree is not to use it as a search structure, but rather to design
an authentication structure for defining the digest of X that matches the optimal querying
performance of hash tables. The idea is as follows: we wish to hierarchically employ an
accumulator over the subsets (of accumulation values) defined by each internal node in the
accumulation tree, so that (non)-membership proofs of size proportional to the depth of the
tree (hence of constant size) are defined with respect the root digest (accumulation value of
the entire set).

3.2

Scheme based on the RSA accumulator

In this section we present a secure authenticated data structure scheme for an authenticated
hash table RHT = {genkey, setup, update, refresh, query, verify} and prove it satisfies the
complexities of Table 3.1. For each algorithm, we are going to describe two constructions,
i.e., the “plain” construction and the one with “precomputed witnesses”.
Algorithm {sk, pk} ← genkey(1k ): The algorithm picks a constant 0 <  < 1 and d1/e+1
RSA moduli Ni = pi qi (i = 0, . . . , l), where pi , qi are strong primes [23] and l = d1/e. The

49
length of the RSA moduli is defined by the recursive relations
|Ni+1 | = 3|Ni | + 1 ,
where |N0 | = 3k+1 and i = 0, . . . , l−1. The algorithm also picks l+1 public bases gi ∈ QRNi
to be used for exponentiation. Finally, given l + 1 families of two-universal hash functions
H0 , H2 , . . . , Hl , the algorithm randomly picks one function hi ∈ Hi , for i = 0, . . . , l (hi will be
used for computing prime representatives). The function hi is such that it maps (|Ni |−1)-bit
primes to ((|Ni | − 1)/3)-bit integers2 . The algorithm sets sk = {φ(Ni ) = (pi − 1)(qi − 1) : i =
0, . . . , l} and pk = {Ni , gi , hi : i = 0, . . . , l; }. Note that since l is constant all RSA moduli
have size that only depends on the security parameter k. Also, since 1/ is constant, the
algorithm has access complexity O(1).

3.2.1

Main authenticated data structure

Let X = {x1 , x2 , . . . , xn } be a collection of n elements. X is stored in a dynamic hash table D0
by using a two-universal hash function H(.) that maps each element to a certain bucket (see
Theorem 3.1). Specifically, the hash table D0 has m = O(n) buckets L1 , L2 , . . . , Lm , where
each bucket contains O(1) elements in expectation (by the property of the two-universal hash
function—see Relation 3.1). Let now 0 <  < 1 be the fixed constant chosen by algorithm
setup(). We build the accumulation tree T () on top of the buckets, i.e., every leaf of the tree
corresponds to a specific bucket and not to an element within the bucket. Since the number
of buckets is m = O(n), the internal nodes of the accumulation tree have O(n ) children.
Our authenticated data structure is defined with respect to the accumulation tree as
follows. We hierarchically employ the RSA accumulator over the buckets of the hash table
so that to augment the accumulation tree with a collection of corresponding accumulation
2

The choice of the domains and ranges of functions hi and of the lengths of moduli Ni is due to the
requirement that prime representatives should be smaller numbers than the respective moduli (see [101]).
As we will see in Section 3.2.5, using ideas from [10] it is possible to avoid the increasing size of the RSA
moduli and instead use only one size for all Ni ’s. By doing so, however, we are forced to prove security
in the random oracle model (using cryptographic hash functions), which is fine for practical applications.

50
values. That is, assuming the setup parameters are in place, for any node v in the accumulation tree we define its accumulation value χ(v) recursively along the tree structure, as a
function of the accumulation value of its children (in a similar way as in a Merkle tree). We
describe algorithm setup() in detail below:
Algorithm {auth(D0 ), d0 } ← setup(D0 , sk, pk): The algorithm builds the accumulation
tree T () on top of the m buckets L1 , L2 , . . . , Lm . For every leaf node v in tree T () that lies
at level 0 and corresponds to a bucket Lj , the algorithm sets
Q

χ(v) = g0

x∈Lj

r0 (x)

mod N0 ∈ Z∗N0 .

(3.6)

For every non-leaf node v in T () that lies at level 1 ≤ i ≤ l, the algorithm sets:
Q

χ(v) = gi

u∈N (v) ri (χ(u))

mod Ni ∈ Z∗Ni ,

(3.7)

where ri (a) is a prime representative of a computed using function hi , N (v) is the set of
children of node v (when node v refers to a bucket, i.e., it is a leaf, we define as v’s children to
be the elements contained in the bucket) and gi ∈ QRNi . The authenticated data structure
auth(D0 ) output by the algorithm consists of the following components:
1. The accumulation tree T ();
2. The prime representatives ri (χ(v)) that correspond to the values χ(v), such that
hi (ri (χ(v))) = χ(v)—as used in Relations 3.6 and 3.7, for all nodes v ∈ T () (at
some level i).
Let r be the root of the tree T . The algorithm also outputs d0 = χ(r), i.e., the digest of the
authenticated data structure is the χ(.) value of the root of the accumulation tree.

Precomputed witnesses. In order to achieve constant-complexity queries, algorithm
setup() can also compute precomputed witnesses. Namely, for every node v of the accumulation tree that lies at level 0 ≤ i ≤ l, let N (v) be the set of its children (for a leaf node,

51
we consider as “children” the elements in the respective bucket). For every j ∈ N (v) the
algorithm computes
Wj(v) = χ(v)

ri (χ(j))−1

Q

= gi

u∈N (v)−{j} ri (χ(u))

mod Ni ,

(3.8)

and stores Wj(v) at v. When the construction with precomputed witnesses is used, auth(D0 )
also includes Wj(v) , for all v ∈ T () and all j ∈ N (v), along with T () and ri (χ(v)).
Lemma 3.4 Algorithm setup() of the authenticated data structure scheme RHT has O(n)
access complexity both with and without precomputed witnesses. Moreover, the authenticated
data structure auth(D0 ) output by setup() has O(n) group complexity.
Proof: For a node v that has degree d, computing χ(v) from Relation 3.7 has O(d) access
complexity. At level i ≥ 1, there are O(m1−i ) such nodes, of degree O(m ), where m is the
number of the buckets (at level 0 there are m nodes of constant degree). Since m = O(n)
and T () has O(1) levels, the access complexity without precomputed witnesses is O(n). For
a node v at level i that has degree d, computing Wj(v) for all j ∈ N (v) from Relation 3.8
has O(d) access complexity: Compute χ(v) first, and then set
−1

Wj(v) = χ(v)ri (χ(j))

mod Ni .

Note that the computation of the inverse in the exponent is feasible because setup() has
access to the secret key, that contains the factorization φ(N0 ) (we will see that this computational task requires more work when the factorization is not available). Therefore the
access complexity of setup() with precomputed witnesses is also O(n), since computing one
such witness requires O(1) work and there are O(n) such witnesses. Finally, every node of
T () stores one group element (and two group elements in the precomputed witnesses case).
Since the tree T () has O(n) nodes, the group complexity of auth(D0 ) is O(n). 2

3.2.2

Updates

We now describe how updates can be efficiently supported in the authenticated hash table
scheme, by using a rebuilding technique, appropriately adjusted from the book of Cormen

52
et al. [29]: Since a hash table with m = O(n) buckets is used, we should expect that at
some point the update algorithms will need to rebuild the table (i.e., rehash all the elements
and reinsert them in a bigger or smaller hash table) and the related authenticated data
structures. This is done according to the following definition:
Definition 3.2 Let m be the current number of buckets of the authenticated hash table and
n be the number of elements contained in the authenticated hash table after an update has
been performed. Define α =

n
m

to be the load factor of the authenticated hash table after the

update. If α = 1 (full table) the capacity of the hash table is doubled. If α =

1
4

(near empty

table) the capacity of the hash table is halved.
The rebuilding method described in Definition 3.2, adjusted to our authenticated hash
table construction, is essential to get the necessary amortized results of Lemmata 3.5 and 3.7,
which constitutes the main complexity results of this work (for similar methods see the book
of Cormen et al. [29]). We describe now algorithms update() and refresh() in detail.
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 , upd} ← update(u, Dh , auth(Dh ), dh , sk, pk): Let m be
the current number of buckets of Dh and n be the number of elements stored in Dh , after
the update has been performed. We distinguish two cases:
Case 1.

m
4

< n < m: In this case there is no need to rebuild the table and the update is

performed as follows: Suppose the update is “insert element e”. The algorithm computes
the bucket j = H(e) (see Relation 3.1) and inserts e in bucket j. Let v0 be the node of T ()
referring to bucket j and r0 (e) be a new prime representative for element e computed using
function h0 , i.e., h0 (r0 (e)) = e. Let v0 , v1 , . . . , vl be the path in T () from node v0 to the
root of the tree. The algorithm initially sets
χ0 (v0 ) = χ(v0 )r0 (e)

mod N0 ,

i.e., it updates the accumulation value that corresponds to the updated bucket. Note that

53
if the update is “delete element e”, the algorithm sets
−1

χ0 (v0 ) = χ(v0 )r0 (e)

mod N0 .

(3.9)

Subsequently, for j = 1, . . . , l the algorithm sets
0

−1

χ0 (vj ) = χ(vj )rj (χ (vj−1 ))rj (χ(vj−1 ))

mod Nj ,

(3.10)

where rj (χ(vj−1 )) is the prime representative of accumulation value χ(vj−1 ) and rj (χ0 (vj−1 ))
is a new prime representative for the updated accumulation value χ0 (vj−1 ), such that
hj (rj (χ0 (vj−1 ))) = χ0 (vj−1 ) .
All these values are stored by the algorithm after they have been computed. The algorithm
also outputs the new prime representatives rj (χ0 (vj−1 )) (j = 1, . . . , l) as the information
upd along the path from the updated bucket to the root of the tree. Information upd also
includes r0 (e) and χ0 (vl ). Also it sets dh+1 = χ0 (vl ), i.e., the updated digest is the updated
χ(.) value of the root of T (). Finally the new authenticated data structure auth(Dh+1 )
is computed as follows. Let auth(Dh ) be the previous authenticated data structure that is
input to the algorithm. Overwrite the values rj (χ(vj−1 )) (j = 1, . . . , l) with the new values
rj (χ0 (vj−1 )) (j = 1, . . . , l) and output the updated structure. The behavior of the algorithm
in the precomputed witnesses case is the same with the difference that upd = Ø.
Case 2. n =
If n =

m
,
4

m
4

or n = m: In this case the hash table is rebuilt according to Definition 3.2:

then the algorithm builds a data structure Dh+1 with m/2 buckets. Otherwise, i.e.,

when n = m, the algorithm builds a data structure Dh+1 with 2m buckets. Subsequently, it
outputs auth(Dh+1 ) and dh+1 by calling algorithm setup(Dh+1 , sk, pk) and sets upd = Ø.
Lemma 3.5 By using the rebuilding policy of Definition 3.2, algorithm update() of the
authenticated data structure scheme RHT has O(1) expected amortized access complexity.
Moreover, the update information upd output by update() has O(1) group complexity.
Proof: The O(1) expected complexity bound comes from the fact that the number of
operations (as long as the group elements contained in upd) that update() performs is always

54
a function of l = 1/ = O(1)—also the actual hash table update is performed, which has
expected O(1) complexity. Note that the complexity of the operations in Relations 3.9
and 3.10 is constant since sk contains the factorizations φ(Ni ) and therefore inverses can be
computed with the extended Euclidean algorithm. When the hash table has to be rebuilt,
algorithm setup() is called, which has O(n) access complexity (see Lemma 3.4). Therefore
the result is expected amortized since we can use the rebuilding strategy in Definition 3.2
and follow the same amortized analysis from [29] (i.e., the cost of rebuilding in [29] does not
increase due to the O(n) complexity of setup()). 2
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): Let m be the
current number of buckets of Dh and n be the number of elements stored in Dh , after the
update has been performed. We distinguish two cases:
Case 1.

m
4

< n < m: Suppose the update is “insert element e”. The algorithm computes

the bucket j = H(e) (see Relation 3.1) and inserts e in bucket j. Let v0 be the node of T ()
referring to bucket j. Let v0 , v1 , . . . , vl be the path in T () from node v0 to the root of the
tree. The algorithm, for j = 0, . . . , l, sets
rj (χ(vj )) = rj (χ0 (vj )) ,
i.e., it updates the prime representatives that correspond to the updated path by using the
information upd3 . Finally it outputs the updated hash table as Dh+1 , the updated prime
representatives rj (χ(vj )) (along with the ones that belong to the nodes that are not updated)
as auth(Dh+1 ) and χ0 (vl ) (contained in upd) as dh+1 .
Precomputed witnesses. When precomputed witnesses are used, the algorithm should
update Wj(v) for v = v0 , v1 , . . . , vl and for all j ∈ N (v) (see Relation 3.8). To achieve that
3

Note that information upd is not required for refresh() to perform this task. Algorithm refresh() uses upd
for efficiency. Namely, algorithm refresh() could compute the updated values rj (χ(vj )) by performing
explicit exponentiations, which would have O(n ) complexity.

55
efficiently, the following result from Sander et al. [101] for efficiently maintaining updated
precomputed witnesses is used:
Lemma 3.6 (Computing witnesses [101]) Suppose we are given the elements collection
X = {x1 , x2 , . . . , xn }, an RSA modulus N and g ∈ QRN . Without the knowledge of φ(N ), the
witnesses Wi = g

Q

j6=i

xj

mod N for i = 1, . . . , n can be computed with O(n log n) complexity.

In order to compute the updated witnesses, the algorithm uses the algorithm from Sander
et al. [101] that provides the above result for all nodes vi , 0 ≤ i ≤ l, and for all j ∈ N (vi ),
as follows. For each vi , it uses the result from Lemma 3.6 with inputs the updated elements
{ri (χ(j)) : j ∈ N (vi )}, the RSA modulus Ni and the exponentiation base gi . In this computation the updated prime representative ri (χ(vi−1 )), computed with O(n ) exponentiations,
is used (note that O(n ) exponentiations are required since sk is not available). This computation outputs the witnesses Wj(vi ) for j ∈ N (vi ) (note that the witness Wvi−1 (vi ) , for i > 0,
remains the same). Also, since the algorithm for updating witnesses [101] is run on O(1/)
nodes v with |N (v)| = O(n ), we have, by Lemma 3.6, that the witnesses update complexity
is O(n log n) (for the complete result see Lemma 3.7).
Case 2. m =
If n =

m
,
4

m
4

or n = m: In this case the hash table is rebuilt according to Definition 3.2:

then the algorithm builds a data structure Dh+1 with m/2 buckets. Otherwise, i.e.,

when n = m, the algorithm builds a data structure Dh+1 with 2m buckets. Subsequently,
it outputs auth(Dh+1 ) and dh+1 by using Relations 3.6 and 3.7. In the case of precomputed
witnesses, it computes the new witnesses to be included in auth(Dh+1 ) by using Lemma 3.6
(note that refresh() cannot call setup() directly since it does not have access to the secret
key sk and that is why it has to use Relations 3.6, 3.7 and Lemma 3.6).
Lemma 3.7 By using the rebuilding policy of Definition 3.2, algorithm refresh() of the authenticated data structure scheme RHT has O(1) expected amortized access complexity, without precomputed witnesses. With precomputed witnesses, algorithm refresh() has O(n log n)
expected amortized access complexity.

56
Proof: For the case when no precomputed witnesses are used, the argument is the same
as in Lemma 3.5. For the case of precomputed witnesses, suppose there are currently n
elements in the hash table and that the capacity of the table (i.e., number of buckets) is
m. Note that, by the rebuilding policy of Definition 3.2, it is m/4 < n < m. As we know,
each one of the m buckets stores O(1) elements in expectation. When an update takes place
and no rebuilding of the table is triggered, all the witnesses along the path of the update of
the accumulation tree have to be updated. By using the algorithm described in Lemma 3.6,
the witnesses within the bucket can be updated in expected complexity O(1), since the size
of the bucket is an expected value. The witnesses of the internal nodes can be updated in
O(m log m) complexity and therefore the overall complexity is O(m log m) in expectation.
When a rebuilding of the table is triggered then the total complexity is O(m log m), since
there is a constant number of levels in the accumulation tree, processing each node has
complexity O(m log m) (since the degree of any internal node is O(m )) and the maximum
number of nodes that lie in any level is O(m1− ). Therefore, the actual complexity of an
update is expected O(m log m), when no rebuilding is triggered and O(m log m) otherwise.
We are interested in the expected value of the amortized complexity (expected amortized
complexity) of an update. Let ni be the number of elements contained in the hash table after
update i and mi be the number of buckets after update i. We do the analysis by defining
the following potential function:

Fi =




c(2ni − mi ) log mi , αi ≥


c

where αi =

ni
.
mi

mi
2


− ni log mi ,

αi <

1
2

,

1
2

The amortized complexity for an update i will be equal to γ̂i = γi +Fi −Fi−1 .

Therefore E[γ̂i ] = E[γi ] + Fi − Fi−1 , since Fi is a deterministic function. To perform the
analysis more precisely we define some constants. Let c1 be that constant such that if the
update complexity C is O(mi log mi ), it is
C ≤ c1 mi log mi .

(3.11)

57
Also, let r1 be that constant such that if the rebuilding complexity R is O(ni log ni ), it is
R ≤ r1 ni log ni .

(3.12)

mi
≤ ni ≤ mi .
4

(3.13)

Also we note that in all cases it holds

We perform the analysis by distinguishing the following cases:
1. αi−1 ≥

1
2

(insertion). For this case, we examine the cases where the hash table is rebuilt

or not. In case the hash table is not rebuilt, we have mi−1 = mi and ni = ni−1 + 1.
Therefore the amortized complexity will be:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(2ni − mi − 2ni−1 + mi−1 ) log mi
= c1 mi log mi + 2c log mi .
In case the hash table is rebuilt (which takes O(n log n) complexity in total) we have
mi = 2mi−1 , ni = ni−1 + 1 and ni−1 = mi−1 (which give ni = mi /2 + 1 ≤ mi /2) and
the amortized complexity will be:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ r1 ni log ni + c(2ni − mi ) log mi − c(2ni−1 − mi−1 ) log mi−1
= r1 ni log ni + c(2ni − mi ) log mi − c
≤ r1

mi
log mi /2
2

mi
mi
log mi /2 + 2c log mi − c
log mi /2
2
2

≤ 2c log mi
for a constant c of the potential function such that c > r1 .
2. αi−1 <

1
2

(insertion). Note that that there is no way that the hash table is rebuilt

in this case. Therefore mi−1 = mi and ni = ni−1 + 1. If now αi <

1
2

the amortized

58
complexity will be:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(mi /2 − ni ) log mi − c(mi−1 /2 − ni−1 ) log mi−1
= c1 mi log mi + c(mi /2 − ni − mi /2 + ni−1 ) log mi
= c1 mi log mi − c log mi .
In case now αi ≥

1
2

the amortized complexity will be:

E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(2ni − mi ) log mi − c(mi−1 /2 − ni−1 ) log mi−1
= c1 mi log mi + c(2(ni−1 + 1) − mi−1 − mi−1 /2 + ni−1 ) log mi
= c1 mi log mi + c(3ni−1 − 3mi−1 /2 + 2) log mi
= c1 mi log mi + c(3αmi−1 − 3mi−1 /2 + 2) log mi
< c1 mi log mi + c(3mi−1 /2 − 3mi−1 /2 + 2) log mi
= c1 mi log mi + 2c log mi .
3. αi−1 <

1
2

(deletion). Here we have ni = ni−1 − 1. In case the hash table does not have

to be rebuilt (i.e.,

1
4

< αi <

1
2

and mi = mi−1 ), we have that the amortized complexity

of the deletion is going to be:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(mi /2 − ni ) log mi − c(mi−1 /2 − ni−1 ) log mi−1
= c1 mi log mi + c(mi /2 − ni − mi /2 + ni−1 ) log mi
= c1 mi log mi + c log mi .
In case now the hash table has to be rebuilt (which has O(ni log ni ) complexity), we

59
have that mi = mi−1 /2, mi = 4ni and therefore the amortized complexity is:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ r1 ni log ni + c(mi /2 − ni ) log mi − c(mi−1 /2 − ni−1 ) log mi−1
≤ r1 ni log ni + c(mi /2 − ni ) log mi − c(mi − (ni + 1)) log 2mi
≤ r1 ni log ni − c(mi /2 − 1) log mi − c(3ni − 1)
≤ r1 ni log ni − cmi /2 log mi + c log mi
≤ r1 mi log mi − (c/2)mi log mi + c log mi
≤ c log mi ,
where c must also be chosen to satisfy c > 2r1 .
4. αi−1 ≥

1
2

(deletion). In this case we have mi−1 = mi . If αi ≥

1
,
2

the amortized

complexity will be:
E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(2ni − mi − 2ni−1 + mi−1 ) log mi
≤ c1 mi log mi − 2c log mi .
Finally for the case that αi <

1
2

we have

E[γ̂i ] = E[γi ] + Fi − Fi−1
≤ c1 mi log mi + c(mi−1 /2 − ni − 2ni−1 + mi−1 ) log mi
= c1 mi log mi + c(3mi−1 /2 − (ni−1 − 1) − 2ni−1 ) log mi
= c1 mi log mi + c(3mi−1 /2 − 3ni−1 + 1) log mi
= c1 mi log mi + c(3(1/αi−1 )ni−1 /2 − 3ni−1 + 1) log mi
≤ c1 mi log mi + c log mi .

60
Therefore we conclude that for all constants c > 2r1 of the potential function, the expected
value of the amortized complexity of any operation is bounded by
E[γ̂i ] ≤ c1 mi log mi + 2c log mi .
By using now Relation 3.13, there is a constant r such that E[γ̂i ] ≤ rni log ni which implies
that the expected value of the amortized complexity of any update (insertion/deletion) in
an authenticated hash table containing n elements is O(n log n) for 0 <  < 1. 2

3.2.3

Queries and verification

We show now how a proof for an element e ∈ X (or an element e ∈
/ X ) can be constructed,
by using the authenticated data structure presented in the previous section. Let H(e) = j,
i.e., the bucket that corresponds to element e is j. Let v0 , v1 , . . . , vl be the path from the
node that corresponds to bucket j to the root of T (). We add a fictitious node v−1 that
stores element e within bucket j such that v−1 , v0 , v1 , . . . , vl is the path in T () from the node
to corresponds to element e to the root of T (). We consider two cases, i.e., membership and
non-membership proof:
• Element e is contained in the hash table. The proof is the ordered sequence π0 , π1 , . . . , πl ,
where πi is a tuple of a prime representative and a witness that authenticates every
node of the path v−1 , v0 , . . . , vl from the element in question e to the root of the tree
vl . Thus, item πi of proof Π(e) (i = 0, . . . , l) is defined as:

πi = ri (χ(vi−1 )), Wvi−1 (vi ) ,

(3.14)

where Wvi−1 (vi ) is defined in Relation 3.8 and χ(v−1 ) = e. For simplicity, we set
αi = ri (χ(vi−1 )) and
βi = Wvi−1 (vi ) .

(3.15)

For example in Figure 3.1, the proof for an element that belongs to the bucket of node

61
a (e.g., element 2) consists of the following tuples:

r0 (2), g r0 (3)r0 (7)r0 (9) mod N0 ,


r (χ(b))r1 (χ(c))r1 (χ(d))
= r1 (χ(a)), g11
mod N1 ,


r (χ(e))r2 (χ(g))r2 (χ(p))
= r2 (χ(f )), g22
mod N2 .

π0 =
π1
π2

• Element e is not contained in the hash table. Let y1 , y2 , . . . , yu be the elements contained
in bucket j (all different than e). First, output a membership proof (as above) for an
element yi in bucket j (note that H(yi ) = H(e)). Then, and by running the extended
Euclidean algorithm, output a non-membership witness
πν = (Ae , Be , r0 (e), e) ,

(3.16)

where Ae , Be and r0 (e) are defined in Relation 3.3. Note that Ae , Be are integer values
proving non-membership of e in the set {y1 , y2 , . . . , yu }.
We now describe the algorithm formally:
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): Let e = q be the queried element.
If e is contained in Dh , set Π(q) = (π0 , π1 , . . . , πl ), as in Relation 3.14 and output α(q) = true.
If e is not contained in Dh , output a membership proof for some other element yi in bucket
j, such that H(e) = H(yi ). Then output a non-membership proof πν for e in bucket j, as
defined in Relation 3.16. Set Π(q) = (Π(yi ), πν ) and output α(q) = false.
Lemma 3.8 Algorithm query() of the authenticated data structure scheme RHT has O(n )
expected access complexity, without precomputed witnesses. With precomputed witnesses,
algorithm query() has O(1) expected access complexity. Moreover, it outputs a proof Π(q) of
O(1) expected group complexity.
Proof: (a) Membership proof : Without precomputed witnesses, the construction of π0 has
always O(1) expected access complexity since each bucket contains O(1) elements in expectation. Also, the construction of each element πi (i = 1, . . . , l) has O(n ) access complexity

62
due to the degree bound of the nodes in T (). Therefore the total complexity is expected
O(n ). With precomputed witnesses, each πi can be “read” directly from memory with O(1)
access complexity (not expected). Finally, the group complexity of Π(q) for a membership
proof is O(1) (not expected), since one witness for each level of T () must be provided. (b)
Non-membership proof : A non-membership proof consists of a membership proof (thus the
above arguments apply here as well) plus the proof πν (see Relation 3.16). Therefore in
both cases (without precomputed witnesses and with precomputed witnesses), πν turns the
complexities into expected, since the complexity of Ae and Be depends on the number of the
elements in the bucket in question (see observation before Section 3.1.4), which is expected
O(1). This completes the proof. 2
We now formally describe the verification algorithm. The verification algorithm will take
as input a proof and an answer and will either accept or reject the answer.
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): Let the query q refer to element e, i.e.,
q = e. We distinguish two cases:
1. Membership proof : In this case it is α = true. The proof Π contains a membership
proof for e, denoted with Π(e) = π0 , π1 , . . . , πl , where πi = (αi , βi ) for i = 0, . . . , l, and
where αi are all primes. The algorithm outputs reject if one of the following is true:
(a) h0 (α0 ) 6= e (the prime representative of element e is not correct);
α

i−1
(b) hi (αi ) 6= βi−1
mod Ni−1 for some 1 ≤ i ≤ l (false witness);

(c) dh 6= βlαl mod Nl (final digest mismatch).
2. Non-membership proof : In this case it is α = false. We recall that in this case the
proof Π contains (a) a membership proof Π(y) = π0 , π1 , . . . , πl for element y 6= e such
that H(y) = H(e), where πi = (αi , βi ) for i = 0, . . . , l (αi are all primes); (b) a
non-membership proof for e, denoted with πν = (A, B, r, e), where r is a prime. The
algorithm outputs reject if one of the following is true:

63
(a) H(e) 6= H(y); (e and y do not belong in the same bucket);
(b) The membership proof for y does not verify, i.e., it is
reject ← verify(y, true, Π(y), dh , pk) ;
(c) h0 (r) 6= e (the prime representative r contained in πν for element e is not correct);
(d) α1A g0rB 6= g0 mod N0 (verification test for non-membership proof of e in the corresponding bucket does not succeed, see Lemma 3.2).
If all the above tests are successful, the algorithm outputs accept.
Lemma 3.9 Algorithm verify() of the authenticated data structure scheme RHT has O(1)
expected access complexity.
Proof: Processing the membership proof has O(1) access complexity, since it requires processing l = O(1) pairs of witnesses and prime representatives. Moreover, processing the
non-membership proof has O(1) expected access complexity, due to the size of the nonmembership witness, that depends on the number of the elements in the bucket in question.
Therefore, the expected access complexity of verify() is O(1). 2

3.2.4

Correctness and security

The following lemmata describe the correctness and the security of our new construction,
according to Definitions 2.4 and 2.5. The security of our scheme is based on the strong RSA
assumption.
Lemma 3.10 The authenticated data structure scheme RHT = {genkey, setup, update,
refresh, query, verify} is correct according to Definition 2.4.
Proof: Let D0 be any hash table containing n elements and having m = O(n) buckets. Fix
the security parameter k and output pk = {Ni , gi , hi : i = 0, . . . , l; } and sk = {φ(Ni ) :

64
i = 0, . . . , l} by calling algorithm genkey(). Then output an authenticated data structure
auth(D0 ) and the respective digest d0 , by calling algorithm setup(). Pick a polynomial number
of updates—namely, pick a polynomial number of elements for insertion or deletion—and
update auth(D0 ) and d0 by calling algorithm refresh(). Let Dh be the final hash table,
auth(Dh ) be the produced authenticated data structure and dh be the final digest. Let e be
an element that belongs (or, it should belong) to bucket j (i.e., H(e) = j). Output a proof
Π(e) and an answer by calling query(). We distinguish two cases:
1. Element e is contained in the hash table. Then Π(e) is a membership proof as defined
in Relation 3.14. Note that π0 contains the prime representative of e with the respective witness, therefore verify() does not reject at Item 1a. By the definitions of the
accumulation values output by setup() and maintained under updates by refresh() (see
Relations 3.6 and 3.7) and by the definition of proof element πi in Relation 3.14 for
i = 1, . . . , l, verify() does not reject at Items 1b and 1c;
2. Element e is not contained in the hash table. Let y1 , y2 , . . . , yu be the elements in
bucket j, where H(e) = j. In this case, the non-membership proof consists of (a) a
membership proof πi = (αi , βi ) for an element y contained in bucket j (which verifies
due to Item 1) such that H(e) = H(y) = j, therefore verify() does not reject at either
Item 2a or Item 2b and, (b) a non-membership proof (Ae , Be , r0 (e), e) for element e
that should belong to bucket j. Therefore verify() does not reject at Item 2c, since
h0 (r0 (e)) = e. Also it does not reject at Item 2d as
r (e)Be

α1Ae g00

Qu

(
= g0

j=1 r0 (yj )

)Ae +r0 (e)Be

= g0

since, by construction it is
Qu

α1 = g0

j=1 r0 (yj )

mod N0 ,

and by Relation 3.3, Ae and Be are computed to satisfy
!
u
Y
r0 (yj ) Ae + r0 (e)Be = 1 .
j=1

mod N0 ,

65
This completes the proof. 2
Lemma 3.11 The authenticated data structure scheme RHT = {genkey, setup, update,
refresh, query, verify} is secure according to Definition 2.5 and under the strong RSA assumption.
Proof: Let k be the security parameter. Output pk = {Ni , gi , hi : i = 0, . . . , l; } and
sk = {φ(Ni ) : i = 0, . . . , l} by calling algorithm genkey(). Let Adv be a polynomiallybounded adversary. Adv picks an initial collection of n elements X , stored in hash table
D0 . Adv outputs an authenticated data structure auth(D0 ), by calling algorithm setup()
through oracle access. Then Adv picks a polynomial number of updates—namely, he picks
a polynomial number of elements for insertion or deletion. Let Dh be the final hash table,
let the updated final element collection be X , and let dh be the final digest as produced by
the adversary through oracle access to algorithm update(). We will compute the probability
that check() rejects, while verify() accepts, as required by Definition 2.5. We distinguish
two cases:
1. Membership proof : The adversary outputs a membership proof Π(e) = (π0 , π1 , . . . , πl )
/ X (thus, a
(l = d 1 e) where πi = (αi , βi ) (see algorithm query()) for an element e ∈
proof for an incorrect answer). Let v0 , v1 , . . . , vl be a path of nodes in T () from the
bucket referring to e to the root of the tree. We define now the following events, related
to the choice of the proof above made by the adversary. Our goal will be to the express
the probability that verify(e, true, Π(e), dh , pk) accepts and e ∈
/ X as a function of the
following events. Note that dh is the correct digest of the authenticated data structure:
(a) E0,0 : The value e and α0 picked by Adv are such that e ∈
/ X , α0 is prime and
h0 (α0 ) = e;
(b) Ej : For j = 1, . . . , l, the values αj , αj−1 and βj−1 picked by Adv are such that
both αj and αj−1 are primes and
α

j−1
hj (αj ) = βj−1

mod Nj−1 for all 1 ≤ j ≤ l .

66
This event can be partitioned into two mutually exclusive events, i.e., Ej = Ej,0 ∪
Ej,1 such that
• Ej,0 : Value hj (αj ) is not the correctly formed digest (i.e., an accumulation of
the digests of its children) of some node vj−1 ∈ N (vj ), as defined in Relation 3.7;
• Ej,1 : Value hj (αj ) is the correctly formed digest of a node vj−1 ∈ N (vj ), as
defined in Relation 3.7.
(c) El+1,1 : The values αl and βl picked by Adv are such that
βlαl = dh

mod Nl .

The probability that verify() accepts, while e ∈
/ X is the probability
Pr[E0,0 ∩ E1 ∩ E2 ∩ . . . ∩ El+1,1 ]
= Pr[E0,0 ∩ (E1,0 ∪ E1,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ . . . ∩ El+1,1 ]
≤ Pr[E1,1 |E0,0 ] + Pr[E2,1 |E1,0 ] + Pr[E3,1 |E2,0 ] + . . . + Pr[El+1,1 |El,0 ]
l+1
X
= Pr[E1,1 |E0,0 ] +
Pr[Ej,1 |Ej−1,0 ] .

(3.17)

j=2

First we examine the event E1,1 |E0,0 . This event implies that the adversary has found
a value e ∈
/ X , a prime α0 such that h0 (α0 ) = e and a value β0 such that
Q

β0α0 = g0

t=1,...,l0

r0 (xt )

mod N0 ,

where x1 , x2 , . . . , xl0 is a subset of the set X . Since e ∈
/ X , it is e ∈
/ {x1 , x2 , . . . , xl0 }.
Also, since every prime representative is mapped to a unique element through function
h0 , we conclude that it should be α0 ∈
/ {r0 (x1 ), r0 (x2 ), . . . , r0 (xl0 )}. By Lemma 3.2 and
Assumption 3.1, this probability is neg(k). Therefore Pr[E1,1 |E0,0 ] ≤ neg(k).
For the remaining events Ej,1 |Ej−1,0 (2 ≤ j ≤ l + 1), we have:

67
• By the one-to-one property of the function hj−1 (.), Ej−1,0 implies that value αj−1
is not the prime representative of the correctly formed digest of some node vj−2 ∈
N (vj−1 ), as defined in Relation 3.7, namely that
αj−1 ∈
/ {rj−1 (χ(vt )) : vt ∈ N (vj−1 )} ;
• However, the event Ej,1 implies that (1) digest hj (αj ) (for j = l + 1 this is just
dh ) is the correctly formed digest of node vj−1 ; and (2)
αj−1
βj−1

Q

v ∈N (vj−1 ) rj−1 (χ(vt ))

= gj−1t

mod Nj−1 .

where rj−1 (χ(vt )) are the prime representatives of correctly formed digests of the
set of neighbors of vj−1 .
Since αj−1 ∈
/ {rj−1 (χ(vt )) : vt ∈ N (vj−1 )}, by Lemma 3.2 and Assumption 3.1, this
probability is neg(k). Therefore for all j = 1, . . . , l+1, Pr[Ej,1 |Ej−1,0 ] is neg(k). Since l =
O(1), the total probability is also neg(k). This concludes the proof for the membership
case.
2. Non-membership proof : For this case, we define the events:
(a) B0 : Adv finds e ∈ X and y such that H(e) = H(y) = j;
(b) B1 : Adv finds a membership proof for y, namely the proof π = π0 , π1 , . . . , πl where
πi = (αi , βi ) (where αi are prime numbers) and accept ← verify(y, true, π, dh , pk);
(c) B2 : Adv finds α1 , a non-membership proof (A, B, r, e) for e such that r is a prime
number, h0 (r) = e and α1A g0rB = g0 mod N0 . This event is partitioned into two
events:
Q

i. B20 : α1 6= acc(Lj ) = g0
Q

ii. B21 : α1 = acc(Lj ) = g0

x∈Lj

x∈Lj

r0 (x)
r0 (x)

mod N0 ;
mod N0 .

We need to compute the probability Pr[B0 ∩ B1 ∩ B2 ] = Pr[B0 ∩ B1 ∩ (B20 ∪ B21 )] ≤
Pr[B20 |B1 |B0 ] + Pr[B21 |B0 ]. By Lemma 3.2 and Assumption 3.1, it is Pr[B21 |B0 ] ≤

68
ν(k), where ν(k) is the appropriate negligible function. Note also that we can express
B20 |B1 |B0 as a function of the events E in the membership proof case. Specifically, the
event B20 |B1 |B0 implies the event E1,0 ∩ E2 ∩ E3 ∩ . . . ∩ El+1,1 , the probability of which,
by following the same logic as in Relations 3.17 is bounded by neg(k).
This concludes the proof for both the membership and the non-membership cases. 2
We can now present the main result of this section.
Theorem 3.2 Let k be the security parameter and 0 <  < 1. Then there exists a publiclyverifiable authenticated data structure scheme RHT = {genkey, setup, update, refresh, query,
verify} for a data structure scheme defined for dynamic hash table D storing n elements such
that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and
under the strong RSA assumption;
2. The access complexity of setup() is O(n), outputting an authenticated data structure
auth(D) of O(n) group complexity;
3. The expected amortized access complexity of update() is O(1), outputting update information upd of O(1) group complexity;
4. The expected amortized access complexity of refresh() is O(n log n) (or O(1));
5. The expected access complexity of query() is O(1) (or O(n )), outputting a proof Π(q)
for a query q of O(1) expected group complexity;
6. The expected access complexity of verify() is O(1).
Proof: This result follows directly from Lemmata 3.4, 3.5, 3.7, 3.8, 3.9, 3.10 and 3.11. The
complexities in the brackets (O(1) for refresh() and O(n ) for query()) refer to the case when
no precomputed witnesses are used. Note that the presented scheme is publicly verifiable
since verify() does not take the secret key as an input. 2

69

3.2.5

A more practical scheme

The construction we have presented (RHT authenticated data structure scheme) uses different RSA moduli for each level of the tree and each new RSA modulus has a bit-length
that is three times longer than the bit-length of the previous-level RSA modulus. Therefore,
computations corresponding to higher levels in the accumulation tree are more expensive,
since they involve modular arithmetic operations over longer elements. This increase in the
lengths of the RSA moduli is due to the need to compute, for the elements stored at every
level in the tree, prime representatives of size that is three times as large as the size of the
elements (see Lemma 3.1). Although from a theoretical point of view this is not a concern
as the number of the levels of the tree is constant (i.e., 1/), from a practical point of view,
this can be prohibitive for efficiently implementing our schemes.
To overcome this complexity overhead, we want to use the same RSA modulus for each
level of the tree, and to achieve this, we present a heuristic inspired by a similar method
originally used in the work of Baric and Pfitzmann [10]. Instead of using two-universal
hash functions to map (general) integers to primes of increased size, the idea is to employ
random oracles [12] for consistently computing primes of relatively small size. In particular,
given a k-bit integer x, instead of mapping it to a 3k-bit prime, we can map it to the value
2t 2b g(x) + d, where g(x) is the output of length b of a random oracle (which in practice is the
output of a cryptographic hash function) at the end of which we append b zeros so that we
make this number large enough, t is a value that equals to the number of bits we are shifting
2b g(x) to the left, and d = 1, 3, . . . , 2t − 1 is a number we are adding so that 2t 2b g(x) + d is a
prime. Note that we require that t is related to b according to Relation 3.18 of Theorem 3.3.
In the following, we denote by q(x) a prime representative of x computed by the above
procedure, i.e., the output of a procedure that transforms a k-bit integer into a k 0 -bit prime,
where k 0 < k. Note that the above procedure (i.e., the computation of q(x)) cannot map
two different integers to the same prime. This can be derived by the random oracle property, namely that for x1 6= x2 , w.h.p. it is g(x1 ) 6= g(x2 ). This implies that the intervals

70
[2t 2b g(x1 ), 2t 2b g(x1 ) + 2t − 1] and [2t 2b g(x2 ), 2t 2b g(x2 ) + 2t − 1] are disjoint. Finally we show
that we can make sure that with high probability we will always be able to find a prime
within the specified interval.
Theorem 3.3 Let x be a k-bit integer and let a = 2b g(x) be the output of a b-bit random
oracle with b zeros appended at the end. The interval [2t a, 2t a + 2t − 1] contains a prime with
probability at least 1 − 2−b provided
j

b ≤ log(1 +

p

2t

+

4e2t −1 )

k
−1 .

(3.18)

Proof: By the prime distribution theorem we have that the number of primes less than n
is approximately

n
.
ln n

Therefore, we want to compute the probability
"
#

 t
t
2t a
e2 −1
2 a + 2t − 1
−
≥ 1 = Pr a ≤ t
,
Pr
ln(2t a + 2t − 1) ln(2t a)
2

by assuming ln(2t a + 2t − 1) ' ln(2t a) since a > 2b >> 2t . By the random oracle property
we have that
"

t

e2 −1
Pr a ≤ t
2

#

"

t

e2 −1
= Pr 2b g(x) ≤ t
2

#

t

e2 −1 1
= b+t b .
2
2

Note that
√
√
t
1
1 − 2t + 4e2t −1
1 + 2t + 4e2t −1
e2 −1 1
b
≥1− b ⇔
≤2 ≤
,
2b+t 2b
2
2
2
j
k
√
which gives b ≤ log(1 + 2t + 4e2t −1 ) − 1 since b is a positive integer. This completes the
proof. 2
Using Theorem 3.3, we can pick the length of the output of the random oracle to ensure
hitting a prime with high probability. For example, for t = 9 we get b ≤ 368, which is true
for most practical hash functions used today (e.g., SHA-256).
Using the above method, we can still accumulate primes in the exponent but this time
without having to increase the size of the RSA moduli at any level of the tree. The only
conditions we need in order to securely use the RSA accumulator are:

71
1. the safe accumulation of primes that map to unique integers (i.e., each accumulated
prime can only represent one integer), and
2. the bit-length of accumulated primes is smaller than the bit-length of the used RSA
modulus.
Thus, we can apply our new procedure for computing prime representatives to all of
the constructions in Section 3.2 with one important efficiency improvement: the same RSA
modulus and exponentiation bases are used at all levels of the accumulation tree. With this
heuristic, we overall get the same security and complexity results as before, but now we have
a more practical accumulator with security that is now based on both the strong RSA and
the random oracle assumptions.

3.2.6

Protocols

Three-party protocol. By using Theorem 2.1 we can easily derive the following corollary
that describes the use of the authenticated data structure scheme RHT of Theorem 3.2 by
Protocol 2.1.
Corollary 3.1 Let k be the security parameter and assume that the strong RSA assumption
holds. Then there exists a three-party authenticated data structures protocol (see Protocol 2.1)
for verifying (non)-membership queries q on a dynamic hash table storing n elements such
that:
1. The setup at the source has O(n) access complexity;
2. The update at the source has O(1) expected amortized access complexity;
3. The space needed at the source has O(n) group complexity;
4. The communication between the source and the server has O(1) group complexity;

72
5. The update at the server has O(n log n) (or O(1)) expected amortized access complexity;
6. The query at the server has O(1) (or O(n )) expected access complexity;
7. The space needed at the server has O(n) group complexity;
8. The communication between the server and the client has O(1) expected group complexity;
9. The verification at the client has O(1) expected access complexity;
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.

Two-party protocol. In order to be able to use the authenticated data structure scheme
RHT of Theorem 3.2 in a black-box way with Theorem 2.2—and derive a two-party authenticated data structures protocol, we have to ensure that Assumption 2.1 holds for the
authenticated data structure scheme RHT :
Lemma 3.12 Assumption 2.1 is true for the authenticated data structure scheme RHT .
Moreover, for every update u, |Qu | has O(1) amortized complexity.
Proof: Let an update u refer to element e, i.e., either insert element e to the hash table or
delete element e from the hash table. We distinguish two cases:
1. The hash table is not rebuilt (see Definition 3.2) due to update u. In this case, the
respective set of queries Qu required for Assumption 2.1 simply contains one query for
element e, i.e., qu = e. Let {Π(e), α(e)} ← query(e, Dh , auth(Dh ), pk).
We now describe function z(.) from Assumption 2.1. Let qu = e. Function z(.) first
computes δu (Dh ) as H(e) = j, since update() needs to access bucket j in order to do any

73
operation on element e, with H(e) = j 4 . To compute δu (auth(Dh )), z(.) processes the
proof Π(qu ) as follows. We recall that both a membership and non-membership proof
contains the ordered sequence π0 , π1 , . . . , πl , where πi is a tuple of a prime representative
and a witness that authenticates every digest of the path v0 , v1 , . . . , vl from the bucket in
question H(e) = j to the root of the tree vl . Thus, item πi of proof Π(e) (i = 0, 1, . . . , l)
is defined as (see Relation 3.14):

πi = ri (χ(vi−1 )), Wvi−1 (vi ) .
Note now that update() needs to access χ(vi ), for i = 0, 1, . . . , l, in order to perform
the update (see Relation 3.10). All these values are easily computed from πi : Just set
r (χ(v

))

χ(vi ) = Wvii−1 (vi−1
—see Relation 3.7. Therefore the function z(.) computes such expoi)
nentiations and outputs δu (auth(Dh )) with O(1) complexity, same with the complexity
of algorithm verify().
2. The hash table is rebuilt (see Definition 3.2) due to update u. Then the set of queries
Qu consists of queries for all the elements contained in the hash table, therefore its
size is O(n), where n is the number of the elements stored in the table. In this case
z(.) is just a call to algorithm setup().
Because the hash table has to be rebuilt, and in this case it is |Qu | = O(n), it follows (with a
similar analysis as in Lemma 3.5) that |Qu | has O(1) amortized complexity. This completes
the proof. 2
By Theorems 2.2 and 3.2 and Lemma 3.12, we can now state the final result for the
two-party model:
Corollary 3.2 Let k be the security parameter and assume that the strong RSA assumption
holds. Then there exists a two-party authenticated data structures protocol (see Protocol 2.2)
4

We recall that update() does not update the bucket j itself, but receives the updated bucket from
update()—see Definition 2.3.

74
for verifying (non)-membership queries q on a dynamic hash table storing n elements such
that:
1. When precomputed witnesses are used, the protocol is non-interactive; Otherwise, it
requires one round of interaction during updates;
2. The setup at the client has O(n) access complexity;
3. The update at the client has O(1) expected amortized access complexity;
4. The verification at the client has O(1) expected access complexity;
5. The space needed at the client has O(1) group complexity;
6. The communication between the client and the server has O(1) expected amortized
group complexity during updates and O(1) expected group complexity during queries;
7. The update at the server has O(n log n) (or O(n )) expected amortized access complexity;
8. The query at the server has O(1) (or O(n )) expected access complexity;
9. The space needed at the server has O(n) group complexity;
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.

3.3

Scheme based on the bilinear-map accumulator

In this section we use the bilinear-map accumulator and present a new authenticated data
structure scheme BHT = {genkey, setup, update, refresh, query, verify} for dynamic hash
tables. We use exactly the same methodology as the one used in Section 3.2, that is, nested

75
invocations of accumulators in a constant-depth tree, to overall obtain similar complexity and
security results with the solution presented before. Accordingly, we use the same structure in
presenting and proving our results. Note however that there are significant differences (both
in complexity and cryptography) that are imposed by the use of the different cryptographic
primitive. For example, the underlying algebraic groups used are fundamentally different (the
RSA accumulator is using ZN while the bilinear-map accumulator is using groups defined
over elliptic curves). This imposes certain differences in the complexity of some algorithms,
which will be discussed in the following sections. We begin with algorithms genkey() and
setup(). Again, the underlying (plain) data structure is a dynamic hash table T(X ) storing n
elements X = {x1 , x2 , . . . , xn }. As in the RSA accumulator case, the elements are distributed
into m buckets L1 , L2 , . . . , Lm (using a two-universal hash function H), where m = O(n).
Algorithm {sk, pk} ← genkey(1k ): The algorithm chooses a k-bit prime p, an exponentiation base g that is a generator of a multiplicative cyclic group G of prime order p, for
which there is a bilinear map e(., .) : G × G → G 5 . All the above are chosen uniformly at
random as indicated by Assumption 3.2, basically the algorithm has to generate the tuple
t = (p, G, G, e, g). Then it randomly picks a number s ∈ Z∗p (s is the trapdoor). An upper bound q of the total number of elements that will be accumulated is decided and the
2

q

algorithm also computes the elements of G g s , g s , . . . , g s . Finally, a function that outputs
the bit-description of the elements in G, i.e., the function h : G → Zp is used6 . Note that
since G has exactly p elements, the function maps each element in G to an integer in Zp . In
order not to overload the notation, we assume that when the input to the function h(.) is an
element x ∈ Zp , it just outputs x. The algorithm outputs s ∈ Z∗p as sk and everything else
as pk.
5

The generator g is used as the exponentiation base in all the levels of the accumulation tree T ().

6

In this way we make sure that the output accumulated value at some level can be used as input to the
next level of accumulation, since we can only accumulate elements of Zp and not elements of G

76
Algorithm {auth(D0 ), d0 } ← setup(D0 , sk, pk): The algorithm builds the accumulation
tree T () on top of the m buckets L1 , L2 , . . . , Lm . For every leaf node v in tree T () that lies
at level 0 and corresponds to a bucket Lj , the algorithm sets
χ(v) = g

Q

x∈Lj (x+s)

∈ G,

(3.19)

while for every non-leaf node v in T () that lies at level 1 ≤ i ≤ l, the algorithm sets:
χ(v) = g

Q

u∈N (v) (h(χ(u))+s)

∈ G,

(3.20)

where h(χ(u)) is an element in Zp , computed using function h(). The authenticated data
structure auth(D0 ) output by the algorithm consists of the following components:
1. The accumulation tree T ();
2. For every node v ∈ T () at level i, the accumulation values χ(v).
Let r be the root of the tree T . The algorithm also outputs d0 = χ(r), i.e., the digest of the
authenticated data structure is the χ(.) value of the root of the accumulation tree.
Precomputed witnesses. The precomputed witnesses in this case are defined as follows:
For every j ∈ N (v) we store at node v the witness
Wj(v) = g

Q

u∈N (v)−{j} (h(χ(u))+s)

.

(3.21)

When the construction with the precomputed witnesses is used, auth(D0 ) also includes Wj(v) ,
for all v ∈ T () and all j ∈ N (v).
Lemma 3.13 Algorithm setup() of the authenticated data structure scheme BHT has O(n)
access complexity both with and without precomputed witnesses. Moreover, the authenticated
data structure auth(D0 ) output by setup() has always O(n) group complexity.
Proof: Same with Lemma 3.4 with the difference that the efficient computation of the
exponent expressions is now feasible since sk contains the trapdoor s. 2
We continue with the algorithms used for updates:

77
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 , upd} ← update(u, Dh , auth(Dh ), dh , sk, pk): Let m be
the current number of buckets of Dh and n be the number of elements stored in Dh , after
the update has been performed. We distinguish two cases:
Case 1.

m
4

< n < m: In this case there is no need to rebuild the table and the update is

performed as follows: Suppose the update we consider is the insertion of an element e ∈ Zp .
The algorithm computes the bucket j = H(e) (see Relation 3.1) and inserts e in bucket j.
Let v0 be the node of T () referring to bucket j. Let v0 , v1 , . . . , vl be the path in T () from
node v0 to the root of the tree. The algorithm initially sets
χ0 (v0 ) = χ(v0 )e+s ,
i.e., it updates the accumulation value that corresponds to the updated bucket. Note that
if the update we consider is the deletion of an element e, the algorithm sets
−1

χ0 (v0 ) = χ(v0 )(e+s)

.

(3.22)

Subsequently, for j = 1, . . . , l the algorithm sets
0

−1

χ0 (vj ) = χ(vj )(h(χ (vj−1 ))+s)(h(χ(vj−1 ))+s)

,

(3.23)

where χ(vj−1 ) is the previous accumulation value and χ0 (vj−1 ) is the updated accumulation
value. All these values are stored by the algorithm after they have been computed. The
algorithm also outputs the new accumulation values χ0 (vj−1 ) (i = 1, . . . , l) as the information
upd along the path from the updated bucket to the root of the tree. Information upd also
includes e and χ0 (vl ). Also it sets dh+1 = χ0 (vl ), i.e., the updated digest is the updated
χ(.) value of the root of T (). Finally the new authenticated data structure auth(Dh+1 )
is computed as follows. Let auth(Dh ) be the previous authenticated data structure that is
input to the algorithm: Overwrite the values χ(vj−1 ) (j = 1, . . . , l) with the new values
χ0 (vj−1 ) (j = 1, . . . , l) and output the updated structure. The behavior of the algorithm in
the precomputed witnesses case is the same, with the difference that upd = Ø.
Case 2. m =

m
4

or n = m: In this case the hash table is rebuilt according to Definition 3.2:

78
If n =

m
,
4

then the algorithm builds a data structure Dh+1 with m/2 buckets. Otherwise, i.e.,

when n = m, the algorithm builds a data structure Dh+1 with 2m buckets. Subsequently, it
outputs auth(Dh+1 ) and dh+1 by calling algorithm setup(Dh+1 , sk, pk). However, instead of
setting upd = Ø, it sets upd = {auth(Dh+1 ), dh+1 }.
Lemma 3.14 By using the rebuilding policy of Definition 3.2, algorithm update() of the
authenticated data structure scheme BHT has O(1) expected amortized access complexity.
Moreover, the update information upd output by update() has O(1) amortized group complexity.
Proof: Same as in Lemma 3.5. However, the group complexity of upd is amortized, because
when the hash table is rebuilt, it contains the new authenticated data structure (of group
complexity O(n)). 2
Before presenting the remaining algorithms we provide some necessary complexity results
that we are going to need. The following result is derived by using an FFT algorithm (e.g., see
Preparata and Sarwate [96]) that computes the DFT in a finite field (e.g., Zp ) for arbitrary
n and with O(n log n) field operations. Note that there is no requirement for existence of an
n-th root of unity in Zp for the algorithm to work.
Lemma 3.15 (Polynomial interpolation with FFT [96]) Let

Qn

i=1 (s+xi )

=

Pn

i=0

ai si

be a degree-n polynomial. The coefficients an , an−1 , . . . , a0 can be computed with O(n log n)
complexity, given x1 , x2 , . . . , xn .
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): Let m be the
current number of buckets of Dh and n be the number of elements stored in Dh , after the
update has been performed. We distinguish two cases:
Case 1.

m
4

< n < m: Suppose the update is an insertion of element e. The algorithm

computes the bucket j = H(e) (see Relation 3.1) and inserts e in bucket j. Let v0 be the
node of T () referring to bucket j. Let v0 , v1 , . . . , vl be the path in T () from node v0 to the

79
root of the tree. The algorithm, for j = 0, . . . , l, sets
χ(vj ) = χ0 (vj ) ,
i.e., it updates the accumulation values that correspond to the updated path by using the
information upd7 . Finally it outputs the updated hash table as Dh+1 , the updated accumulation values χ(vj ) (along with the ones that belong to the nodes that are not updated) as
auth(Dh+1 ) and χ0 (vl ) (contained in upd) as dh+1 .
Precomputed witnesses. When precomputed witnesses are used, the algorithm should
update Wj(v) for v = v0 , v1 , . . . , vl and for all j ∈ N (v) (see Relation 3.8). To achieve that
efficiently, the following result is used (derived in part from [83]):
Lemma 3.16 (Witnesses update formulas) Suppose we are given the elements collection X = {x1 , x2 , . . . , xn }. Let Wi be the witness of xi , i.e., Wi = g

Q

j6=i (xj +s)

. Then the

following hold:
1. (Element addition) If X 0 = X ∪ {xn+1 }, then for all i = 1, . . . , n + 1 it is
x

Wi0 = acc(X )Wi n+1

−xi

.

(3.24)

2. (Element deletion) If X 0 = X − {xj }, then for all i 6= j it is
Wi0


=

Wi
Wj

x

1
j −xi

.

(3.25)

3. (Element modification) If X 0 = X − {xj } ∪ {x0j }, then for all i 6= j it is
0

Wi0 = Wj



Wi
Wj

i
 xxj −x
−x
j

i

.

(3.26)

For i = j, it is Wi0 = Wi .
7

Note that information upd is not required for refresh() to perform this task. Algorithm refresh() uses upd
for efficiency. Namely, algorithm refresh() could compute the updated values χ(vj ) by doing polynomial
interpolation, which would have O(n log n) complexity (see Lemma 3.15).

80
Proof: Relations 3.24 and 3.25 are given in the initial work of Nguyen [83]. Relation 3.26
is derived as a corollary of Relations 3.24 and 3.25. Indeed, for all i 6= j:
0

0


Wj

Wi
Wj

−xi
 xxj −x
j

i

= g

Q

g

x∈X −{xj } (x+s)

g

Q
Q

x∈X −{xi

} (x+s)

−xi
! xxj −x
j

i

x∈X −{xj } (x+s)
0

= g
= g
= g
= g
= g

Q

x∈X −{xj

Q

x∈X −{xj } (x+s)

Q

x∈X −{xj } (x+s)

s

Q

Q


 xj −xi
Q
(xj −xi ) x∈X −{x ,x } (x+s) xj −xi
j i
g

} (x+s)

g
g

(x0j −xi )
−xi

x∈X −{xj ,xi } (x+s)

g

Q

x∈X −{xj ,xi } (x+s)

Q

x∈X −{xj ,xi } (x+s)

x0j

Q

g

x0j

Q

x∈X −{xj ,xi } (x+s)

x∈X −{xj ,xi } (x+s)

x∈X 0 −{xi } (x+s)

= Wi0 .
For i = j, the witness Wj does not change since, by definition, Wj is not a function of the
value of j (xj or x0j ). This completes the proof. 2
Corollary 3.3 (Updating precomputed witnesses) Given the collection of elements X =
{x1 , x2 , . . . , xn }, and the witnesses Wi for all i = 1, . . . , n, computing the updated witnesses
Wi0 of either X ∪ {xn+1 } or X − {xj } or X − {xj } ∪ {x0j }, without the knowledge of the
trapdoor s, has O(n) complexity.
Algorithm refresh() computes the updated witnesses as follows: Since the previous witnesses
Wi are stored at each node vi , for i = 1, . . . , l, the algorithm uses Relations 3.24 and 3.25
to update the witnesses within the bucket of the update (depending on whether there is
an addition or a deletion of an element) and Relation 3.26 to update the witnesses that
correspond to every internal node of the tree. Specifically, for an internal node v, that has
children v1 , v2 , . . . , vt , suppose the accumulation value χ(vj ) of vj is modified. Then the
element collections X and X 0 used in Formula 3.26 are the following:
X = {h(χ(v1 )), h(χ(v2 )), . . . , h(χ(vt ))} ,

81
and
X 0 = {h(χ(v1 )), h(χ(v2 )), . . . , h(χ(vj−1 )), h(χ0 (vj )), h(χ(vj+1 )), . . . , h(χ(vt ))} .
Case 2. m =
If n =

m
,
4

m
4

or n = m: In this case the hash table is rebuilt according to Definition 3.2:

then the algorithm builds a data structure Dh+1 with m/2 buckets. Otherwise, i.e.,

when n = m, the algorithm builds a data structure Dh+1 with 2m buckets. Subsequently, it
outputs auth(Dh+1 ) and dh+1 from information upd output by update(). We recall that upd
includes the new witnesses.
By using the same amortized analysis as in Lemma 3.7 (note now that the work that
refresh() does when rebuilding the hash table is O(m)—copying information from upd) and
not O(m log m)) but Corollary 3.3 instead of Lemma 3.6 in the proof, we can derive the
following result:
Lemma 3.17 By using the rebuilding policy of Definition 3.2, algorithm refresh() of the
authenticated data structure scheme BHT has O(1) expected amortized access complexity,
without precomputed witnesses. With precomputed witnesses, algorithm refresh() has O(n )
expected amortized access complexity.

3.3.1

Queries and verification

We show now how a proof for an element e ∈ X (or an element e ∈
/ X ) can be constructed.
As in the RSA accumulator case, let H(e) = j (bucket assignment for e) and let v0 , v1 , . . . , vl
be the path from the node that corresponds to bucket j to the root of T (). We recall v−1
is a fictitious node that stores element e within bucket j such that v−1 , v0 , v1 , . . . , vl is the
path in T () from the node that corresponds to element e to the root of T (). We consider
two cases, i.e., membership and non-membership proof:
• Element e is contained in the hash table. The proof is the ordered sequence π0 , π1 , . . . , πl ,
where πi is a tuple of an accumulation value χ() and a witness that authenticates every

82
node of the path v−1 , v0 , . . . , vl from the element in question e to the root of the tree
vl . Thus, item πi of proof Π(e) (i = 0, . . . , l) is defined as:

πi = χ(vi−1 )), Wvi−1 (vi ) ,

(3.27)

where Wvi−1 (vi ) is defined in Relation 3.21. For simplicity, we set αi = χ(vi−1 ) (note
that χ(v−1 ) = e) and
βi = Wvi−1 (vi ) .

(3.28)

For example in Figure 3.1, the proof for an element that belongs to bucket of node a
(e.g., element 2) consists of the following tuples:
π0 =


2, g (s+3)(s+7)(s+9) ,

π1 =


χ(a), g (h(χ(b))+s)(h(χ(c))+s)(h(χ(d))+s) ,

χ(f ), g (h(χ(e))+s)(h(χ(g))+s)(h(χ(p))+s) .

π2 =

• Element e is not contained in the hash table. Let y1 , y2 , . . . , yu be the elements contained
in bucket j (all different than e). First, output a membership proof (as above) for an
element yi in bucket j (note that H(yi ) = H(e)). Then, and by running the extended
Euclidean algorithm for polynomials, output a non-membership witness
πν = (Ae , Be , e) ,

(3.29)

where Ae , Be are elements in G defined in Relation 3.5. Note that Ae , Be have group
complexity O(1) (and not expected O(1) as in the RSA accumulator case—see Relation 3.3—since they consist of just one group element each) and they are used to prove
non-membership of e in the set {y1 , y2 , . . . , yu }.
We now describe the algorithm formally:
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): Let e = q be the queried element.
If e is contained in Dh , set Π(q) = (π0 , π1 , . . . , πl ), as in Relation 3.27 and output α(q) = true.

83
If e is not contained in Dh , output a membership proof for some other element yi in bucket
j, such that H(e) = H(yi ). Then output a non-membership proof πν for e in bucket j, as
defined in Relation 3.29. Set Π(q) = (Π(yi ), πν ) and α(q) = false.
Lemma 3.18 Without precomputed witnesses, algorithm query() of the authenticated data
structure scheme BHT has O(n log n) expected access complexity. With precomputed witnesses, algorithm query() has O(1) expected access complexity. Moreover, it outputs a proof
Π(q) of O(1) group complexity.
Proof: The proof is the same with Lemma 3.8, but with the following differences:
1. Without precomputed witnesses, a witness cannot be constructed with direct exponentiation, since the trapdoor s is not known. It is constructed with polynomial interpolation as follows: Suppose the witness is g (y1 +s)(y2 +s)...(yt +s) (where t = O(n )). Compute
a0 , a1 , . . . , at by using Lemma 3.15 (O(n log n) complexity). Then output the witness
as
 2 a2
 t at
g a0 × (g s )a1 × g s
× . . . × gs
,
t

where g, g s , . . . , g s are contained in the public key. The final task has O(n ) access
complexity.
2. The proof has group complexity O(1) and not expected O(1), due the compactness
of the non-membership proof in the bilinear-map accumulator construction (see Relation 3.5).
This completes the proof. 2
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): Let the query q refer to element e, i.e.,
q = e. We distinguish two cases:
1. Membership proof : In this case it is α = true. The proof Π should contain Π(e) =

84
π0 , π1 , . . . , πl , i.e., the membership proof for element e, where πi = (αi , βi ). The algorithm outputs reject if one of the following is true (note that the verification algorithm
is using the bilinear map function e(., .)):
(a) α0 6= e (element α0 is not correct);

(b) e (αi , g) 6= e βi−1 , g s g h(αi−1 ) for some 1 ≤ i ≤ l (false witness);

(c) e(dh , g) 6= e βl , g s g h(αl ) (final digest mismatch).
2. Non-membership proof : In this case it is α = false. The proof Π in this case contains
Π(y) = π0 , π1 , . . . , πl , i.e., the membership proof for an element y 6= e, where πi =
(αi , βi ) for i = 0, . . . , l. It also contains πν = (A, B, r), the non-membership proof for
e. The algorithm outputs reject if one of the following is true:
(a) H(e) 6= H(y); (e and y do not belong in the same bucket);
(b) The membership proof for y does not verify, i.e., it is
reject ← verify(y, true, Π(y), dh , pk) ;
(c) r 6= e (the data element contained in πν for element e is not correct);
(d) e (α1 , A) e (g s g r , B) 6= e(g, g) (verification test for non-membership proof of e does
not succeed, see Lemma 3.3).
If all the above tests are successful, the algorithm outputs accept.
Lemma 3.19 Algorithm verify() of the authenticated data structure scheme BHT has O(1)
access complexity.
Proof: Same as in Lemma 3.9, with the difference that the complexity is not expected any
more, due to the compactness of the non-membership proof. 2
We finally give the results for correctness and security of the scheme BHT :

85
Lemma 3.20 The authenticated data structure scheme BHT = {genkey, setup, update,
refresh, query, verify} is correct according to Definition 2.4.
Proof: The proof follows the same logic with the proof of Lemma 3.10. As such, we only
show the correctness for the non-membership proof case. Let y1 , y2 , . . . , yu be the elements
contained in the bucket where e should belong. The non-membership proof, as computed
by query(), that is needed for verification, is (Ae , Be , e). Therefore verify() does not reject at
Item 2c, since r = e. Also it does not reject at Item 2d since
 Qu

e (α1 , Ae ) e (g s g e , Be ) = e g j=1 (yj +s) , Ae e (g s g e , Be ) = e(g, g) ,
since, by Relation 3.5, Ae = g α(s) and Be = g β(s) such that

i
(y
+
s)
α(s)+(e+s)β(s) =
j
j=1

hQ
u

1. 2
Lemma 3.21 The authenticated data structure scheme BHT = {genkey, setup, update,
refresh, query, verify} is secure according to Definition 2.5 and under the q-strong DiffieHellman assumption.
Proof: The proof follows exactly the same logic with the proof of Lemma 3.11: Let k be the
2

q

security parameter. Output pk = {h(.), (p, G, G, e, g), {g s , g s , . . . , g s }, } and sk = s ∈ Z∗p
by calling algorithm genkey(). Let Adv be a polynomially-bounded adversary. Adv picks an
initial collection of n elements X , stored in hash table D0 . Adv outputs an authenticated
data structure auth(D0 ), by calling algorithm setup() through oracle access. Then Adv picks
a polynomial number of updates—namely, he picks a polynomial number of elements for
insertion or deletion. Let Dh be the final hash table, let the updated final element collection
be X , and let dh be the final digest as produced by the adversary through oracle access to
algorithm update(). We will compute the probability that check() rejects, while verify()
accepts, as required by Definition 2.5.
For the case of a membership proof, the adversary Adv outputs an incorrect answer e ∈
/X
and also a proof Π(e) = (π0 , π1 , . . . , πl ) (l = d 1 e) where πi = (αi , βi ) (see algorithm query()).

86
Let v0 , v1 , . . . , vl be a path of nodes in T () from the bucket referring to e to the root of the
tree. We define now the following events, related to the choice of the proof above made by
the adversary. Our goal will be to the express the probability that verify(e, true, Π(e), dh , pk)
accepts and e ∈
/ X as a function of the following events. Note that dh is the correct digest
of the authenticated data structure:
1. E0,0 : The value α0 picked by Adv are such that α0 = e ∈
/ X;
2. Ej : For j = 1, . . . , l, the values αj , αj−1 and βj−1 picked by Adv are such that

e (αj , g) = e βj−1 , g s g h(αj−1 ) for all 1 ≤ j ≤ l .
This event can be partitioned into two mutually exclusive events, i.e., Ej = Ej,0 ∪ Ej,1
such that
• Ej,0 : Value αj is not the correctly formed digest (i.e., an accumulation of the
digests of its children) of some node vj−1 ∈ N (vj ), as defined in Relation 3.20;
• Ej,1 : Value αj is the correctly formed digest of a node vj−1 ∈ N (vj ), as defined in
Relation 3.20.
3. El+1,1 : The values αl and βl picked by Adv are such that

e βl , g s g h(αl ) = e(dh , g) .
The probability that verify() accepts, while e ∈
/ X is the probability
Pr[E0,0 ∩ E1 ∩ E2 ∩ . . . ∩ El+1,1 ]
= Pr[E0,0 ∩ (E1,0 ∪ E1,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ . . . ∩ El+1,1 ]
≤ Pr[E1,1 |E0,0 ] + Pr[E2,1 |E1,0 ] + Pr[E3,1 |E2,0 ] + . . . + Pr[El+1,1 |El,0 ]
l+1
X
= Pr[E1,1 |E0,0 ] +
Pr[Ej,1 |Ej−1,0 ] .
j=2

(3.30)

87
First we examine the event E1,1 |E0,0 . This event implies that the adversary has found a value
α0 = e ∈
/ X and a value β0 such that
Q


e β0 , g s g h(α0 ) = e g t=1,...,l0 (s+xt ) , g ,

where x1 , x2 , . . . , xl0 is a subset of the set X . Since e = h(α0 ) ∈
/ X , it is e ∈
/ {x1 , x2 , . . . , xl0 }.
By Lemma 3.3 and Assumption 3.2, this probability is neg(k). Therefore Pr[E1,1 |E0,0 ] ≤
neg(k).
For the remaining events Ej,1 |Ej−1,0 (2 ≤ j ≤ l + 1), we have:
• Ej−1,0 implies that value αj−1 is not the correctly formed digest of some node vj−2 ∈
N (vj−1 ), as defined in Relation 3.20, namely that αj−1 ∈
/ {χ(vt ) : vt ∈ N (vj−1 )} which
gives h(αj−1 ) ∈
/ {h(χ(vt )) : vt ∈ N (vj−1 )}, by the one-to-one property of h(.);
• However, the event Ej,1 implies that (1) digest αj (for j = l + 1 this is just dh ) is the
correctly formed digest of node vj−1 ; and (2)
 Q


(s+h(χ(vt )))
e βj−1 , g s g h(αj−1 ) = e g vt ∈N (vj−1 )
,g .
where χ(vt ) are the correctly formed digests of the set of neighbors of vj−1 .
Since h(αj−1 ) ∈
/ {h(χ(vt )) : vt ∈ N (vj−1 )}, by Lemma 3.3 and Assumption 3.2, this probability is neg(k). Therefore for all j = 1, . . . , l + 1, Pr[Ej,1 |Ej−1,0 ] is neg(k). Since l = O(1),
the total probability is also neg(k). This concludes the proof for the membership proof.
For the case of a non-membership proof, the proof for this case follows exactly the same
logic with Lemma 3.11, so it is omitted. 2
We continue with the following corollary that is useful in Chapter 5:
Corollary 3.4 Let H(e) = j and Π(e) = {(αi , βi ) : i = 0, . . . , l} be a membership proof for
element e. The probability that verify(e, true, Π(e), dh , pk) accepts and β0 6= g
negl(k).

Q

x∈Lj −{e} (s+x)

is

88
Proof: The event β0 6= g

Q

x∈Lj −{e} (s+x)

and verify() accepts implies the event α1 6= g

Q

x∈Lj (s+x)

and verify() accepts. Therefore the probability in question is less or equal than the probability
Pr[E1,0 ∩ E2 ∩ . . . ∩ El+1,1 ], since E1,0 exactly the event
α1 6= g

Q

x∈Lj (s+x)

.

By following the same proof procedure as in Relations 3.30, this can be proved to be neg(k)
as well. 2
Theorem 3.4 Let k be the security parameter and 0 <  < 1. Then there exists a publiclyverifiable authenticated data structure scheme BHT = {genkey, setup, update, refresh, query,
verify} for a data structure scheme defined for a dynamic hash table D storing n elements
such that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and
under the bilinear q-strong Diffie-Hellman assumption;
2. The access complexity of setup() is O(n), outputting an authenticated data structure
auth(D) of O(n) group complexity;
3. The expected amortized access complexity of update() is O(1), outputting update information upd of O(1) amortized group complexity;
4. The expected amortized access complexity of refresh() is O(n ) (or O(1));
5. The expected access complexity of query() is O(1) (or O(n log n)), outputting a proof
Π(q) for a query q of O(1) group complexity;
6. The access complexity of verify() is O(1).
Proof: This result follows directly from Lemmata 3.13, 3.14, 3.17, 3.18, 3.19, 3.20 and 3.21.
The complexities in the brackets (O(1) for refresh() and O(n log n) for query()) refer to the
case when no precomputed witnesses are used. 2

89
Finally we note here that both constructions (RSA accumulator and bilinear-map accumulator) use the same algorithmic ideas, i.e., the accumulation tree. We could have described
a scheme by using an abstract notion of an accumulator and then derive our results by instantiating the abstract solution with the RSA accumulator and the bilinear-map accumulator.
However, we chose not to do that because we feel that this would add more complexity
in the presentation, for something that can be derived and described a lot easier with no
abstraction.

3.3.2

Protocols

Three-party protocol. By using Theorem 2.1 we can easily derive the following corollary
that describes the use of the authenticated data structure scheme BHT of Theorem 3.4 in
the three-party model:
Corollary 3.5 Let k be the security parameter and assume that the bilinear q-strong DiffieHellman assumption holds. Then there exists a three-party authenticated data structures
protocol (see Protocol 2.1) for verifying (non)-membership queries q on a dynamic hash table
storing n elements such that:
1. The setup at the source has O(n) access complexity;
2. The update at the source has O(1) expected amortized access complexity;
3. The space needed at the source has O(n) group complexity;
4. The communication between the source and the server has O(1) amortized group complexity;
5. The update at the server has O(n ) (or O(1)) expected amortized access complexity;
6. The query at the server has O(1) (or O(n log n)) expected access complexity;
7. The space needed at the server has O(n) group complexity;

90
8. The communication between the server and the client has O(1) group complexity;
9. The verification at the client has O(1) access complexity;
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.
Two-party protocol. As a corollary of Lemma 3.12 (the proof follows exactly the same
techniques), we can state a similar assumption for the authenticated data structure scheme
BHT :
Corollary 3.6 Assumption 2.1 is true for the authenticated data structure scheme BHT .
Moreover, for every update u, |Qu | has O(1) amortized complexity.
By Theorems 2.2 and 3.4 and Corollary 3.6, we can now state the final result for the two-party
model:
Corollary 3.7 Let k be the security parameter and assume that the bilinear q-strong DiffieHellman assumption holds. Then there exists a two-party authenticated data structures protocol (see Protocol 2.2) for verifying (non)-membership queries q on a dynamic hash table
storing n elements such that:
1. When precomputed witnesses are used, the protocol requires one round of interaction
during updates that cause the hash table to be rebuilt (see Definition 3.2); When no
precomputed witnesses are used, it requires one round of interaction during updates;
2. The setup at the client has O(n) access complexity;
3. The update at the client has O(1) expected amortized access complexity;
4. The verification at the client has O(1) access complexity;
5. The space needed at the client has O(1) group complexity;

91
6. The communication between the client and the server has O(1) amortized group complexity during updates and O(1) group complexity during queries;
7. The update at the server has O(n ) (or O(n log n)) expected amortized access complexity;
8. The query at the server has O(1) (or O(n log n)) expected access complexity;
9. The space needed at the server has O(n) group complexity;
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.

3.4

Complexity limitations

In this chapter, we proposed a new, provably secure, cryptographic construction for verifying
hash table queries over a dynamic set. We use nested cryptographic accumulators on a tree
of constant depth to achieve constant query and verification costs and sublinear update
costs. Our results are applicable to both the two-party and three-party data authentication
models. We use our method to authenticate general set-membership queries and overall
improve over previous techniques that use cryptographic accumulators, reducing the main
complexity measures to constant, yet keeping sublinear update complexity.
An important open problem is whether one can achieve logarithmic update cost and
still keep the communication complexity constant. There has been no such solution todate. In particular, no method is known that can construct constant-size accumulator proofs
(witnesses) in logarithmic time. In Chapter 6 however, which is the full version of the work
of Papamanthou et al. [91], we describe a solution for this problem that uses a cryptographic
primitive that, unfortunately, is not known to exist yet. On the other hand, we believe that
doing even better, i.e., achieving constant complexity for all the complexity measures seems

92
to be unfeasible due to the Ω(log n/ log log n) memory checking lower bound [35] on query
complexity (the sum of read and write complexity). This result, however, motivates seeking
more general lower bounds for authenticated data structures (similar directions have been
followed in the lower bound works of Dwork et al. [35] and Tamassia and Triandopoulos [106]):
given any cryptographic primitive, what is the best we can do in terms of complexity?
Finally, it would be interesting to modify our schemes to obtain non-amortized bounds
for updates using for example Overmar’s global rebuilding technique [87].

Chapter

4

Authenticated structures based on lattices
Lattices, an infinite-sized set of specially constructed vectors, is a mathematical tool that
made its first appearance in cryptography with Ajtai’s seminal result [3], showing the construction of one-way functions based on hard lattices problems. Since then, lattices have been
proven to enjoy appealing properties that have made their application to cryptography very
promising. Such properties include the seemingly resistance of lattice-based assumptions
to quantum algorithms [98]—as opposed to other assumptions such as factoring—as well
as their worst-case to average-case reductions [99], namely the existence of polynomial-time
algorithms that can transform a solution to a random instance of a certain problem into a solution to any (worst-case) instance of another lattice problem. As such, many cryptographic
primitives, such as public-key encryption schemes (e.g., see the work of Peikert [95]) and
collision-resistant hash functions (e.g., see the work of Lyubashevsky and Micciancio [71]),
based on lattice assumptions have been derived during the last decade. Even more significantly, the long-standing open problem of fully-homomorphic encryption was settled with
a lattice-based construction in 2009 by Gentry [43]. Finally, lattice-based constructions appear to be efficient in practice due to the extensive use of linear algebra (therefore also easily
parallelizable), and have also led to the deployment of lattice-based cryptographic systems
(e.g., see the NTRU system by Hoffstein et al. [57]).
In this chapter we present the first authenticated data structure based on lattices, and
93

94
specifically a lattice-based authenticated table of highly desirable complexity features, such
as update optimality and parallelism (i.e., the constructed authenticated table admits parallel
algorithms). Specifically, we design the first authenticated data structure based on lattices,
the update complexity of which is O(1), improving in this way the O(log n) update bounds
of previous constructions, such as the Merkle tree, and while retaining efficient O(log n)
proof complexity. Moreover, the used lattice-based cryptographic primitive lends itself to a
natural notion of parallelism: As such, we describe parallel versions of our authenticated data
structure algorithms, yielding the first parallel online memory checker [15] with O(1) query
complexity using O(log n) checkers in the CREW model (a parallel model of computation
where processors can read concurrently but can write only exclusively) and without using a
secret key setting, i.e., there is only need for small reliable but not secret memory (as opposed
to [54]). We base the security of our constructions on the difficulty of approximating the gap
version of the shortest vector problem in lattices (GAPSVP) within polynomial factors.
The key idea used here is to combine the simplicity of a Merkle tree [77] with a special property of lattice-based hash functions, which we establish and call repeated linearity.
Roughly speaking, this property allows using the output of one invocation of the hash function, as an input to another invocation of the function, without losing “structure”. This
observation, in the authenticated data structures setting, turns out to be crucial in achieving
constant update complexity (as well as parallel algorithms), while keeping all the remaining
complexity bounds logarithmic. This is a trade-off that, to the best of our knowledge, has
not been achieved so far in the literature—and is feasible due to the use of lattices: For
example, for a table data structure of n entries, the constructions of Bellare and Micciancio [11], the authenticated data structure of Papamanthou et al. [90] and the memory checker
of Dwork et al. [35] have O(1) update but Ω(n ) proof (or query) complexity, whereas hierarchical hashing constructions such as the one of Blum et al. [15] and the one of Goodrich
and Tamassia [48] impose O(log n) bounds on all the complexity measures, which is to be
expected, given the lower bound for hash-based authenticated data structures by Tamassia

95
and Triandopoulos [106].
The data structure we are considering in this chapter is a dynamic table of size n, read
and written through indices 0, . . . , n − 1. We base the security of our construction on the
hardness of the GAPSVP problem in lattices [78], which has its own significance given recent
attacks on collision-resistant functions such as MD-5 [103]. We note that our construction
requires an one-time O(n log n) preprocessing, which is is however amortized—in comparison
with other works (see Table 4.1)— after Ω(n log n) updates.
Overview of the solution. Our authenticated data structure scheme, denoted with LBT
in Table 4.1, can be seen as a generalization of the Merkle tree and related hierarchical
hashing constructions [15, 48, 81]. By exploiting a property of lattice-based hash functions
(which we call repeated linearity) over a typical Merkle tree, we depart from black-box use
of generic collision-resistant hash functions (e.g., MD-5 or SHA-256) in the authenticated
data structures setting. As a consequence, and in the Merkle tree paradigm, the digest of a
tree node v can be expressed as the “sum” of well-defined functions (called partial digests)
applied to data stored at the leaves of v’s subtree (Theorem 4.3). Exploiting this property
enables constant update complexity as well as deriving parallel algorithms. It may also be
of general interest and have other applications. A comparison of our solution with existing
work is given in Table 4.1.
We now give the formal definition of the underlying data structure scheme, for which the
authenticated data structure scheme LBT is designed.

[15, 48, 75, 81] [11]
[83]
setup()
n
n
n
update()
log n
1
1
refresh()
log n
1
n
query()
log n
n
1
verify()
log n
n
1
proof Π(q)
log n
n
1
info. upd
1
1
1
assumption
Generic CR
D. Log B. q-DH

[23, 101] [51]
n
n
1
n
n log n
n
1
n
1
1
1
1
1
n
Strong RSA

[90]
n
1
1
n
1
1
1

LBT
n log n
1
log n
log n
log n
log n
1
GAPSVP

Table 4.1: Asymptotic access and group complexities of various authenticated data structure schemes (see Definition 2.3) for
a dynamic table of n entries. Parameter 0 <  < 1 is a constant and GAPSVP is the gap shortest vector problem in lattices
(Definition 4.1). In all schemes, the authenticated structure has group complexity O(n) and genkey() has O(1) complexity. Note
that [90] is the published conference version of Chapter 3. The acronyms of the other assumptions can be found in Table 3.1.
All presented schemes in the table are publicly verifiable.

96

97
The data structure scheme. Let T be a dynamic table of n indices, storing values
T[1], T[2], . . . , T[n]. The data structure scheme {query(), update(), check()} (Definition 2.2) for a dynamic table T is as follows:
1. T[i] ← query(i, T): Given an index 1 ≤ i ≤ n, return T[i]. Answering this query has
O(1) complexity;
2. T0 ← update(i, y, T): Given an index 1 ≤ i ≤ n, set T[i] := y. The complexity for
this task is O(1);
3. {accept, reject} ← check(i, y, T): If T[i] 6= y return reject. Else return accept.

4.1

Lattice definitions

We start with some basic definitions related to lattices.We use upper case bold letters to
denote matrices, e.g., B, lower case bold letters to denote vectors, e.g., b, and lower case
italic letters to denote scalars. Finally, for a vector x = [x1 x2 . . . xk ]T (note that T as an
exponent in the vector notation denotes the transpose vector), kxk denotes the Euclidean
norm of x, i.e., kxk = (x21 + x22 + . . . + x2k )

4.1.1

1/2

.

What is a lattice?

Given the security parameter k, a full-rank k-dimensional lattice is defined as the infinitesized set of all vectors produced as the integer combinations
)
( k
X
xi bi : xi ∈ Z, 1 ≤ i ≤ k ,
i=1

where B = {b1 , b2 , . . . , bk } is the basis of the lattice and b1 , b2 , . . . , bk are linearly independent, all belonging to Rk . We denote the lattice produced by B (i.e., the set of vectors)
with L(B). A well-known difficult problem in lattices is the approximation within a polynomial factor of the shortest vector in a lattice (SVP problem). Namely, given a lattice L(B)

98
produced by a basis B, approximate up to a polynomial factor in k the shortest (in an Euclidean sense) vector in L(B), the length of which we denote with λ(B). A similar problem
in lattices is the “gap” version of the shortest vector problem (GAPSVPγ ), the difficulty of
which is useful in our context:
Definition 4.1 (Problem GAPSVPγ ) An input to GAPSVPγ is a k-dimensional lattice basis B and a number d, where k is the security parameter. In YES inputs λ(B) ≤ d and in
NO inputs λ(B) > γ × d, where γ ≥ 1.
We note that, for exponential values of γ, i.e., γ = 2O(k) , one can use the LLL algorithm [65]
and decide the above problem in polynomial time. The difficult version of the problem arises
for polynomial γ, for which no efficient algorithm is known to date, even for factors slightly
smaller than exponential [99], i.e., very big polynomials. Moreover, for polynomial factors,
there is no proof that this problem is NP-hard1 , which makes the polynomial approximation
cryptographically interesting as well. Therefore, a well-accepted assumption on which the
security of our scheme is based is as follows:
Assumption 4.1 (Hardness of GAPSVPγ ) Let GAPSVPγ be an instance of the gap version of the shortest vector problem in lattices, as defined in Definition 4.1 and k be the security parameter. There is no polynomial-time algorithm for solving GAPSVPγ for γ = poly(k),
except with negligible probability neg(k).

4.1.2

Reductions

After Ajtai’s seminal work [3] where an one-way function based on hard lattices problem
is presented, Goldreich et al. [44] presented a variation of the function, providing at the
same time collision resistance. Based on this collision resistant hash function, Micciancio
and Regev [78] described a generalized version of it, a modification of which we are using in
1

In specific, as outlined in [99], the current state of knowledge indicates that for γ >
unlikely that this problem is NP-hard and no efficient algorithm is known to date.

p

k/ log k, it is

99
our construction. The security of the hash function is based on the difficulty of the small
integer solution problem (SIS):
Definition 4.2 (Problem SISq,m,β ) Given an integer q, a matrix M ∈ Zk×m
and a real β,
q
find a non-zero integer vector z ∈ Zm \{0} such that Mz = 0 mod q and kzk ≤ β.
√
Note that at least one solution to the above problem exists when β ≥ mq k/m and m > k [78].
√
Moreover, if q ≥ 4 mk 1.5 β, we will see that such a solution is difficult to find. We continue
with the definition of SIS0 , where the solution vector is required to have at least one odd
coordinate:
and a real β,
Definition 4.3 (Problem SIS0q,m,β ) Given an integer q, a matrix M ∈ Zk×m
q
find an integer vector z ∈ Zm \2Zm such that Mz = 0 mod q and kzk ≤ β.
For odd q, there is a polynomial-time reduction from SIS0q,m,β to SISq,m,β [78]:
Lemma 4.1 (Reduction from SIS0q,m,β to SISq,m,β [78]) For any odd integer q ∈ 2Z + 1
and SIS0 instance I = (q, M, β), if I has a solution as an instance of SIS, then it has a
solution as an instance of SIS0 . Moreover, there is a polynomial-time algorithm that on input
a solution to a SIS instance I, outputs a solution to the same SIS0 instance I.
As proved by Micciancio and Regev [78], by choosing certain parameters, GAPSVPγ can
be reduced to SIS0 (derived by combining Lemma 5.22 and Theorem 5.23 from the work of
Micciancio and Regev [78]):
Lemma 4.2 (Reduction from GAPSVPγ to SIS0q,m,β [78]) Let β, m, q = k O(1) be values
√
√
that are polynomially-bounded, with q ≥ 4 mk 1.5 β and γ = 14π kβ. Then there is a
probabilistic polynomial-time reduction from solving GAPSVPγ in the worst case to solving
SIS0q,m,β on the average with non-negligible probability.
A direct application of Lemma 4.1 and Lemma 4.2 gives the following result.

100
Theorem 4.1 Let q = k O(1) be an odd positive integer. For any polynomially-bounded values
√
√
β, m = k O(1) , with q ≥ 4 mk 1.5 β and γ = 14π kβ, there is a probabilistic polynomial-time
reduction from solving GAPSVPγ in the worst case to solving SISq,m,β on the average with
non-negligible probability.
Theorem 4.1 states that if there is an algorithm that solves an average (i.e., M ∈ Zk×m
q
√ 1.5
is chosen uniformly at random) instance of SISq,m,β , for an odd q, q ≥ 4 mk β and γ =
√
14π kβ, then, this algorithm can be used to solve any instance of GAPSVPγ .

4.1.3

Lattice-based hash function

√
Let m = 2k log q and β = δ m, where δ is poly(k). Note that log δ = O(log k). We also
√
require q ≥ 4 mk 1.5 β = 8k 2.5 δ log q. It is easy to see that given k and δ there is always
a q = O(k 2.5 δ log k) to satisfy the above constraints—since δ is poly(k), the bit-size of q is
O(log k). The collision resistant hash function that we are using is a generalization of the
function presented by Micciancio and Regev [78], where δ = O(1) (in the security parameter)
is used instead. In our construction we use bigger values for δ. Namely the value that we
use to bound the norm of the solution vector can be up to poly(k). This was observed in
the original definition of Ajtai’s one-way function [3], i.e., that the input vector can contain
larger values (but not so large), and was also noted in its extension that achieves collision
resistance [44]. This remark is very useful in our context and implies that, the larger value
one picks for β, the larger the modulus q should be so that security is guaranteed (still q’s
bit size is O(log k)).
Let now M ∈ Zk×m
be a k × m matrix that is chosen uniformly at random. We can
q
define the function hM : Zm → Zkq as hM (x) = Mx mod q, where kxk ≤ β and the modulo
operation is taken component-wise. The above function is collision resistant based on the
difficulty of GAPSVP14π√kβ :
√
Theorem 4.2 (Strong collision resistance) Let m = 2k log q, β = δ m and q be an

101
√
odd positive integer such that q ≥ 4 mk 1.5 β. Let also M ∈ Zk×m
be a k × m matrix that is
q
chosen uniformly at random. If there is a polynomial-time algorithm that finds two vectors
x, y ∈ {0, 1, . . . , δ}m and x 6= y such that Mx = My mod q, then there is a polynomial-time
algorithm to solve any instance of GAPSVP14πδ√km .
Proof: Suppose there is an algorithm that finds x, y ∈ {0, 1, . . . , δ}m with x 6= y such that
Mx = My mod q. Therefore the non-zero vector z = x − y, which also has norm kzk ≤ β,
since its coordinates are between −δ and +δ, comprises a solution to the problem SISq,m,β
(note that matrix M by construction is chosen uniformly at random). By Theorem 4.1, this
√
√
can be used to solve GAPSVPγ for γ = 14π kβ. Setting β = δ m we get the desired result.
2
Since δ = poly(k), γ is also poly(k) and therefore the presented hash function is secure, by
Assumption 4.1. We can now extend the function h to accept two inputs as follows: Denote
with Tδ,+ the set of all m × 1 (m = 2k log q) vectors such that their last k log q entries are
zero and the remaining entries are in {0, 1, . . . , δ} and analogously with Tδ,− the set of all
m × 1 vectors such that their first k log q entries are zero and the remaining entries are in
{0, 1, . . . , δ}:
Definition 4.4 (Lattice-based hash function with two inputs) We define the function
hM,δ : Tδ,+ × Tδ,− → Zkq as hM,δ (x, y) = M(x + y) mod q, where x, y ∈ {0, 1, . . . , δ}m .
Note that we use both M and δ as subscripts for the function. Similarly as in Theorem 4.2,
this function is strong collision resistant, i.e., if there is a polynomial-time algorithm that
finds (x1 , y1 ) ∈ (Tδ,+ × Tδ,− ) and (x2 , y2 ) ∈ (Tδ,+ × Tδ,− ) with (x1 , y1 ) 6= (x2 , y2 ) such
that M(x1 + y1 ) = M(x2 + y2 ) mod q then there is a polynomial-time algorithm that solves
GAPSVPγ for polynomial γ. To see that, note that the vector x1 −x2 +y1 −y2 has coordinates
in {0, 1, . . . , δ}, since, by the definition of Tδ,+ and Tδ,− , the entries of x1 − x2 and y1 − y2
do not overlap.

102
Time and space complexity of hash function. In this paragraph we analyze the time
and space complexity of the used hash function. Since the modulus q has O(log k) bits,
our hash function is described with a k × 2k log q matrix of O(log k)-bit entries. Therefore
the space complexity is O(k 2 log2 k) bits. Given now an input x ∈ {0, 1, . . . , δ}2k log q , we
can compute hM,δ (x) in O(k 2 log2 k log2 log k) time. To see that, an application of the hash
function requires the computation of k internal products between vectors of 2k log q entries,
and each multiplication in the internal product is a multiplication in Zq , which can be
computed in O(log k log2 log k) time using FFT [29]. This makes the total time equal to
O(k 2 log2 k log2 log k).

4.1.4

Parallel models of computation

As we mentioned in the beginning of this chapter, we also give parallel versions of our latticebased authenticated data structures algorithms. We use the PRAM model (parallel random
access machine) and specifically EREW PRAM, CREW PRAM and CRCW PRAM. We
recall the definition of these models below:
1. EREW: This model allows all processors to read and write exclusively at the same
time. Therefore no conflicts need to be resolved;
2. CREW: This model allows all processors to read concurrently and write exclusively at
the same time. Read conflicts are resolved with O(1) complexity;
3. CRCW: This model allows all processors to read concurrently and write concurrently
at the same time. Read and write conflicts are resolved with O(1) complexity.
Note that EREW requires minimal assumptions, CREW requires a stronger assumption (as
there is a need to resolve read conflicts) and CRCW requires the strongest assumptions
since both read and write conflicts need to be resolved. Ways to resolve conflicts in the
PRAM model have been extensively studied by the literature. A great introduction to

103
most fundamental results related to the PRAM model of computation as well as to parallel
algorithms is given in the book of JaJa [59].

4.2

Main construction

In this section we present our update-optimal authenticated data structure scheme for a
dynamic table, i.e., the scheme LBT = {genkey, setup, update, refresh, query, verify}. We
recall that the data structure for which we describe an authenticated data structure scheme
for is a table T that consists of n indices 1, 2, . . . , n, supporting index queries and index
updates.
A direct solution for this problem would be to use a Merkle tree with some collisionresistant hash function (e.g., SHA-2, see first column of Table 4.1), which would bear logarithmic complexities in all the complexity measures—also inherently enforcing sequential
computations. Here we build an authenticated structure for this data structure that uses the
lattice-based hash function introduced in Section 4.1 and also supports constant complexity
updates, allowing at the same time a great deal of parallelism.

4.2.1

Algebraic tools

We now discuss some algebraic tools to be used in our construction. Without loss of generality, assume that q, the modulus is a power of two:
Definition 4.5 (Binary representation) Define f (x) = [f0 f1 . . . flog q−1 ]T ∈ {0, 1}log q
P q−1 i
to be the binary representation of x ∈ Zq . Namely, x = log
fi 2 mod q.
i=0
q
Definition 4.6 (Radix-2 representation) Define g(x) = [f0 f1 . . . flog q−1 ]T ∈ Zlog
to be
q
P q−1 i
some radix-2 representation of x ∈ Zq . Namely, x = log
fi 2 mod q.
i=0

q
By “some” radix-2 representation we mean that the function g : Zq → Zlog
is “one-toq

many”. For example, for q = 16, x = 7, possible values for g(x) can be [0 1 1 1]T (the usual

104
binary representation), [0 − 2 0 − 1]T or [−2 2 0 − 1]T (and many more). We now give an
important result for our construction:
Lemma 4.3 For any x1 , x2 , . . . , xt ∈ Zq there exist a radix-2 representation g(.) such that
g(x1 + x2 + . . . + xt mod q) = f (x1 ) + f (x2 ) + . . . + f (xt ) mod q. Moreover it is g(x1 +
x2 + . . . + xt mod q) ∈ {0, . . . , t}log q .
Proof: Let xi = f (xi ) be the binary representation of xi for i = 1, . . . , t. Then
" t
#T
t
t
t
X
X
X
X
xi =
xi0
xi1 . . .
xi(k−1)
mod q .
i=1

i=1

i=1

i=1

The resulting vector is a radix-2 representation of
!
!
t
t
X
X
xi0 × 20 +
xi1 × 21 + . . . +
i=1

i=1

t
X

!
xi(k−1)

× 2k−1

i=1

which can be written as
k−1
k−1
k−1
X
X
X
j
j
x1j × 2 +
x2j × 2 + . . . +
xtj × 2j = x1 + x2 + . . . + xt
j=0

j=0

mod q ,

mod q.

j=0

Therefore there exists a radix-2 representation g such that g(x1 + x2 + . . . + xt mod q) =
f (x1 ) + f (x2 ) + . . . + f (xt ) mod q. Finally note that since g(.) is the sum of t binary
representations, it cannot contain a entry that is greater than t. 2
Lemma 4.3 is useful in the following sense: Given two binary representations of x1 and
x2 , namely f1 and f2 , a radix-2 representation of x1 + x2 is f1 + f2 . Definitions 4.5 and 4.6
and also Lemma 4.3 (see Corollary 4.1) can be naturally extended for vectors x ∈ Zkq : For
i = 1, . . . , k, xi is mapped to the respective log q entries f (xi ) (or g(xi )) in the resulting
vector f (x) (or g(x)). Therefore we have the following:
Corollary 4.1 For any x1 , x2 , . . . , xt ∈ Zkq there exist a radix-2 representation g(.) such
that g(x1 + x2 + . . . + xt mod q) = f (x1 ) + f (x2 ) + . . . + f (xt ) mod q. Moreover it is
g(x1 + x2 + . . . + xt mod q) ∈ {0, . . . , t}k log q .
To constrain the inputs to our hash function, we need the following definition:
Definition 4.7 Let x ∈ Zkq . We say that the radix-2 representation g(x) ∈ Zkq log q is δadmissible if and only if g(x) ∈ {0, 1, . . . , δ}k log q .

105

4.2.2

Algorithms of the scheme

We now describe the algorithms of the scheme LBT (see Definition 2.3). All expressions
below are reduced modulo q, i.e., we work in Zq .
Algorithm {sk, pk} ← genkey(1k ): On input the security parameter k, this algorithm computes an odd number q = O(k 2.5 δ log k), for some δ = n = poly(k). Namely we set δ to be
uniformly at random, where
equal to the size of the table, n. Then it samples M ∈ Zk×m
q
m = 2k log q. It sets sk = Ø and pk = {M, q}, i.e., there is no secret (trapdoor information)
in our scheme. The access complexity of this algorithm is O(1).

Lattice-based digests. Before we describe algorithm setup(), we describe how we define
the lattice-based digests on the table T, by using the hash function of Definition 4.4. Let
D0 be the initial state of our table, storing values x1 , x2 , . . . , xn ∈ Zkq . Let T be the binary
tree of ` levels on top of the values x1 , x2 , . . . , xn —recall we have assumed that n = 2` , and
r be the root of tree T . By convention, the root of the tree lies at level 0 and the leaves
of the tree lie at level `. For every leaf node vi of the tree, i = 1, . . . , n, the digest d(vi ) is
defined as d(vi ) = xi . Then, for any internal node u, with left child v and right child w, by
using the hash function hM,n (x, y) given in Definition 4.4 in a recursive way, the digest d(u)
of node u can be defined as
d(u) = hM,n (Ug(d(v)), Dg(d(w))) = M [Ug(d(v)) + Dg(d(w))] ,

(4.1)

where g(d(v)) and g(d(w)) are some n-admissible radix-2 representations of d(v) and d(w),
i.e., by Definition 4.4, it must be g(d(v)), g(d(w)) ∈ {0, 1, . . . , n}k log q .
In the above relations, matrices U and D are special matrices such that multiplying
matrices U and D with a vector in {0, 1, . . . , n}k log q doubles the dimension of the vector by
shifting its entries accordingly and by filling the vacant entries with zeros. This operation
is used to prepare the vectors in the appropriate input format for the hash function. More
formally, U = [Ik log q Ok log q ]T and D = [Ok log q Ik log q ]T , where Il denotes the square identity

106
matrix of dimension l and Ol denotes the square zero matrix of dimension l. Indeed, it easy
to see that for all x ∈ {0, 1, . . . , n}k log q it is Ux ∈ Tn,+ and Dx ∈ Tn,− , where Tn,+ and Tn,+
are defined in Section 4.1.
M× r
+
Ug(d(r11))

Ug(d(r21))

M× r11
+

Dg(d(r22))

Ug(d(r23))

Dg(x2)

Ug(x3)

M× r12
+

Dg(d(r24))

M× r24
+

M× r23
+

M× r22
+

M× r21
+
Ug(x1)

Dg(d(r12))

Dg(x4)

Ug(x5)

Dg(x6)

Ug(x7)

Dg(x8)

x1

x2

x3

x4

x5

x6

x7

x8

MUf(MUf(MUf(x1)))

MUf(MUf(MDf(x2)))

MUf(MDf(MUf(x3)))

MUf(MDf(MDf(x4)))

MDf(MUf(MUf(x5)))

MDf(MUf(MDf(x6)))

MDf(MDf(MUf(x7)))

MDf(MDf(MDf(x8)))

Figure 4.1: Tree T built on top of a table with 8 values x1 , x2 , . . . , x8 . After producing an
n-admissible radix-2 g(.) representation of the children digests, we multiply with either U
or D, then we add the two resulting digests and we compute the hash function on them by
multiplying with M. At the leaves of the tree we show the terms that correspond to each
index, as computed by Theorem 4.3 (i.e., the partial digests of the root r with reference
to every value at the table). The g(.) representation of the internal nodes are indicated
with dashed lines (see Definition 4.9). Note that the g(.) representations of the internal
nodes are the sum of specific f (.) representations of the leaves, for example, g(d(r12 )) =
f (Lf (Lf (x5 ))) + f (Lf (Rf (x6 ))) + f (Rf (Lf (x7 ))) + f (Rf (Rf (x8 ))), where MU = L and
MD = R.
The computation in Relation 4.1 is as follows (see Figure 4.1): Suppose a node u ∈ T
has children v and w of digests d(v), d(w) ∈ Zkq . Applying g(.) transforms d(v), d(w) into
vectors of k log q small entries (admissible radix-2 representations). Multiplying with U and
D prepares g(d(v)), g(d(w)) to be input to the hash function.2
2

The procedure so far is the same with a Merkle tree construction that uses a collision-resistant function
such as SHA-2, i.e., recursive computation over the nodes of a tree.

107

4.2.3

Partial digests

Here we show how to express the digest d(u) (computed in Relation 4.1) for every node
u ∈ T somehow differently, which is crucial for deriving our final results. To simplify some
log q
.
notation, we set MU = L and MD = R (stand for left/right)—note that L, R ∈ Zk×k
q

Let also range(u) be the range of successive indices corresponding to the leaves of the subtree
of T rooted on u. E.g., in Figure 4.1, it is range(r11 ) = {1, 2, 3, 4}. For every node u ∈ T
and for every i ∈ range(u) we define the partial digest of u with reference to xi :
Definition 4.8 (Partial digest of a node u) For a leaf node u ∈ T storing value xi , the
partial digest of u with reference to xi is defined as d(u, xi ) = xi . Else, for every other node
u of T , with left child v and right child w, and for every i ∈ range(u), the partial digest
d(u, xi ) of u with reference to xi is recursively defined as d(u, xi ) = Lf (d(v, xi )), if xi belongs
to the left subtree of u; Else, d(u, xi ) = Rf (d(w, xi )).
E.g., in Figure 4.1, the partial digests of root r with reference to x2 and x3 are d(r, x2 ) =
Rf (Rf (Lf (x2 ))) and d(r, x3 ) = Rf (Lf (Rf (x3 ))) respectively (f (z) is z’s binary representation). We now give the main result of this section.
Theorem 4.3 The digest d(u) of node u ∈ T in Relation 4.1 can be expressed as
d(u) =

X

d(u, xi ) ,

i∈range(u)

where d(u, xi ) is the partial digest of node u with reference to xi .
Proof: We prove the claim by induction on the levels of the tree T . For any internal node u
that lies at level ` − 1, there are only two nodes (that store for example values xi (left child)
and xj (right child) and belong to range(u)) in the subtree rooted on u. Therefore
d(u, xi ) + d(u, xj ) = Lf (xi ) + Rf (xj ) = MUf (xi ) + MDf (xj )
= M [Ug(xi ) + Dg(xj )] = d(u) .

108
This is due to Relation 4.1 and also due to the fact that g(.) can be picked to be f (.), which
is an n-admissible radix-2 representation, therefore satisfying the constraint of the inputs of
Definition 4.4. Hence the base case holds. Assume the theorem holds for any internal node
z that lies at level 0 < t + 1 ≤ `. Therefore
d(z) =

X

d(z, xi ) .

i∈range(z)

Let u be an internal node that lies at level t and let i1 , i2 , . . . , iu be the indices in range(u)
in sorted order. Let v be the left child of u and w be the right child of u. Then, by the
definition of the partial digest of the node u (Definition 4.8) we
d(u) =

X

d(u, xi ) =

u/2
X

Lf (d(v, xj )) +

j=1

i∈range(u)
u/2

= MU

X

f (d(v, xj )) + MD

j=1

u
X

Rf (d(w, xj ))

j=u/2+1
u
X

f (d(w, xj )) .

j=u/2+1

By Corollary 4.1 there exist g(.) representations whose entries are at most u/2 ≤ n such that




u/2
u
X
X
d(u) = MUg 
d(w, xj ) .
d(v, xj ) + MDg 
j=1

j=u/2+1

By the inductive step this can be written as
d(u) = M[Ug(d(v)) + Dg(d(w))] ,
where g(.) are radix-2 representations that are n-admissible, since they are the sum of at
most u/2 = n/2 binary representations. Therefore this satisfies Definition 4.1 and d(u) is
indeed the correct digest of any internal node u, as computed by Relation 4.1. This completes
the proof. 2
Algorithm {auth(D0 ), d0 } ← setup(D0 , sk, pk): Let D0 be the initial table, storing values
x1 , x2 , . . . , xn ∈ Zkq . The algorithm computes the digests of the nodes: It sets d(u) = xi
for all leaf nodes u storing value xi and d(u) = M [Ug(d(v)) + Dg(d(w))] (application of

109
the hash function in Definition 4.4) for all internal nodes u with left child v and right child
w, where g(d(v)) and g(d(w)), i.e., the radix-2 representations of the children digests, are
computed according to the following definition3 :
Definition 4.9 The radix-2 representation of d(u) of node u ∈ T is computed as the sum
of |range(u)| binary representations, i.e.,
g(d(u)) =

X

f (d(u, xi )) ,

i∈range(u)

where d(u, xi ) is the partial digest of node u with reference to xi .
By combining Theorem 4.3 and Definition 4.9, by Corollary 4.1, we have:
Corollary 4.2 Let u be an internal node of tree T . The g(.) representation of d(u) defined
in Definition 4.9 is an n-admissible radix-2 representation of d(u).
This concludes the description of setup(). The algorithm outputs d0 = d(r), where r is the
root of T (i.e., the digest of the data structure is the digest of the root of the tree) and also
it outputs auth(D0 ) to be a structure that contains: (a) Tree T ; (b) g(d(u)) for all nodes u
of T as computed in Definition 4.9. The complexity of the algorithm is O(n log n), since the
computation of g(d(u)) involves a linear number of operations per tree level, and there are
O(log n) levels in total:
Lemma 4.4 Algorithm setup() of the authenticated data structure scheme LBT has O(n log n)
access complexity, outputting an authenticated data structure auth(D0 ) of O(n) group complexity. Moreover it is parallelizable with O(n) access complexity using O(log n) processors
in the CREW model.
Proof: The algorithm needs to compute the n-admissible radix-2 representations g(d(u))
of digests d(u) for every internal node u of the tree T . Note that by Definition 4.9, there
3

Note here that the binary representations f (d(v)), f (d(w)) could be used instead; However, in lieu of
achieving our efficiency goals, the algorithm uses Definition 4.9.

110
are n/2, n/4, n/8, . . . , 2 such representations that need to be computed for levels ` − 1, ` −
2, ` − 3, . . . , 1 respectively, each one being the sum of 2, 4, 8 . . . , n/2 binary representations
respectively, i.e.,
X

g(d(u)) =

f (d(u, xi )) .

i∈range(u)

Since computing f (d(u, xi )) has access complexity O(1) (they are just functions of specific
values), it follows that the computation of the g(.) representations for all the internal nodes
of the tree requires access complexity
n
n
n
n
× 2 + × 4 + × 8 + . . . + 2 × = O(n log n) .
2
4
8
2
Note now in the CREW model, we can use O(log n) processors, i.e., one processor for each
level of the tree. By reading the values xi concurrently and writing the values g(d(u)) at
different memory locations, it follows that each processor will have to do O(n) work in the
CREW model. Finally, we note that the output authenticated data structure stores with
each internal node u of the tree T the respective n-admissible radix-2 representations g(d(u)).
Therefore the group complexity of auth(D) is O(n). This completes the proof. 2
We continue by noting that Theorem 4.3 allows us to express d(r) as a sum of well-defined
functions of the leaves, namely the partial digests of the root r with reference to values in
the table. This allows us to achieve our desired complexity bounds:
Corollary 4.3 Let x1 , x2 , . . . , xn be the values stored in our table. Then the digest d(r) of
the root r of the tree T can be expressed as
d(r) =

n
X

d(r, xi ) ,

i=1

where d(r, xi ) is the partial digest of the root r with reference to xi .
We observe that computing the partial digest d(r, xi ) requires one query to the authenticated
data structure, i.e., a query for value xi , therefore yielding O(1) access complexity. Matrices
L and R, both used for its computation (Definition 4.8) are not part of the authenticated

111
data structure (they are fixed by setup() as public information) and accessing them any
number of times does not add to the access complexity. We continue with describing the
remaining algorithms of our authenticated data structure scheme:
Algorithm {dh+1 , Dh+1 , upd} ← update(u, Dh , dh , sk, pk): Let the update u be set T[i] =
x0i and let the value of T[i] before the update be xi . Then the algorithm sets
dh+1 = dh − d(r, xi ) + d(r, x0i ) ,
where d(r, xi ) and d(r, x0i ) are the partial digests of r with reference to xi and x0i , defined
in Definition 4.8. Due to Corollary 4.3, dh+1 is the correct updated digest. Since the
computation of partial digests has constant access complexity, algorithm update() has O(1)
access complexity, since it involves two operations in Zkq . The algorithm outputs dh+1 as well
as the updated table Dh+1 (note that the algorithm does not need to access the authenticated
data structure at all—see Definition 2.3—and does not output anything as upd):
Lemma 4.5 Algorithm update() of the authenticated data structure scheme LBT has O(1)
access complexity. Moreover, the update information upd output by update() is empty.
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): This algorithm should update the authenticated data structure auth(Dh ). Let the update u be set
T[i] = x0i and let the value of T[i] before the update be xi . Suppose v` , v`−1 , . . . , v1 is the
path from the node of index i to the child v1 of the root of the tree. The algorithm should
update the values g(d(vj )) for j = `, ` − 1, . . . , 1. This is achieved via Definition 4.9, by
setting
g(d0 (vj )) = g(d(vj )) − f (d(vj , xi )) + f (d(vj , x0i )) for j = `, ` − 1, . . . , 1 ,
namely the invariant of Definition 4.9 must be maintained, and where d(vj , xi ), d(vj , x0i ) are
the partial digests of node vj with reference to xi and x0i . The algorithm outputs Dh+1 ,
the updated g(d0 (.)) representations as auth(Dh+1 ) and dh+1 as in update(), i.e., dh+1 =
dh − d(r, xi ) + d(r, x0i ).

112
Lemma 4.6 Algorithm refresh() of the authenticated data structure scheme LBT has O(log n)
access complexity. Moreover, it is parallelizable with O(1) access complexity using O(log n)
processors in the CREW model.
Proof: For each update from xi to x0i , the algorithm should update the values g(d(vj )) for
j = `, ` − 1, . . . , 1. This is achieved via Definition 4.9, by setting
g(d0 (vj )) = g(d(vj )) − f (d(vj , xi )) + f (d(vj , x0i )) ,

(4.2)

(the invariant of Definition 4.9 must be maintained) for j = `, `−1, . . . , 1 and where d(vj , xi ),
d(vj , x0i ) are the partial digests of node vj with reference to xi and x0i respectively. Since
` = O(log n) the result follows. Note also that the update Relations 4.2 are independent from
one another. Therefore we can take advantage of that, and, in the CREW model, we can
use O(log n) processors, i.e., one processor for each level of the tree. By reading the values
xi concurrently and writing the values g(d(u)) at different memory locations (as required),
it follows that each processor will have to do O(1) work in the CREW model. 2
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): Let the query q be return the value
stored at index i of table T. Suppose v` , v`−1 , . . . , v1 is the path from the node of index i to
the child v1 of the root of the tree T . The algorithm sets α(q) = T[i] and sets the proof Π(q)
to be the array π of g(.) representations such that
πi = (g(d(vi )), g(d(sib(vi )))) ,

(4.3)

for i = `, ` − 1, . . . , 1, where sib(u) denotes the sibling of a node u in tree T .
Lemma 4.7 Algorithm query() of the authenticated data structure scheme LBT has O(log n)
access complexity. Moreover, it is parallelizable with O(1) access complexity using O(log n)
processors in the EREW model. Finally, it outputs a proof Π(q) of O(log n) group complexity.
Proof: Since ` = O(log n) values have to be collected to construct the proof, the result
follows. Moreover, with O(log n) processors—one processor per node, this algorithm is parallelizable in the EREW model, with O(1) complexity: For p = 1, . . . , `, processor p outputs

113
πp = (g(d(vp )), g(d(sib(vp )))), as defined in Relation 4.3. 2
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): Let the query q be return the value at
index i, y = α, and Π = π such that πj = (αj , βj ) (j = `, ` − 1, . . . , 1). For j = `, ` − 1, . . . , 1
the algorithm performs the following:
1. If αj is not a g(.) representation of y or αj , βj are not n-admissible g(.) representations,
output reject;
2. Set y = M(Uαj + Dβj ) if vj is vj−1 ’s left child, or y = M(Dαj + Uβj ) otherwise.
After the loop terminates, if y 6= dh , reject is output, else, accept is output.
Lemma 4.8 Algorithm verify() of the authenticated data structure scheme LBT has O(log n)
access complexity. Moreover, it is parallelizable with O(1) access complexity using O(log n)
processors in the CRCW model.
Proof: Since ` = O(log n) values have to be processed to perform the verification of the
proof, the result follows. The parallel algorithm that a processor p = `, ` − 1, . . . , 1 executes
is the following (assume that α0 is defined as a g(.) representation of the digest dh ):
If p < ` then y = M(Uαp + Dβp ) (or y = M(Dαp + Uβp )) else y = α;
If αp−1 is not a g(.) representation of y or αp−1 , αp , βp are not n-admissible
then output reject else output accept;
Note that the algorithm requires concurrent write, since all the processors need to write to
a location storing the “reject” bit concurrently. Therefore the algorithm is parallelizable in
the CRCW model with O(1) access complexity using O(log n) processors. 2

4.2.4

Correctness and security

Lemma 4.9 The authenticated data structure scheme LBT = {genkey, setup, update, refresh,
query, verify} is correct according to Definition 2.4.

114
Proof: Let T = D0 be any table of n entries. Fix the security parameter k and output sk
and pk = (M, q) by calling algorithm genkey(). Then output an authenticated data structure
auth(D0 ) and the respective digest d0 , by calling algorithm setup(). Pick a polynomial number
of updates—namely, pick a polynomial number of pairs of indices and values to be written
on the respective indices—and update auth(D0 ) and d0 by calling algorithm refresh(). Let
Dh be the final table T, auth(Dh ) be the produced authenticated data structure and dh be
the final digest. Let i be an index and let y = T[i]. Output a proof Π(q) for index i and
answer y by calling query(). Π(q) contains pairs (g(d(vj )), g(d(sib(vj )))) (j = `, ` − 1, . . . , 1)
of n-admissible representations, where v` , v`−1 , . . . , v1 are the nodes on the path from index
i (i.e., node v` ) to the first child v1 of the root of the root of the tree T . For the elements of
the proof, the following are true:
1. g(d(v` )) = f (y) (definition of a leaf digest);
2. d(vj−1 ) = M(Ug(d(vj ))+Dg(d(sib(vj )))) or d(vj−1 ) = M(Dg(d(vj ))+Ug(d(sib(vj ))))—
according to left child or right child relation—, for j = `, ` − 1, . . . , 1 and where v0 is
the root of the tree (by Relation 4.1);
3. The g(.) representations in Π(q) are always n-admissible, i.e., they are maintained
to be n-admissible during updates, since refresh() always updates the g(.) representations so that Definition 4.9 is satisfied, which by Corollary 4.2 gives n-admissible
representations.
Based on the above, and the code of verify(), we conclude that verify() always accepts a proof
for index i (of answer y = T[i]) computed by query(). 2
Lemma 4.10 The authenticated data structure scheme LBT = {genkey, setup, update,
refresh, query, verify} is secure according to Definition 2.5 and assuming the hardness of
√
GAPSVPγ for γ = O(nk log n + log k).
Proof: Fix the security parameter k and output sk and pk = (M, q) by calling algorithm
genkey(). Let Adv be a polynomially-bounded adversary. Adv picks an initial table T = D0

115
of n entries and outputs authenticated data structure auth(D0 ), the respective digest d0 , tree
T of ` levels, by calling algorithm setup() through oracle access. Then Adv picks a polynomial
number of updates—namely, he picks a polynomial number of pairs of indices and values to
be written on the respective indices: Let Dh be the final table T, and dh be the final digest
as produced by the adversary through oracle access to algorithm update(). Let q = i be the
query index picked by Adv, y = T[i] be the value stored in this index and vl , vl−1 , . . . , v0
be the path of T from the node referring to index i to the root of T . The adversary Adv
outputs an incorrect answer α 6= y and also a proof Π = (πl , πl−1 , . . . , π1 ) (l = O(log n))
where πj = (αj , βj ) (see algorithm query()). We define now the following events, related
to the choice of the proof above made by the adversary. Our goal will be to the express
the probability that verify(i, α, Π, dh , pk) accepts while α 6= y as a function of the following
events. Note that dh is the correct digest of the authenticated data structure:
1. E`,0 : The value αl picked by Adv is such that α` is not an n-admissible g(.) representation of y;
2. Ej : For j = ` − 1, . . . , 1, the values αj and αj+1 , βj+1 ∈ {0, 1, . . . , n}k log q picked by Adv
are such that αj is an n-admissible g(.) representation of
M(Uαj+1 + Dβj+1 ) .
Assume, without loss of generality that a convenient index i = 0 is used so that the
order of U and D is always the same. This event can be partitioned into two mutually
exclusive events, i.e., Ej = Ej,0 ∪ Ej,1 such that
• Ej,0 : Value αj is not an n-admissible g(.) representation of the digest of node vj ,
as defined in Relation 4.1, i.e., αj 6= g(d(vj )) ;
• Ej,1 : Value αj is an n-admissible g(.) representation of the digest of node vj , as
defined in Relation 4.1, i.e., αj = g(d(vj )).

116
3. E0,1 : The values α1 ∈ {0, 1, . . . , n}k log q and β1 ∈ {0, 1, . . . , n}k log q picked by Adv are
such that
dh = M(Uα1 + Dβ1 ).
The probability that verify() accepts, while α` is not an n-admissible g(.) representation of
y is the probability
Pr[E`,0 ∩ E`−1 ∩ E`−2 ∩ . . . ∩ E0,1 ]
= Pr[E`,0 ∩ (E`−1,0 ∪ E`−1,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ . . . ∩ E0,1 ]
≤ Pr[E`,0 |E`−1,1 ] + Pr[E`−1,0 |E`−2,1 ] + Pr[E`−2,0 |E`−3,1 ] + . . . + Pr[E1,0 |E0,1 ]
`
X
=
Pr[Ej,0 |Ej−1,1 ] .
j=1

Note that the event Ej,0 |Ej−1,1 implies the following:
1. αj 6= g(d(vj ));
2. αj−1 = g(d(vj−1 )), where d(vj−1 ) = M(Uαj + Dβj ).
However, from Relation 4.1, it should be that
d(vj−1 ) = M(Ug(d(vj )) + Dg(d(sib(vj )))) ,
where g(d(vj )) and g(d(sib(vj ))) are the digests of nodes vj and sib(vj ) respectively. Therefore
(αj , βj ) is a collision with (g(d(vj )), g(d(sib(vj )))), since αj 6= g(d(vj )). Note now that by
√
Theorem 4.2, which gives γ = O(nk log n + log k) = poly(k) since q = O(k 2.5 δ log k) and
δ = n, and Assumption 4.1, Pr[Ej,0 |Ej−1,1 ] is neg(k), for all j = `, ` − 1, . . . , 1. Therefore the
sum
`
X

Pr[Ej,0 |Ej−1,1 ]

j=1

is also neg(k), since ` = O(log n) = O(log k). This concludes the proof. 2
We can now present the main result of this section.

117
Theorem 4.4 Let k be the security parameter. Then there exists a publicly-verifiable authenticated data structure scheme LBT = {genkey, setup, update, refresh, query, verify} for
a data structure scheme defined for a dynamic table D of n entries such that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and
√
assuming the hardness of GAPSVPγ for γ = O(nk log n + log k);
2. The access complexity of setup() is O(n log n) or O(n) using O(log n) processors in
the CREW model, outputting an authenticated data structure auth(D) of O(n) group
complexity;
3. The access complexity of update() is O(1), outputting update information upd of O(1)
group complexity;
4. The access complexity of refresh() is O(log n) or O(1) using O(log n) processors in the
CREW model;
5. The access complexity of query() is O(log n) or O(1) using O(log n) processors in the
EREW model, outputting a proof Π(q) for a query q of O(log n) group complexity;
6. The access complexity of verify() is O(log n) or O(1) using O(log n) processors in the
CRCW model.
Proof: This result follows directly from Lemmata 4.4, 4.5, 4.6, 4.7, 4.8, 4.9 and 4.10. Note
that the presented scheme is publicly verifiable since verify() does not take the secret key as
an input. 2

4.2.5

A note on repeated linearity

We note here that the fact that the used collision-resistant hash function is additive (homomorphic), i.e., it is
Mx + My = M(x + y) ,

118
is not enough for deriving our results. This is the reason that other homomorphic collisionresistant hash functions (e.g., exponentiation with secret factorization) could not be employed instead. The crucial property we can exhibit here, which is what we call repeated
linearity, is a means of “feeding” the output of the function again as an input, so that
certain homomorphic properties are still satisfied—and in specific the properties of Corollary 4.1. Therefore, it might be the case that other functions could be also used instead,
would they satisfy such a property.

4.3

Authenticated bloom filters

In this paragraph we show how we can use the lattice-based hash function to verify the
Bloom filter functionality, a space-efficient dictionary, originally introduced by Bloom [14].
The Bloom filter consists of an array (table) A[0 . . . n − 1] storing n bits. All the bits are
initially set to 0. Suppose one needs to store a set S of r elements. Then K hash functions
hi (.) with range {0, . . . , n − 1} are used (these are not lattice-based hash functions) and for
each element s ∈ S we set the bits A[hi (s)] to 1, for i = 1, . . . , K. In this way, false positives
can occur, i.e., an element that is not present might be represented in A. The probability
of a false positive can be proved to be (1 − p)K , where p = e−Kr/n , which is minimized for
K = ln 2(n/r) [14].
The Bloom filter above supports only insertions though. A deletion (i.e., setting some bits
to 0) can cause the undesired deletion of many elements. To deal with this problem, counting
Bloom filters were introduced by Fan et al. [38]. In this solution, by keeping a counter for
each index of A (instead of just 0 or 1), we can tolerate deletions by incrementing the counter
during insertions and decrementing the counter during deletions. However, the problem of
overflow exists. As observed by Broder and Mitzenmacher [21], the overflow (at least one
counter goes over some value C) occurs with probability n(e ln 2/C)C , for a certain set of r
elements. Setting C = O(1) (e.g., C = 16) is suitable for most of the applications [21].

119
By the above description, it is clear that we can use our lattice-based construction to
authenticate the Bloom filter functionality: Each update in the Bloom filter corresponds to
K updates in table T and querying one element in the Bloom filter corresponds to K queries
to table T . Note that constant update complexity in this application is very important given
that a Bloom filter is an update-intensive data structure (i.e., an insertion or deletion of an
element involves K operations):
Theorem 4.5 Let k be the security parameter. Then there exists a publicly-verifiable authenticated data structure scheme ABF = {genkey, setup, update, refresh, query, verify} for
a data structure scheme defined for a Bloom filter D of n entries, storing r elements and
using K hash functions such that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and
√
assuming the hardness of GAPSVPγ for γ = O(nk log n + log k);
2. The access complexity of setup() is O(n log n) or O(n) using O(log n) processors in
the CREW model, outputting an authenticated data structure auth(D) of O(n) group
complexity;
3. The access complexity of update() is O(K), outputting update information upd of O(1)
group complexity;
4. The access complexity of refresh() is O(K log n) or O(K) using O(log n) processors in
the CREW model;
5. The access complexity of query() is O(K log n) or O(K) using O(log n) processors in
the EREW model, outputting a proof Π(q) for a query q of O(K log n) group complexity;
6. The access complexity of verify() is O(K log n) or O(K) using O(log n) processors in
the CRCW model.

120
Proof: The construction for an authenticated Bloom filter is the same with Theorem 4.4.
The extra K multiplicative factor in the complexities is due to the fact that one operation in
the authenticated Bloom filter (insertion, deletion and query of an element) requires O(K)
operations on an authenticated table. This follows by the construction and the definition of
the Bloom filter data structure. 2

4.4

Parallel online memory checking

In this section, we establish our results concerning parallel online memory checking. The
online memory checking model [15] can be (informally) described as follows: Suppose M is an
unreliable (malicious) memory of n cells. A user U wants to read (through operation read(i))
or write (through operation write(i, x), where x is the new content) a cell i ∈ {1, 2, . . . , n}.
However, his requests go through a checker C. The checker is supposed to read cells from the
unreliable memory C and also some reliable (and possibly secret) information s of sublinear
size and output either the correct answer (i.e., the latest content of cell i) or BUGGY, if
the content of cell i is corrupted. The probability of returning the corrupted content of a
cell as correct should be negligible. The checker is called non-adaptive, if, given an index i,
the set and the order of the cells accessed in order to output the answer is deterministic. In
this paper we are considering such checkers. In the following, we give the formal definition:
Definition 1 (Online memory checking [35]) Let M be an n-cell unreliable memory.
An online non-adaptive memory checker C = (Σ, n, q, s) over an alphabet Σ with reliable
(and possibly secret) memory s is a probabilistic Turing machine with five tapes:
• A read-only input tape for receiving read/write requests from the user U to the unreliable
memory M of n cells, indexed by 1, 2, . . . , n;
• A write-only output tape for sending responses back to the user;
• A read-write work tape, i.e., the (secret) reliable memory s;

121
• A write-only tape for sending read/write requests to the memory M;
• A read only input tape for receiving M’s responses.
A checker is presented with write(i, x) and read(i) requests made by U to M, where i ∈
{1, 2, . . . , n}. After each read request C returns an answer or outputs that M’s operation is
BUGGY. C’s operation should be both correct and secure:
1. Correctness: For any polynomially-large sequence of user requests, as long as M answers all of C’s read requests correctly, C also answers all of the user’s read requests
correctly;
2. Security: For any any polynomially-large sequence of user requests, for any (even incorrect or malicious) answers returned by M, the probability that C answers a user
request incorrectly is neg(k), where k is the security parameter. C may either recover
the correct answer independently or answer that M is BUGGY, but it may not answer
a request incorrectly (beyond negligible probability).
In online memory checking settings, the complexity measure we are interested in minimizing
is the query complexity, which is defined as the sum of the number of requests that the checker
makes to the unreliable memory M during a read(i) operation plus the number of requests
that the checker makes to the unreliable memory M during a write(i, x) operation [82]. So far
in the literature, and in the computational model, checkers with O(log n) [15] or O(logd n) [35]
query complexity have appeared. Specifically for these checkers, we can distinguish two cases:
1. In the secret key setting, i.e., when there is requirement for both reliable and secret
small memory s, these checkers have been shown to be parallelizable, e.g., see the work
of Hall and Julta [54], as well as the construction based on PRFs [15]—although this
has not been reported in the literature4 ;
4

The construction based on PRFs appearing [15] is easily parallelizable since the PRF tag computed on
each node of the tree is not a function of the PRF tags of its children.

122
2. In the non-secret key setting, i.e., when there is requirement for only reliable memory
(e.g., the construction using UOWHFs from [15] and Merkle tree constructions), these
checkers have appeared to be inherently sequential.
However, in this section we establish the first parallel online memory checker in the non-secret
key setting:
Theorem 4.6 In the non-secret key setting and in the CREW model of parallel computation,
there is a non-adaptive online memory checker for an unreliable memory of n cells with O(1)
query complexity, using O(log n) checkers and O(1) reliable memory.
Proof: Let LBT = {genkey, setup, update, refresh, query, verify} be the authenticated
data structure scheme derived in Theorem 4.4. We show how to construct a parallel online
memory checker by using this scheme, in the non-secret key setting. Let M be the unreliable
memory accessed through indices 1, 2, . . . , n. Assume we can use u checkers C1 , C2 , . . . , Cu ,
where u = O(log n). The user U sends his requests to all the checkers simultaneously and all
the checkers have access to the unreliable memory M and to some reliable memory s. We
work in the CREW model—i.e., all the checkers can read simultaneously the same value but
writing at the same location simultaneously is not feasible. Let {sk, pk} ← genkey(), where
sk = Ø. The checkers run the algorithm {auth(M), d0 } ← setup(M, pk) (since sk = Ø we do
not use the secret key as input from now on) in parallel, requiring O(n) access complexity
in the CREW model (Theorem 4.4). The authenticated structure auth(M) is stored in the
unreliable memory (all its parts can be uniquely referenced) and d0 is stored in the small
reliable memory, i.e., s = d0 . We have two cases:
1. User U sends the request read(i) to all checkers C1 , C2 , . . . , Cu . The checkers run the
algorithm query(i, M, auth(M), pk) in parallel and output the answer M[i] and the
proof Π(i). This requires O(1) requests to the unreliable memory per checker in the
EREW model (Theorem 4.4). Then the algorithm verify(i, M[i], Π(i), s, pk) is run by
the checkers (note that running query() and verify() can be combined in one algorithm).

123
The algorithm writes either M[i] (in this case verify() accepts) or BUGGY (in this
case verify() rejects) in a location of the reliable memory. User U reads that location and
gets the result. We note here that the fact that verify() is parallelizable in the CRCW
model does not affect our complexity results since the write part of the algorithm is
done on the reliable memory—however, requests to the reliable memory are not taken
into account in query complexity (only requests to the unreliable memory). Therefore
the query complexity of the parallel checker due to read operations is O(1) in EREW
model;
2. User U sends the request write(i, x) to all checkers C1 , C2 , . . . , Cu . First the current
content of cell i is verified through a read(i) operation. If this verification succeeds the
checkers run algorithm
{M0 , auth(M0 ), s0 } ← refresh(write(i, x), M, auth(M), s, pk)
in parallel. Note that this algorithm has O(1) access complexity using O(log n) processors in the CREW model, by Theorem 4.4. We need concurrent read because all
the checkers should be able to read the same value of the old (verified) content of cell
i.
Finally, we note that the correctness and the security of the checker comes as a direct result of
the correctness and the security of the authenticated data structure scheme LBT . Also, since
our lattice-based construction does not use any secret key, it follows that the construction
we have described is in the non-secret key setting. This completes the proof. 2

4.5

Protocols

Three-party protocol. By using Theorem 2.1 we can easily derive the following corollary
that describes the use of the authenticated data structure scheme LBT of Theorem 4.4 in
the three-party model:

124
Corollary 4.4 Let k be the security parameter and assume the hardness of GAPSVPγ for
γ = poly(k). Then there exists a three-party authenticated data structures protocol (see
Protocol 2.1) for verifying queries q on a dynamic table of n entries such that:
1. The setup at the source has O(n log n) access complexity or O(n) access complexity
using O(log n) processors in the CREW model;
2. The update at the source has O(1) access complexity;
3. The space needed at the source has O(n) group complexity;
4. The communication between the source and the server has O(1) group complexity;
5. The update at the server has O(log n) access complexity or O(1) access complexity using
O(log n) processors in the CREW model;
6. The query at the server has O(log n) access complexity or O(1) access complexity using
O(log n) processors in the EREW model;
7. The space needed at the server has O(n) group complexity;
8. The communication between the server and the client has O(log n) group complexity;
9. The verification at the client has O(log n) access complexity or O(1) access complexity
using O(log n) processors in the CRCW model;
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.

Two-party protocol. As a corollary of Example 2.1 (the Merkle tree techniques apply in
the lattice-based authenticated table as are), we can state the following for the authenticated
data structure scheme LBT :

125
Corollary 4.5 Assumption 2.1 is true for the authenticated data structure scheme LBT .
Moreover, for every update u, |Qu | has O(1) complexity.
By Theorems 2.2 and 4.4 and Corollary 4.5, we can now state the final result for the two-party
model:
Corollary 4.6 Let k be the security parameter and assume the hardness of GAPSVPγ for
γ = poly(k). Then there exists a two-party authenticated data structures protocol (see Protocol 2.2) for verifying queries q on a dynamic table of n entries such that:
1. The protocol is non-interactive;
2. The setup at the client has O(n log n) access complexity or O(n) access complexity
using O(log n) processors in the CREW model;
3. The update at the client has O(log n) access complexity or O(1) access complexity using
O(log n) processors in the CRCW model;
4. The verification at the client has O(log n) access complexity or O(1) access complexity
using O(log n) processors in the CRCW model;
5. The space needed at the client has O(1) group complexity;
6. The communication between the client and the server has O(log n) group complexity;
7. The update at the server has O(log n) access complexity or O(1) access complexity using
O(log n) processors in the CREW model;
8. The query at the server has O(log n) access complexity or O(1) access complexity using
O(log n) processors in the EREW model;
9. The space needed at the server has O(n) group complexity;

126
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.
Finally, we note that similar protocols can be derived for the authenticated data structure
scheme ABF (Theorem 4.5), referring to Bloom filters.

Chapter

5

Authenticated sets operations with bilinear
maps
In the previous chapters of this thesis, we mainly studied the verification of fundamental
data structure queries, such as hash table queries (Chapter 3) and index queries on tables
(Chapter 4). The verification of these queries in the authenticated data structures setting
allows us to secure outsourced storage efficiently, i.e., to ensure that data has not been
tampered with by the untrusted party that stores it. In this chapter, we follow a different
direction where we are interested in verifying outsourced computation. Namely, how can
one verify the outcome of a computation that has been performed by an untrusted entity?
Of course, the main challenge in this paradigm is that the verification procedure should not
involve executing the computation from scratch: This would defeat the purpose of employing
a powerful (but untrusted) machine in the cloud to perform the computation for us.
Motivated mainly by computations performed by search engines (e.g., keyword searches
using an inverted index) as well as by database applications, in this chapter, we examine a
very fundamental class of computations: We study the verification of outsourced operations
on general sets, where a dynamic collection of m sets S1 , S2 , . . . , Sm is remotely stored at
an untrusted server and we wish to publicly verify primitive queries on these sets, such as
intersection, union and set difference. For example, for the query requesting the intersection
127

128
of t sets specified by indices i1 , i2 , . . . , it between 1 and m, we wish to design techniques that
allow any client to cryptographically check the correctness of the returned intersection Si1 ∩
Si2 ∩. . .∩Sit . In addition, we wish the verification of any set operation be operation-sensitive,
meaning that the required complexity depends only on the (description and outcome of the)
operation, and not on the sizes of the involved sets. For example, if |Si1 ∩ Si2 ∩ . . . ∩ Sit | = δ,
then we would like the verification cost to be proportional to t + δ. This achieves optimality,
as the query and the answer require O(t + δ) complexity.

Relation to outsourced verifiable computation. Recent works on outsourced verifiable computation by Gennaro et al. [41], Chung et al. [28] and Applebaum et al. [5] achieve
operation-sensitive verification of general functionalities. Although such approaches completely cover set operations as a special case, clearly meeting our goal with respect to optimal verifiability, they are inherently inadequate to meet our other goals with respect to
public verifiability and dynamic updates, both important properties in the context of data
querying. Indeed, the works on outsourced verifiable computation [5, 28, 41] are primarily
designed to provide secrecy of the outsourced computations, and as such, the client makes
use of some secret information to outsource the computation as a circuit and in an encrypted
form. This secret information is also used in the verifying computation, therefore effectively
supporting only one verifier; instead, we seek for schemes that allow any client to query the
sets collection and verify the returned results. Finally, in the outsourced verifiable computation framework [5, 28, 41], the description of the circuit is fixed at the initialization of the
scheme, therefore effectively supporting no updates (or, very expensive updates as shown in
Table 5.1 for [41]) in the outsourced data; instead, we seek for schemes supporting efficient
updates. We accordingly study our problem in the model of authenticated data structures,
which provides mechanisms for supporting public verifiability and queries on dynamic data.

Achieving operation-sensitive verification. In this chapter, we design a new authenticated data structure scheme (denoted with ASC in Table 5.1) for the verification of set

129
operations in an operation-sensitive manner, that is, with proof and verification complexity depending only on the description and outcome of the operation and not on the size of
the sets involved. Conceptually, this property is similar to the property of super-efficient
verification that has been studied in certifying algorithms [63] and certification data structures [52, 107] (as well as in the context of outsourced verifiable computation [5, 28, 41]),
where an answer can be verified in complexity asymptotically less than the complexity required to produce it. Whether the above optimality property is achievable for set operations
(with linear storage) was posed as an open problem by Devanbu et al. [33]. We close this
problem in the affirmative.
All existing schemes for verifying outsourced set operations fall into the following two
rather straightforward and highly inefficient solutions (for a detailed comparison see Table 5.1): Either short proofs for the answer of every possible set operation query are precomputed allowing for highly imbalanced schemes (exponential storage is required in order
to achieve optimal verification, e.g., see the work by Pang and Tan [89]) or integrity proofs
for all the elements of the sets participating in the query are given to the client who locally
verifies the set operation (in this case verification complexity can be linear in the problem
size, e.g., see the work by Devanbu et al. [33]).

setup()
update()
refresh()
query()
verify()
proof Π(q)
info. upd
publicly verifiable
assumption

[33, 112]
[79]
[89]
t
m+M
m+M
m +M
log n + log m
m+M
mt
log n + log m
m+M
mt
n + log m
n
1
n + log m
n
δ
n + log m
n
δ
1
n
mt
yes
yes
yes
Generic CR Strong RSA D. Log

[41]
ASC
m+M
m+M
m+M
1
m+M
1
2
m + M n log n log log n + m log m
δ
δ
δ
δ
m+M
1
no
yes
FHE
B. q-DH

Table 5.1: Asymptotic access and group complexities of various authenticated data structure schemes defined by algorithms
{genkey, setup, update, refresh, query, verify}, for a sets collection data structure of m sets: The sum of sizes of all the sets is M
and 0 <  < 1 is a constant. FHE stands for fully-homomorphic encryption, the security of which is based on lattice assumptions,
such as the bounded distance decoding and the SplitKey distinguishing problems—see [43]. We note that the scheme based on
FHE is not publicly-verifiable. It however provides privacy on top of integrity of computations. We show complexities for an
intersection query on t = O(1) sets, outputting an intersection δ elements. All sizes of the intersected and updated sets are
Θ(n).

130

131
Intuition of our construction. We achieve optimal verification complexity by departing
from the above approaches as follows. We first reduce the problem of verifying set operations
to the problem of verifying the validity of some more primitive relations on sets, namely
subset containment and set disjointness. Then for each such primitive relation we employ
a corresponding cryptographic primitive to optimally verify its validity. In particular, we
extend the bilinear-map accumulator to optimally verify subset containment, inspired by [90].
We then employ the extended Euclidean algorithm over polynomials in combination with
subset containment proofs to provide a novel optimal verification test for set disjointness.
The intuition behind our technique is that disjoint sets can be represented by polynomials
mutually indivisible, therefore there exist other polynomials so that the sum of their pairwise
products equals to one—this is the test to be used in the proof. However, transmitting
(and processing) these polynomials is bandwidth (and time)-prohibitive and does not lead
to operation-sensitivity. Taking advantage of bilinearity properties, we can compress their
coefficients in the exponent and still use them in a meaningful way, i.e., compute an internal
product. This is why although using a conceptually simpler RSA accumulator [11] would
lead to a mathematically sound solution, a bilinear-map accumulator [83] is essential for
achieving the desired complexity goal.

Related work for securing sets operations. Despite the fact that privacy-related problems for set operations have been extensively studied in the cryptographic literature (e.g., see
the work by Boneh and Waters [20] and the work by Freedman et al. [39]), existing work on
the integrity dimension of set operations appears mostly in the database literature. Devanbu
et al. [33] identify the importance of coming up with an operation-sensitive scheme. In the
work by Morselli et al. [79], possibly the closest in context work to ours, set intersection,
union and difference are authenticated with linear verification and proof costs. Same linear
asymptotic bounds are achieved by Yang et al. [112]. Pang and Tan [89] take a different

132
approach: In order to achieve operation-sensitivity, expensive pre-processing and exponential space are required (i.e., answers to all possible queries are signed). Finally, related to
our work are non-membership proofs, both for the RSA [68] and the bilinear-map [8, 32]
accumulators. We note here that the first part of the solution presented in this chapter uses
a modification of the authenticated data structure scheme BHT presented in Chapter 3.

5.1

Preliminaries

The data structure for which we design an authenticated data structure scheme for is called
sets collection and is a generalization of the inverted index [9]. We describe it in detail in
the following paragraph.

5.1.1

Sets collection data structure scheme

The sets collection data structure consists of a collection of m sets, denoted with S =
{S1 , S2 , . . . , Sm }, each containing elements from a universe U. Without loss of generality we
assume that our universe U is the set of nonnegative integers in the interval [m + 1, p − 1],
where p is k-bit prime, m is the number of the sets in our collection that has bit size O(log k),
and where k is the security parameter1 . Every set Si is maintained to be sorted and does
not contain duplicate elements; however an element x can appear in more than one set. The
space usage of the sets collection is O(m + M ), where M is the sum of the sizes of the sets.
Let now It be a collection of t indices, all between 1 and m. The data structure scheme
{query(), update(), check()} (Definition 2.2) for a sets collection data structure T(S)
supports various set operations over a collection S of dynamic sets and is defined as follows:
1. answer ← query(It , T(S), op): Depending on the input parameter op, a query on the
sets collection data structure is one of the following standard set operations:
1

As we are going to see later in this chapter, we could have easily set our universe to be Zp by using
CRHFs, but we choose not to do so in sake of a cleaner presentation. However, even with this constraint,
our universe contains O(2k − poly(k)) = O(2k ) elements, since m is polynomially large.

133
• Intersection: Given indices It = {i1 , i2 , . . . , it }, return set I = Si1 ∩ Si2 ∩ . . . ∩ Sit
as answer;
• Union: Given indices It = {i1 , i2 , . . . , it }, return set U = Si1 ∪ Si2 ∪ . . . ∪ Sit as
answer;
• Subset: Given indices It = {i, j}, return true as answer if Si ⊆ Sj and false
otherwise;
• Set difference: Given indices It = {i, j}, return the set D = Si − Sj as answer.
2. T(S 0 ) ← update(x, i, T(S)): Given an element x ∈ U and 1 ≤ i ≤ m such that
x∈
/ Si , insert element x into Si and output T(S 0 ); Given an element x ∈ U such that
x ∈ Si , delete element x from Si and output T(S 0 ).
3. {accept, reject} ← check(answer, op, T(S)): Output true if answer is the correct answer
to the query on T(S) defined by op.
Complexity. Let N be the sum of the sizes of the sets participating in the queries defined
by algorithm query(). By using a generalized merge, all these queries can be answered
with O(N ) complexity. Moreover, due to the requirement of keeping the sets sorted, all the
updates require O(log N ) complexity. Also, for the remainder of the chapter, we denote with
δ the size of the answer to a query operation, i.e., δ is equal to the size of I, U, or D. For a
subset query, δ is O(1) (true/false).
Sets collection as a hash table. We observe here that the sets collection data structure
T(S) for the sets collection S = {S1 , S2 , . . . , Sm } can be viewed as a special hash table:
Every set Si refers to a bucket Li of the hash table data structure scheme in Chapter 3.
This construction does not have expected O(1) size for the buckets, since the sets can have
arbitrary size. However, viewing the sets collection as a hash table—that uses a different
function for distributing elements in the buckets—will allow as to employ scheme BHT from
Chapter 3 as a black box for verifying set operations queries.

134

5.1.2

Subset witnesses

Our construction uses bilinear maps and the bilinear-map accumulator, which were introduced in Section 3.1.3. We urge the reader to review the bilinear-map accumulator section
(Section 3.1.3) before continuing. We begin with introducing an extra property that is going
to be used in this context, the property of subset witnesses, which also appeared simultaneously (without a proof though) in the recent work of Canard and Gouget [25]. Assume that
the bilinear-map parameters are in place, as described in Section 3.1.3. The proof for subset
containment of a set S ⊆ X —for |S| = 1, this is a proof of membership—is the witness
(WS,X , S) where
WS,X = g

Q

x∈X −S (x+s)

.

A verifier can test subset containment for S by checking the relation e(WS,X , g

(5.1)
Q

x∈S (x+s)

?

)=

e (acc(X ), g). We continue with the proof of security, which is a generalization of the membership proof presented by Nguyen [83]:
Lemma 5.1 (Proving subsets) Let k be the security parameter and let (p, G, G, e, g) be
a uniformly randomly generated tuple of bilinear pairings parameters. Given the elements
q

g, g s , . . . , g s ∈ G for some s chosen at random from Z∗p and a set of elements X in Zp
(q ≥ |X |), suppose there is a polynomial-time algorithm that finds S and W such that S * X
and e(W, g

Q

x∈S (x+s)

) = e(acc(X ), g). Then there is a polynomial-time algorithm for breaking

the bilinear q-strong Diffie-Hellman assumption.
Proof: Suppose there is a polynomial-time algorithm that computes such a set S =
{y1 , y2 , . . . , y` }. Let X = {x1 , x2 , . . . , xn } and yj ∈
/ X for some 1 ≤ j ≤ `. That means
that
e(W, g)

Q

y∈S (y+s)

= e(g, g)(x1 +s)(x2 +s)...(xn +s) .

Note that (yj +s) does not divide (x1 +s)(x2 +s) . . . (xn +s). Therefore there exist polynomial
Q(s) of degree n − 1 and constant λ, such that (x1 + s)(x2 + s) . . . (xn + s) = Q(s)(yj + s) + λ.

135
Thus we have
e(W, g)(yj +s)

Q

1≤i6=j≤` (yi +s)

Q

1≤i6=j≤` (yi +s)

Q

1≤i6=j≤` (yi +s)

e(W, g)
e(W, g)

1

e(g, g) yj +s

= e(g, g)Q(s)(yj +s)+λ ⇒

= e(g, g)

λ
j +s)

Q(s)+ (y

⇒
λ

= e(g, g)Q(s) e(g, g) (yj +s) ⇒
h 

iλ−1
Q
−Q(s)
(yi +s)
1≤i6
=
j≤`
= e W, g
e (g, g)
.

This means that the algorithm can be used to break the bilinear q-strong Diffie-Hellman
assumption. 2

5.2

Construction and algorithms

In the following, we recall that m denotes the number of the sets of our sets collection data
structure and M denotes the sum of the sizes of the sets in our collections, i.e.,
M=

m
X

|Si | .

i=1

We now describe ASC = {genkey, setup, update, refresh, query, verify}, our authenticated
data structure scheme for a sets collection data structure S1 , S2 , . . . , Sm . To do that, we are
going to employ an extended version of the authenticated data structure scheme BHT =
{genkey, setup, update, refresh, query, verify} described in Chapter 3: Instead of employing
BHT on top of some m buckets created by using a two-universal hash function H on an
elements collection X , we are going to employ it on top of the sets S1 ∪{1}, S2 ∪{2}, . . . , Sm ∪
{m}. By the constraint of our universe U, note that i ∈
/ Si .
Algorithm {sk, pk} ← genkey(1k ): The algorithm calls {sk, pk} ← BHT .genkey(). It
outputs BHT .sk as sk and BHT .pk as pk. The access complexity is of this algorithm is
O(1).

136
Algorithm {auth(D0 ), d0 } ← setup(D0 , sk, pk): The authenticated data structure auth(D0 )
is built as follows: First of all, the accumulation values of sets Si
acc(Si ) = g

Q

x∈Si (s+x)

for all i = 1, . . . , m ,

(5.2)

are computed (see Section 3.1). Then the algorithm calls
{auth(D0 ), d0 } ← BHT .setup(D0 , sk, pk) ,
without precomputed witnesses, and where D0 is the collection of m sets
S1 ∪ {1}, S2 ∪ {2}, . . . , Sm ∪ {m} .
Namely, the “bucket” Li in the scheme BHT is defined as the set Si ∪ {i} in this construction. The algorithm outputs both the authenticated data structure BHT .auth(D0 ) and the
accumulation values acc(Si ) for all i = 1, . . . , m as auth(D0 ). Also it sets d0 to be BHT .d0 .
Lemma 5.2 Algorithm setup() of the authenticated data structure scheme ASC has O(m +
M ) access complexity. Moreover, the authenticated data structure auth(D0 ) output by setup()
has O(m + M ) group complexity.
Proof: When the scheme BHT is used with buckets of size O(1), the complexity of algorithm
setup() as well as the group complexity of the output authenticated data structure, by
Lemma 3.13, are both O(m). However in our case, since the size of the “buckets” sums to
M ≥ m (and not to O(m) as it happens with the authenticated hash table), both these
complexities are O(m + M ). 2
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 , upd} ← update(u, Dh , auth(Dh ), dh , sk, pk): Suppose
the update u is insert element x ∈ U into set Si . The algorithm initially sets
acc(Si ) = acc(Si )x+s ,
if the update is an insertion. If the update is a deletion, the algorithm sets
−1

acc(Si ) = acc(Si )(x+s)

,

137
i.e., it updates the accumulation value that corresponds to the updated set. Then it calls
{Dh+1 , auth(Dh+1 ), dh+1 , upd} ← BHT .update(u, Dh , auth(Dh ), dh , sk, pk). However, no rebuilding policy is applied here, as is done in BHT (therefore the complexity is not amortized). Information upd and auth(Dh+1 ) are set equal to BHT .upd and BHT .auth(Dh+1 )
respectively, both enhanced with the updated accumulated value acc(Si ).
Lemma 5.3 Algorithm update() of the authenticated data structure scheme ASC has O(1)
access complexity. Moreover, the update information upd output by update() has O(1) group
complexity.
Proof: The complexity bounds follow from Lemma 3.14 and by the fact that no rebuilding
of the sets collection data structure is employed in this case. 2
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): Suppose the
update u is insert element x ∈ U into set Si . The algorithm calls the respective procedure
from the authenticated data structure scheme BHT , i.e., it calls {Dh+1 , auth(Dh+1 ), dh+1 } ←
BHT .refresh(u, Dh , auth(Dh ), dh , upd, pk) (again, no rebuilding policy is applied) and stores
the new accumulation value acc(Si ) contained in upd. The updated authenticated data structure auth(Dh+1 ) is set equal to BHT .auth(Dh+1 ), enhanced with the updated accumulation
value acc(Si ), contained in upd.
Lemma 5.4 Algorithm refresh() of the authenticated data structure scheme ASC has O(1)
access complexity.
Proof: It follows directly from Lemma 3.17, and since we are not using precomputed witnesses and any rebuilding of the table. 2

5.3

Queries and verification

In this section, we show how compact proofs for the answers to set queries (e.g., intersection,
union) can be constructed using the authenticated sets collection data structure presented

138
earlier. The proofs have optimal size O(t + δ), where t is the size of the query parameters
(e.g., t = 2 for an intersection of two sets) and δ is the answer size (e.g., δ = 1 if the
intersection consists of one element). Our solutions use polynomial arithmetic, since the
basis of our construction involves the bilinear-map accumulator.
We begin with a result, to be used extensively by our methods, related to certifying
algorithms [63]. Lemma 5.5 states that if the vector of coefficients a = [an , an−1 , . . . , a0 ]
of a polynomial having roots x = [−x1 , −x2 , . . . , −xn ] is claimed to be correct, it can be
certified, with high probability, with complexity less than O(n log n), i.e., without a fast
Fourier transform computation (FFT) from scratch (see Lemma 3.15 from Chapter 3). This
can be achieved with the following algorithm (not part of the ASC scheme):
Algorithm {accept, reject} ← certify(a, x, pk): Pick a random κ ∈ Z∗p . If
Qm
i=1 (κ + xi ), then the algorithm outputs accept, else it outputs reject.

Pn

i=0

ai κi =

Lemma 5.5 (Verification of polynomial coefficients) Let a = [an , an−1 , . . . , a0 ] and x =
[x1 , x2 , . . . , xn ]. If accept ← certify(a, x, pk), then an , an−1 , . . . , a0 are the coefficients of the
Q
polynomial ni=1 (s + xi ) with probability Ω(1 − neg(k)). Moreover, algorithm certify(a, x, pk)
has O(n) complexity.
Proof: Algorithm certify() has complexity O(n) since it involves O(n) multiplications, additions and exponentiations. The probability that certify() accepts while a0 , a1 , . . . , an are not
the coefficients of the polynomial that has roots −x1 , −x2 , . . . , −xn is equal to the probability
P
Q
of κ being the root of the polynomial R(κ) = ni=0 ai κi − m
i=1 (κ + xi ). This follows from
polynomial equality that should hold for all κ. Now, polynomial R(κ) has degree n = poly(k)
and has O(n) roots. Since κ is picked at random from Z∗p , it follows that this probability is
bounded by O(poly(k)/2k ), which is neg(k), and therefore the validity of the coefficients can
be verified with probability Ω(1 − neg(k)) with Θ(n) complexity. 2
In the following we describe the algorithms for the queries intersection, union, subset and
set difference in detail. The parameters of our queries are t ≥ 2 indices (for subset and set

139
difference queries it is t = 2), namely the indices i1 , i2 , . . . , it , with 1 ≤ t ≤ m. To simplify
the notation, we assume without loss of generality that these indices are 1, 2, . . . , t. We
P
denote with ni the size of set Si (i = 1, 2, . . . , t) and we define N = ti=1 ni . I.e., N is the
total size of the sets involved in the execution of our queries. We repeat that δ denotes the
size of our answer (e.g., size of the output intersection). Note, that in all cases δ = O(N ) and
that performing the actual operations has O(N ) complexity, by using a generalized merge.

5.3.1

Intersection query

We begin with the intersection query. Let I = S1 ∩ S2 ∩ . . . ∩ St = {y1 , y2 , . . . , yδ } be
the intersection of sets S1 , S2 , . . . , St . We express the correctness of the answer I to the
intersection query by means of the following two conditions:
Subset condition: I ⊆ S1 ∧ I ⊆ S2 ∧ . . . ∧ I ⊆ St ;

(5.3)

Completeness condition: (S1 − I) ∩ (S2 − I) ∩ . . . ∩ (St − I) = Ø .

(5.4)

Note the completeness condition in Equation 5.4 is necessary since I should contain all
the common elements. Given an intersection I, and for every set Sj , we define polynomial
Q
Pj (s) = x∈Sj −I (x + s), of degree nj . We can now state the following lemma:
Lemma 5.6 Set I that is a subset of sets S1 , S2 , . . . , St is the intersection of sets S1 , S2 , . . . , St
if and only if there exist polynomials q1 (s), q2 (s), . . . , qt (s) such that q1 (s)P1 (s)+q2 (s)P2 (s)+
. . . + qt (s)Pt (s) = 1. Moreover, computing polynomials q1 (s), q2 (s), . . . , qt (s) can be achieved
with complexity O(N log2 N log log N ).
Proof: (⇒) This direction follows by the fact that we can use the extended Euclidean
algorithm and find polynomials q1 (s), . . . , qt (s) such that
q1 (s)P1 (s) + . . . + qt (s)Pt (s) = GCD(P1 (s), P2 (s), . . . , Pt (s)).
Since P1 (s), P2 (s), . . . , Pt (s) share no common factors, it follows that
GCD(P1 (s), P2 (s), . . . , Pt (s)) = 1 .

140
(⇐) Suppose there exist polynomials q1 (s), q2 (s), . . . , qt (s) that satisfy relation q1 (s)P1 (s) +
q2 (s)P2 (s) + . . . + qt (s)Pt (s) = 1 but I is not the intersection. This means that polynomials
P1 (s), P2 (s), . . . , Pt (s) share at least one common factor, e.g., (s + r). Therefore there exists
some polynomial A(s) such that (s + r)A(s) = 1, i.e., the polynomials (s + r)A(s) and 1
are equal, which is a contradiction (note that we want the polynomials to be equal for every
s ∈ Zp ).
In order to compute these coefficients, we use the extended Euclidean algorithm recursively, based on the fact that the greatest common divisor GCD(P1 (s), . . . , Pt (s)) equals
GCD(P1 (s), GCD(P2 (s), . . . , Pt (s))). To compute the greatest common divisor of two O(n)degree polynomials, we can use the algorithm described in the book by von zur Gathen and
Gerhard [40] that has O(n log2 n log log n) complexity. Since we are using this algorithm t
times, the time complexity is bounded by O(tn log2 n log log n). Moreover, by the property
that x log x + y log y ≤ (x + y) log(x + y) and since the size of the sets participating in the intersection is N , this equals O(N log2 N log log N ). This algorithm also outputs the required
coefficients. If we arrange our data (i.e., t polynomials) on a binary tree, after all the coefficients of the internal nodes have been computed, the final coefficients for all elements at the
leaves can be computed in O(t) multiplications (we can avoid the O(t log t) cost) of O(ni )
degree polynomials, where ni are the degrees of the polynomials of the leaves. Therefore the
result holds.

2

We use Lemmata 3.15 and 5.6 to construct efficient proofs for both conditions in Relations 5.3 and 5.4:
Proof of subset condition. For each set Sj , 1 ≤ j ≤ t, the subset witnesses
WI,j = g

Pj (s)

are computed, as defined in Relation 5.1.

=g

Q

x∈Sj −I (x+s)

(5.5)

141
Proof of completeness condition. Suppose q1 (s), q2 (s), . . . , qt (s) are polynomials computed in Lemma 5.6 that satisfy q1 (s)P1 (s)+q2 (s)P2 (s)+. . .+qt (s)Pt (s) = 1. For j = 1, . . . , t,
the completeness witnesses
FI,j = g qj (s)

(5.6)

are computed. We can now formally define algorithms query() and verify() of the authenticated data structure scheme ASC and for the intersection query:
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): (intersection)
The query q is the set of indices {1, 2, . . . , t}, requiring the intersection of S1 , S2 , . . . , St .
Let α(q) = {y1 , y2 , . . . , yδ } be the intersection I. The proof Π(q) consisting of the following
pieces:
1. Coefficients bδ , bδ−1 , . . . , b0 of the polynomial (s + y1 )(s + y2 ) . . . (s + yδ );
2. Accumulation values proofs Πj = {(αji , βji ) : i = 0, . . . , l}, as defined in Relation 3.27,
output by calling algorithm BHT .query(j, Dh , auth(Dh ), pk), for all j = 1, . . . , t;
3. Subset witnesses WI,j , as defined in Relation 5.5, for all j = 1, . . . , t;
4. Completeness witnesses FI,j = g qj (s) , as defined in Relation 5.6, for all j = 1, . . . , t.
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): (intersection)
Given a proof Π and an answer α = {y1 , y2 , . . . , yδ }, the verification algorithm for the
intersection query S1 ∩ S2 ∩ . . . ∩ St outputs accept if all of the following tests are successful,
else it outputs reject:
1. Coefficients test: It is accept ← certify([bδ , bδ−1 , . . . , b0 ], [−y1 , −y2 , . . . , −yδ ], pk);2
2. Accumulation values tests: For all j = 1, . . . , t, it is
accept ← BHT .verify(j, true, Πj , dh , pk) ;
2

Algorithm certify() is used to achieve optimal verification complexity.

142
3. Subset tests: For all j = 1, . . . , t, it is
e

δ 
Y

g

si

b i

!
, WI,j

= e (βj0 , g) ,

(5.7)

i=0

where βj0 is taken from Πj ;
4. Completeness test: It is
t
Y

e (WI,j , FI,j ) = e(g, g) .

(5.8)

j=1

5.3.2

Union query

The answer to a union query is the set U = S1 ∪ S2 ∪ . . . ∪ St = {y1 , y2 , . . . , yδ }. We express
the correctness of the answer U to the union query by means of the following two conditions:
Membership condition: ∀yi ∈ U ∃j ∈ {1, 2, . . . , t} : yi ∈ Sj ;
Superset condition: (U ⊇ S1 ) ∧ (U ⊇ S2 ) ∧ . . . ∧ (U ⊇ St ) .

(5.9)
(5.10)

Note that the superset condition is needed to make sure that no element has been excluded
from the returned answer U. We now formally describe algorithms query() and verify() for
the union query.
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): (union)
The query q is the set of indices {1, 2, . . . , t}, requiring the union of S1 , S2 , . . . , St . Let
α(q) = {y1 , y2 , . . . , yδ } be the union U. The proof Π(q) consisting of the following pieces:
1. Coefficients bδ , bδ−1 , . . . , b0 of the polynomial (s + y1 )(s + y2 ) . . . (s + yδ );
2. Accumulation values proofs Πj = {(αji , βji ) : i = 0, . . . , l}, as defined in Relation 3.27,
output by calling algorithm BHT .query(j, Dh , auth(Dh ), pk), for all j = 1, . . . , t;
3. Membership witnesses Wyi ,Sk (for some 1 ≤ k ≤ t), as defined in Relation 3.4, for all
i = 1, . . . , δ;
4. Subset witnesses WSj ,U , as defined in Relation 5.1, for all j = 1, . . . , t.

143
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): (union)
Given a proof Π and an answer α = {y1 , y2 , . . . , yδ }, the verification algorithm for the union
query S1 ∪ S2 ∪ . . . ∪ St outputs accept if all of the following tests are successful, else it
outputs reject:
1. Coefficients test: It is accept ← certify([bδ , bδ−1 , . . . , b0 ], [−y1 , −y2 , . . . , −yδ ], pk);
2. Accumulation values tests: For all j = 1, . . . , t, it is
accept ← BHT .verify(j, true, Πj , dh , pk) ;
3. Membership tests: For all i = 1, . . . , δ, it is
e (Wyi ,Sk , g yi g s ) = e (βk0 , g) ,
where βk0 is taken from Πk ;
4. Subset tests: For all j = 1, . . . , t, it is
e WSj ,U , βj0


δ 

Y
i bi
=e
gs
,g

!
,

i=0

where βj0 is taken from Πj .

5.3.3

Subset query

The correctness properties we need for the subset query are expressed with the relations
S1 ⊆ S2 ⇔ ∀y ∈ S1 : y ∈ S2 .
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): (subset)
The query q is the set of indices {1, 2} (wlog). Let α(q) = true if S1 ⊆ S2 or α(q) = false
otherwise. The proof Π(q) consisting of the following pieces:
1. Accumulation values proofs Πj = {(αji , βji ) : i = 0, . . . , l}, as defined in Relation 3.27,
output by calling algorithm BHT .query(j, Dh , auth(Dh ), pk), for j = 1, 2;

144
2. We distinguish two cases:
(a) α(q) = true: The proof contains the subset witness WS1 ,S2 as defined in Relation 5.1;
(b) α(q) = false: The proof contains a membership witness Wy,S1 (for some y) as
defined in Relation 3.4 and a non-membership witness (Ay , By )—that proves that
y∈
/ S2 — as defined in Relation 3.5.
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): (subset)
Given a proof Π and an answer α ∈ {true, false}, the verification algorithm for the subset
query S1 ⊆ S2 outputs accept if all of the following tests are successful, else it outputs reject:
1. Accumulation values tests: For j = 1, 2, it is accept ← BHT .verify(j, true, Πj , dh , pk);
2. (Non)-membership tests: When α = true, it is e(WS1 ,S2 , β10 ) = e(β20 , g), otherwise
(α = false) it is e(Wy,S1 , g y g s ) = e(β10 , g) (verification of membership of y in S1 ) and
e(g y g s , Ay )e(β20 , By ) = e(g, g)
(verification of non-membership of y in S2 ), where β10 and β20 are taken from Π1 and
Π2 .

5.3.4

Set difference query

The correctness properties for a set difference query are expressed with the following relations. It is
D = S1 − S2 ⇔ D ⊆ S1 ∧ S1 − D = S1 ∩ S2 .
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): (set difference)
The query q is the set of indices {1, 2} (wlog), requiring the difference S1 − S2 . Let α(q) =
{y1 , y2 , . . . , yδ } be the difference D. The proof Π(q) consisting of the following pieces:

145
1. Coefficients bδ , bδ−1 , . . . , b0 of the polynomial (s + y1 )(s + y2 ) . . . (s + yδ );
2. Accumulation values proofs Πj = {(αji , βji ) : i = 0, . . . , l}, as defined in Relation 3.27,
output by calling algorithm BHT .query(j, Dh , auth(Dh ), pk), for j = 1, 2;
3. Subset witness WD,S1 , as defined in Relation 5.1;
4. Subset witnesses WS1 −D,1 and WS1 −D,2 as defined in Relation 5.5;
5. Completeness witnesses FS1 −D,1 and FS1 −D,2 as defined in Relation 5.6.
Algorithm {accept, reject} ← verify(q, α, Π, dh , pk): (set difference)
Given a proof Π and an answer α = {y1 , y2 , . . . , yδ }, the verification algorithm for the
difference query S1 − S2 outputs accept if all of the following tests are successful, else it
outputs reject:
1. Coefficients test: It is accept ← certify([bδ , bδ−1 , . . . , b0 ], [−y1 , −y2 , . . . , −yδ ], pk);
2. Accumulation values tests: For all j = 1, 2, it is accept ← BHT .verify(j, true, Πj , dh , pk);


Q δ  si b i
3. Subset tests: It is e WD,S1 , i=0 g
= e (β10 , g) and
(a) e (WS1 −D,1 , WD,S1 ) = e(β10 , g),
(b) e (WS1 −D,2 , WD,S1 ) = e(β20 , g),
where β10 and β20 are taken from Π1 and Π2 ;
4. Completeness test: It is e (WS1 −D,1 , FS1 −D,1 ) e (WS1 −D,2 , FS1 −D,2 ) = e (g, g).
This concludes the description of the verification and the query algorithms for all four set
operations supported by the authenticated data structure scheme ASC.

146

5.4

Complexity

Let now n1 , n2 , . . . , nt be the sizes of the involved sets in our queries and N =

Pt

i=1

ni . We

have the following result:
Lemma 5.7 For all queries q, algorithm query() of the authenticated data structure scheme
ASC has O(N log2 N log log N + tm log m) access complexity. Moreover, it outputs a proof
Π(q) of O(t + δ) group complexity.
Proof: For all queries involving t sets S1 , S2 , . . . , St , accumulation proofs Π1 , Π2 , . . . , Πt have
to be constructed, by using the authenticated data structure scheme algorithm BHT .query().
By Lemma 3.18 (no precomputed witnesses), this requires O(tm log m) complexity and the
output proofs Π1 , Π2 , . . . , Πt have O(t) group complexity. Moreover:
• Queries intersection, union and set difference require the computation of the coefficients bδ , bδ−1 , . . . , b0 of the polynomial that has roots −y1 , −y2 , . . . , −yδ . This task, by
Lemma 3.15 has O(δ log δ) = O(N log N ) complexity, since δ ≤ N . Since bδ , bδ−1 , . . . , b0 ∈
Zp , their total group complexity is O(δ);
• All queries require computing t subset witnesses (note that for the subset and set
difference queries it is t = O(1)). By Lemma 3.15 and by the definition of subset
witnesses in Relation 5.1, computing the subset witnesses has
!
t
X
O
(ni − δ) log(ni − δ) = O(N log N )
i=1

complexity. Since all subset witnesses are elements in G, their total group complexity
is O(t). Moreover, for the union query, δ membership witnesses have to be computed.
This, by Lemma 3.15 has complexity that is bounded above by O(N log N ). Also,
these membership witnesses are elements in G, therefore their total group complexity
is O(δ);
• Queries intersection, subset and set difference require computing t completeness (or
non-membership) witnesses (note that for the subset and set difference queries it is t =

147
O(1)), which involves running the extended Euclidean algorithm. By Lemma 5.6, this
task has O(N log2 N log log N ) complexity. The group complexity of these witnesses is
O(t), since they are elements in G.
Summing up, we conclude that the proof has always group complexity O(t + δ) (hence,
operation-sensitive) and the complexity to compute it is O(N log2 N log log N + tm log m)
for all queries, except for the union proof, which requires slightly less complexity, i.e.,
O(N log N + tm log m). 2
Lemma 5.8 Algorithm verify() of the authenticated data structure scheme ASC has O(t+δ)
access complexity.
Proof: Algorithm certify() has O(δ) complexity, by Lemma 5.5. Also, the verification algorithm for all queries performs a number of constant-complexity operations—such as verification of proofs Πi with BHT .verify() (see Lemma 3.19) and bilinear-map computations—,
that is proportional to t + δ. Therefore the access complexity of verify() is O(t + δ). 2

5.5

Proof of correctness

Lemma 5.9 The authenticated data structure scheme ASC = {genkey, setup, update, refresh,
query, verify} that uses the correct authenticated data structure scheme BHT from Chapter 3
is correct according to Definition 2.4.
Proof: Let D0 be any sets collection data structure containing m sets. Fix the security
2

q

parameter k and output pk = {h(.), (p, G, G, e, g), g s , g s , . . . , g s } and sk = s by calling algorithm genkey(). Then output an authenticated data structure auth(D0 ) and the respective
digest d0 , by calling algorithm setup(). Pick a polynomial number of updates—namely, pick
a polynomial number of elements for insertion (or deletion) into (or from) a set Sr —and
update auth(D0 ) and d0 by calling algorithm refresh(). Let Dh be the final sets collection
data structure, auth(Dh ) be the produced authenticated data structure and dh be the final

148
digest. By the way refresh() operates, at every time, the digest d(vj ) of a leaf node vj (that
corresponds to set Sj ) of the tree T is
d(vj ) = acc(Sj )(s+j) .

(5.11)

We prove correctness for all four query operations, i.e., for intersection, union, subset and
difference.
Intersection. Let our query q be {1, 2, . . . , t} (wlog), i.e., a set of indices that refers to the
intersection of sets S1 , S2 , . . . , St . Algorithm query() outputs the proof Π(q) and the correct
answer I = {y1 , y2 , . . . , yδ } = S1 ∩ S2 ∩ . . . ∩ St . The proof Π(q) for the intersection contains
the following parts:
1. The coefficients bδ , bδ−1 , . . . , b0 of polynomial (s + y1 )(s + y2 ) . . . (s + yδ ) associated
P
with the intersection I = {y1 , y2 , . . . , yδ }. Since for every κ ∈ Zp it is δi=0 bi κi =
Qδ
i=1 (κ + yi ), Algorithm certify() accepts;
2. The proofs Πj , output by BHT .query(j, Dh , auth(Dh ), pk), for j = 1, . . . , t. We recall
that each proof Πj is the ordered sequence (αji , βji ) for i = 0, . . . , l, as defined in
−1

Relations 3.27. Specifically, by Relations 3.27 it should be βj0 = d(vj )(s+j) , which by
Relation 5.11 gives
βj0 = acc(Sj ) .

(5.12)

Now, by the correctness of the scheme BHT , BHT .verify(j, true, Πj , dh , pk), on inputs
Πj output by BHT .query(j, Dh , auth(Dh ), always accepts (see Lemma 3.20);
3. The subset witnesses WI,j = g
e

Pj (s)

=g

δ 
Y

g

si

Q

x∈Sj −I (x+s)

bi

for j = 1, . . . , t. The equality

!
, WI,j

= e(βj0 , g) ,

i=0

is always true, by the properties of the bilinear map, by Relation 5.12 and by the fact
P
Q
that δi=0 bi si = δi=1 (s + yi ) (Item 1);

149
4. The completeness witnesses FI,j = g qj (s) for j = 1, . . . , t. The following equality
t
Y

Pt

e (WI,j , FI,j ) = e(g, g)

j=1 qj (s)Pj (s)

= e(g, g) ,

j=1

is always true: By construction of the completeness witnesses it should be
t
X
qj (s)Pj (s) = 1 .
j=1

This completes the proof of correctness for the case of intersection, since we proved that for
every intersection query q and for every correct answer and proof output by query(), verify()
always accepts.
Union. Let our query q be {1, 2, . . . , t} (wlog), i.e., a set of indices that refers to the union
of sets S1 , S2 , . . . , St . Algorithm query() outputs the proof Π(q) and the correct answer
U = {y1 , y2 , . . . , yδ } = S1 ∪ S2 ∪ . . . ∪ St . The proof Π(q) for a union contains the following
parts:
1. The coefficients bδ , bδ−1 , . . . , b0 and the proofs Πj . These are always verified as in the
case of intersection. See Items 1 and 2 above;
2. The membership witnesses Wyi ,Sk for some k = 1, . . . , t, for each element yi (i =
1, . . . , δ). For i = 1, . . . , δ, it is e(Wyi ,Sk , g yi g s ) = e(βk0 , g), since Wyi ,Sk is the subset
witness as defined in Relation 5.1 and βk0 = acc(Sk ), by Relation 5.12;
3. The subset witnesses WSj ,U , for all j = 1, . . . , t. For all j = 1, . . . , t it is
!
δ 
b i
Y
i
e(WSj ,U , βj0 ) = e
gs
,g ,
i=0

where WSj ,U is the subset witness of Sj with respect to U (the coefficients of which are
b0 , b1 , . . . , bδ ), as defined in Relation 5.1 and U ⊇ Sj for all j = 1, . . . , t Therefore this
relation also verifies, since βj0 = acc(Sj ) by Relation 5.12.
This completes the proof of correctness for the case of union, since we proved that for every
union query q and for every correct answer and proof output by query(), verify() always
accepts.

150
Subset. Let the query be is S1 ⊆ S2 ? (wlog). Algorithm query() outputs the proof Π(q)
and the correct answer, i.e., either true or false. The proof Π(q) for a subset query contains
the following parts:
1. The accumulation values Π1 and Π2 . These are always verified as in the case of intersection. See Item 2 in the proof of correctness of the intersection operation;
2. Depending on whether we have a positive or a negative answer, we distinguish the
following cases:
• Positive answer, i.e., S1 is a subset of S2 . The proof contains the subset witness
WS1 ,S2 . Then it is e(WS1 ,S2 , β10 ) = e(β20 , g), by the definition of WS1 ,S2 (see
Relation 5.1), since S1 ⊆ S2 and since β10 = acc(S1 ) and β20 = acc(S2 ), by
Relation 5.12;
• Negative answer, i.e., S1 is not a subset of S2 . The proof contains an element
y such that y ∈ S1 but y ∈
/ S2 , the respective membership witness Wy,S1 and a
non-membership proof (Ay , By ). It is e(Wy,S1 , g y g s ) = e(β10 , g), by definition of
Wy,S1 in Relation 5.1, since y ∈ S1 and since β10 = acc(S1 ), by Relation 5.12.
Also it holds e(g y g s , Ay )e(β20 , By ) = e(g, g), since Ay = g q(s) and By = g p(s) are
Q
such that (y + s)q(s) + p(s) x∈S2 (x + s) = 1, y ∈
/ S2 and β20 = acc(S2 ), by
Relation 5.12.
This completes the proof of correctness for the case of the subset query, since we proved that
for every subset query q and for every correct answer and proof output by query(), verify()
always accepts.
Set difference. Let our query q be S1 − S2 (wlog). Algorithm query() outputs the proof
Π(q) and the correct answer D = S1 − S2 = {y1 , y2 , . . . , yδ }. The proof Π(q) for a difference
query contains the following parts:

151
1. The coefficients bδ , bδ−1 , . . . , b0 (that relate to the difference {y1 , y2 , . . . , yδ }) and the
proofs Π1 and Π2 . These are always verified as in the case of intersection. See Items 1
and 2 in the proof of correctness of the intersection query;


Q δ  si b i
2. The subset witness WD,S1 . Then it is e WD,S1 , i=0 g
= e(β10 , g), by the definition of WD,S1 (see Relation 5.1), since D ⊆ S1 and since β10 = acc(S1 ) by Relation 5.12;
3. Note now that WD,S1 = g

Q

x∈S1 −D (x+s)

. The remaining relations involving the subset

witnesses WS1 −D,1 , WS1 −D,2 and the completeness witnesses FS1 −D,1 , FS1 −D,1 always verify since they comprise an intersection proof, i.e., the proof that S1 − D = S1 ∩ S2 and
we have already shown the correctness of the intersection operation.
This completes the proof of correctness for all the queries supported by sets collection, since
we proved that for every query q (intersection/union/subset/difference) and for every correct
answer and proof output by query(), verify() always accepts. 2

5.6

Proof of security

Lemma 5.10 The authenticated data structure scheme ASC = {genkey, setup, update,
refresh, query, verify} that uses the secure authenticated data structure scheme BHT from
Chapter 3 is secure according to Definition 2.5 and under the bilinear q-strong Diffie-Hellman
assumption.
Proof: Let Adv be a computationally-bounded adversary, D0 be a sets collection data structure consisting of m sets S1 , S2 , . . . , Sm , ASC = {genkey, setup, update, refresh, query, verify}
be our authenticated data structure scheme, k be the security parameter and {sk, pk} ←
genkey(1k ). The adversary Adv is given the public key pk, namely he is given the values
2

q

{h(.), (p, G, G, e, g), g s , g s , . . . , g s } and unlimited access to all the algorithms of ASC, except for setup() and update() to which he only has oracle access. The adversary initially
outputs the authenticated data structure auth(D0 ) and the digest d0 , through an oracle call

152
to algorithm setup(). Then the adversary picks a polynomial number of updates (e.g., insert
an element x into a set Sr ) and eventually outputs the data structure Dh , the authenticated
data structure auth(Dh ) and the digest dh through oracle access to update(). Note that
since dh , the digest of the authenticated data structure, is produced through oracle access to
setup() and update(), it follows that it is the correct one. We now prove the security of each
operation separately. For each operation we will express the probability of Definition 2.5 as
the intersection of several events that we are going to define precisely below. Then, by using
well-accepted assumptions already introduced, we are going to prove that this probability is
negligible.
Intersection. Let the intersection query be a set of indices {1, 2, . . . , t} (wlog). The adversary Adv outputs an incorrect answer I = {e1 , e2 , . . . , eδ } =
6 S1 ∩ S2 ∩ . . . ∩ St and also a
proof that consists of the following elements:
1. Coefficients γδ , γδ−1 , . . . , γ0 ;
2. Proofs Π1 , Π2 , . . . , Πt ;
3. Subset witnesses W1 , W2 , . . . , Wt ;
4. Completeness witnesses F1 , F2 , . . . , Ft .
We define now the following events, related to the choice of the proof above made by the
adversary. Our goal will be to the express the probability of the security definition (Definition 2.5) as a function of the following events.
• E1 : The values γ = [γδ , γδ−1 , . . . , γ0 ] and the answer e = {e1 , e2 , . . . , eδ } picked by Adv
are such that accept ← certify(γ, e, pk). Event E1 can be partitioned into two mutually
exclusive events E1,0 and E1,1 , i.e, E1 = E1,0 ∪ E1,1 :
– E1,0 : The coefficients γδ , γδ−1 , . . . , γ0 are not the coefficients of the polynomial
(s + e1 )(s + e2 ) . . . (s + eδ );

153
– E1,1 : The coefficients γδ , γδ−1 , . . . , γ0 are the coefficients of the polynomial (s +
e1 )(s + e2 ) . . . (s + eδ ).
• E2 : The proofs Π1 , Π2 , . . . , Πt picked by Adv are accepted by algorithm BHT .verify(),
i.e., it is accept ← BHT .verify(j, true, Πj , dh , pk), for all j = 1, . . . , t. Let
(βj0 , αj0 )

(5.13)

be the first element of the proof Πj . Event E2 can be partitioned into two mutually
exclusive events E2,0 and E2,1 , i.e, E2 = E2,0 ∪ E2,1 :
– E2,0 : There exists j ∈ {1, 2, . . . , t} such that βj0 6= acc(Sj );
– E2,1 : For all j = 1, . . . , t it is βj0 = acc(Sj ).
• E3 : The values γδ , γδ−1 , . . . , γ0 , W1 , W2 , . . . , Wt and β10 , β20 , . . . , βt0 , which are contained in Π1 , Π2 , . . . , Πt , picked by Adv satisfy
!
δ 
γi
Y
i
e
gs
, Wj = e (βj0 , g) for j = 1, . . . , t .
i=0

Event E3 can be partitioned into two mutually exclusive events E3,0 and E3,1 , i.e, E3 =
E3,0 ∪ E3,1 :
– E3,0 : There exists j ∈ {1, 2, . . . , t} such that the opposites of the roots of the
P
polynomial δi=0 γi si are not a subset of Sj ;
– E3,1 : The opposites of the roots of the polynomial

Pδ

i=0

γi si are a subset of Sj for

all j = 1, . . . , t.
• E4 : The values W1 , W2 , . . . , Wt and F1 , F2 , . . . , Ft picked by Adv satisfy

Qt

j=1

e (Wj , Fj ) =

e(g, g);
• F: The answer (intersection) I picked by Adv is not correct, i.e., I = {e1 , e2 , . . . , eδ } 6=
S1 ∩ S2 ∩ . . . ∩ St .

154
Let now P be the probability of Definition 2.5, i.e., it is


k
 {Q, Π, α, h} ← Adv(1 , pk); accept ← verify(Q, α, Π, dh , pk); 
P = Pr 
.
reject ← check(Q, α, Dih ).
We recall that the authenticated data structure scheme ASC is secure if P ≤ ν(k), where
ν(k) is neg(k). We observe that for the case of the intersection query, P can be expressed as
the probability of the intersection of the events E1 , E2 , E3 , E4 , F. By using simple probability
calculus, this can be written as
P = Pr [E1 ∩ E2 ∩ E3 ∩ E4 ∩ F] = Pr [(E1,0 ∪ E1,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ (E3,0 ∪ E3,1 ) ∩ E4 ∩ F]
≤ Pr[E1,0 ] + Pr [E1,1 ∩ (E2,0 ∪ E2,1 ) ∩ (E3,0 ∪ E3,1 ) ∩ E4 ∩ F]
≤ Pr[E1,0 ] + Pr[E2,0 ] + Pr [E1,1 ∩ E2,1 ∩ (E3,0 ∪ E3,1 ) ∩ E4 ∩ F]
≤ Pr[E1,0 ] + Pr[E2,0 ] + Pr[E3,0 ∩ E2,1 ∩ E1,1 ] + Pr [E1,1 ∩ E2,1 ∩ E3,1 ∩ E4 ∩ F]
≤ Pr[E1,0 ] + Pr[E2,0 ] + Pr[E3,0 |E2,1 ∩ E1,1 ] + Pr [E4 |E3,1 |E2,1 ∩ E1,1 ∩ F] .
We compute each such probability separately:
1. Pr[E1,0 ] is neg(k) by Lemma 5.5;
2. Pr[E2,0 ] is neg(k) by Corollary 3.4 (security of scheme BHT );3
3. Pr[E3,0 |E2,1 ∩ E1,1 ]: For this event we note that the event E3,0 is conditioned on the
event E2,1 ∩ E1,1 . This condition allows us to replace βj0 with acc(Sj ) (due to E2,1 )
P
Q
and δi=0 γi si with x∈I (x + s) (due to E1,1 ) in the event E3,0 . Therefore the event
E3,0 |E2,1 ∩ E1,1 is the event
e g

Q

x∈I (x+s)


, Wj = e (acc(Sj ), g) ∧ I * Sj for some j ∈ {1, 2, . . . , t} .

This event implies breaking the bilinear q-strong Diffie-Hellman assumption (Assumption 3.2), by Lemma 5.1. Therefore the probability Pr[E3,0 |E2,1 ∩ E1,1 ] is neg(k);
3

Note that in order to apply this corollary in the sets collection data structure, we have to consider that
the respective “bucket” of the hash table representing the sets collection is Lj = Sj ∪ {j} and therefore
Lj − {j} = Sj .

155
4. Pr[E4 |E3,1 |E2,1 ∩ E1,1 ∩ F]: For this event we note that the event E4 is conditioned on
the event E3,1 |E2,1 ∩ E1,1 ∩ F. This condition allows us to replace βj0 with acc(Sj ) (due
P
Q
to E2,1 ) and δi=0 γi si with x∈I (x + s) (due to E1,1 ) in the event E3,1 . Therefore, the
event E3,1 |E2,1 ∩ E1,1 is the event
e g

Q

x∈I (x+s)


, Wj = e (acc(Sj ), g) ∧ I ⊆ Sj for all j = 1, 2, . . . , t .

This is equivalent to writing Wj as the subset witness WI,Sj , i.e.,
Wj = g

Q

x∈Sj −I (x+s)

= g Pj (s) .

(5.14)

Note now that E4 is also conditioned on F. Therefore I is has to be incorrect. Specifically,
since I ⊆ Sj for all j = 1, . . . , t (due to the condition on E31 ), it follows that I does not
contain all the elements of the intersection, i.e., it is incomplete. Thus the polynomials
P1 (s), P2 (s), . . . , Pt (s) (Relation 5.14) have at least one common factor, say (s + r) and it
holds Pj (s) = (s + r)Qj (s) for some polynomials Qj (s)—computable in polynomial time—,
for all j = 1, . . . , t. Therefore the event E4 |E3,1 | ∩ E2,1 ∩ E1,1 ∩ F implies that
e(g, g) =

=

t
Y

e (Wj , Fj ) =

t
Y

e g

j=1

j=1

t
Y

(s+r)

e g Qj (s) , Fj

Pj (s)


, Fj =

t
Y

e g (s+r)Qj (s) , Fj


j=1

=

j=1

t
Y

!(s+r)
e g Qj (s) , Fj


.

j=1

Therefore we can derive an (s + r)-th root of e(g, g) as
e(g, g)

1
s+r

=

t
Y


e g Qj (s) , Fj .

j=1

This implies breaking the bilinear q-strong Diffie-Hellman assumption for (p, G, G, e, g) (Assumption 3.2). By Assumption 3.2, this probability is neg(k), and therefore Pr[E4 |E3,1 |E2,1 ∩
E1,1 ∩ F] is neg(k). Thus the total probability P is neg(k). This concludes the proof for the
security of an intersection query.

156
Union. Let the union query be a set of indices {1, 2, . . . , t} (wlog). The adversary Adv
outputs an incorrect answer U = {e1 , e2 , . . . , eδ } 6= S1 ∪ S2 ∪ . . . ∪ St and also a proof that
consists of the following elements:
1. Coefficients γδ , γδ−1 , . . . , γ0 ;
2. Proofs Π1 , Π2 , . . . , Πt ;
3. For each element ei ∈ U, membership witnesses Wi,j with reference to some set Sj ,
where 1 ≤ j ≤ t;
4. Subset witnesses W1 , W2 , . . . , Wt that prove that U is a superset of Sj , for all j =
1, 2 . . . , t.
We define now the following events, related to the choice of the proof above made by the
adversary. Our goal will be to the express the probability of the security definition as a
function of the following events.
• E1 , E1,0 , E1,1 : Same as in intersection;
• E2 , E2,0 , E2,1 : Same as in intersection;
• E3 : The values {e1 , e2 , . . . , eδ }, W1,j1 , W1,j2 , . . . , W1,jδ picked by Adv satisfy
e (Wi,ji , g s g ei ) = e(βji 0 , g) for all i = 1, . . . , δ and ji ∈ {1, 2, . . . , t} ,
where βji 0 is the first element of proof Πji , as defined in Relation 5.13. Event E3 can
be partitioned into two mutually exclusive events E3,0 and E3,1 , i.e, E3 = E3,0 ∪ E3,1 :
– E3,0 : There exists i ∈ {1, 2, . . . , δ} such that ei ∈
/ Sji ;
– E3,1 : For all i = 1, 2, . . . , δ it is ei ∈ Sji .
• E4 : The values W1 , W2 , . . . , Wt , β10 , β20 , . . . , βt0 (contained in Π1 , Π2 , . . . , Πt ) as well as
the values γδ , γδ−1 , . . . , γ0 picked by Adv satisfy
!
δ 
γi
Y
i
e (Wj , βj0 ) = e
gs
,g .
i=0

157
• F: The answer (union) U picked by Adv is not correct, i.e., U = {e1 , e2 , . . . , eδ } 6=
S1 ∪ S2 ∪ . . . ∪ St .
Similarly with the intersection security proof, let P be the probability of Definition 2.5. We
observe that for the case of the union query, P can be expressed as the probability of the
intersection of the events E1 , E2 , E3 , E4 , F. By using simple probability calculus (and similarly
with the intersection security proof), this can be written as
P = Pr [E1 ∩ E2 ∩ E3 ∩ E4 ∩ F] = Pr [(E1,0 ∪ E1,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ (E3,0 ∪ E3,1 ) ∩ E4 ∩ F]
≤ Pr[E1,0 ] + Pr[E2,0 ] + Pr[E3,0 |E2,1 ] + Pr [E4 |E3,1 ∩ E2,1 ∩ E1,1 ∩ F] .
We compute each such probability separately:
1. Pr[E1,0 ] is neg(k) by Lemma 5.5;
2. Pr[E2,0 ] is neg(k) by Corollary 3.4;
3. Pr[E3,0 |E2,1 ]: For this event we note that the event E3,0 is conditioned on the event E2,1 .
This condition allows us to replace βj0 with acc(Sj ) in the event E3,0 . Therefore the
event E3,0 |E2,1 is the event
e (Wi,ji , g s g ei ) = e(acc(Sji ), g) ∧ ∃i ∈ {1, 2, . . . , δ} ∧ ji ∈ {1, 2, . . . , t} : ei ∈
/ Sji .
This event implies breaking the bilinear q-strong Diffie-Hellman assumption (Assumption 3.2), by Lemma 5.1. Therefore the probability Pr[E3,0 |E2,1 ] is neg(k);
4. Pr[E4 |E3,1 ∩ E2,1 ∩ E1,1 ∩ F]: For this event we note that the event E4 is conditioned on
the event E3,1 ∩ E2,1 ∩ E1,1 ∩ F. This condition allows us to replace βj0 with acc(Sj )
P
Q
(due to E2,1 ) and δi=0 γi si with x∈U (x + s) (due to E1,1 ) in the event E4 . Therefore,
the event E4 |E3,1 ∩ E2,1 ∩ E1,1 is the event
e (Wj , acc(Sj )) = e g

Q

x∈U (x+s)


,g .

(5.15)

158
Note now that E4 is also conditioned on E3,1 . Thus it holds that all elements ei ∈ U
belong to some Sji . Therefore the reported union cannot contain extra elements. Also,
E4 is conditioned on F (incorrect union). Therefore the reported union must contain
less elements and there should be an Sj (1 ≤ j ≤ t) that contains an r such that r ∈
/ U.
Therefore since Relation 5.15 holds, the adversary Adv can find P (s), Q(s) and α such
that
e (Wj , acc(Sj )) = e(Wj , g)(s+r)P (s) = e g

Q

x∈U (x+s)


, g = e(g, g)(s+r)Q(s)+α .

Therefore we can derive an (s + r)-th of e(g, g) as
1

e(g, g) s+r = e(g, WSj )P (s)/α e(g, g)−Q(s)/α .
This implies breaking the bilinear q-strong Diffie-Hellman assumption for the setting
(p, G, G, e, g) (Assumption 3.2). By Assumption 3.2, this probability is neg(k), and
therefore Pr[E4 |E3,1 ∩ E2,1 ∩ E1,1 ∩ F] is neg(k). Thus the total probability P is neg(k).
This concludes the proof for the security of a union query.
Subset. Let the subset query be S1 ⊆ S2 . For a positive answer, the adversary Adv outputs
an incorrect answer false and also a proof that consists of the following elements:
1. Proofs Π1 and Π2 ;
2. A membership witness WS1 ,S2 with reference to set S2 .
We define now the following events, related to the choice of the proof above made by the
adversary. Our goal will be to the express the probability of the security definition as a
function of the following events.
• E2 , E2,0 , E2,1 : Same as in intersection, with the difference that we only refer to two sets,
i.e., sets S1 and S2 ;
• E3 : The values β10 (contained in Π1 ), β20 (contained in Π2 ) and WS1 ,S2 picked by Adv
satisfy e (WS1 ,S2 , β10 ) = e (β20 , g).

159
• F: S1 * S2 .
Similarly with the intersection security proof, let P be the probability of Definition 2.5.
We observe that for the case of the positive subset query, P can be expressed as the probability of the intersection of the events E2 , E3 , F. By using simple probability calculus (and
similarly with the intersection security proof), this can be written as
P = Pr [E2 ∩ E3 ∩ F] = Pr [(E2,0 ∪ E2,1 ) ∩ E3 ∩ F] ≤ Pr[E2,0 ] + Pr [E3 |E2,1 ∩ F] .
We compute each such probability separately:
1. Pr[E2,0 ] is neg(k) by Lemma 3.4;
2. Pr[E3 |E2,1 ∩ F]: For this event we note that the event E3 is conditioned on the event
E2,1 ∩ F. This condition allows us to replace β10 with acc(S1 ) and β20 with acc(S2 ) in
the event E3 . Therefore the event E3 |E2,1 ∩ F is the event
e (WS1 ,S2 , acc(S1 )) = e (acc(S2 ), g) ∧ S1 * S2 .
This event implies breaking the bilinear q-strong Diffie-Hellman assumption (Assumption 3.2), by Lemma 5.1. Therefore the probability Pr[E3 |E2,1 ∩ F] is neg(k);
This concludes the security proof for the case of the positive subset query. For a negative
answer, the adversary Adv outputs an incorrect answer true and also a proof that consists of
the following elements:
1. Proof Π1 and Π2 ;
2. An element y;
3. A membership witness Wy for element y;
4. A non-membership witness Ay and By .

160
We define now the following events, related to the choice of the proof above made by the
adversary. Our goal will be to the express the probability of the security definition (Definition 2.5) as a function of the following events.
• E2 : Same as in the positive answer;
• E3 : The values β10 , Wy and y picked by Adv are such that e(Wy , g s g y ) = e(β10 , g). Event
E3 can be partitioned into two mutually exclusive events E3,0 and E3,1 , i.e, E3 = E3,0 ∪E3,1 :
– y ∈ S1 ;
– y∈
/ S1 .
• E4 : The values y, Ay , By and β20 picked by Adv are such that e(g y g s , Ay )e(β20 , By ) =
e(g, g);
• F: S1 ⊆ S2 .
Similarly with the intersection security proof, let P be the probability of Definition 2.5. We
observe that for the case of the negative subset query, P can be expressed as the probability
of the intersection of the events E2 , E3 , E4 , F. By using simple probability calculus (and
similarly with the intersection security proof), this can be written as
P = Pr [E4 ∩ E3 ∩ E2 ∩ F] = Pr [E4 ∩ (E3,0 ∪ E3,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ F]
≤ Pr[E2,0 ] + Pr[E3,0 |E2,1 ] + Pr [E4 |E3,1 ∩ E2,1 ∩ F] .
We compute each such probability separately:
1. Pr[E2,0 ] is neg(k) by Lemma 3.4;
2. Pr[E3,0 |E2,1 ]: For this event we note that the event E3,0 is conditioned on the event E2,1 .
This condition allows us to replace β10 with acc(S1 ) in the event E3,0 . Therefore the
event E3,0 |E2,1 is the event
e (Wy , g s g y ) = e (acc(S1 ), g) ∧ y ∈
/ S1 .

161
This event implies breaking the bilinear q-strong Diffie-Hellman assumption (Assumption 3.2), by Lemma 5.1. Therefore the probability Pr[E3,0 |E2,1 ] is neg(k);
3. Pr [E4 |E3,1 ∩ E2,1 ∩ F]. Due to the condition on E3,1 ∩ E2,1 ∩ F this is the event
e (g y g s , Ay ) e (acc(S2 ), By ) = e(g, g) ∧ S1 ⊆ S2 .
Since we have the condition on E3,1 ∩ F (y ∈ S1 and S1 ⊆ S2 ), it must be that y ∈ S2 .
By the security of the non-membership witness (Lemma 3.3), this implies breaking the
q-strong Diffie-Hellman assumption (Assumption 3.2), which happens with probability
neg(k). Thus the total probability P is neg(k). This concludes the proof for the security
of a negative subset query.
Set difference. Let the difference query be D = S1 − S2 . The adversary Adv outputs an
incorrect answer D = {e1 , e2 , . . . , eδ } =
6 S1 − S2 and also a proof that consists of the following
elements:
1. Coefficients γδ , γδ−1 , . . . , γ0 ;
2. Proofs Π1 and Π2 ;
3. A subset witness WD,S1 ;
4. A proof (WS1 −D,1 , WS1 −D,2 , FS1 −D,1 , FS1 −D,2 ) for the intersection S1 ∩ S2 .
We define now the following events, related to the choice of the proof above made by the
adversary. Our goal will be to the express the probability of the security definition (Definition 2.5) as a function of the following events.
• E1 : Same as in intersection;
• E2 : Same as in subset;

162
• E3 : The values γδ , γδ−1 , . . . , γ0 , WD,S1 and β10 (contained in Π1 ) picked by Adv satisfy
!
δ 
γi
Y
si
e WD,S1 ,
g
= e(β10 , g) .
i=0

Event E3 can be partitioned into two mutually exclusive events E3,0 and E3,1 , i.e, E3 =
E3,0 ∪ E3,1 :
– E3,0 : D * S1 ;
– E3,1 : D ⊆ S1 ;
• E4 : The values WD,S1 , β10 , β20 , WS1 −D,1 , WS1 −D,2 , FS1 −D,1 , FS1 −D,2 picked by Adv are such
that the respective tests for the intersection of S1 and S2 are satisfied, i.e.,
1. e (WS1 −D,1 , WD,S1 ) = e(β10 , g);
2. e (WS1 −D,2 , WD,S1 ) = e(β20 , g);
3. e (WS1 −D,1 , FS1 −D,1 ) e (WS1 −D,2 , FS1 −D,2 ) = e (g, g).
• F: The difference D is incorrect, i.e., D 6= S1 − S2 .
Similarly with the intersection security proof, let P be the probability of Definition 2.5. We
observe that for the case of the difference query, P can be expressed as the probability of the
intersection of the events E1 , E2 , E3 , E4 , F. By using simple probability calculus (and similarly
with the intersection security proof), this can be written as
P = Pr [E4 ∩ E3 ∩ E2 ∩ E1 ∩ F]
= Pr [E4 ∩ (E3,0 ∪ E3,1 ) ∩ (E2,0 ∪ E2,1 ) ∩ (E1,0 ∪ E1,1 ) ∩ F]
≤ Pr[E1,0 ] + Pr[E2,0 ] + Pr[E3,0 |E2,1 ∩ E1,1 ] + Pr[E4 |E3,1 ∩ E2,1 ∩ F] .
We compute each such probability separately:
1. Pr[E1,0 ] is neg(k) by Lemma 5.5;
2. Pr[E2,0 ] is neg(k) by Lemma 3.4;

163
3. Pr[E3,0 |E2,1 ∩E1,1 ]. For the event E3,0 |E2,1 ∩E1,1 , by replacing the values of the conditions,
we get
e WD,S1 , g

Q

x∈D (x+s)


= e(acc(S1 ), g) ∧ D * S1 .

This event implies breaking the bilinear q-strong Diffie-Hellman assumption (Assumption 3.2), by Lemma 5.1. Therefore the probability Pr[E3,0 |E2,1 ∩ E1,1 ] is neg(k);
4. Pr[E4 |E3,1 ∩ E2,1 ∩ F]. By the conditions, since D ⊆ S1 , we can write
WD,S1 = g

Q

x∈S1 −D (x+s)

.

Therefore the event is equivalent to the conjunction of the following events:


Q
• e WS1 −D,1 , g x∈S1 −D (x+s) = e (acc(S1 ), g);


Q
• e WS1 −D,2 , g x∈S1 −D (x+s) = e (acc(S2 ), g);
• e (WS1 −D,1 , FS1 −D,1 ) e (WS1 −D,2 , FS1 −D,2 ) = e(g, g).
We have already proved (intersection proof) that the probability that the above event
holds and S1 − D 6= S1 ∩ S2 is neg(k). However, the event S1 − D 6= S1 ∩ S2 is
equivalent with the event D 6= S1 − S2 , which is our event F. Therefore the probability
Pr[E4 |E3,1 ∩ E2,1 ∩ F] is neg(k).
This completes the proof of security for all the queries of the sets collection data structure.
2
Theorem 5.1 Consider a collection of m sets S1 , . . . , Sm and let M =

Pm

i=1

|Si | and 0 <

 < 1. For a query operation involving t sets (intersection/union/subset/difference), let
N be the sum of the sizes of the involved sets and δ be the answer size. Let now k be
the security parameter. Then there exists a publicly-verifiable authenticated data structure
scheme ASC = {genkey, setup, update, refresh, query, verify} for a data structure scheme
defined for dynamic sets collection data structure D such that:

164
1. It is correct and secure according to Definitions 2.4 and 2.5 and based on the bilinear
q-strong Diffie-Hellman assumption;
2. The access complexity of setup() is O(m + M ), outputting an authenticated data structure auth(D) of O(m + M ) group complexity;
3. The access complexity of update() is O(1), outputting update information upd of O(1)
group complexity;
4. The access complexity of refresh() is O(1);
5. For all queries q (intersection/union/subset/difference), the access complexity of query()
is O(N log2 N log log N + tm log m), outputting a proof Π(q) of O(t + δ) group complexity;
6. For all queries (intersection/union/subset/difference), the access complexity of verify()
is O(t + δ).
Proof: The result follows from Lemmata 5.2, 5.3, 5.4, 5.7, 5.8, 5.9 and 5.10. 2

5.6.1

Protocols

Three-party protocol. By using Theorem 2.1 we can easily derive the following corollary
that describes the use of the authenticated data structure scheme ASC of Theorem 5.1 in
the three-party model:
Corollary 5.1 Consider a collection of m sets S1 , . . . , Sm and let M =

Pm

i=1

|Si | and 0 <

 < 1. For a query operation involving t sets (intersection/union/subset/difference), let N
be the sum of the sizes of the involved sets and δ be the answer size. Let now k be the
security parameter and assume that the bilinear q-strong Diffie-Hellman assumption holds.
Then there exists a three-party authenticated data structures protocol (see Protocol 2.1) for
verifying intersection, union, subset and difference queries q on a dynamic sets collection
data structure such that:

165
1. The setup at the source has O(m + M ) access complexity;
2. The update at the source has O(1) access complexity;
3. The space needed at the source has O(m + M ) group complexity;
4. The communication between the source and the server has O(1) group complexity;
5. The update at the server has O(1) access complexity;
6. For all queries (intersection/union/subset/difference), the query at the server has
O(N log2 N log log N + tm log m)
access complexity;
7. The space needed at the server has O(m + M ) group complexity;
8. For all queries (intersection/union/subset/difference), the communication between the
server and the client has O(t + δ) group complexity;
9. For all queries (intersection/union/subset/difference), the verification at the client has
O(t + δ) access complexity;
10. For a query q (intersection/union/subset/difference) sent by the client to the server at
any time (even after updates), let α be an answer and let π be a proof returned by the
server. With probability Ω(1 − neg(k)), the client accepts the answer α if and only if
α is correct.
Two-party protocol. Since the authenticated data structure scheme ASC uses the authenticated data structure scheme BHT , for which we have proved that Assumption 2.1
is true (see Corollary 3.6), by Theorems 2.2 and 5.1, we can state the final result for the
two-party model:

166
Corollary 5.2 Consider a collection of m sets S1 , . . . , Sm and let M =

Pm

i=1

|Si | and 0 <

 < 1. For a query operation involving t sets (intersection/union/subset/difference), let N
be the sum of the sizes of the involved sets and δ be the answer size. Let now k be the
security parameter and assume that the bilinear q-strong Diffie-Hellman assumption holds.
Then there exists a two-party authenticated data structures protocol (see Protocol 2.2) for
verifying intersection, union, subset and difference queries q on a dynamic sets collection
data structure such that:
1. The protocol requires one round of interaction during updates;
2. The setup at the client has O(m + M ) access complexity;
3. The update at the client has O(1) access complexity;
4. For all queries (intersection/union/subset/difference), the verification at the client has
O(t + δ) access complexity;
5. The space needed at the client has O(1) group complexity;
6. The communication between the client and the server has O(1) group complexity during
updates and O(t + δ) group complexity during queries;
7. The update at the server has O(m log m) access complexity;
8. For all queries (intersection/union/subset/difference), the query at the server has
O(N log2 N log log N + tm log m)
access complexity;
9. The space needed at the server has O(m + M ) group complexity;
10. For a query q (intersection/union/subset/difference) sent by the client to the server at
any time (even after updates), let α be an answer and let π be a proof returned by the

167
server. With probability Ω(1 − neg(k)), the client accepts the answer α if and only if
α is correct.

5.7

Applications

In this section we discuss on some applications of the presented authenticated sets collection
data structure.

5.7.1

Keyword-search

First of all, we notice that our scheme could be easily used to authenticate keyword-search
queries implemented by the inverted index data structure [9]: Each term in the dictionary
corresponds to a set in our sets collection data structure which contains as elements all the
documents that include this term. A usual text query for terms m1 and m2 returns those
documents that are included in both the sets that are represented by m1 and m2 , i.e., their
intersection. By using our scheme, we can easily authenticate any such keyword-search query
with costs that are proportional to the size of the answer of the query and not proportional
to the amount of data that the algorithm reads in order to process the query. Moreover, the
derived authenticated inverted index can be efficiently updated as well. We continue now
with an extension of the authenticated inverted index, the timestamped keyword-search.

5.7.2

Timestamped keyword-search

Apart from applications in web search engines, the inverted index is used in other applications
that employ keyword-search as well, such as email-search. In email-search, a word dictionary
is again maintained, the terms of which are mapped into sets of email messages that contain
the specific term. Therefore when we are searching our inbox for emails containing terms m1
and m2 , an inverted index query is executed. However, it is always desirable in email search
to be able to introduce a “second” dimension in searching. For example, a query could be

168
“give me the emails that contain terms m1 and m2 and which were received between time t1
and t2 ”, where t1 < t2 . We call this procedure timestamped keyword-search.
One solution for the verification of timestamped keyword-search would be to embed a
timestamp in the documents (e.g., each email message) and have the client do the filtering
locally, after he has verified—using our scheme—the intersection of the sets that correspond
to terms m1 and m2 . However, this is not operation-sensitive at all: The intersection can
be a lot bigger than the set resulted after the application of the local filtering, making this
straightforward solution inefficient.
We now describe an algorithmic construction to solve this problem. Let t1 , t2 , . . . , tr be
the discrete timestamps that we are interested in (ti can be viewed as a certain day of the
month). We define a new sets collection data structure as follows: Imagine t1 , t2 , . . . , tr
are the leaves of a binary tree. We build a segment tree [97] on top of these timestamps
as follows: Each leaf storing timestamp ti contains the documents (e.g., email messages)
that were received at time ti . Moreover, the internal nodes of the binary tree contain the
documents that correspond to the union (note that this union does not have any common
elements) of the documents contained in the children’s nodes, recursively defining in this
way sets of documents for all the nodes of the tree. Therefore we end up with a new sets
collection data structure that is built on top of these 2r − 1 sets (one set per internal tree
node of the tree), namely the sets T1 , T2 , . . . , T2r−1 . The timestamped keyword-search is
therefore verified by two sets collection data structures, one built on the text terms, namely
the sets S1 , S2 , . . . , Sm , and one built on top of the sets of the timestamps, namely the sets
T1 , T2 , . . . , T2r−1 . Define now the extension of two timestamps ext(t1 , t2 ) to be the set of sets
Ti that “cover” the interval [t1 , t2 ], i.e., namely the set that contains sets the union of which
equals the set of all timestamps in [t1 , t2 ]. One can easily see that for every 1 ≤ t1 ≤ t2 ≤ r,
it is |ext(t1 , t2 )| = O(log r).
Suppose now we want to verify the documents that contain terms m1 and m2 and
which were received between t1 and t2 . Namely our query is described by the parameters

169
m1 , m2 , t1 , t2 (in the general case our query is described by t terms m1 , m2 , . . . , mt and two
timestamps t1 and t2 —see Corollary 5.3). All we have to do is to verify the intersection of the
following sets: (a) the union of sets in ext(t1 , t2 ), (b) S1 (set that refers to term m1 ) and, (c)
S2 (set that refers to term m2 ). Let T1 , T2 , . . . , T` be the disjoint sets that are contained in
ext(t1 , t2 ), where ` = O(log r). The answer to the query is the set (S1 ∩S2 )∩(T1 ∪T2 ∪. . .∪T` )
which can be written as (S1 ∩S2 ∩T1 )∪(S1 ∩S2 ∩T2 )∪. . .∪(S1 ∩S2 ∩T` ). Since Ti are disjoint,
each term of the union contributes at least one new term to the answer, and therefore we
can verify this query in a nearly operation-sensitive way by authenticating log r intersections separately (note there is an extra O(log r) multiplicative factor in the complexities of
Corollary 5.3).
Corollary 5.3 Consider a collection of m sets S1 , . . . , Sm , let M =

Pm

i=1

|Si |, 0 <  < 1

and t1 , t2 , . . . , tr be discrete timestamps. For a query operation involving in a time interval
[t1 , t2 ], let t be the number of involved sets, N be the sum of the sizes of the involved sets, and
δ be the answer size. There exists an authenticated data structure scheme T KS = {genkey,
setup, update, refresh, query, verify} for a data structure scheme defined for a timestamped
keyword-search data structure D with the following properties:
1. It is correct and secure according to Definitions 2.4 and 2.5 and based on the bilinear
q-strong Diffie-Hellman assumption;
2. The access complexity of setup() is O(m + r + M ), outputting an authenticated data
structure auth(D) of O(m + M + r) group complexity;
3. The access complexity of update() is O(log r), outputting information upd of O(1) group
complexity;
4. The access complexity of refresh() is O(log r);
5. For a time-stamped keyword-search query q, algorithm query() has O(N log2 N log log N +
t(m + r) log(m + r)) access complexity, outputting a proof Π(q) of O(t log r + δ) group

170
complexity;
6. For a time-stamped keyword-search query, the access complexity of verify() is O(t log r+
δ).
Note that in the above theorem we do not have a result concerning the verification of
union with timestamps. This is due to the following: Using the same notation as we did for
the intersection, the answer to the union query, would be the set (S1 ∪S2 )∩(T1 ∪T2 ∪. . .∪T` ).
The nature of the answer does not allow for any further algebraic processing and therefore
in order to authenticate the whole expression, one needs to verify the two unions separately.
This leads to a solution that is not operation-sensitive (we recall that the size of out query is
O(t)), therefore the operation-sensitive verification of this type of queries cannot be achieved
with our method—at least in a way similar to the techniques we have used so far. The same
applies for the difference queries.

5.8

Analysis

In this section we analyze the costs needed by our solution and compare with experimental
results from other works. For bilinear maps and generic-group operations in the bilinearmap accumulator, we used the PBC library [1], a library for pairing-based cryptography,
interfaced with C.

5.8.1

System setup

We choose our system parameters as follows. First of all, type A pairings are used, as
described in [70]. These pairings are constructed on the curve y 2 = x3 + x over the base field
Fq , where q is a prime number. The multiplicative cyclic group G we are using is a subgroup
of points in E(Fq ), namely a subset of those points of Fq that belong to the elliptic curve
E. Therefore this pairing is symmetric. The order of E(Fq ) is q + 1 and the order of the

171
group G is some prime factor p of q + 1. The group of the output of the bilinear map G is a
subgroup of Fq2 .
In order to instantiate type A pairings in the PBC library, we have to choose the size of
the primes q and p. The main constraint in choosing the bit-sizes of q and p is that we want
to make sure that discrete logarithm is difficult in G (that has order p) and in Fq2 . Typical
values are 160 bits for p and 512 bits for q. We use the typical value for the size of q, i.e.,
512 bits. Note that with this choice of parameters the size of the elements in G (which have
the form (x, y), i.e., points on the elliptic curve) is 1024 bits. Finally, let’s assume that the
accumulation tree that is built on top of the set digests, has two levels, i.e.,  = 0.5.

5.8.2

Communication cost

Here we analyze the communication cost that our scheme has for an intersection of two
sets. Let’s assume that the size of the reported intersection is δ. According to the described
query() algorithm for the intersection, the proof (apart from the answer itself), consists of
the following values: (a) Two subset witnesses, two completeness witnesses and two proofs
(each one of the proofs consist of two proof elements of two group elements each). Therefore
the size of all these elements, which are all elements of group G, is not dependent on the
size of the intersection and is equal to 2 × (1024 + 1024 + 4 × 1024)/8 = 1536 bytes; (b) The
coefficients bi ∈ Zp (we recall p is 160 bits long) of the intersection, for i = 1, . . . , δ. These
have size 160δ/8 = 20δ bytes. Therefore the total communication cost is a linear function of
δ, i.e., the function 1536 + 20δ (in bytes). We now compare the communication cost of our
scheme with the analysis made in [79]. In Table 5.2 we compare with the results presented in
Table IV of [79] where various set sizes n1 and n2 are used and the size of the intersection δ
is always 0.01n2 . Note that in most cases, our communication cost is a lot less than the one
reported in [79]. More importantly, it is not dependent on the size of the sets participating
in the intersection. In cases that our cost is worse, it is due to the big constants enforced by
the use of bilinear pairings and accumulators.

172

Table 5.2: Comparison of a 2-intersection communication overhead (proof size) of the scheme
presented by Morselli et al. [79] with our scheme. Here n1 and n2 are the sets sizes that are
intersected and δ is the size of the intersection.
n1
n2
δ KB [79] KB (this work)
1000
1000
10
3.34
1.73
1000
100
1
1.68
1.55
1000
10
0
1.01
1.53
1000
1
0
0.46
1.53
10000 10000 100
26.88
3.53
10000
1000
10
12.15
1.73
10000
100
1
6.86
1.55
10000
10
0
3.08
1.53
100000 100000 1000 263.25
21.53
100000 10000 100 116.13
3.53
100000
1000
10
63.18
1.73
100000
100
1
26.69
1.55

5.8.3

Verification cost

Let exp, mult, add be the times needed to perform an exponentiation, a multiplication and an
addition respectively, all modulo p. Let also EXP, MULT be times required for exponentiation
and multiplication in group G and let EX P, MULT be the respective times in the target
group of the bilinear map G. Finally let MAP be the time needed to perform the operation
e(., .). We benchmarked all these operations using the PBC library [1] (version pbc − 0.5.7),
on a 64-bit, 2.8GHz Intel based, dual-core, dual-processor machine with 4GB main memory,
running Debian Linux, and derived the following times, i.e., MAP = 5ms, MULT = 0.005ms,
exp = 0.02ms, add = 0.002ms and mult = 0.002ms.
We analyze now the verification cost of a 2-intersection, required by our scheme. Let Si
and Sj be the sets of the intersection. The verification algorithm, on input the proof has to
perform the following tasks: (a) First it verifies the proofs Πi and Πj , which requires two
bilinear-map computations for each value, therefore taking time 4MAP; (b) Then algorithm
certify() is executed. The time needed for this part is δ(2mult + 2add + exp); (c) Then the
algorithm checks the subset condition which takes time 4MAP; (d) Finally it checks the
completeness condition that takes times 2MAP + MULT . Therefore we see that the total

173
cost for verification of a 2-intersection of size δ is
10MAP + δ(2mult + 2add + exp) + MULT ,
which is a linear function in δ, namely the function 50 + 0.028δ (in ms).

Chapter

6

Optimality with multilinear forms
In the previous chapters of this thesis, we introduced authenticated data structures schemes
based on several well-accepted cryptographic primitives such as accumulators, bilinear maps
and lattices. Some of these schemes present desirable efficiency characteristics, such as
operation sensitivity and parallel algorithms, which would not be achievable with traditional
hash-based techniques.
However, none of the authenticated data structure schemes presented so far is optimal
(i.e., adding no extra asymptotic overhead to the respective plain data structure scheme),
according to our natural definition of optimality given in Section 2 (see Definition 2.8). One
authenticated data structure scheme that is almost optimal is the scheme ASC, used for
verifying set operations and presented in Chapter 5: Although the verification and communication costs were showed to be optimal (O(t + δ)), the query costs were increased by a
polylogarithmic factor (see Theorem 5.1).
As such, in this chapter, we pose the following natural question: Can we construct an
optimal authenticated data structure scheme? The answer is yes, but, assuming the existence
of a cryptographic primitive that does not exist yet! Moreover, and less importantly, the
derived authenticated data structure scheme is not publicly verifiable, thus not allowing its
use by a three-party protocol (Protocol 2.1). This shows the complication of the problem of
achieving optimality in authenticated data structures.
174

175
To show the realization of an optimal authenticated data structure, we present an authenticated dictionary data structure that is based on a new cryptographic primitive that was
proposed by Silverberg and Boneh, namely multilinear forms [19], the construction of which
remains however an open problem to date. The use of such a primitive gives an authenticated dictionary with constant communication and constant verification complexity, while
maintaining all other complexities logarithmic. To the best of our knowledge (see Table 6.1),
this is the first optimal authenticated dictionary to appear in the literature, as it exactly
matches the respective complexities (update, query, and communication complexity) of the
optimal dictionary data structure (e.g., implemented as a red-black tree).
The multilinear form cryptographic primitive that is used in our construction can be
described as the “multi” version of the well-known bilinear map. Although initially used
to attack elliptic curve systems [76], bilinear maps (also extensively used in the previous
chapters of the thesis), being literally an efficient a tool for solving the decisional DiffieHellman problem, eventually proved to be a very useful tool in cryptography (e.g., [16, 17,
18]) after their first appearance in the literature for a “good purpose” [60]. However, the
main limitation of bilinear maps is the fact that they cannot be applied twice, i.e., the output
element cannot be fed back into the map e(., .) in an efficient way. Finding such maps, i.e.,
self-bilinear maps, which could be used in a recursive way to construct multilinear forms, was
recently proved to be infeasible for groups that are of interest in cryptography, i.e., groups
where the computational Diffie-Hellman problem is hard [27].
However, since cryptographically interesting multilinear form generators1 are not known
to exist to date, one can view our work from a different (and more theoretical) angle: A
proof through a complexity lower bound of the nonexistence of optimal authenticated dictionaries would imply the nonexistence of cryptographically interesting multilinear form
generators (see Theorem 6.2). This reveals yet another important relation between two
1

I.e., multilinear form generators for groups where the discrete log problem is hard, e.g., elliptic curve
groups. We call these generators admissible later in the paper.

176
fields—combinatorics and cryptography—and becomes more promising (towards proving
nonexistence of cryptographically interesting multilinear form generators) given recent advances in the derivation of general complexity lower bounds for memory checking [35] and
authenticated data structures [106].

About mulitlinear forms. Multilinear forms were proposed as a possible useful tool in
cryptography in 2003 by Silverberg and Boneh [19]. Since then, no efficient construction
of interest in cryptography has appeared. A work similar in nature with ours, where an
efficient construction for a cryptographic application based on multilinear forms is presented,
is proposed by Lee et al. [64]. The impossibility of deriving multilinear forms through selfbilinear maps is investigated by Cheon and Lee [27].

setup()
update()
refresh()
query()
verify()
proof Π(q)
info. upd
publicly verifiable
optimal
assumption

[15, 48, 75, 81] [11]
[83]
n
n
n
log n
1
1
log n
1
n
log n
n
1
log n
n
1
log n
n
1
1
1
1
yes
yes
yes
no
no
no
Generic CR
D. Log B. q-DH

[23, 101] [51]
n
n
1
n
n log n
n
1
n
1
1
1
1
1
n
yes
yes
no
no
Strong RSA

[90]
n
1
1
n
1
1
1
yes
no

MFD
n
log n
log n
log n
1
1
log n
no
yes
M. q-DH
Generic CR

Table 6.1: Asymptotic access and group complexities of various authenticated data structure schemes for a dynamic dictionary
storing n elements, compared with the optimal authenticated dictionary MFD based on multilinear forms and derived in this
chapter. Parameter 0 <  < 1 is a constant and “M. q-DH” stands for “Multilinear q-strong Diffie-Hellman”. The various
acronyms used for variables and assumptions have all been defined in Table 3.1. Note that our construction requires two
assumptions, namely the assumptions M. q-DH and Generic CR.

177

178

6.1

Dictionary data structure

In this chapter, the underlying data structure we are using (and for which we are designing
an authenticated data structure scheme for) is a dictionary. Let X be a collection of n
elements from a totally-ordered universe U. Note that the total order is a requirement for
the dictionary data structure, a property that distinguishes it from a hash table (and thus
the difference in their complexities). The data structure scheme {update, query, check}
as defined in Definition 2.2 for a dictionary D(X ) is as follows:
1. y ← query(a, b, D(X )): Given two elements a, b ∈ U, with a ≤ b, return the sorted
list of successive elements y = [y1 y2 . . . yw−1 yw ] ⊆ X such that
y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw .
This is a general range search query. Note that for a = b this query reduces to a
membership (or non-membership) query for a outputting the interval of X containing
(or not) a. Answering a range search query can be implemented to have O(log n + w)
worst-case complexity with a red-black tree data structure [29], outputting an answer
of size O(w);
2. D(X 0 ) ← update(x, D(X )): Given an element x ∈ U such that x ∈
/ X , insert element
x into X and output D(X 0 ); Given an element x ∈ U such that x ∈ X , delete element
x from X and output D(X 0 ). Both insertions and deletions can be implemented to
have O(log n) worst-case complexity [29];
3. {accept, reject} ← check(a, b, y, D(X )): If y = [y1 y2 . . . yw−1 yw ] ⊆ X is a sorted list
of w successive elements such that y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw , return accept.
Else return reject.

179

6.1.1

Non-optimal authenticated dictionaries

To verify range search queries on a dictionary of n elements, we can use many authenticated data structure schemes extensively described in the literature, e.g., various hierarchical hashing constructions [15, 48, 75, 81, 92], the security of which is based on generic
collision-resistant hashing. Let HBD = {genkey, setup, update, refresh, query, verify} be such
a scheme, described below in Corollary 6.1. All these schemes, are however, non-optimal:
When the output range is of size w = o(log n), the output proof complexity as long as the
verification complexity are both Ω(log n), not satisfying in this way the definition of optimality (see Definition 2.8). Nevertheless, for reasons that will become clear later, we are
using such an authenticated dictionary in our construction:
Corollary 6.1 Let k be the security parameter. Then there exists a non-optimal, publiclyverifiable authenticated data structure scheme HBD = {genkey, setup, update, refresh, query,
verify} for a data structure scheme defined for a dynamic dictionary D storing n elements
such that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and
assuming the existence of generic collision-resistant hash functions;
2. The access complexity of setup() is O(n), outputting an authenticated data structure
auth(D) of O(n) group complexity;
3. The access complexity of update() is O(log n), outputting update information upd of
O(1) group complexity;
4. The access complexity of refresh() is O(log n);
5. For a range search query q outputting an answer of size w we have:
(a) The access complexity of query() is O(log n + w);
(b) The access complexity of verify() is O(log n + w);

180
(c) The group complexity of the proof Π(q) is O(log n + w).
We continue with describing the cryptographic primitive to be used in our construction, the
multilinear form:

6.2

Multilinear forms

Let G, G be two cyclic groups of prime order p and let g be a generator of G. We let the
bit-size of p (the order of both G and G) to be a polynomial in the security parameter k.
We are now ready to define an admissible t-multilinear form. The definition is similar to the
one presented in the original paper by Silverberg and Boneh [19]:
Definition 6.1 We say that a map e : Gt → G is an admissible t-multilinear form if it
satisfies the following properties:
1. G and G are cyclic groups of the same prime order p;
2. The discrete logarithm problem is hard both in G and G;
3. For all a1 , a2 , . . . , at ∈ Z∗p and x1 , x2 , . . . , xt ∈ G it is
e(xa11 , xa22 , . . . , xat t ) = e(x1 , x2 , . . . , xt )a1 a2 ...at ∈ G;
4. The map is non-degenerate: If g ∈ G generates G then e(g, g, . . . , g) ∈ G generates G.
We call the groups G and G for which there exists an admissible t-multilinear form admissible
t-multilinear groups.
Definition 6.2 An admissible t-multilinear form generator is a probabilistic polynomialtime algorithm that takes as input a natural number t and the security parameter 1k and
outputs a uniformly random tuple of multilinear pairing parameters (p, G, G, e, g), where G
and G are admissible t-multilinear groups for which there exists an admissible t-multilinear
form e(., . . . , .) : Gt → G of t inputs.

181
To prove security of our construction, we are going to use the following assumption, which
can be described as the “multi” version of the bilinear q-strong Diffie-Hellman assumption
(see Assumption 3.2):
Assumption 6.1 (Multilinear q-strong Diffie-Hellman assumption) Let k be the security parameter and let (p, G, G, e, g) be a uniformly randomly generated tuple of multilinear
pairings parameters, output by an admissible t-multilinear form generator. Given the eleq

ments g, g s , . . . , g s ∈ G for some s chosen at random from Z∗p , where q = poly(k), there
is no polynomial-time algorithm that can output the pair (a, e(g, g, . . . , g)1/(s+a) ) ∈ Zp × G
except with negligible probability neg(k).

6.3

An optimal authenticated dictionary

In this section, we describe an authenticated dictionary data structure scheme based on
admissible multilinear form generators that achieves communication and verification complexity that is proportional to the size of the reported output range w. More importantly,
these complexities are combined with logarithmic update and query costs.
Let X = {x1 , x2 , . . . , xn } be a set of elements from a totally-ordered universe U contained
in the dictionary, where x1 < x2 < . . . < xn . Each element is represented with k/2 bits.
The actual set we are going to store, in order to also support efficient range search and
non-membership queries, is the set of k-bit intervals, i.e., the set A = {a0 , a1 , a2 , . . . , an },
where, for i = 1, . . . , n − 1, it is ai = xi ||xi+1 , for i = 0, it is a0 = −∞||x1 , and, for i = n,
it is an = xn || + ∞. Note that the total order on X imposes a natural total order on A. In
our construction we use a red-black tree, with data at the leaves (i.e., internal nodes data
navigates the searches and does not correspond to actual data) [29].
We now describe MFD = {genkey, setup, update, refresh, query, verify}, our authenticated data structure scheme for a dictionary data structure on the totally-ordered set

182
X = {x1 , x2 , . . . , xn }. We note that the construction uses several features from the accumulators constructions in Chapter 3, as long as the authenticated data structure scheme for a
dictionary HBD from Corollary 6.1. Again, the actual set on which we are going to build
our data structure is the set of intervals A = {a0 , a1 , a2 , . . . , an }.
Algorithm {sk, pk} ← genkey(1k ): Using a k-admissible multilinear form generator from
Definition 6.2, the algorithm outputs a k-bit prime p, k-admissible multilinear groups G and
G, g ∈ G that generates G and a k-admissible multilinear form e(., . . . , .) : Gk → G 2 . Then it
randomly picks a number s ∈ Z∗p (s is the trapdoor). An upper bound q of the total number
of elements to be stored in the data structure is decided and the algorithm also computes
2

q

the elements of G g s , g s , . . . , g s . It also calls {sk, pk} ← HBD.genkey(1k ). The algorithm
outputs s ∈ Z∗p and HBD.sk as sk and everything else as pk.
Algorithm {auth(D0 ), d0 } ← setup(D0 , sk, pk): Let D0 = A = {a0 , a1 , a2 , . . . , an } be the
set of sorted intervals, that corresponds to the underlying set of elements X = {x1 , x2 , . . . , xn }.
Initially, the algorithm computes
acc(A) = g (a0 +s)(a1 +s)...(an +s) ∈ G .

(6.1)

Let now T be the red-black tree built on top of the intervals ai for i = 0, . . . , n. Note that
there is a natural notion of order imposed on A, based on the order imposed on X . Let
v0 , v1 , . . . , vn be the leaves of the tree, storing the intervals a1 , a1 , . . . , an respectively. We
define the label of vi as
label(vi ) = g ai +s ∈ G for all i = 0, . . . , n .

(6.2)

Also, let vA be the internal node of T that is the root of the subtree TA of T that contains
the elements of some set A ⊆ A. For every internal node vA (and for the root of the tree as
2

Note that the number of inputs of the multilinear form is equal to the security parameter k. This is
because our construction will require O(log n) inputs for the multilinear form, and, since we are in the
computational model, it is always log n < k.

183
well), the algorithm sets
label(vA ) = g

Q

a∈A (a+s)

∈ G.

(6.3)

All the labels label() are stored with tree T . Subsequently, the algorithm calls the algorithm
HBD.setup(A, sk, pk), building in this way a new authenticated dictionary based on hashing
on top of A. It sets d0 = {acc(A), hash(A)}, where hash(A) = HBD.d0 . The structure
auth(D0 ) contains the tree T , as well as the authenticated structure HBD.auth(A). We now
make the following important remark:
Remark 6.1 The hashing scheme employed by HBD includes in the hashing computation
all the labels label(v) (defined in Relation 6.3) of the internal nodes v. Namely the hash value
hv at some internal node v that has left child u and right child w is computed as
hv = h(hu ||v||label(v)||hw ) ,
where h(.) is the used generic collision-resistant hash function (e.g., SHA-2).
Lemma 6.1 Algorithm setup() of the authenticated data structure scheme MFD has O(n)
access complexity. Moreover, the authenticated data structure auth(D0 ) output by setup() has
O(n) group complexity.
Proof: First of all, HBD.setup() has O(n) complexity, outputting an authenticated data
structure of O(n) group complexity, by Corollary 6.1. Denote now with vA an internal node
of tree T , which is the root of a subtree TA . For each leaf of the tree vi the algorithm
computes Pi = ai + s and then sets label(vi ) = g Pi . Note now that for every other internal
node of the tree vA with left child vB and right child vC , it is PA = PB PC and label(vA ) = g PA ,
since A = B ∪ C. The described recursive computation has O(n) complexity (a postorder
traversal of T ). Moreover, since for each node of T , we are storing one label, the total group
complexity of the labels is O(n). This completes the proof. 2

184
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 , upd} ← update(u, Dh , auth(Dh ), dh , sk, pk): We distinguish two cases:
(1) Insertion of element x ∈ X : Let a1 = u||z ∈ A be an interval stored in the dictionary
such that u < x < z. Then the insertion of x is equivalent with deleting the interval a1 and
inserting the intervals b1 = u||x and b2 = x||z;
(2) Deletion of element x ∈ X : Let a1 = u||x ∈ A and a2 = x||z ∈ A be successive intervals
stored in the dictionary. Then the deletion of x is equivalent with deleting the intervals a1
and a2 and inserting the interval b = u||z.
We have therefore reduced the update of elements in X to a constant number of updates of intervals in A. Thus we continue the description of the update algorithms with
reference to intervals. Let a be the interval of the update u. Interval a defines a logarithmic number of nodes in T that needs to be accessed and modified in order for the update
to be performed (this follows from red-black tree properties). Let p(a) be the set of those
nodes. For every node v ∈ p(a), the algorithm updates the labels label(v) and outputs
the updated labels as information upd. The algorithm also stores the new (updated) labels on tree T , which is updated to T 0 . Finally, the algorithm calls {A0 , auth(A0 ), d0 } ←
HBD.update(u, A, auth(A), dh , sk, pk) and outputs the following structures:
1. The new digest dh+1 , which contains acc(A0 ) = acc(A)a+s (in the case of a deletion it is
−1

acc(A0 ) = acc(A)(a+s) ) and the new digest hash(A0 ) = HBD.d0 , as output by calling
HBD.update();
2. The new authenticated data structure auth(Dh+1 ), which contains T 0 and auth(A0 );
3. Information upd, which contains the updated labels.
Lemma 6.2 Algorithm update() of the authenticated data structure scheme MFD has O(log n)
access complexity. Moreover, the update information upd output by update() has O(log n)
group complexity.

185
Proof: This result follows from the properties of the red-black tree [29]: There is only a
logarithmic number of nodes that changes during a red-black tree update. Moreover, the
label label(v) of each such node v can be updated with O(1) complexity since update() has
access to the secret key s. Finally, by Corollary 6.1, HBD.update() has O(log n) access
complexity. This makes the total update complexity equal to O(log n). 2
Algorithm {Dh+1 , auth(Dh+1 ), dh+1 } ← refresh(u, Dh , auth(Dh ), dh , upd, pk): The algorithm
updates T to T 0 by using the information contained in upd3 . Finally, the algorithm calls
{A0 , auth(A0 ), d0 } ← HBD.refresh(u, A, auth(A), dh , sk, pk) and outputs the following structures:
1. The new digest dh+1 , which contains acc(A0 ) (contained in upd) and the new digest
hash(A0 ) = HBD.d0 , as output by calling HBD.refresh();
2. The new authenticated data structure auth(Dh+1 ), which contains T 0 and auth(A0 ).
Lemma 6.3 Algorithm refresh() of the authenticated data structure scheme MFD has O(log n)
access complexity.
Proof: The algorithm performs a computation proportional to the size of the information
upd. Therefore, by Lemma 6.2, this part has O(log n) access complexity. Moreover, by
Corollary 6.1, HBD.refresh() has O(log n) access complexity. Summing up, the total access
complexity of the algorithm is O(log n). 2

6.3.1

Dictionary queries and verification

In this section we show the construction of proofs for the dictionary queries using the authenticated data structure scheme MFD. As it was defined in Section 6.1, a dictionary
range search query is described by two arguments, namely a, b ∈ U, with a ≤ b. Moreover,
3

Algorithm refresh could perform this task without access to information upd. However, since the algorithm does not have access to the secret key sk, this would require linear complexity.

186
the answer to the query is the sorted list of w successive elements y = [y1 y2 . . . yw−1 yw ] ⊆ X
such that y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw . For reasons to be made clear later, we will
distinguish two cases:
• If w = Ω(log n), the proof is constructed by using the authenticated data structure
scheme HBD. In this case, the size of the answer is Ω(log n), and therefore the
logarithmic-sized proofs of HBD (see Corollary 6.1) achieve optimality, according to
Definition 2.8;
• If w = o(log n), the proof needs to be constructed in a different way, so that optimality
can be achieved. Specifically, and as we will see later, optimality will be achieved only
for w = O(1). This is where the multilinear forms need to be employed.
We continue with describing algorithm query() formally.
Algorithm {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk): Let the query q be two elements
a, b ∈ U, with a ≤ b. Suppose the answer to the query is the sorted list of w successive
elements y = [y1 y2 . . . yw−1 yw ] ⊆ X such that y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw .
If w = Ω(log n), let HBD.Π(q) and HBD.α(q) be the proof and the answer respectively, output by calling HBD.query(q, Dh , auth(Dh ), pk), where q contains intervals y1 ||y2
and yw−1 ||yw (note that y1 ||y2 ≤ yw−1 ||yw ). Then we set Π(q) = HBD.Π(q). Namely in this
case, the proof is constructed by using the authenticated data structure scheme HBD.
Let us now examine the most interesting case, where w = o(log n). In this case the proof
Π(q) consists of w − 1 group elements in G, namely the witnesses
Q

Wai = e(g, g, . . . , g)

a∈A−ai (a+s)

∈G,

(6.4)

where ai = yi ||yi+1 , for i = 1, . . . w − 1. Note that Wai is similar to the membership witness
for accumulators, described in Section 3.

187
Lemma 6.4 Algorithm query() of the authenticated data structure scheme MFD has O(log n+
w) access complexity when w = Ω(log n) and O(w log n) access complexity when w = o(log n).
Moreover, in both cases, it outputs a proof Π(q) of O(w) group complexity.
Proof: If w = Ω(log n), then the authenticated data structure scheme HBD, by Corollary 6.1, outputs proofs of O(log n+w) group complexity with O(log n+w) access complexity.
However, since w = Ω(log n), it is O(log n + w) = O(w).
For the case w = o(log n), note that each witness Wai in Relation 6.4 can be constructed with O(log n) access complexity: Let vi0 , vi1 , . . . , vil be the path in tree T from the
leaf node storing the interval ai to the root of tree T , vil , where l = O(log n). Let also
wi0 , wi1 , . . . , wi(l−1) be the sibling nodes of nodes vi0 , vi1 , . . . , vi(l−1) respectively (note that
wi0 might not exist). By the construction of tree T (it can be viewed as segment tree [97]),
for each j = 0, 1, . . . , l − 1, it is
label(wij ) = g Pij ,

(6.5)

where
l−1
Y
j=0

Pij =

Y

(a + s) .

(6.6)

a∈A−ai

This means that information about the whole set can be retrieved by accessing O(log n)
memory locations. Therefore, the algorithm constructs the witness Wai by computing
e label(wi0 ), label(wi1 ), . . . , label(wi(l−1) ), g, . . . , g


= e g Pi0 , g Pi1 , . . . , g Pi(l−1) , g, . . . , g
= e(g, g, . . . , g)
= e(g, g, . . . , g)

Ql−1

j=0


Pij

Q

a∈A−ai (a+s)

= Wai .
The above four equalities follow from Relation 6.5, the properties of the multilinear form
e(., . . . , .) and Relations 6.6 and 6.4 respectively. Since computing one such witness Wai
requires O(log n) inputs from the authenticated data structure, and since the proof construction requires the computation of w − 1 such witnesses we conclude that for the case

188
w = o(log n), computing the proof has O(w log n) access complexity. Finally, since the
proof contains w − 1 such witnesses (which are elements in G), we conclude that the group
complexity of the proof in the case w = o(log n) is also O(w). 2
We now formally describe the verification algorithm. The verification algorithm will take
as input a proof and an answer and will either accept or reject the answer. Note that the
verification algorithm needs to have access to the secret key sk. Therefore the authenticated
data structure scheme MFD is not publicly verifiable.
Algorithm {accept, reject} ← verify(q, α, Π, dh , sk, pk): Let the query q be two elements
a, b ∈ U, with a ≤ b. Suppose the input answer α is the sorted list of w successive elements
y = [y1 y2 . . . yw−1 yw ] such that y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw . If w = Ω(log n) the
algorithm verifies the answer by running algorithm HBD.verify(q, α, Π, dh , sk, pk).
Otherwise, i.e., if w = o(log n), the input proof Π is the list of w − 1 witnesses
W1 , W2 , . . . , Ww−1 ∈ G .
The algorithm outputs accept if all of the following relations are true:
(s+ai )

Wi

= e(d, g, . . . , g) for all i = 1, . . . , w − 1 ,

(6.7)

where ai = yi ||yi+1 and d = acc(A) is contained in digest dh .
Lemma 6.5 Algorithm verify() of the authenticated data structure scheme MFD has O(w)
access complexity.
Proof: For w = Ω(log n) the complexity is due to the authenticated data structure scheme
HBD and the proof follows the same argument as in Lemma 6.4. For w = o(log n), the
complexity is O(w) since the algorithm needs to check Relations 6.7. Checking one such
relation requires one exponentiation in G and since w − 1 such relations need to be checked,
the result follows. Note that the required exponentiations in Relations 6.7 can be performed
by the algorithm because the algorithm has access to the secret key s. 2

189
Lemma 6.6 The authenticated data structure scheme MFD = {genkey, setup, update,
refresh, query, verify} is correct according to Definition 2.4.
Proof: Let D0 be any dictionary storing the collection of intervals A, corresponding to an
elements collection X (of n elements) from a totally-ordered universe U. Fix the security
parameter k and output pk = {G, G, e(., . . . , .), p} and sk = s ∈ Z∗p by calling algorithm
genkey(). Then output an authenticated data structure auth(D0 ) and the respective digest
d0 , by calling algorithm setup(). Pick a polynomial number of updates—namely, pick a
polynomial number of elements from U for insertion or deletion—and update auth(D0 ) and
d0 by calling algorithm refresh(). Let Dh be the final dictionary, auth(Dh ) be the produced
authenticated data structure and dh be the final digest. Let now q be a range search query
corresponding to elements a and b from U with a ≤ b. Algorithm query outputs an answer
α(q) which is the sorted list of w successive elements y = [y1 y2 . . . yw−1 yw ] ⊆ X (note that
[y1 ||y2 y2 ||y3 . . . yw−1 ||yw ] ⊆ A) such that y1 ≤ a ≤ y2 ≤ . . . ≤ yw−1 ≤ b < yw . Let also Π(q)
be the proof output by algorithm query(). We distinguish two cases:
1. If w = Ω(log n), the proof Π(q) is computed by algorithm HBD.query(). By the
correctness of the authenticated data structure scheme HBD, verify() does not reject
in this case;
2. If w = o(log n), the proof Π(q) consists of the witnesses Wai for i = 1, . . . , w − 1, and
where ai = yi ||yi+1 . Algorithm verify() does not reject since
Waaii +s =



e(g, g, . . . , g)

Q

Q

= e(g, g, . . . , g)

a∈A−ai (a+s)

(ai +s)

a∈A (a+s)

= e(acc(A), g, . . . , g) ,
by the definition of Wai in Relation 6.4 and since acc(A) is always maintained to be the
accumulation of all the intervals in A through algorithm refresh()—see Relation 6.1.
This completes the proof. 2

190
Lemma 6.7 The authenticated data structure scheme MFD = {genkey, setup, update,
refresh, query, verify} is secure according to Definition 2.5.
Proof: Fix the security parameter k and output pk = {G, G, e(., . . . , .), p} and sk = s ∈ Z∗p
by calling algorithm genkey(). Let Adv be a polynomially-bounded adversary. Adv picks an
initial collection of n elements X , all belonging to a totally-ordered universe U. Let A be the
respective collection of intervals, stored in a dictionary D0 . Adv outputs an authenticated
data structure auth(D0 ), by calling algorithm setup() through oracle access. Then Adv picks
a polynomial number of updates—namely, he picks a polynomial number of elements from
U for insertion or deletion. Let Dh be the final dictionary after the updates, let the updated
final collection of intervals and elements be A and X respectively, and let dh be the final
digest as produced by the adversary through oracle access to algorithm update(). Let q be
a dictionary query picked by the adversary, consisting of two elements a, b ∈ U, with a ≤ b.
Suppose the adversary outputs an incorrect answer α which is however the sorted list of w
successive elements [z1 z2 . . . zw−1 zw ] such that z1 ≤ a ≤ z2 ≤ . . . ≤ zw−1 ≤ b < zw and
a respective proof Π. We will compute the probability that check(q, α, Dh ) rejects, while
verify(q, α, Π, sk, pk) accepts, as required by Definition 2.5. If w = Ω(log n), by the security
of the scheme HBD the event in question happens with probability neg(k). If w = o(log n),
the proof Π consists of w − 1 witnesses W1 , W2 , . . . , Ww−1 , each one referring to the intervals
b1 , b2 , . . . , bw−1 , where bi = zi ||zi+1 , respectively. Since answer α is not correct, it should be
the case that there exists bi ∈
/ A4 such that
Wibi +s = e(g, g, . . . , g)(a0 +s)(a1 +s)(a2 +s)...(an +s) ,
where A = {a0 , a1 , . . . , an }. Since bi ∈
/ {a0 , a1 , a2 , . . . , an } we can write
(a0 + s)(a1 + s) . . . (an + s) = P(bi + s) + λ ,
where the coefficients of polynomial P and quantity λ are computable in polynomial time in
4

Note that bi ∈
/ A is equivalent to either adding extra elements in the reported range or omitting certain
elements from the reported range.

191
n (polynomial division). Therefore the adversary Adv can compute
1

e(g, g, . . . , g) bi +s = [Wi e(g, g, . . . , g)−P ]λ
i

−1

,

i

since e(g, g, . . . , g)s ∈ G can efficiently be computed from g s ∈ G by using the admissible
multilinear form e : Gk → G for all i = 0, . . . , q. However, by Assumption 6.1, this happens
with probability neg(k). 2
We note here that the second part of the proof of security above (the case w = o(log n))
follows the same logic of the security of the bilinear-map accumulator in Lemma 5.1. However, in a multilinear setting, where e(., . . . , .) is not used for verification, if we are to use only
Assumption 6.1 we cannot prove security for subsets of elements, but only for one element.
Proving security for subsets of elements in the multilinear setting would require a stronger
assumption. Moreover, due to this limitation, we will only be able to prove optimality of
the presented authenticated data structure scheme MFD for specific values of w, i.e., for
w = O(1) or w = Ω(log n).
Lemma 6.8 For range search queries outputting an answer of size w such that w = O(1)
or w = Ω(log n), the authenticated data structure scheme MFD = {genkey, setup, update,
refresh, query, verify} is optimal according to Definition 2.8.
Proof: According to Definition 2.8, the authenticated data structure scheme MFD =
{genkey, setup, update, refresh, query, verify} for a dictionary D of n elements is optimal as
long as w = O(1) or w = Ω(log n). This is because all of the following are true:
1. The authenticated data structure scheme MFD is correct and secure, by Lemmata 6.6
and 6.7 respectively;
2. For the group complexity of the authenticated data structure we have |auth(D)| =
|D| = O(n), by Lemma 6.1;
3. By Lemmata 6.2 and 6.3, and for the update access complexity we have |update()| +
|upd| + |refresh()| = O(|update()|) = O(log n);

192
4. For the query access complexity, we distinguish two cases:
• When w = O(1) or w = Ω(log n), by Lemma 6.4, we have
|query()| = O(|query()|) = O(log n + w) .
So in this case optimality is achieved;
• When w = ω(1) and w = o(log n), by Lemma 6.4, we have |query()| = O(w log n)
and |query()| = O(log n + w). Therefore |query()| is not O(|query()|) which
means that the query complexity constraint is not satisfied in this case.
5. For the group complexity of the proof we have |Π(q)| = O(|q| + |α(q)|) = O(w), by
Lemma 6.4;
6. For the access complexity of the verification algorithm we have |verify()| = O(|q| +
|α(q)|) = O(w).
Therefore the authenticated data structure scheme MFD is optimal for range search queries
returning an answer of size w such that w = O(1) or w = Ω(log n). 2

6.3.2

Main results

Theorem 6.1 Let k be the security parameter and assume the existence of an admissible
Θ(k)-multilinear form generator, as defined in Definition 6.2. Then there exists an authenticated data structure scheme MFD = {genkey, setup, update, refresh, query, verify} for a
data structure scheme defined for a dynamic dictionary D storing n elements such that:
1. It is correct according to Definition 2.4 and secure according to Definition 2.5 and (i)
under the multilinear q-strong Diffie-Hellman assumption; (ii) assuming the existence
of generic collision-resistant hash functions;
2. It is optimal only for range search queries outputting an answer of size w such that
w = O(1) or w = Ω(log n), according to Definition 2.8;

193
3. It is not publicly-verifiable according to Definition 2.9;
4. The access complexity of setup() is O(n), outputting an authenticated data structure
auth(D) of O(n) group complexity;
5. The access complexity of update() is O(log n), outputting update information upd of
O(log n) group complexity;
6. The access complexity of refresh() is O(log n);
7. For a range search query q outputting an answer of size w we have:
(a) The access complexity of query() is O(log n+w) when w = Ω(log n) and O(w log n)
when w = o(log n);
(b) The access complexity of verify() is O(w);
(c) The group complexity of the proof Π(q) is O(w).
Proof: This result follows directly from Lemmata 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7 and 6.7.
Note that the scheme is not publicly-verifiable since algorithm verify() requires the secret key
as an input. Finally, we have to assume existence of generic collision-resistant hash functions
since we are using the authenticated data structure scheme HBD. 2
We now present the final result of this chapter that relates optimality of an authenticated
data structure scheme for a dictionary with the existence of admissible multilinear form
generators.
Theorem 6.2 Let k be the security parameter and let D be a data structure scheme for a
dictionary of n elements that supports range search queries q outputting an answer of size w
such that w = O(1) or w = Ω(log n). If no optimal authenticated data structure scheme for
D exists, then no admissible Θ(k)-multilinear form generator exists either.
Proof: Let’s assume this is not the case and an admissible Θ(k)-multilinear form generator
does exist in the absence of an optimal authenticated data structure scheme for D. This is

194
a contradiction since we can use the construction of Theorem 6.1, which uses an admissible
Θ(k)-multilinear form generator, to derive an optimal authenticated data structure scheme
for D. 2
Finally we need to make the following important observation. Theorem 6.2 does not
exclude the existence of some instance of a multilinear form, even in the absence of optimal
authenticated dictionaries (say for example an instance of a multilinear form for three inputs).
The result holds for all admissible Θ(k)-multilinear forms, where k is the security parameter.

6.3.3

Application in the two-party protocol

Due to the fact that the authenticated data structure scheme MFD is not publicly-verifiable,
it can only be used by a two-party protocol, since a three-party protocol always requires a
publicly-verifiable authenticated data structure scheme (see Protocol 2.1). However, in order
to be able to use the authenticated data structure scheme MFD of Theorem 6.1 in a blackbox way with Theorem 2.2—and derive a two-party authenticated data structures protocol,
we have to ensure that Assumption 2.1 holds for the authenticated data structure scheme
MFD:
Lemma 6.9 Assumption 2.1 is true for the authenticated data structure scheme MFD.
Moreover, for every update u, |Qu | has O(1) complexity.
Proof: Let an update u refer to element e, i.e., either insert element e to the dictionary or
delete element e from the dictionary. The respective set of queries Qu required for Assumption 2.1 simply contains one query q for the range [e, e0 ] such that there are w = Ω(log n) elements between e and e0 . Let {Π(q), α(q)} ← query(q, Dh , auth(Dh ), pk). Since w = Ω(log n),
Π(q) and α(q) are output by algorithm HBD.query().
We now describe function z(.) from Assumption 2.1. Function z(.) extracts δu (Dh ) and
δu (auth(Dh )) from Π(q): Due to the hashing scheme employed in Remark 6.1, Π(q) contains all the structure of the red-black tree δu (Dh ) that is accessed during update u, along

195
with the labels label(.) (i.e., the ones that need to be updated by update()), that belong to
the accessed authenticated data structure δu (auth(Dh )). Extracting that information has
O(log n) complexity, equal to the verification complexity, as required by Assumption 2.1.
This completes the proof. 2
By Theorems 2.2 and 6.1 and Lemma 6.9, we can now state the final result for the
two-party model:
Corollary 6.2 Let k be the security parameter and assume (i) the existence of an admissible
Θ(k)-multilinear form generator; (ii) that the multilinear q-strong Diffie-Hellman assumption
holds; (iii) the existence of generic collision-resistant hash functions. Then there exists a
two-party authenticated data structures protocol (see Protocol 2.2) for verifying range search
queries q on a dynamic dictionary storing n elements, and where w is the size of the answer
to a range search query q, such that:
1. The protocol is interactive;
2. The setup at the client has O(n) access complexity;
3. The update at the client has O(log n) access complexity;
4. The verification at the client has O(w) access complexity;
5. The space needed at the client has O(1) group complexity;
6. The communication between the client and the server has O(log n) group complexity
during updates and O(w) group complexity during queries;
7. The update at the server has O(log n) access complexity;
8. The query at the server has O(log n + w) access complexity when w = Ω(log n) and
O(w log n) access complexity when w = o(log n);
9. The space needed at the server has O(n) group complexity;

196
10. For a query q sent by the client to the server at any time (even after updates), let α be
an answer and let π be a proof returned by the server. With probability Ω(1 − neg(k)),
the client accepts the answer α if and only if α is correct.

6.4

Summary

In this chapter, we have presented the first optimal authenticated dictionary (Theorem 6.1)
supporting range search queries outputting answers of size w such that w = O(1) or w =
Ω(log n), having verification and proof complexity equal to O(w) (as opposed to O(log n+w),
see Table 6.1). Its design is based on multilinear forms, a recently-proposed cryptographic
primitive [19] whose construction remains an open problem to date.
However, since multilinear forms are not known to exist yet, this work can be viewed
from a different angle (Theorem 6.2): if one could prove that such optimal authenticated
dictionaries cannot exist in the computational model, irrespectively of cryptographic primitives, then our result would imply that certain admissible multilinear form generators cannot
exist as well (i.e., it can be viewed as a reduction). Thus, we provide an alternative avenue
towards proving the nonexistence of multilinear form generators in the context of general
lower bounds for authenticated data structures [106] and for memory checking [35].

Chapter

7

Conclusions
This thesis studies the problem of efficiently verifying data and computations stored and
performed respectively by untrusted parties. This research direction, lying under the framework of cloud cryptography, has become very relevant nowadays, given the great amount of
information and computation that is outsourced remotely at untrusted repositories, due to
the increasing adoption of cloud computing in our everyday digital interactions. We therefore explore in depth the field of authenticated data structures, constructing a firm theoretical
foundation in Chapter 2 and then continue, in the subsequent chapters, with the development
of five different authenticated data structure schemes, each one constructed for a different
problem. All our solutions are fully dynamic.
A common feature shared by all the authenticated data structures designed in this thesis
is the use of advanced cryptography. A common goal of all the solutions has been how
to exploit the offered cryptographic tools in order to derive highly-desirable efficiency features, such as constant communication complexity (Chapter 3), parallel algorithms (Chapter 4), operation sensitivity (Chapter 5) and optimality (Chapter 6), which could not be
achieved otherwise, e.g., with the use of traditional hash-based techniques. We prove the
security of our constructions only under well-accepted—by the cryptography community—
computational assumptions (e.g., strong Diffie-Hellman and RSA problems and polynomial
approximation of lattices problems).
197

HBD
RHT
BHT
LBT
MFD
[81]
Chapter 3
Chapter 3 Chapter 4
Chapter 6
setup()
n
n
n
n log n
n
update()
log n
1
1
1
log n
refresh()
log n
1
1
log n
log n


query()
log n
n
n log n
log n
log n
verify()
log n
1
1
log n
1
proof Π(q)
log n
1
1
log n
1
info. upd
1
1
1
1
log n
publicly verifiable
yes
yes
yes
yes
no
optimal
no
no
no
no
no
assumption
Generic CR Strong RSA Bilinear q-DH GAPSVP Multilinear q-DH and Generic CR

Table 7.1: Asymptotic access and group complexities of the authenticated data structure schemes presented in this thesis,
applied to the fundamental problem of verifying read/write operations on an array of n entries, and compared with the first
result on dynamic authenticated data structures by Naor and Nissim [81]. We note that, since all complexities for the plain
table data structure are constant, no authenticated data structure scheme presented is optimal. Moreover, based on the recent
lower bound for memory checking by Dwork et al. [35], it seems unlikely that such a scheme could be derived.

198

199
The findings of this thesis indicate that understanding and employing advanced cryptographic tools can lead to significant complexity gains in authenticated data structures and
more generally in verifiable computations. Perhaps the most persuading justification for
the validity of this statement is Chapter 5 itself, where non-trivial computations over outsourced data (set operations) are verified with optimal costs, due to the use of bilinear maps.
Moreover this result provides evidence that more complicated functionalities (other than traditional set-membership computations) could be possibly verified efficiently in a public-key
setting using authenticated data structures techniques, initiating in this fashion the quest
for schemes that apply to other interesting problems (e.g., geometric computations).

7.1

Overview of thesis results and discussion

In this thesis, we observed that using different cryptography allows for various complexity
trade-offs in authenticated data structures. In Table 7.1, we apply all our schemes (except
for the scheme of Chapter 5) in the fundamental problem of verifying read/write operations
on an array of n entries. This is a data structure where all our authenticated data structure
schemes can be easily employed. Table 7.1 also includes a column referring to the seminal
result by Naor and Nissim [81], where an authenticated dictionary was presented, based on
2-3 tree implementation, and with logarithmic complexities.
From the results in Table 7.1, we draw the following conclusion: As of now, there is no
optimal authenticated data structure (as defined in Definition 2.8) for the simplest functionality of reading and writing entries on a table (similar to the memory checking model).
We note however, that this does not come as a surprise: It would seem that deriving an
optimal authenticated data structure scheme for a table (a RAM array) would violate existing Ω(log n/ log log n) bounds that have appeared in the memory checking model [35].
This observation naturally raises an open problem: Can we design an authenticated table of
Θ(log n/ log log n) complexities? This construction would potentially yield an optimal online

200
memory checker and could be derived from the realm of more advanced cryptography.

7.2

Future work

It is our belief that providing security in the cloud is going to play a major role in adopting
cloud computing as a new computing discipline. Concerning cloud integrity, future work
includes a further investigation of the field of authenticated data structures. More specifically,
one can focus on the verification of outsourced computations, in an operation-sensitive way.
Operation-sensitivity, a crucial efficiency property, has only been achieved so far in a practical
and publicly-verifiable fashion for specific computations, such as range search [52] and set
operations (see Chapter 5). On the other hand, it has been shown that in a privatelyverifiable way—and under certain assumptions, it is feasible for general computations, i.e.,
any boolean circuit, e.g., by using the model of outsourced verifiable computation [41].
However, these constructions are currently not very efficient. Aiming at publicly-verifiable
solutions that could be used by cloud applications without changing the user experience, the
question that arises is evident: Which outsourced computations (e.g., shortest paths) can be
practically and publicly verified in an operation-sensitive way?
Another aspect of cloud security that can be investigated is cloud privacy, i.e., protecting
the confidentiality of data that is stored remotely. Resorting to a solution that merely encrypts our data before uploading it online defeats one of the main purposes of investing into
cloud infrastructures: No advanced meaningful outsourced computations can be performed
on encrypted data. Achieving both goals, namely storing encrypted data and at the same
time being able to do significant processing with it, was recently achieved with the proposal
of a fully-homomorphic encryption scheme [43]. Implementing however such a primitive has
not led to efficient solutions yet—it has on the other hand ignited a lot of enthusiasm for
cloud privacy research. Our belief is that we have to settle for simpler and more efficient
constructions that refer to specific functionalities, e.g., see the work on searchable symmetric

201
encryption by Curtmola et al. [31]. Therefore, future directions could explore the computation of such specific functionalities (e.g., geometric queries, polynomial evaluation) on private
data in an efficient way which will allow easy implementation and fast deployment.
Finally, another very interesting privacy topic that has emerged lately and lies at the
intersection of algorithms and cryptography is the notion of data-oblivious algorithms [49,
109]. Data oblivious algorithms do not perform any data-dependent operations and therefore
an adversary observing the flow of the circuit computation cannot distinguish between two
different inputs. Applying oblivious algorithms in secure two-party computations can lead to
considerable efficiency gains and practical protocols. This is because garbled circuits [113] can
be used in this way only for primitive black boxes performing data-dependent operations (e.g.,
min, max), and not for the whole circuit. Recent results include highly efficient protocols
for secure two-party sorting, selection, and permuting [49] as well as for various geometric
problems [36]. Since the need for efficient secure two-party computation is now greater than
ever, my belief is that there is a lot of research potential on transforming algorithms into
oblivious algorithms so that they can be securely used by cloud applications.
More theoretically-oriented future research, and as mentioned in Section 7.1, can involve
improving the asymptotic bounds of memory checking [35] by using advanced cryptographic
primitives, exploring existence and limitations of optimal authenticated data structures, and
studying the dynamization overhead of cloud cryptography1 .
From a practical perspective, designing more efficient authenticated data structures to
be used in practice is definitely a big challenge: Most practical applications nowadays extensively use fast authenticated data structures such as Merkle trees, the security of which
is however based on totally empirical assumptions (e.g., collision-resistance of SHA-2). This
provides great efficiency at the cost of risking the security of the application—e.g., SHA-2
replaced MD-5 due to an attack [103], after MD-5 had been used extensively over the years
1

So far, most cloud cryptography constructions work for static data only and updates can be handled in
a secure way only through total recomputation, which is highly inefficient.

202
in many systems, such as authenticated file systems and authenticated storage systems. It
would be great to come up with authenticated data structures, whose security will be based
on a widely-accepted computational assumption (e.g., discrete log) and at the same time can
favorably compete in practice with the widely-used Merkle trees.

Bibliography

[1] PBC: The pairing-based cryptography library. http://crypto.stanford.edu/pbc/.
[2] T-mobile sidekick disaster: Danger’s servers crashed, and they don’t have a backup.
http://techcrunch.com/2009/10/10/.
[3] Miklós Ajtai. Generating hard instances of lattice problems (extended abstract). In
Proc. Symposium on Theory of Computing (STOC), pages 99–108, 1996.
[4] Aris Anagnostopoulos, Michael T. Goodrich, and Roberto Tamassia. Persistent authenticated dictionaries and their applications. In Proc. Information Security Conference (ISC), pages 379–393, 2001.
[5] Benny Applebaum, Yuval Ishai, and Eyal Kushilevitz. From secrecy to soundness:
Efficient verification via secure computation. In Proc. International Colloquium on
Automata, Languages and Programming (ICALP), pages 152–163, 2010.
[6] Mikhail J. Atallah, YounSun Cho, and Ashish Kundu. Efficient data authentication
in an environment of untrusted third-party distributors. In Proc. International Conference on Data Engineering (ICDE), pages 696–704, 2008.
[7] Giuseppe Ateniese, Randal Burns, Reza Curtmola, Joseph Herring, Lea Kissner,
Zachary Peterson, and Dawn Song. Provable data possession at untrusted stores.
203

204
In Proc. International Conference on Computer and Communications Security (CCS),
pages 598–609, 2007.
[8] Man Ho Au, Patrick P. Tsang, Willy Susilo, and Yi Mu. Dynamic universal accumulators for DDH groups and their application to attribute-based anonymous credential
systems. In Proc. Cryptographers’ Track at the RSA Conference (CT-RSA), pages
295–308, 2009.
[9] Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval:
The Concepts and Technology behind Search. Addison-Wesley, 2nd edition, 2010.
[10] Niko Baric and Birgit Pfitzmann. Collision-free accumulators and fail-stop signature
schemes without trees. In Proc. Annual International Conference on the Theory and
Applications of Cryptographic Techniques (EUROCRYPT), pages 480–494, 1997.
[11] Mihir Bellare and Daniele Micciancio. A new paradigm for collision-free hashing: Incrementality at reduced cost. In Proc. Annual International Conference on the Theory
and Applications of Cryptographic Techniques (EUROCRYPT), pages 163–192, 1997.
[12] Mihir Bellare and Phillip Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In Proc. International Conference on Computer and Communications Security (CCS), pages 62–73, 1993.
[13] Josh Benaloh and Michael de Mare. One-way accumulators: A decentralized alternative
to digital signatures. In Proc. Annual International Conference on the Theory and
Applications of Cryptographic Techniques (EUROCRYPT), pages 274–285, 1993.
[14] Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13:422–426, 1970.
[15] Manuel Blum, William S. Evans, Peter Gemmell, Sampath Kannan, and Moni Naor.
Checking the correctness of memories. Algorithmica, 12(2/3):225–244, 1994.

205
[16] Dan Boneh and Xavier Boyen. Short signatures without random oracles and the SDH
assumption in bilinear groups. Journal of Cryptology, 21(2):149–177, 2008.
[17] Dan Boneh and Matthew K. Franklin. Identity-based encryption from the weil pairing.
In Proc. International Cryptology Conference (CRYPTO), pages 213–229, 2001.
[18] Dan Boneh, Ilya Mironov, and Victor Shoup. A secure signature scheme from bilinear
maps. In Proc. Cryptographers’ Track at the RSA Conference (CT-RSA), pages 98–
110, 2003.
[19] Dan Boneh and Alice Silverberg. Applications of multilinear forms to cryptography.
Contemporary Mathematics, 324(1):71–90, 2003.
[20] Dan Boneh and Brent Waters. Conjunctive, subset, and range queries on encrypted
data. In Proc. Theoretical Cryptography Conference (TCC), pages 535–554, 2007.
[21] Andrei Z. Broder and Michael Mitzenmacher. Network applications of Bloom filters:
A survey. Internet Mathematics, 1(4):485–509, 2005.
[22] Jan Camenisch, Markulf Kohlweiss, and Claudio Soriente. An accumulator based on
bilinear maps and efficient revocation for anonymous credentials. In Proc. Public Key
Cryptography (PKC), pages 481–500, 2009.
[23] Jan Camenisch and Anna Lysyanskaya. Dynamic accumulators and application to
efficient revocation of anonymous credentials. In Proc. International Cryptology Conference (CRYPTO), pages 61–76, 2002.
[24] Jan Camenisch and Anna Lysyanskaya. A signature scheme with efficient protocols.
In Proc. Security and Cryptography for Networks (SCN), pages 268–289, 2002.
[25] Sébastien Canard and Aline Gouget. Multiple denominations in e-cash with compact
transaction data. In Proc. Financial Cryptography (FC), pages 82–97, 2010.

206
[26] Larry Carter and Mark N. Wegman. Universal classes of hash functions. In Proc.
Symposium on Theory of Computing (STOC), pages 106–112, 1977.
[27] Jung Hee Cheon and Dong Hoon Lee. A note on self-bilinear maps. Korean Mathematical Society, 46(2):303–309, 2009.
[28] Kai-Min Chung, Yael Kalai, and Salil Vadhan. Improved delegation of computation
using fully homomorphic encryption. In Proc. International Cryptology Conference
(CRYPTO), pages 483–501, 2010.
[29] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, 3rd edition, 2009.
[30] Scott A. Crosby. Efficient Tamper-Evident Data Structures for Untrusted Servers. PhD
thesis, Rice University, May 2010.
[31] Reza Curtmola, Juan A. Garay, Seny Kamara, and Rafail Ostrovsky. Searchable symmetric encryption: improved definitions and efficient constructions. In Proc. International Conference on Computer and Communications Security (CCS), pages 79–88,
2006.
[32] Ivan Damgård and Nikos Triandopoulos. Supporting non-membership proofs with
bilinear-map accumulators. http://eprint.iacr.org/. Cryptology ePrint Archive, Report
2008/538, 2008.
[33] Premkumar Devanbu, Michael Gertz, Chip Martel, and Stuart G. Stubblebine. Authentic third-party data publication. In Proc. Conference on Database Security (DBSEC), pages 101–112, 2000.
[34] Martin Dietzfelbinger, Anna Karlin, Kurt Mehlhorn, Friedhelm Meyer auf der Heide,
Hans Rohnert, and Robert E. Tarjan. Dynamic perfect hashing: Upper and lower
bounds. SIAM Journal on Computing, 23(4):738–761, 1994.

207
[35] Cynthia Dwork, Moni Naor, Guy Rothblum, and Vinod Vaikuntanathan. How efficient
can memory checking be? In Proc. Theoretical Cryptography Conference (TCC), pages
503–520, 2009.
[36] David Eppstein, Michael T. Goodrich, and Roberto Tamassia. Privacy-preserving dataoblivious geometric algorithms for geographic data. In Proc. International Symposium
on Advances in Geographic Information Systems (GIS), pages 13–22, 2010.
[37] C. Chris Erway, Alptekin Küpçü, Charalampos Papamanthou, and Roberto Tamassia.
Dynamic provable data possession. In Proc. International Conference on Computer
and Communications Security (CCS), pages 213–222, 2009.
[38] Li Fan, Pei Cao, Jussara Almeida, and Andrei Z. Broder. Summary cache: A scalable
wide-area web cache sharing protocol. IEEE/ACM Trans. Networking, 8(3):281–293,
2000.
[39] Michael J. Freedman, Kobbi Nissim, and Benny Pinkas. Efficient private matching
and set intersection. In Proc. Annual International Conference on the Theory and
Applications of Cryptographic Techniques (EUROCRYPT), pages 1–19, 2004.
[40] Joachim Von Zur Gathen and Jurgen Gerhard. Modern Computer Algebra. Cambridge
University Press, 2nd edition, 2003.
[41] Rosario Gennaro, Craig Gentry, and Bryan Parno. Non-interactive verifiable computing: Outsourcing computation to untrusted workers. In Proc. International Cryptology
Conference (CRYPTO), pages 465–482, 2010.
[42] Rosario Gennaro, Shai Halevi, and Tal Rabin. Secure hash-and-sign signatures without
the random oracle. In Proc. Annual International Conference on the Theory and
Applications of Cryptographic Techniques (EUROCRYPT), pages 123–139, 1999.

208
[43] Craig Gentry. Fully homomorphic encryption using ideal lattices. In Proc. Symposium
on Theory of Computing (STOC), pages 169–178, 2009.
[44] Oded Goldreich, Shafi Goldwasser, and Shai Halevi. Collision-free hashing from lattice
problems. In Electronic Colloqium on Computational Complexity (ECCC), 3(56), 1996.
[45] Oded Goldreich, Silvio Micali, and Avi Wigderson. Proofs that yield nothing but their
validity for all languages in NP have zero-knowledge proof systems. Journal of the
ACM, 38(3):691–729, 1991.
[46] Michael T. Goodrich, Charalampos Papamanthou, Roberto Tamassia, and Nikos
Triandopoulos. Athos: Efficient authentication of outsourced file systems. In Proc.
Information Security Conference (ISC), pages 80–96, 2008.
[47] Michael T. Goodrich and Roberto Tamassia. Algorithm Design: Foundations, Analysis,
and Internet Examples. John Wiley & Sons, 2002.
[48] Michael T. Goodrich, Roberto Tamassia, and Andrew Schwerin. Implementation of an
authenticated dictionary with skip lists and commutative hashing. In Proc. DARPA
Information Survivability Conference and Exposition II (DISCEX II), pages 68–82,
2001.
[49] Michael T. Goodrich. Randomized Shellsort: A simple oblivious sorting algorithm. In
Proc. Symposium on Discrete Algorithms (SODA), pages 1–16, 2010.
[50] Michael T. Goodrich, Charalampos Papamanthou, and Roberto Tamassia. On the cost
of persistence and authentication in skip lists. In Proc. Workshop on Experimental
Algorithms (WEA), pages 94–107, 2007.
[51] Michael T. Goodrich, Roberto Tamassia, and Jasminka Hasic. An efficient dynamic
and distributed cryptographic accumulator. In Proc. Information Security Conference
(ISC), pages 372–388, 2002.

209
[52] Michael T. Goodrich, Roberto Tamassia, and Nikos Triandopoulos. Super-efficient
verification of dynamic outsourced databases. In Proc. Cryptographers’ Track at the
RSA Conference (CT-RSA), pages 407–424, 2008.
[53] Michael T. Goodrich, Roberto Tamassia, and Nikos Triandopoulos. Efficient authenticated data structures for graph connectivity and geometric search problems. Algorithmica, 60(3):505–552, 2011.
[54] Eric Hall and Charanjit S. Julta. Parallelizable authentication trees. In Proc. Selected
Areas in Cryptography (SAC), pages 95–109, 2005.
[55] Brian Hayes. Cloud computing. Communications of the ACM, 51(7):9–11, 2008.
[56] Alexander Heitzmann, Bernardo Palazzi, Charalampos Papamanthou, and Roberto
Tamassia. Efficient integrity checking of untrusted network storage. In Proc. International Workshop on Storage Security and Survivability (STORAGESS), pages 43–54,
2008.
[57] Jeffrey Hoffstein, Nick Howgrave-Graham, Jill Pipher, Joseph H. Silverman, and
William Whyte. NTRUSIGN: Digital signatures using the NTRU lattice. In Proc.
Cryptographers’ Track at the RSA Conference (CT-RSA), pages 122–140, 2003.
[58] Andreas Hutflesz, Hans-Werner Six, and Peter Widmayer. Globally order preserving
multidimensional linear hashing. In Proc. International Conference on Data Engineering (ICDE), pages 572–579, 1988.
[59] Joseph F. JaJa. An Introduction to Parallel Algorithms. Addison-Wesley, 1992.
[60] Antoine Joux. A one-round protocol for tripartite Diffie-Hellman. Journal of Cryptology, 17(4):263–276, 2004.
[61] Jonathan Katz and Yehuda Lindell. Introduction to Modern Cryptography. Chapman
& Hall/CRC, 2007.

210
[62] Claire Kenyon and Jeffrey S. Vitter. Maximum queue size and hashing with lazy
deletion. Algorithmica, 6:597–619, 1991.
[63] Dieter Kratsch, Ross M. McConnell, Kurt Mehlhorn, and Jeremy P. Spinrad. Certifying algorithms for recognizing interval graphs and permutation graphs. In Proc.
Symposium on Discrete Algorithms (SODA), pages 158–167, 2003.
[64] Hyung-Mok Lee, Kyung Ju Ha, and Kyo-Min Ku. ID-based multi-party authenticated key agreement protocols from multilinear forms. In Proc. Information Security
Conference (ISC), pages 104–117, 2005.
[65] Arjen K. Lenstra, Hendrik W. Lenstra Jr, and László Lovász. Factoring polynomials
with rational coefficients. Mathematische Annalen, (261):515–534, 1982.
[66] Feifei Li, Marios Hadjieleftheriou, George Kollios, and Leonid Reyzin. Dynamic authenticated index structures for outsourced databases. In Proc. International Conference on Management of Data (SIGMOD), pages 121–132, 2006.
[67] Feifei Li, Ke Yi, Marios Hadjieleftheriou, and George Kollios. Proof-infused streams:
Enabling authentication of sliding window queries on streams. In Proc. Very Large
Data Bases (VLDB), pages 147–158, 2007.
[68] Jiangtao Li, Ninghui Li, and Rui Xue. Universal accumulators with efficient nonmembership proofs. In Proc. Applied Cryptography and Network Security (ACNS), pages
253–269, 2007.
[69] Nathan Linial and Ori Sasson. Non-expansive hashing. In Proc. Symposium on Theory
of Computing (STOC), pages 509–517, 1996.
[70] Ben Lynn. On the Implementation of Pairing-Based Cryptosystems. PhD thesis, Stanford University, November 2008.

211
[71] Vadim Lyubashevsky and Daniele Micciancio. Generalized compact knapsacks are
collision resistant. In Proc. International Colloquium on Automata, Languages and
Programming (ICALP), pages 144–155, 2006.
[72] Kyriakos Mouratidis, Man Lung Yiu and Yimin Lin. Efficient verification of shortest path search via authenticated hints. In Proc. International Conference on Data
Engineering (ICDE), pages 237–248, 2010.
[73] Petros Maniatis. Historic Integrity in Distributed Systems. PhD thesis, Stanford University, August 2003.
[74] Petros Maniatis and Mary Baker. Enabling the archival storage of signed documents.
In Proc. USENIX Conference on File and Storage Technologies (FAST), pages 31–45,
2002.
[75] Charles U. Martel, Glen Nuckolls, Premkumar T. Devanbu, Michael Gertz, April
Kwong, Stuart G. Stubblebine. A general model for authenticated data structures.
Algorithmica, 39(1):21–41, 2004.
[76] Alfred Menezes, Scott Vanstone, and Tatsuaki Okamoto. Reducing elliptic curve logarithms to logarithms in a finite field. In Proc. Symposium on Theory of Computing
(STOC), pages 80–89, 1991.
[77] Ralph C. Merkle. A certified digital signature. In Proc. International Cryptology
Conference (CRYPTO), pages 218–238, 1989.
[78] Daniele Micciancio and Oded Regev. Worst-case to average-case reductions based on
gaussian measures. SIAM Journal on Computing, 37(1):267–302, 2007.
[79] Ruggero Morselli, Samrat Bhattacharjee, Jonathan Katz, and Peter J. Keleher. Trustpreserving set operations. In Proc. Conference on Computer Communications (INFOCOM), 2004.

212
[80] James K. Mullin. Spiral storage: Efficient dynamic hashing with constant-performance.
Computer Journal, 28:330–334, 1985.
[81] Moni Naor and Kobbi Nissim. Certificate revocation and certificate update. In Proc.
USENIX Security Symposium (USENIX), pages 217–228, 1998.
[82] Moni Naor and Guy Rothblum. The complexity of online memory checking. Journal
of the ACM, 56(1), 2009.
[83] Lan Nguyen. Accumulators from bilinear pairings and applications. In Proc. Cryptographers’ Track at the RSA Conference (CT-RSA), pages 275–292, 2005.
[84] Glen Nuckolls. Verified query results from hybrid authentication trees. In Proc. Conference on Database Security (DBSEC), pages 84–98, 2005.
[85] National Institute of Standards and Technology. Secure hash standard (SHS). October
2008.
[86] Rafail Ostrovsky. Efficient computation on oblivious RAMs. In Proc. Symposium on
Theory of Computing (STOC), pages 514–523, 1990.
[87] Mark H. Overmars. The Design of Dynamic Data Structures. Springer-Verlag, LNCS
156, 1983.
[88] HweeHwa Pang and Kyriakos Mouratidis. Authenticating the query results of text
search engines. VLDB Endowment, 1(1):126–137, 2008.
[89] HweeHwa Pang and Kian-Lee Tan. Authenticating query results in edge computing.
In Proc. International Conference on Data Engineering (ICDE), pages 560–571, 2004.
[90] Charalampos Papamanthou, Roberto Tamassia, and Nikos Triandopoulos. Authenticated hash tables. In Proc. International Conference on Computer and Communications Security (CCS), pages 437–448, 2008.

213
[91] Charalampos Papamanthou, Roberto Tamassia, and Nikos Triandopoulos. Optimal
authenticated data structures with multilinear forms. In Proc. International Conference on Pairing-Based Cryptography (PAIRING), pages 246–264, 2010.
[92] Charalampos Papamanthou and Roberto Tamassia. Time and space efficient algorithms for two-party authenticated data structures. In Proc. International Conference
on Information and Communications Security (ICICS), pages 1–15, 2007.
[93] Charalampos Papamanthou and Roberto Tamassia. Cryptography for efficiency: Authenticated data structures based on lattices and parallel online memory checking.
http://eprint.iacr.org/. Cryptology ePrint Archive, Report 2011/102, 2011.
[94] Charalampos Papamanthou, Roberto Tamassia, and Nikos Triandopoulos. Optimal
verification of operations on dynamic sets. In Proc. International Cryptology Conference (CRYPTO), 2011.
[95] Chris Peikert. Public-key cryptosystems from the worst-case shortest vector problem
(extended abstract). In Proc. Symposium on Theory of Computing (STOC), pages
333–342, 2009.
[96] Franco P. Preparata and Dilip V. Sarwate. Computational complexity of Fourier transforms over finite fields. Mathematics of Computation, 31(139):740–751, 1977.
[97] Franco P. Preparata and Michael I. Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1985.
[98] Oded Regev. Lattice-based cryptography. In Proc. International Cryptology Conference
(CRYPTO), pages 131–141, 2006.
[99] Oded Regev. On the complexity of lattice problems with polynomial approximation
factors. The LLL algorithm, pages 475–496, 2010.

214
[100] Tomas Sander.

Efficient accumulators without trapdoor (extended abstract).

In

Proc. International Conference on Information and Communications Security (ICICS),
pages 252–262, 1999.
[101] Tomas Sander, Amnon Ta-Shma, and Moti Yung. Blind, auditable membership proofs.
In Proc. Financial Cryptography (FC), pages 53–71, 2001.
[102] Victor Shoup. A Computational Introduction to Number Theory and Algebra. Cambridge University Press, 2nd edition, 2008.
[103] Marc Stevens, Alexander Sotirov, Jacob Appelbaum, Arjen Lenstra, David Molnar,
Dag Arne Osvik, and Benne Weger.

Short chosen-prefix collisions for MD5 and

the creation of a rogue CA certificate. In Proc. International Cryptology Conference
(CRYPTO), pages 55–69, 2009.
[104] Roberto Tamassia and Nikos Triandopoulos. Efficient content authentication in peerto-peer networks. In Proc. Applied Cryptography and Network Security (ACNS), pages
354–372, 2007.
[105] Roberto Tamassia. Authenticated data structures. In Proc. European Symposium on
Algorithms (ESA), pages 2–5, 2003.
[106] Roberto Tamassia and Nikos Triandopoulos. Computational bounds on hierarchical
data processing with applications to information security. In Proc. International Colloquium on Automata, Languages and Programming (ICALP), pages 153–165, 2005.
[107] Roberto Tamassia and Nikos Triandopoulos. Certification and authentication of data
structures. In Proc. Alberto Mendelzon Workshop on Foundations of Data Management, 2010.
[108] Nikos Triandopoulos. Efficient Data Authentication. PhD thesis, Brown University,
September 2006.

215
[109] Guan Wang, Tongbo Luo, Michael T. Goodrich, Wenliang Du, and Zutao Zhu. Bureaucratic protocols for secure two-party sorting, selection, and permuting. In Proc. Symposium on Information, Computer and Communications Security (ASIACCS), pages
226–237, 2010.
[110] Peishun Wang, Huaxiong Wang, and Josef Pieprzyk. A new dynamic accumulator for
batch updates. In Proc. International Conference on Information and Communications
Security (ICICS), pages 98–112, 2007.
[111] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu. Finding collisions in the full SHA-1.
In Proc. International Cryptology Conference (CRYPTO), pages 17–36, 2005.
[112] Yin Yang, Dimitris Papadias, Stavros Papadopoulos, and Panos Kalnis. Authenticated join processing in outsourced databases. In Proc. International Conference on
Management of Data (SIGMOD), pages 5–18, 2009.
[113] Andrew Chi-Chih Yao. Protocols for secure computations (extended abstract). In
Proc. Foundations of Computer Science (FOCS), pages 160–164, 1982.