K-Anonymity: Universit' A Degli Studi Di Milano, 26013 Crema, Italia (Ciriani, Decapita, Foresti, Samarati) @dti - Unimi.it
K-Anonymity: Universit' A Degli Studi Di Milano, 26013 Crema, Italia (Ciriani, Decapita, Foresti, Samarati) @dti - Unimi.it
K-Anonymity: Universit' A Degli Studi Di Milano, 26013 Crema, Italia (Ciriani, Decapita, Foresti, Samarati) @dti - Unimi.it
1 Introduction
Today’s globally networked society places great demand on the dissemination
and sharing of information, which is probably becoming the most important
and demanded resource. While in the past released information was mostly
in tabular and statistical form (macrodata), many situations call today for
the release of specific data (microdata). Microdata, in contrast to macrodata
reporting precomputed statistics, provide the convenience of allowing the final
recipient to perform on them analysis as needed.
2 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
SSN Name Race Date of birth Sex ZIP Marital status Disease
asian 64/04/12 F 94142 divorced hypertension
asian 64/09/13 F 94141 divorced obesity
asian 64/04/15 F 94139 married chest pain
asian 63/03/13 M 94139 married obesity
asian 63/03/18 M 94139 married short breath
black 64/09/27 F 94138 single short breath
black 64/09/27 F 94139 single obesity
white 64/09/27 F 94139 single chest pain
white 64/09/27 F 94141 widow short breath
1
T [QI] denotes the projection, maintaining duplicate tuples, of attributes QI in
T.
k-Anonymity 5
respondents. For instance, with respect to the microdata table in Fig. 1 and
the quasi-identifier {Race, Date of birth, Sex, ZIP, Marital status},
it easy to see that the table satisfies k-anonymity with k = 1 only, since there
are single occurrences of values over the considered quasi-identified (e.g., the
single occurrence “asian, 64/04/12, F, 94142, divorced”).
The enforcement of k-anonymity requires the preliminary identification of
the quasi-identifier . The quasi-identifier depends on the external information
available to the recipient, as this determines her linking ability (not all possi-
ble external tables are available to every possible data recipient); and different
quasi-identifiers can potentially exist for a given table. For the sake of simplic-
ity, the original k-anonymity proposal [26] assumes that private table PT has
a single quasi-identifier composed of all attributes in PT that can be exter-
nally available and contains at most one tuple for each respondent. Therefore,
although the identification of the correct quasi-identifier for a private table
can be a difficult task, it is assumed that the quasi-identifier has been properly
recognized and defined. For instance, with respect to the microdata table in
Fig. 1, a quasi-identifier can be the set of attributes {Race, Date of birth,
Sex, ZIP, Marital status}.
O E O Y333
R1 = {person} person
33
3
R0 = {asian,black,white} asian black white
DGHR VGHR
0 0
(a) Race
O G W00
S1 = {not released} not released
000
0
S0 = {M,F} M F
DGHS VGHS
0 0
(b) Sex
O B \999
Z2 = {941**} 941**
99
99
O H V,
9413*
H V,
9414*
,,
,,
94138 94139 94141 94142
Z0 = {94138,94139,94141,94142}
DGHZ VGHZ
0 0
(c) ZIP
O u: ]:::
M2 = {not released} not released
uuu ::
u
uu ::
uu
O D O Z6 O
666
M1 = {been married,never married} been married never married
666
DGHM VGHM
0 0
(d) Marital status
O
D3 = {half-decade}
@
60 − 65
dHH
HH
HH
HH
H
O
D2 = {year} D 63
F X11
64
111
1
D Z444 B O O \99
63/03 64/04 64/09
O
D1 = {year/month}
99
44
99
4 9
63/03/13 63/03/18 64/04/12 64/04/15 64/09/13 64/09/27
D0 = {year/month/day}
DGHD VGHD
0 0
(e) Date of birth
DGHhR ,Z i
0 0 Generalization Strategy 2
hR1 , Z2 i
hR1 , Z2 i O f3 hperson, 941**i kXXXXXXX
D Z666 ffff fffff XXXXX
hR1 , Z1 i
fff X
hR1 , Z1 i O hperson, 9413*i hperson, 9414*i
9t O jTTTT 9 O jTTTT
O dIII hR0 ,O Z2 i
II ttt TTTT tttt TTTT
hR0 , Z1 i
t t
hR1 , Z0 i O hasian,9413*ihblack,9413*i hwhite,9413*i hasian,9414*ihblack,9414*i hwhite,9414*i
: O O eJJJ O eJJJ : O O eJJJ O dIII
Z66 hR0D , Z1 i
uuuu JJ JJ uuuu JJ II
6
hR0 , Z0 i
u u
hR0 , Z0 i hasian,94138ihasian,94139ihblack,94138ihblack,94139ihwhite,94138ihwhite,94139ihasian,94141ihasian,94142ihblack,94141ihblack,94142ihwhite,94141ihwhite,94142i
Generalization Strategy 3
hR1 , Z2 i
O hperson, 941**i
ff3 O lZZZZZZ
fffffffff ZZZZZZZ
hR0 , Z2 i
ff ZZZZZZ
O hasian, 941**i hblack, 941**i hwhite, 941**i
:u O O eJJJ : dIII
u u J J u uu II
hR0 , Z1 i
uu uu
O hasian,9413*ihasian,9414*i hblack,9413*ihblack,9414*i hwhite,9413*i hwhite,9414*i
: O O dIII 9 O O dIII O dIII O dIII
uu II tt II II II
k-Anonymity
hR0 , Z0 i
uuu ttt
hasian,94138ihasian,94139ihasian,94141ihasian,94142ihblack,94138ihblack,94139ihblack,94141ihblack,94142ihwhite,94138ihwhite,94139ihwhite,94141ihwhite,94142i
9
Fig. 4. Hierarchy DGHhR0 ,Z0 i and corresponding Domain and Value Generalization Strategies
10 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
hR , Z i [1, 2]
x; cFF >>
1 2
x x FF >>
x x F >
hR1 , Z1 i iS hR0 , Z2 i [1, 1] N [0, 2]
O SSS O NNN
SSS NNN
SS N
hR1 , Z0 i hR , Z1 i [1, 0] [0, 1]
cFF ;0 >>
FF xxx >>
>
F xx
hR0 , Z0 i [0, 0]
Suppression
Generalization Tuple Attribute Cell None
Attribute AG TS AG AS AG CS AG
≡ AG ≡ AG AS
Cell CG TS CG AS CG CS CG
not applicable not applicable ≡ CG ≡ CG CS
None TS AS CS
not interesting
some cells can report the specific day (no generalization), others the
month (one step of generalization), others the year (two steps of gen-
eralization), and so on. Generalizing at the cell level has the advantage
of allowing the release of more specific values (as generalization can be
confined to specific cells rather than hitting whole columns). However,
besides a higher complexity of the problem, a possible drawback in the
application of generalization at the cell level is the complication aris-
ing from the management of values at different generalization levels
within the same column.
Suppression can be applied at the level of:
• Tuple (TS): suppression is performed at the level of row; a suppression
operation removes a whole tuple.
• Attribute (AS): suppression is performed at the level of column, a
suppression operation obscures all the values of a column.
• Cell (CS): suppression is performed at the level of single cells; as a
result a k-anonymized table may wipe out only certain cells of a given
tuple/attribute.
The possible combinations of the different choices for generalization and
suppression (including also the choice of not applying one of the two tech-
niques) result in different models for k-anonymity, which can represent a tax-
onomy for classifying the different k-anonymity proposals. Different models
bear different complexity and define in different ways the concept of minimal-
ity of the solutions.
A first attempt to introduce a taxonomy for classifying k-anonymity ap-
proaches has been described in [20], where the authors distinguish between the
application of suppression and generalization at the cell or attribute level. Our
taxonomy refines and completes this classification. Below we discuss the dif-
ferent models resulting from our classification, characterize them, and classify
existing approaches accordingly. We refer to each model with a pair (sepa-
rated by ), where the first element describes the level of generalization (AG,
CG, or none) and the second element describes the level of suppression(TS,
AS, CS, or none). Table in Fig. 8 summarizes these models.
AG TS Generalization is applied at the level of attribute (column) and
suppression at the level of tuple (row). This is the assumption consid-
ered in the original model [26], as well as in most of the subsequent ap-
proaches providing efficient algorithms for solving the k-anonymity prob-
lem [5, 18, 20, 29, 33], since it enjoys a tradeoff between the computational
complexity and the quality of the anonymized table.
AG AS Both generalization and suppression are applied at the level of col-
umn. No specific approach has investigated this model. It must also be
noted that if attribute generalization is applied, attribute suppression is
not needed; since suppressing an attribute (i.e., not releasing any of its
values) to reach k-anonymity can equivalently be modeled via a gener-
alization of all the attribute values to the maximal element in the value
14 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
Fig. 9. A private table (a) and some 2-anonymized version of according to different
models
k-Anonymity 17
PT, these exact algorithms with attribute generalization and tuple suppres-
sion are practical. In particular, when |QI| ∈ O(log n), these exact algorithms
have computational time polynomial in the number of tuples of PT, provided
that the threshold on the number of suppressed tuples (MaxSup) is constant
in value.
Recently many exact algorithms for producing k-anonymous tables
through attribute generalization and tuple suppression have been proposed [5,
20, 26, 29]. Samarati [26] presented an algorithm that exploits a binary search
on the domain generalization hierarchy to avoid an exhaustive visit of the
whole generalization space. Bayardo and Agrawal [5] presented an optimal
algorithm that starts from a fully generalized table (with all tuples equal)
and specializes the dataset in a minimal k-anonymous table, exploiting ad-
hoc pruning techniques. Finally, LeFevre, DeWitt, and Ramakrishnan [20]
described an algorithm that uses a bottom-up technique and a priori com-
putation. Sweeney [29] proposed an algorithm that exhaustively examines
all potential generalizations for identifying a minimal one satisfying the k-
anonymity requirement. This latter approach is clearly impractical for large
datasets, and we will therefore not discuss it further. We will now describe
these approaches in more details.
2
Meyerson and Williams have also described in [24] a O(k log |QI|)-approximation
algorithm with polynomial time complexity (O(|QI|n3 )) for the CS model.
18 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
t1 t2 t3 /t4 /t5 t6 t7 t8 t9
t1 [0, 0] [0, 1] [0, 2] [1, 2] [1, 2] [1, 2] [1, 1]
t2 [0, 1] [0, 0] [0, 2] [1, 2] [1, 2] [1, 2] [1, 0]
t6 [1, 2] [1, 2] [1, 1] [0, 0] [0, 1] [1, 1] [1, 2]
t7 [1, 2] [1, 2] [1, 0] [0, 1] [0, 0] [1, 0] [1, 2]
t8 [1, 2] [1, 2] [1, 0] [1, 1] [1, 0] [0, 0] [0, 2]
t9 [1, 1] [1, 0] [1, 2] [1, 2] [1, 2] [0, 2] [0, 0]
Race ZIP
h[asian] [black] [white]i h[94138] [94139] [94141] [94142]i
1 2 3 4 5 6 7
Bayardo and Agrawal [5] propose an interesting algorithm for AG TS, called
k-Optimize, which often obtains good solutions with a reduced computational
time. According to this approach, an attribute generalization for an attribute
A with an ordered domain D consists in a partitioning of the attribute domain
into intervals such that each possible value in the domain appears in some in-
terval and each value in a given interval I precedes any value in the intervals
following I. As an example, consider attribute Race on domain D1 = {asian,
black, white} where the values in D1 are ordered according to a lexicographic
order, and attribute ZIP on domain D2 = {94138, 94139, 94141, 94142} where
the values follow a numeric order. For instance, domain D1 can be partitioned
into three intervals, namely [asian], [black], and [white], and domain D2 can be
partitioned into four intervals, namely [94138], [94139], [94141], and [94142].
The approach then assumes an order among quasi-identifier attributes and as-
sociates an integer, called index , with each each interval in any domain of the
quasi-identifier attributes. The index assignment reflects the total order rela-
tionship over intervals in the domains and among quasi-identifier attributes.
For instance, consider the quasi-identifier attributes Race and Zip and sup-
pose that Race precedes ZIP. Figure 12 illustrates the value ordering and the
corresponding index values. As it is visible from this fig., the index values
associated with the intervals of domain D1 of attribute Race are lower than
the index values associated with the intervals of domain D2 of attribute ZIP
since we assume that Race precedes ZIP. Moreover, within each domain the
index assignment reflects the total order among intervals. More formally, the
indexes associated with the intervals of domain Di of attribute Ai are lower
than the indexes associated with intervals of domain Dj of attribute Aj , if
attribute Ai precedes Aj in the order relationship. Moreover, indexes associ-
k-Anonymity 21
ated with each interval I of domain Di follow the same order as intervals in
Di .
A generalization is then represented through the union of the individual
index values for each attribute. The least value in an attribute domain can be
omitted since it will certainly appear in the generalizations for that domain.
For instance, with respect to the total order of the value domains in Fig. 12,
notation {6} identifies a generalization, where the generalizations are {1} for
attribute Race and {4, 6} for attribute ZIP. These, in turn, represent the fol-
lowing value intervals: Race: h[asian or black or white]i; ZIP: h[94138 or 94139],
[94141 or 94142]i. Note that the empty set { } represents the most general
anonymization. For instance, with respect to our example, { } corresponds to
the generalizations {1} for attribute Race and {4} for attribute ZIP, which
in turn correspond to the generalized values Race: h[asian or black or white]i;
ZIP: h[94138 or 94139 or 94141 or 94142]i.
k-Optimize builds a set enumeration tree over the set I of index values. The
root node of the tree is the empty set. The children of a node n will enumerate
those sets that can be formed by appending a single element of I to n, with
the restriction that this single element must follow every element already in
n according to the total order previously defined. Figure 13 illustrates an
example of set enumeration tree over I = {1, 2, 3}. The consideration of a
tree guarantees the existence of a unique path between the root and each
node. The visit of the set enumeration tree using a standard traversal strategy
is equivalent to the evaluation of each possible solution to the k-anonymity
problem. At each node n in the tree the cost of the generalization strategy
represented by n is computed and compared against the best cost found until
that point; if lower it becomes the new best cost. This approach however is
not practical because the number of nodes in the tree is 2|I| ; therefore [5]
proposes heuristics and pruning strategies. In particular, k-Optimize prunes a
node n when it can determine that none of its descendants could be optimal.
According to a given cost function, k-Optimize computes a lower bound on
the cost that can be obtained by any node in the sub-tree rooted at n. The
subtree can be pruned if the computed lower bound is higher than the best
cost found by the algorithm until that point. Note that when a subtree is
pruned also additional nodes can be removed from the tree. For instance,
consider the set enumeration tree in Fig. 13 and suppose that node {1, 3} can
be pruned. This means that a solution that contains index values 1 and 3 is
not optimal and therefore also node {1, 2, 3} can be pruned.
k -Optimize can always compute the best solution in the space of the gener-
alization strategies. Since the algorithm tries to improve the solution at each
visited node evaluating the corresponding generalization strategy, it is possible
to fix a maximum computational time, and obtain a good, but not optimal,
solution.
22 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
{}J
9J99 JJ
99 JJJ
99 JJ
J
{1} {2} {3}
==
==
==
=
{1, 2} {1, 3} {2, 3}
{1, 2, 3}
Fig. 13. An example of set enumeration tree over set I = {1, 2, 3} of indexes
4.3 Incognito
anonymity on all the possible pairs of attributes, that is, hRace, Sexi, hRace,
Marital statusi, and hSex, Marital statusi. In particular, Incognito has
to first check the 2-anonymity with respect to the lowest tuples that can be
formed with the single attributes generated at iteration 1 (i.e., hR0 , S0 i, hR0 ,
M1 i, and hS0 , M1 i). It is easy to see that the microdata table in Fig. 1 is
2-anonymous with respect to hR0 , S0 i and hS0 , M1 i but is not 2-anonymous
with respect to hR0 , M1 i because, for example, there is only one occurrence
of hwhite, been marriedi. Incognito therefore proceeds by checking general-
izations hR0 , M2 i and hR1 , M1 i. These generalizations satisfy 2-anonymity and
then Incognito can start iteration 3. Due to the previous iterations, Incognito
has to first check generalizations hR0 , S0 , M2 i, and hR1 , S0 , M1 i. Since these two
generalizations satisfy the 2-anonymity property, the algorithm terminates.
Figure 14 illustrates on the left-hand side the complete domain generaliza-
tion hierarchies and on the right-hand side the sub-hierarchies computed by
Incognito at each iteration (i.e., from which the generalizations, which are a
priori known not to satisfy k-anonymity, have been discarded).
The algorithms presented so far find exact solutions for the k-anonymity prob-
lem. Since k-anonymity is a NP-hard problem, all these algorithms have com-
plexity exponential in the size of the quasi-identifier. Alternative approaches
have proposed the application of heuristic algorithms. The algorithm proposed
by Iyengar [18] is based on genetic algorithms and solves the k-anonymity
problem using an incomplete stochastic search method. The method does not
assure the quality of the solution proposed, but experimental results show
the validity of the approach. Winkler [34] proposes a method based on sim-
ulated annealing for finding locally minimal solutions, which requires high
computational time and does not assure the quality of the solution.
Fung, Wang and Yu [12] present a top-down heuristic to make a table
to be released k-anomymous. The approach applies to both continuous and
categorical attributes. The top-down algorithm starts from the most general
solution, and iteratively specializes some values of the current solution until
the k-anonymity requirement is violated. Each step of specialization increases
the information and decreases the anonymity. Therefore, at each iteration,
the heuristic selects a “good” specialization guided by a goodness metric. The
metric takes into account both the “information gain” and the “anonymity
loss”.
Due to heuristic nature of these approaches, no bounds on efficiency and
goodness of the solutions can be given; however experimental results can be
used to assess the quality of the solution retrieved.
24 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
hR1 , M2 i hR1 , M2 i
y< bEE
EE y< bEE
EE
yyy E yyy
E
y y
hR0 , M2 i hR1 , M1 i hR0 , M2 i hR1 , M1 i
O
lllll6 O O
llll
l6
llll lll l
hR0 , M1 i hR1 , M0 i hR0 , M1 i
bEE <
EE yyy
E yy
hR0 , M0 i
hS1 , M2 i hS1 , M2 i
yy< bEE
EE yy< bEE
EE
yy E yy E
y y
hS0 , M2 i hS1 , M1 i hS0 , M2 i hS1 , M1 i
O 6 O O 6
l ll llll l ll llll
ll ll
hS0 , M1 i hS1 , M0 i hS0 , M1 i
bEE <
EE yyy
E yy
hS0 , M0 i
Iteration 3 hR1 , S1 , M2 i hR1 , S1 , M2 i
ww; O cGGG w; O dIIII
w GG ww II
ww ww
hR0 , S1 , M2 i hR1 , S0 , M2 i hR1 , S1 , M1 i hR0 , S1 , M2 i hR1 , S0 , M2 i hR1 , S1 , M1 i
w; O kk5 cGkGk5 O cGG O w; dIII O
w wwkkkkkk kkkkkk GGG GG
G w ww III
wk k w
hR0 , S0 , M2 i hR0 , S1 , M1 i hR1 , S0 , M1 i hR1 , S1 , M0 i hR0 , S0 , M2 i hR1 , S0 , M1 i
cGG O cGG 5 O 5 ;
GG GG kkkkkk kkkkkkwww
G kkGk kkk ww
hR0 , S0 , M1 i hR0 , S1 , M0 i hR1 , S0 , M0 i
cGG O w;
GG ww
G w w
hR0 , S0 , M0 i
The algorithm for k = 2 exploits the minimum-weight [1, 2]-factor built on the
graph constructed for the 2-anonymity instance. The [1, 2]-factor for graph
G is the spanning subgraph of G, built using only vertexes of degree 1 or
2 (i.e., with no more than 2 outgoing edges). Such a subgraph is a vertex-
disjoint collection of edges and pairs of adjacent nodes and can be computed
in polynomial time. Each component in the subgraph is treated as a cluster,
and we can obtain a 2-anonymized table by suppressing each cell, for which
the vectors in the cluster differ in value. This procedure is a 1.5-approximation
algorithm.
The approximation algorithm for k = 3 is similar and guarantees a 2-
approximation solution.
Multidimensional k-Anonymity
`-Diversity
knows that this entity is represented in the table, the attacker can infer the
sensitive value associated with certainty. For instance, with respect to the 2-
anonymous table in Fig. 15, if Alice knows that Carol is a black female
and that her data are in the microdata table, she can infer that Carol suffers
of short breath, as both the tuples having these values for the Race and
Sex attributes are associated with the short breath value for the Disease
attribute. The 2-anonymous table is therefore exposed to attribute linkage.
The background knowledge attack is instead based on a prior knowledge
of the attacker of some additional external information. For instance, suppose
that Alice knows that Hellen is a white female. Alice can then infer that
Hellen suffers of chest pain or short breath. Suppose now that Alice
knows that Hellen runs for two hours every day. Since a person that suffers of
short breath cannot run for a long period, Alice can infer with probability
equal to 1 that Hellen suffers of chest pain.
To avoid such attacks, Machanavajjhala, Gehrke, and Kifer introduce the
notion of `-diversity [23]. Given a private table PT and a generalization GT of
PT, let q-block be a set of tuples in GT with the same quasi-identifier value.
A q-block is said to be `-diverse if it contains at least ` different values for
the sensitive attribute. It is easy to see that with this additional constraint,
the homogeneity attack is no more applicable because each q-block set has at
least ` (≥ 2) distinct sensitive attribute values. Analogously, the background
knowledge attack becomes more complicate as ` increases because the attacker
needs more knowledge to individuate a unique value associable to a predefined
entity. The algorithm proposed in [23] therefore generates k-anonymous tables
with the `-diversity property. The algorithm checks the `-diversity property,
which is a monotonic property with respect to the generalization hierarchies
considered for k-anonymity purposes.
28 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati
Evaluation of k-Anonymity
Some recent papers evaluate the results of k-anonymization using data min-
ing techniques [1, 12, 32]. In particular, Aggarwal [1] shows that, when the
number of attributes in the quasi-identifier increases, the information loss of
the resulting k-anonymized table may become very high. The intuition behind
this result is that the probability that k tuples in the private table are “sim-
ilar” (i.e., they correspond to the same tuple in the anonymized table with
a reduced loss of information) is very low. The ability to identify minimal
quasi-identifiers is therefore important.
Distributed Algorithms
Yao, Wang and Jajodia show that the problem of stating whether a set
of views violates k-anonymity is in general computationally hard (N P N P -
hard). In the case where no functional dependencies exist among the views,
the problem becomes simpler, and a polynomial checking algorithm for its
solution is described [7].
The k-anonymity property has been studied also for protecting location pri-
vacy [6, 14]. In the context of location-based services, Bettini, Wang and
Jajodia [6] present a framework for evaluating the privacy of a user identity
when location information is released. In this case, k-anonymity is guaranteed,
not among a set of tuples of a database, but in a set of individuals that can
send a message in the same spatio-temporal context.
7 Conclusions
k-anonymity has recently been investigated as an interesting approach to pro-
tect microdata undergoing public or semi-public release from linking attacks.
In this chap., we illustrated the original k-anonymity proposal and its enforce-
ment via generalization and suppression as means to protect respondents’
identities while releasing truthful information. We then discussed different
ways in which generalization and suppression can be applied, thus defining a
possible taxonomy for k-anonymity and discussed the main proposals existing
in the literature for solving the k-anonymity problems in the different models.
We have also illustrated further studies building on the k-anonymity concept
to safeguard privacy.
8 Acknowledgments
This work was supported in part by the European Union within the PRIME
Project in the FP6/IST Programme under contract IST-2002-507591 and by
the Italian MIUR within the KIWI and MAPS projects.
References
1. Aggarwal C (2005). On k-anonymity and the curse of dimensionality. In Proc.
of the 31st International Conference on Very Large Data Bases (VLDB’05),
Trondheim, Norway.
2. Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D,
Zhu A (2005). Anonymizing tables. In Proc. of the 10th International Confer-
ence on Database Theory (ICDT’05), pp. 246–258, Edinburgh, Scotland.
3. Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D,
Zhu A (2005). Approximation algorithms for k-anonymity. Journal of Privacy
Technology, paper number 20051120001.
4. Anderson R (1996). A security policy model for clinical information systems.
In Proc. of the 1996 IEEE Symposium on Security and Privacy, pp. 30–43,
Oakland, CA, USA.
5. Bayardo RJ, Agrawal R (2005). Data privacy through optimal k-anonymization.
In Proc. of the 21st International Conference on Data Engineering (ICDE’05),
pp. 217–228, Tokyo, Japan.
6. Bettini C, Wang XS, Jajodia S (2005). Protecting privacy against location-
based personal identification. In Proc. of the Secure Data Management, Trond-
heim, Norway.
7. Jajodia S, Yao C, Wang XS (2005). Checking for k-anonymity violation by
views. In Proc. of the 31st International Conference on Very Large Data Bases
(VLDB’05), Trondheim, Norway.
8. Dobson J, Jajodia S, Olivier M, Samarati P, Thuraisingham B (1998). Privacy
issues in www and data mining. In Proc. of the 12th IFIP WG11.3 Working
Conference on Database Security, Chalkidiki, Greece. Panel notes.
32 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati