On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation

Kokolakis, George; Fouskakis, Dimitris

doi:10.1007/s00357-008-9014-8

On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation

Published: 24 October 2008

Volume 25, pages 209–224, (2008)
Cite this article

Journal of Classification Aims and scope Submit manuscript

George Kokolakis¹ &
Dimitris Fouskakis¹

100 Accesses
1 Citation
Explore all metrics

Abstract

Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions, one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability m-partition of $ \mathbb{R}^{d} $, with m ^1/d ∈ $ \mathbb{N} $, is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes, no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal partition with a very small computational cost is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimax Decision Rules for Identifying an Unknown Distribution of a Random Variable

Empirical likelihood for quantiles under associated samples

Article 01 January 2015

A Robust Hurdle Poisson Model in the Estimation of the Extremal Index

References

ADAM, N.P., and WORTMANN, J.C. (1989), “Security Control Methods for Statistical Databases. A Comparative Study”, ACM Computing Surveys, 21, 515–556.
Article Google Scholar
DEFAYS, D., and NANOPOULOS, P. (1993), “Panels of Enterprises and Confidentiality: The Small Aggregates Method”, in Proceedings of Statistics Canada Symposium –Design and Analysis of Longitudinal Surveys, Statistics Canada: Ottawa, 195–204.
Google Scholar
DEGROOT, M.H. (1970), Optimal Statistical Decisions, New York: McGraw-Hill.
MATH Google Scholar
DOMINGO-FERRER, J., and MATEO-SANZ, J.M. (2002), “Practical Data-oriented Microaggregation for Statistical Disclosure Control”, IEEE Transactions on Knowledge and Data Engineering, 14(1), 189–201.
Article Google Scholar
DUNCAN, G.T., and LAMBERT, D. (1989), “The Risk of Disclosure for Microdata”, Journal of Business and Economic Statistics, 7, 207–217.
Article Google Scholar
DUNCAN, G.T., and PEARSON, R.W. (1991), “Enhancing Access to Microdata While Protecting Confidentiality: Prospects for the Future”, Statistical Science, 6, 219–239.
Article Google Scholar
FIENBERG, S.E. (1994), “Conflict Between the Needs for Access to Statistical Information and Demands for Confidentiality”, Journal of Official Statistics, 10, 115–132.
Google Scholar
KOKOLAKIS, G., and FOUSKAKIS, D. (2007), Importance Partitioning in Micro-Aggregation. Submitted.
KOKOLAKIS, G., and NANOPOULOS, P. (2001), “Bayesian Multivariate Micro-Aggregation under the Hellinger’s Distance Criterion”, Research in Official Statistics, 4, 117–125.
Google Scholar
KOKOLAKIS, G., NANOPOULOS, P., and FOUSKAKIS, D. (2006), “Bregman Divergences in the (m × k)−partitioning Problem”, Computational Statistics and Data Analysis, 51, 668–678.
Article MathSciNet Google Scholar
KRZANOWSKI, W.J. (1983), “Distance Between Populations Using Mixed Continuous and Categorical Variables”, Biometrika, 70, 235–243.
Article MATH MathSciNet Google Scholar
MCLACHLAN, G.J. (1992), Discriminant Analysis and Statistical Pattern Recognition, New York: Wiley.
Google Scholar
ROBERT, C.P. (1994), The Bayesian Choice: A Decision-Theoretic Motivation, New York: Springer.
MATH Google Scholar
ROCKAFELLAR, R.T. (1997), Convex Analysis, Princeton University Press: Princeton, NJ.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics, National Technical University of Athens, Zografou Campus, Athens, 15780, Greece
George Kokolakis & Dimitris Fouskakis

Authors

George Kokolakis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Fouskakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Kokolakis.

Additional information

Published online xx, xx, xxxx.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kokolakis, G., Fouskakis, D. On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation. J Classif 25, 209–224 (2008). https://doi.org/10.1007/s00357-008-9014-8

Download citation

Published: 24 October 2008
Issue Date: November 2008
DOI: https://doi.org/10.1007/s00357-008-9014-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimax Decision Rules for Identifying an Unknown Distribution of a Random Variable

Empirical likelihood for quantiles under associated samples

A Robust Hurdle Poisson Model in the Estimation of the Extremal Index

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimax Decision Rules for Identifying an Unknown Distribution of a Random Variable

Empirical likelihood for quantiles under associated samples

A Robust Hurdle Poisson Model in the Estimation of the Extremal Index

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation