Abstract
Disclosure analysis in two-way contingency tables is important in categorical data analysis. The disclosure analysis concerns whether a data snooper can infer any protected cell values, which contain privacy sensitive information, from available marginal totals (i.e., row sums and column sums) in a two-way contingency table. Previous research has been targeted on this problem from various perspectives. However, there is a lack of systematic definitions on the disclosure of cell values. Also, no previous study has been focused on the distribution of the cells that are subject to various types of disclosure. In this paper, we define four types of possible disclosure based on the exact upper bound and/or the lower bound of each cell that can be computed from the marginal totals. For each type of disclosure, we discover the distribution pattern of the cells subject to disclosure. Based on the distribution patterns discovered, we can speed up the search for all cells subject to disclosure.
Work of Lu and Li was supported in part by SMU Research Office. Work of Wu was supported in part by USA National Science Foundation Grant IIS-0546027.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 21(4), 515–556 (1989)
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: PODS (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)
Beck, L.L.: A security mechanism for statistical databases. ACM Trans. Database Syst. 5(3), 316–338 (1980)
Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)
Buzzigoli, L., Giusti, A.: An algorithm to calculate the lower and upper bounds of the elements of an array given its marginals. In: Proceedings of the conference for statistical data protection, pp. 131–147 (1999)
Causey, B.D., Cox, L.H., Ernst, L.R.: Applications of transportation theory to statistical problems. Journal of the American Statistical Association 80, 903–909 (1985)
Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: ICDM, pp. 589–592 (2005)
Chin, F.Y.L., Özsoyoglu, G.: Statistical database design. ACM Trans. Database Syst. 6(1), 113–139 (1981)
Chin, F.Y.L., Özsoyoglu, G.: Auditing and inference control in statistical databases. IEEE Trans. Software Eng. 8(6), 574–582 (1982)
Chowdhury, S., Duncan, G., Krishnan, R., Roehrig, S., Mukherjee, S.: Disclosure detection in multivariate categorical databases: auditing confidentiality protection through two new matrix operators. Management Sciences 45, 1710–1723 (1999)
Cox, L.: Bounding entries in 3-dimensional contingency tables. In: SDC: From Theory to Practice (2001), http://vneumann.etse.urv.es/amrads/papers/coxlux.pdf
Cox, L.: On properties of multi-dimensional statistical tables. Journal of Statistical Planning and Inference 117(2), 251–273 (2003)
Cox, L.H.: Suppression methodology and statistical disclosure control. Journal of American Statistical Association 75, 377–385 (1980)
Cox, L.H.: A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524 (1987)
Cox, L.H., George, J.A.: Controlled rounding for tables with subtotals. Annuals of operations research 20(1-4), 141–157 (1989)
Cox, L.H.: Network models for complementary cell suppression. Journal of the American Statistical Association 90, 1453–1462 (1995)
Dandekar, R.A., Cox, L.H.: Synthetic tabular data: An alternative to complementary cell suppression. Manuscript available from URL, http://mysite.verizon.net/vze7w8vk/
Denning, D.E., Schlorer, J.: Inference controls for statistical databases. IEEE Computer 16(7), 69–82 (1983)
Dobkin, D.P., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Trans. Database Syst. 4(1), 97–106 (1979)
Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables given fixed marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences of the United States of America 97(22), 11885–11892 (2000)
Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables induced by fixed marginal totals with applications to disclosure limitation. Statistical journal of the united states 18, 363–371 (2001)
Dobra, A., Karr, A., Sanil, A.: Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues. Statistics and Computing 13, 363–370 (2003)
Domingo-Ferrer, J.: Advances in inference control in statistical databases: An overview. In: Inference Control in Statistical Databases, pp. 1–7 (2002)
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)
Farkas, C., Jajodia, S.: The inference problem: A survey. SIGKDD Explorations 4(2), 6–11 (2002)
Fischetti, M., Salazar, J.: Solving the cell suppression problem on tabular data with linear constraints. Management sciences 47(7), 1008–1027 (2001)
Fischetti, M., Salazar, J.J.: Solving the cell suppression problem on tabular data with linear constraints. Management Sciences 47, 1008–1026 (2000)
Fischetti, M., Salazar, J.J.: Partial cell suppression: a new methodology for statistical disclosure control. Statistics and Computing 13, 13–21 (2003)
Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD, pp. 279–288 (2002)
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp. 99–106 (2003)
Li, Y., Lu, H., Deng, R.H.: Practical inference control for data cubes (extended abstract). In: IEEE Symposium on Security and Privacy (2006)
Li, Y., Wang, L., Jajodia, S.: Preventing interval-based inference by random data perturbation. In: Privacy Enhancing Technologies, pp. 160–170 (2002)
Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)
Muralidhar, K., Sarathy, R.: A general aditive data perturbation method for database security. Management Sciences 45, 1399–1415 (2002)
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)
Schlörer, J.: Security of statistical databases: Multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)
Schlörer, J.: Information loss in partitioned statistical databases. Comput. J. 26(3), 218–223 (1983)
Sturmfels, B.: Week 1: Two-way contingency tables, John von Neumann Lectures 2003 at the Technical University München (2003), http://www-m10.mathematik.tu-muenchen.de/neumann/lecturenotes/neumann_week1.pdf
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)
Traub, J.F., Yemini, Y., Wozniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)
Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 249–256. Springer, Heidelberg (2004)
Wang, L., Jajodia, S., Wijesekera, D.: Securing olap data cubes against privacy breaches. In: IEEE Symposium on Security and Privacy, pp. 161–175 (2004)
Wang, L., Li, Y., Wijesekera, D., Jajodia, S.: Precisely Answering Multi-dimensional Range Queries without Privacy Breaches. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 100–115. Springer, Heidelberg (2003)
Willenborg, L., de Walal, T.: Statistical Disclosure Control in Practice. Springer, Heidelberg (1996)
Yao, C., Wang, X.S., Jajodia, S.: Checking for k-anonymity violation by views. In: VLDB, pp. 910–921 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, H., Li, Y., Wu, X. (2006). Disclosure Analysis for Two-Way Contingency Tables. In: Domingo-Ferrer, J., Franconi, L. (eds) Privacy in Statistical Databases. PSD 2006. Lecture Notes in Computer Science, vol 4302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11930242_6
Download citation
DOI: https://doi.org/10.1007/11930242_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49330-3
Online ISBN: 978-3-540-49332-7
eBook Packages: Computer ScienceComputer Science (R0)