Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Part of the book series: Studies in Big Data ((SBD,volume 28))

Abstract

A fundamental issue in order to define effective methods for ensuring confidentiality is to define privacy models as well as measures for disclosure risk assessment. In this chapter we review the main models and measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This way of proceeding is already discussed in e.g. [29] (p. 408): “Methods that mask the key variables impede identification of the respondent in the file, and methods that mask the target variables limit what is learned if a match is made. Both approaches may be useful, and in practice a precise classification of variables as keys or targets may be difficult. However, masking of targets is more vulnerable to the trade-off between protection gain and information loss than masking of keys; hence masking of keys seems potentially more fruitful”.

  2. 2.

    In databases, the schema define the type of attributes, their types and relationships. They roughly correspond to metadata in statistical disclosure control.

  3. 3.

    A description of hash functions , very common in data structures, can be found e.g. in [74].

  4. 4.

    The discussion of which are the quasi-identifiers for attacking a database is present in e.g. the literature on data protection for graphs and social networks. There are a few competing definitions of k-anonymity for graphs that correspond to different sets of quasi-identifiers. We will discuss them in Sect. 6.4.2 (on algorithms for k-anonymity for big data).

  5. 5.

    There are two main types of methods for anomaly detection: models based on misuses (a database of misuses is used to learn what an anomaly is) and models based on correct activity (the model we learn explains normal activity, and what diverges from the model is classified as an anomaly). The latter approach seems more suitable when data is protected if this process can eliminate outliers, and, thus, the anomalies of a database.

References

  1. Bambauer, J.: Tragedy of the deidentified data commons: an appeal for transparency and access. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, Canada, 28–30 Oct 2013

    Google Scholar 

  2. Yakowitz, J.: Tragedy of the data commons. Harv. J. Law Technol. 25(1), 1–67 (2011)

    Google Scholar 

  3. Polonetsky, J., Wolf, C., Brennan, M.W.: Comments of the future of privacy forum. Future of Privacy. http://www.futureofprivacy.org/wp-content/uploads/01-17-2014-FPF-Comments-to-the-FCC.pdf (2014)

  4. de Montjoye, Y.-A., Radaelli, L., Singh, V.K., Pentland, A.S.: Unique in the shopping mall: on the reidentifiability of credit card metadata. Science 347, 536–539 (2015)

    Article  Google Scholar 

  5. Jändel, M.: Anonymization of personal data is impossible in practice. Presented in Kistamässan om Samhällssäkerhet (2015)

    Google Scholar 

  6. Barth-Jones, D., El Emam, K., Bambauer, J., Cavoukioan, A., Malin, B.: Assessing data intrusion threats. Science 348, 194–195 (2015)

    Article  Google Scholar 

  7. Sánchez, D., Martínez, S., Domingo-Ferrer, J.: Comment on “Unique in the shopping mall: reidentifiability of credit card metadata”. Science, 18 March 1274-a (2016)

    Google Scholar 

  8. de Montjoye, Y.-A., Pentland, A.S.: Response. Science 348, 195 (2015)

    Article  Google Scholar 

  9. de Montjoye, Y.-A., Pentland, A.S.: Response to Comment on “Unique in the shopping mall: On the reidentifiability of credit card metadata”. Science, 18 March 1274-b (2016)

    Google Scholar 

  10. Cavoukian, A., El Emam, K.: Dispelling the Myths Surrounding De-identification: Anonymization Remains a Strong Tool for Protecting Privacy (2011)

    Google Scholar 

  11. Dalenius, T.: Towards a methodology for statistical disclosure control. Statistisk Tidskrift 5, 429–444 (1977)

    Google Scholar 

  12. Krantz, D.H., Luce, R.D., Suppes, P., Tversky, A.: Foundations of Measurement: Additive and Polynomial Representations, vol. 1. Academic Press, New York (1971)

    MATH  Google Scholar 

  13. Luce, R.D., Krantz, D.H., Suppes, P., Tversky, A.: Foundations of Measurement: Representation, Axiomatization, and Invariance, vol. 3. Academic Press, New York (1990)

    MATH  Google Scholar 

  14. Roberts, F.S.: Measurement Theory. Addison-Wesley, Reading (1979)

    Google Scholar 

  15. Suppes, P., Krantz, D.H., Luce, R.D., Tversky, A.: Foundations of Measurement: Geometrical, Threshold, and Probability Representations, vol. 2. Academic Press, San Diego (1989)

    MATH  Google Scholar 

  16. Dalenius, T.: Finding a needle in a haystack—or identifying anonymous census records. J. Off. Stat. 2(3), 329–336 (1986)

    Google Scholar 

  17. Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: Pre-proceedings of ETK-NTTS 2001, vol. 2, pp. 807–826. Eurostat (2001)

    Google Scholar 

  18. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–134. North-Holland, Amsterdam (2001)

    Google Scholar 

  19. Mateo-Sanz, J.M., Sebé, F., Domingo-Ferrer, J.: Outlier protection in continuous microdata masking. In: PSD 2004. LNCS, vol. 3050, pp. 201–215 (2004)

    Google Scholar 

  20. Templ, M.: Statistical disclosure control for microdata using the R-Package sdcMicro. Trans. Data Priv. 1, 67–85 (2008)

    MathSciNet  Google Scholar 

  21. Nin, J., Herranz, J., Torra, V.: Using classification methods to evaluate attribute disclosure risk. In: Proceedings of the MDAI 2010. LNCS, vol. 6408, pp. 277–286 (2010)

    Google Scholar 

  22. Herranz, J., Matwin, S., Nin, J., Torra, V.: Classifying data from protected statistical datasets. Comput. Secur. 29, 875–890 (2010)

    Article  Google Scholar 

  23. Hall, M., Frank, E., Holmes, G., Pfahringer, G., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11, 1 (2009)

    Article  Google Scholar 

  24. Balsa, E., Troncoso, C., Díaz, C.: A metric to evaluate interaction obfuscation in online social networks. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20, 877–892 (2012)

    Article  MathSciNet  Google Scholar 

  25. Muralidhar, M., Sarathy, R.: Statistical dependence as the basis for a privacy measure for microdata release. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20, 893–906 (2012)

    Article  Google Scholar 

  26. Sweeney, L.: \(k\)-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  27. Torra, V., Abowd, J.M., Domingo-Ferrer, J.: Using mahalanobis distance-based record linkage for disclosure risk assessment. LNCS, vol. 4302, pp. 233–242 (2006)

    Google Scholar 

  28. Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  29. Little, R.J.A.: Statistical analysis of masked data. J. Off. Stat. 9(2), 407–426 (1993)

    Google Scholar 

  30. Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. Data Knowl. Eng. 64(1), 346–364 (2007)

    Article  Google Scholar 

  31. Torra, V.: OWA operators in data modeling and reidentification. IEEE Trans. Fuzzy Syst. 12(5), 652–660 (2004)

    Article  Google Scholar 

  32. Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical microdata protection via advanced record linkage. Stat. Comput. 13, 343–354 (2003)

    Article  MathSciNet  Google Scholar 

  33. Spruill, N.L.: The confidentiality and analytic usefulness of masked business microdata. In: Proceedings of the Section on Survey Research Methods 1983, pp. 602–610, American Statistical Association (1983)

    Google Scholar 

  34. Paass, G.: Disclosure risk and disclosure avoidance for microdata. J. Bus. Econ. Stat. 6, 487–500 (1985)

    Google Scholar 

  35. Paass, G., Wauschkuhn, U.: Datenzugang, Datenschutz und Anonymisierung—Analysepotential und Identifizierbarkeit von Anonymisierten Individualdaten. Oldenbourg Verlag (1985)

    Google Scholar 

  36. Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Lecture Notes in Statistics. Springer, New York (2001)

    Book  MATH  Google Scholar 

  37. Elliot, M.J., Manning, A.M., Ford, R.W.: A computational algorithm for handling the special uniques problem. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 493–509 (2002)

    Article  MATH  Google Scholar 

  38. Manning, A.M., Haglin, D.J., Keaner, J.A.: A recursive search algorithm for statistical disclosure assessment. Data Min. Knowl. Disc. 16, 165–196 (2008)

    Article  MathSciNet  Google Scholar 

  39. Elliot, M.J., Skinner, C.J., Dale, A.: Special uniqueness, random uniques and sticky populations: some counterintuitive effects of geographical detail on disclosure risk. Res. Off. Stat. 1(2), 53–67 (1998)

    Google Scholar 

  40. Elamir, E.A.H.: Analysis of re-identification risk based on log-linear models. In: Proceedings of the PSD 2004. LNCS, vol. 3050, pp. 273–281 (2004)

    Google Scholar 

  41. Elliot, M.: Integrating file and record level disclosure risk assessment. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 126–134 (2002)

    Google Scholar 

  42. Franconi, L., Polettini, S.: Individual risk estimation in \(\mu \)-argus: a review. In: PSD 2004. LNCS, vol. 3050, pp. 262–272 (2004)

    Google Scholar 

  43. Winkler, W.E.: Re-identification methods for masked microdata. In: Proceedings of PSD 2004. LNCS, vol. 3050, pp. 216–230 (2004)

    Google Scholar 

  44. Bacher, J., Brand, R., Bender, S.: Re-identifying register data by survey data using cluster analysis: an empirical study. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 589–607 (2002)

    Article  MATH  Google Scholar 

  45. Herranz, J., Nin, J., Rodríguez, P., Tassa, T.: Revisiting distance-based record linkage for privacy-preserving release of statistical datasets. Data Knowl. Eng. 100, 78–93 (2015)

    Article  Google Scholar 

  46. Lenz, R.: A graph theoretical approach to record linkage. In: Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Working Paper no. 35 (2003)

    Google Scholar 

  47. Scannapieco, M., Cibella, N., Tosco, L., Tuoto, T., Valentino, L., Fortini, M., Mancini, L.: Relais (REcord Linkage At IStat): user’s guide. http://www.istat.it/en/tools/methods-and-it-tools/processing-tools/relais (2015)

  48. Torra, V., Miyamoto, S.: Evaluating fuzzy clustering algorithms for microdata protection. In: Proceedings of PSD 2004. LNCS, vol. 3050, pp. 175–186 (2004)

    Google Scholar 

  49. Nin, J., Torra, V.: Analysis of the univariate microaggregation disclosure risk. New Gener. Comput. 27, 177–194 (2009)

    Article  MATH  Google Scholar 

  50. Nin, J., Herranz, J., Torra, V.: On the disclosure risk of multivariate microaggregation. Data Knowl. Eng. 67(3), 399–412 (2008)

    Article  Google Scholar 

  51. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP 2008), pp. 111–125 (2008)

    Google Scholar 

  52. Martínez, S., Valls, A., Sánchez, D.: An ontology-based record linkage method for textual microdata. In: Artificial Intelligence Research and Development, vol. 232, pp. 130-139. IOS Press (2011)

    Google Scholar 

  53. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers, San Francisco (2011)

    MATH  Google Scholar 

  54. Rahm, E., Do, H.-H.: Data cleaning: problems and current approaches. Bull. IEEE Comput. Soci. Techn. Committee Data Eng. 23(4), 3–13 (2000)

    Google Scholar 

  55. International Classification of Diseases (ICD), 10-th revision. http://www.who.int/classifications/icd/en/. Accessed Jan 2017

  56. Gaines, B.R., Shaw, M.L.G.: Knowledge acquisition tools based on personal construct psychology. Knowl. Eng. Rev. 8, 49–85 (1993)

    Article  Google Scholar 

  57. Torra, V.: Towards the re-identification of individuals in data files with non-common variables. In: Proceedings of ECAI 2000, pp. 326–330 (2000)

    Google Scholar 

  58. Boose, J.H.: Expertise Transfer for Expert System Design. Elsevier, New York (1986)

    MATH  Google Scholar 

  59. Christen, P.: Data Matching—Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer, Heidelberg (2012)

    Google Scholar 

  60. Torra, V., Narukawa, Y.: Modeling decisions: information fusion and aggregation operators. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  61. Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2013)

    Book  MATH  Google Scholar 

  62. Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)

    Article  Google Scholar 

  63. Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic integration of heterogeneous information sources. Data Knowl. Eng. 36(3), 215–249 (2001)

    Article  MATH  Google Scholar 

  64. Do, H.-H., Rahm, E.: COMA—a system for flexible combination of schema matching approaches. In: Proceedings of VLDB, pp. 610-621 (2002)

    Google Scholar 

  65. Embley, D.W., Jackman, D., Xu, L.: Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration (2001)

    Google Scholar 

  66. Princeton University: “About WordNet”. WordNet. Princeton University. http://wordnet.princeton.edu (2010). Accessed Jan 2017

  67. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)

    Article  MATH  Google Scholar 

  68. Haas, L.M., Miller, R.J., Niswonger, B., Tork Roth, M., Schwarz, P.M., Wimmers, E.L.: Transforming heterogeneous data with database middleware: beyond integration. Bull. IEEE Comput. Soci. Techn. Committee Data Eng. 22, 31–36 (1999)

    Google Scholar 

  69. Borkar, V., Deshmukh, K., Sarawagi, S.: Automatic segmentation of text into structured records. In: Proceedings of ACM SIGMOD Conference (2001)

    Google Scholar 

  70. Churches, T., Christen, P., Lim, K., Zhu, J.X.: Preparation of name and address data for record linkage using hidden Markov models. BMC Med. Inform. Decis. Making 2, 9 (2002)

    Article  Google Scholar 

  71. Li, W.-S., Clifton, C.: SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33, 49–84 (2000)

    Article  MATH  Google Scholar 

  72. Winkler, W.E.: Matching and record linkage. Statistical Research Division, U.S. Bureau of the Census (USA), RR93/08 (1993). Also in B.G. Cox (ed.) Business Survey Methods, pp. 355–384. Wiley (1995)

    Google Scholar 

  73. Steorts, R.C., Ventura, S.L., Sadinle, M., Fienberg, S.E.: A comparison of blocking methods for record linkage. In: PSD 2014. LNCS, vol. 8744, pp. 253–268 (2014)

    Google Scholar 

  74. Aho, A.V., Ullman, J.D., Hopcroft, J.E.: Data Structures and Algorithms. Addison-Wesley, Reading (1988)

    MATH  Google Scholar 

  75. Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. Knowl. Data Eng. 24, 1537–1555 (2012)

    Article  Google Scholar 

  76. Nin, J., Muntés-Mulero, V., Martínez-Bazan, N., Larriba-Pey, J.-L.: On the use of semantic blocking techniques for data cleansing and integration. In: Proceedings of IDEAS 2007, pp. 190–198 (2007)

    Google Scholar 

  77. Michelson, M., Knoblock, C.A.: Learning blocking schemes for record linkage. In: AAAI (2006)

    Google Scholar 

  78. Searcóid, M.O.: Metric Spaces. Springer, London (2007)

    MATH  Google Scholar 

  79. Salari, M., Jalili, S., Mortazavi, R.: TBM, a transformation based method for microaggregation of large volume mixed data. Data Min. Knowl. Disc. (2016, in press). doi:10.1007/s10618-016-0457-y

  80. Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)

    Article  Google Scholar 

  81. Stephen, G.A.: String Searching Algorithms. World Scientific Publishing Co., Singapore (1994)

    Book  MATH  Google Scholar 

  82. Odell, M.K., Russell, R.C.: U. S. Patents 1261167 (1918)

    Google Scholar 

  83. Odell, M.K., Russell, R.C.: U. S. Patents 1435663 (1922)

    Google Scholar 

  84. Knuth, D.E.: The Art of Computer Programming: Sorting and Searching, vol. 3. Addison-Wesley, Reading (1973)

    MATH  Google Scholar 

  85. Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)

    Article  Google Scholar 

  86. Newcombe, H.B.: Record linking: the design of efficient systems for linking records into individuals and family histories. Am. J. Hum. Genet. 19 Part I(3) (1967)

    Google Scholar 

  87. Blair, C.R.: A program for correcting spelling errors. Inf. Control 3(1), 60–67 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  88. Jaro, M.A.: UNIMATCH: a record linkage system: user’s manual. U.S, Bureau of the Census, Washington DC (1978)

    Google Scholar 

  89. Porter, E.H., Winkler, W.E.: Approximate string comparison and its effect on an advanced record linkage system. Report RR97/02, Statistical Research Division, U.S. Bureau of the Census, USA (1997)

    Google Scholar 

  90. This link is currently outdated. http://www.census.gov/geo/msb/stand/strcmp.c. Code, http://www.perlmonks.org/?node=659795, https://people.rit.edu/rmb5229/320/project3/jaro_winkler.html

  91. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals, Doklady Academii nauk SSSR 163(4), 845–848 (1965) (in Russian). (Also in Cybern. Control Theor. 10(8), 707–710 (1966))

    Google Scholar 

  92. Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string metrics for matching names and records. In: Proceedings of the KDD 2003 (2003)

    Google Scholar 

  93. Herranz, J., Nin, J., Solé, M.: Optimal Symbol Alignment distance: a new distance for sequences of symbols. IEEE Trans. Knowl. Data Eng. 23(10), 1541–1554 (2011)

    Article  Google Scholar 

  94. Muralidhar, K., Domingo-Ferrer, J.: Rank-based record linkage for re-identification risk assessment. In: Proceedings of PSD 2016 (2016)

    Google Scholar 

  95. Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Torra, V. (ed.) Information Fusion in Data Mining, pp. 101–132. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  96. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64(328), 1183–1210 (1969)

    Article  MATH  Google Scholar 

  97. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  98. Domingo-Ferrer, J., Torra, V.: Validating distance-based record linkage with probabilistic record linkage. LNCS, vol. 2504, pp. 207–215 (2002)

    Google Scholar 

  99. Newcombe, H.B., Kennedy, J.M., Axford, S.L., James, A.P.: Automatic linkage of vital records. Science 130, 954 (1959)

    Article  Google Scholar 

  100. Winkler, W.E.: Methods for Record Linkage and Bayesian Networks, Bureau of the Census (USA), RR2002/05 (2002)

    Google Scholar 

  101. Larsen, M.D., Rubin, D.B.: Iterative automated record linkage using mixture models. J. Am. Stat. Assoc. 79, 32–41 (2001)

    Article  MathSciNet  Google Scholar 

  102. Winkler, W.E.: Improved decision rules in the Fellegi-Sunter model of record linkage, pp. 274–279. In: Proceedings of the Section on Survey Research Methods. American Statistical Association (1993)

    Google Scholar 

  103. Tromp, M., Méray, N., Ravelli, A.C.J., Reitsma, J.B., Bonsel, G.J.: Ignoring dependency between linking variables and its impact on the outcome of probabilistic record linkage studies. J. Am. Med. Inform. Assoc. 15(5), 654–660 (2008)

    Article  Google Scholar 

  104. Daggy, J.K., Xu, H., Hui, S.L., Gamache, R.E., Grannis, S.J.: A practical approach for incorporating dependence among fields in probabilistic record linkage. BMC Med. Inform. Decis. Making 13, 97 (2013)

    Article  Google Scholar 

  105. Herzog, T.N., Scheuren, F.J., Winkler, W.E.: Data Quality and Record Linkage Techniques. Springer, New York (2007)

    MATH  Google Scholar 

  106. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng. 19(1), 1–16 (2007)

    Article  Google Scholar 

  107. https://www.cs.cmu.edu/~wcohen/matching/. Accessed Jan 2017

  108. Winkler, W.E.: Overview of record linkage and current research directions, U.S. Census Bureau RR2006/02 (2006)

    Google Scholar 

  109. Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. Inf. Syst. 38, 946–969 (2013)

    Article  Google Scholar 

  110. Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey. Esprit SDC Project, Deliverable MI-3/D2 (1999)

    Google Scholar 

  111. Beliakov, G., Pradera, A., Calvo, T.: Aggregation Functions: A Guide for Practitioners. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  112. Grabisch, M., Marichal, J.-L., Mesiar, R., Pap, E.: Aggregation Functions. In: Encyclopedia of Mathematics and its Applications, No. 127. Cambridge University Press (2009)

    Google Scholar 

  113. Abril, D., Navarro-Arribas, G., Torra, V.: Improving record linkage with supervised learning for disclosure risk assessment. Inf. Fusion 13(4), 274–284 (2012)

    Article  MATH  Google Scholar 

  114. Torra, V., Navarro-Arribas, G., Abril, D.: Supervised learning for record linkage through weighted means and OWA operators. Control Cybern. 39(4), 1011–1026 (2010)

    MATH  Google Scholar 

  115. Abril, D., Navarro-Arribas, G., Torra, V.: Choquet Integral for Record Linkage. Ann. Oper. Res. 195, 97–110 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  116. Abril, D., Navarro-Arribas, G., Torra, V.: Supervised learning using a symmetric bilinear form for record linkage. Inf. Fusion 26, 144–153 (2016)

    Article  Google Scholar 

  117. IBM ILOG CPLEX: High-performance mathematical programming engine. International business machines corp. http://www-01.ibm.com/software/integration/optimization/cplex/ (2010)

  118. Neumann, D.A., Norton Jr., V.T.: Clustering and isolation in the consensus problem for partitions. J. Classif. 3, 281–297 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  119. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \(k\)-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)

    Google Scholar 

  120. Sweeney, L.: Achieving \(k\)-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  121. Tendick, P., Matloff, N.: A modified random perturbation method for database security. ACM Trans. Database Syst. 19, 47–63 (1994)

    Article  Google Scholar 

  122. Stokes, K., Torra, V.: n-Confusion: a generalization of k-anonymity. In: Proceedings of the Fifth International Workshop on Privacy and Anonymity on Information Society (PAIS 2012) (2012)

    Google Scholar 

  123. Stokes, K., Farràs, O.: Linear spaces and transversal designs: \(k\)-anonymous combinatorial configurations for anonymous database search. Des. Codes Cryptogr. 71, 503–524 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  124. Tassa, T., Mazza, A., Gionis, A.: k-Concealment: an alternative model of k-Type anonymity. Trans. Data Priv. 5(1), 189–222 (2012)

    MathSciNet  Google Scholar 

  125. Soria-Comas, J., Domingo-Ferrer, J.: Probabilistic k-anonymity through microaggregation and data swapping. FUZZ-IEEE 2012, 1–8 (2012)

    Google Scholar 

  126. Gehrke, J., Hay, M., Lui, E., Pass, R.: Crowd-blending privacy. In: 32nd International Cryptology Conference (CRYPTO 2012) (2012)

    Google Scholar 

  127. Gionis, A., Mazza, A., Tassa, T.: k-anonymization revisited. In: Proceedings of ICDE 2008 (2008)

    Google Scholar 

  128. Capitani, D., di Vimercati, S., Foresti, S., Livraga, G., Samarati, P.: Data privacy: definitions and techniques. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20(6), 793–817 (2012)

    Article  MATH  Google Scholar 

  129. Truta, T.M., Vinay, B.: Privacy protection: p-sensitive k-anonymity property. In: Proceedings of the 2nd International Workshop on Privacy Data management (PDM 2006), p. 94 (2006)

    Google Scholar 

  130. Truta, T.M., Campan, A., Sun, X.: an overview of p-sensitive k-anonymity models for microdata anonymization. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20(6), 819–838 (2012)

    Article  MathSciNet  Google Scholar 

  131. Machanavajjhala, A., Gehrke, J., Kiefer, D., Venkitasubramanian, M.: L-diversity: privacy beyond k-anonymity. In: Proceedings of the IEEE ICDE (2006)

    Google Scholar 

  132. Li, N., Li, T., Venkatasubramanian, S.: T-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the IEEE ICDE 2007 (2007)

    Google Scholar 

  133. Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of ICDR 2005 (2005)

    Google Scholar 

  134. Stokes, K.: On computational anonymity. In: Proceedings of PSD 2012. LNCS, vol. 7556, pp. 336–347 (2012)

    Google Scholar 

  135. Stokes, K., Torra, V.: Reidentification and k-anonymity: a model for disclosure risk in graphs. Soft Comput. 16(10), 1657–1670 (2012)

    Article  MATH  Google Scholar 

  136. Torra, V.: Towards the formalization of re-identification for some data masking methods. In: Proceedings of CCIA 2012, pp. 47–55 (2012)

    Google Scholar 

  137. Torra, V., Stokes, K.: A formalization of re-identification in terms of compatible probabilities, CoRR abs/1301.5022 (2013)

    Google Scholar 

  138. Torra, V., Stokes, K.: A formalization of record linkage and its application to data protection. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20, 907–919 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  139. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. PNAS 110, 5802–5805 (2013)

    Article  Google Scholar 

  140. Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Proceedings of SIGMOD 2011 (2011)

    Google Scholar 

  141. Stokes, K., Torra, V.: Multiple releases of k-anonymous data sets and k-anonymous relational databases. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 20(6), 839–854 (2012)

    Article  MathSciNet  Google Scholar 

  142. Pei, J., Xu, J., Wang, Z., Wang, W., Wang, K.: Maintaining k-anonymity against incremental updates. In: Proceedings of SSDBM (2007)

    Google Scholar 

  143. Truta, T.M., Campan, A.: K-anonymization incremental maintenance and optimization techniques. In: Proceedings of ACM SAC 2007, pp. 380–387 (2007)

    Google Scholar 

  144. Nergiz, M.E., Clifton, C., Nergiz, A.E.: Multirelational k-anonymity. IEEE Trans. Knowl. Data Eng. 21(8), 1104–1117 (2009)

    Article  Google Scholar 

  145. Navarro-Arribas, G., Abril, D., Torra, V.: Dynamic anonymous index for confidential data. In: Proceedings of 8th DPM and SETOP, pp. 362-368 (2013)

    Google Scholar 

  146. D’Acquisto, G., Domingo-Ferrer, J., Kikiras, P., Torra, V., de Montjoye, Y.-A., Bourka, A.: Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics. ENISA Report (2015)

    Google Scholar 

  147. Estivill-Castro, V., Nettleton, D.F.: Privacy Tips: Would it be ever possible to empower online social-network users to control the confidentiality of their data? In: Proceedings of ASONAM 2015, pp. 1449–1456 (2015)

    Google Scholar 

  148. Soria-Comas, J., Domingo-Ferrer, J.: Big data privacy: challenges to privacy principles and models. Data Sci. Eng. 1(1), 21–28 (2016)

    Article  Google Scholar 

  149. Torra, V., Navarro-Arribas, G.: Big data privacy and anonymization. In: Lehmann, A., Whitehouse, D., Fischer-Hübner, S., Fritsch, L.: Privacy and identity management—facing up to next steps. Springer (2017, in press)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vicenç Torra .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Torra, V. (2017). Privacy Models and Disclosure Risk Measures. In: Data Privacy: Foundations, New Developments and the Big Data Challenge. Studies in Big Data, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-319-57358-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57358-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57356-4

  • Online ISBN: 978-3-319-57358-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics