Abstract
There is growing public concern about personal data collected by both private and public sectors. People have very little control over what kinds of data are stored and how such data is used. Moreover, the ability to infer new knowledge from existing data is increasing rapidly with advances in database and data mining technologies. We describe a solution which allows people to take control by specifying constraints on the ways in which their data can be used. User constraints are represented in formal logic, and organizations that want to use this data provide formal proofs that the software they use to process data meets these constraints. Checking the proof by an independent verifier demonstrates that user constraints are (or are not) respected by this software. Our notion of “privacy correctness” differs from general software correctness in two ways. First, properties of interest are simpler and thus their proofs should be easier to automate. Second, this kind of correctness is stricter; in addition to showing a certain relation between input and output is realized, we must also show that only operations that respect privacy constraints are applied during execution. We have therefore an intensional notion of correctness, rather that the usual extensional one. We discuss how our mechanism can be put into practice, and we present the technical aspects via an example. Our example shows how users can exercise control when their data is to be used as input to a decision tree learning algorithm. We have formalized the example and the proof of preservation of privacy constraints in Coq.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) 2000 ACM SIGMOD International Conference on Management of Data, pp. 439–450. ACM, New York (2000)
Bove, A., Capretta, V.: Nested general recursion and partiality in type theory. In: Boulton, R.J., Jackson, P.B. (eds.) TPHOLs 2001. LNCS, vol. 2152, pp. 121–135. Springer, Heidelberg (2001)
Bove, A., Capretta, V.: Modelling general recursion in type theory. To appear in Mathematical Structures in Computer Science (2004), Available at http://www.science.uottawa.ca/~vcapr396/
Capretta, V.: Abstraction and Computation. PhD thesis, Computing Science Institute, University of Nijmegen (2002)
Coquand, T.: Infinite objects in type theory. In: Barendregt, H., Nipkow, T. (eds.) TYPES 1993. LNCS, vol. 806, pp. 62–78. Springer, Heidelberg (1994)
Coq Development Team. The Coq Proof Assistant reference manual: Version 7.4. Technical report, INRIA (2003)
Coquand, T., Paulin, C.: Inductively defined types. In: Martin-Löf, P., Mints, G. (eds.) COLOG 1988. LNCS, vol. 417. Springer, Heidelberg (1990)
EPIC. Electronic Privacy Information Center (2005), http://www.epic.org/
Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining (July 2002)
Felty, A., Matwin, S.: Privacy-oriented data mining by proof checking. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 138–149. Springer, Heidelberg (2002)
Giménez, E.: Codifying guarded definitions with recursive schemes. In: Smith, J., Dybjer, P., Nordström, B. (eds.) TYPES 1994. LNCS, vol. 996, pp. 39–59. Springer, Heidelberg (1995)
Giménez, E.: A Tutorial on Recursive Types in Coq. Technical Report 0221, Unité de recherche INRIA Rocquencourt (May 1998)
Hayati, K., Abadi, M.: Language-based enforcement of privacy policies. In: Martin, D., Serjantov, A. (eds.) PET 2004. LNCS, vol. 3424, pp. 302–313. Springer, Heidelberg (2005)
Hagino, T.: A typed lambda calculus with categorical type constructors. In: Pitt, D.H., Rydeheard, D.E., Poigné, A. (eds.) Category Theory and Computer Science. LNCS, vol. 283, pp. 140–157. Springer, Heidelberg (1987)
IPCO. Data mining: Staking a claim on your privacy, Information and Privacy Commissioner/Ontario (January 1998), http://www.ipc.on.ca/scripts/index.asp?action=31&P_ID=11387&N_ID=1&PT_ID=11351&U_ID=0
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining, July 2002, pp. 279–287 (2002)
Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. In: The ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2002) (June 2002)
Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)
McBride, C., McKinna, J.: The view from the left. Journal of Functional Programming 14(1), 69–111 (2004)
Marché, C., Paulin-Mohring, C., Urbain, X.: The Krakatoa tool for certification of Java/JavaCard programs annotated in JML. Journal of Logic and Algebraic Programming 58(1–2), 89–106 (2004)
Paulin-Mohring, C.: Inductive Definitions in the System Coq - Rules and Properties. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664. Springer, Heidelberg (1993)
Ries, D.G.: Protecting consumer online privacy – an overview (May 2001), http://www.pbi.org/Goodies/privacy/privacy_ries.htm
USCM. Mayors attack urban redlining, mortgage discrimination, The US Conference of Mayors (1998), http://www.usmayors.org/uscm/news/press_releases/press_archive.asp?doc_id=98
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery in Databases and Data Mining (July 2002)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matwin, S., Felty, A., Hernádvölgyi, I., Capretta, V. (2005). Privacy in Data Mining Using Formal Methods. In: Urzyczyn, P. (eds) Typed Lambda Calculi and Applications. TLCA 2005. Lecture Notes in Computer Science, vol 3461. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11417170_21
Download citation
DOI: https://doi.org/10.1007/11417170_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25593-2
Online ISBN: 978-3-540-32014-2
eBook Packages: Computer ScienceComputer Science (R0)