Abstract
Increasingly, companies use multi-source data to operate new information systems, such as social networking, e-commerce, and location-based services. These systems leverage complex, multi-stakeholder data supply chains in which each stakeholder (e.g., users, developers, companies, and government) must manage privacy and security requirements that cover their practices. US regulator and European regulator expect companies to ensure consistency between their privacy policies and their data practices, including restrictions on what data may be collected, how it may be used, to whom it may be transferred, and for what purposes. To help developers check consistency, we identified a strict subset of commonly found privacy requirements and we developed a methodology to map these requirements from natural language text to a formal language in description logic, called Eddy. Using this language, developers can detect conflicting privacy requirements within a policy and enable the tracing of data flows within these policies. We derived our methodology from an exploratory case study of the Facebook platform policy and an extended case study using privacy policies from Zynga and AOL Advertising. In this paper, we report results from multiple analysts in a literal replication study, which includes a refined methodology and set of heuristics that we used to extract privacy requirements from policy texts. In addition to providing the method, we report results from performing automated conflict detection within the Facebook, Zynga, and AOL privacy specifications, and results from a computer simulation that demonstrates the scalability of our formal language toolset to specifications of reasonable size.
Similar content being viewed by others
Notes
For the purposes of comparison to other approaches, we call our privacy requirements specification language “Eddy” for the circular movement of water that runs counter to the main current, a connotation that appears appropriate when describing data flows in multi-tier systems where the flow of data may run counter to conflicting privacy requirements.
References
Anderson A (2006) A comparison of two privacy policy languages: EPAL and XACML. ACM workshop on secure web services, pp 53–60
Ashley P, Hada S, Karjoth G, Schunter M (2002) E-P3P privacy policies and privacy authorization. In: Proceedings of the ACM workshop on privacy in the electronic society, pp 103–109
Antón AI, Earp JB, He Q, Stufflebeam W, Bolchini D, Jensen C (2004) Financial privacy policies and the need for standardization. IEEE Secur Priv 2(2):36–45
Antón AI, Earp JB (2004) A requirements taxonomy for reducing web site privacy vulnerabilities. Requir Eng J 9(3):169–185
Aucher G, Boella G, van der Torre L (2010) Privacy policies with modal logic: a dynamic turn. In: Lecture Notes on Computer Science, vol 6181, pp 196–213
Baader F, Calvenese D, McGuiness D (eds) (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, Cambridge
Barth A, Datta A, Mitchell JC, Nissenbaum H (2006) Privacy and contextual integrity: framework and applications. In: IEEE symposium on security and privacy, pp 184–198
Breaux TD, Antón AI (2005) Analyzing goal semantics for rights, permissions, and obligations. In: IEEE international requirements engineering conference, Paris, France, pp 177–186
Breaux TD, Antón AI (2008) Analyzing regulatory rules for privacy and security requirements. IEEE Trans Softw Eng 34(1):5–20
Breaux TD, Antón AI, Doyle J (2009) Semantic parameterization: a conceptual modeling process for domain descriptions. ACM Trans Softw Eng Method 18(2) (article 5)
Breaux TD, Vail MW, Antón AI (2006) Towards regulatory compliance: extracting rights and obligations to align requirements with regulations. In: IEEE requirements engineering conference, pp 49–58
Breaux TD, Baumer DL (2011) Legally ‘reasonable’ security requirements: a 10-year FTC retrospective. Comput Secur 30(4):178–193
Breaux TD, Rao A (2013) Formal analysis of privacy requirements specifications for multi-tier applications. In: IEEE 21st international requirements engineering conference (to appear)
Bradshaw J, Uszok A, Jeffers R, Suri N, Hayes P, Burstein M, Acquisti A, Benyo B, Breedy M, Carvalho M, Diller D, Johnson M, Kulkarni S, Lott J, Sierhuis M, van Hoof R (2003) Representation and reasoning for DAML-based policy and domain services in KAoS and Nomads. In: 2nd International joint conference on autonomous agents and multi agent systems
Cranor L et al (2006) Platform for privacy preferences (P3P) specification. W3C working group note
Cohen J (1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70(4):213–220
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: 6th Symposium on operating system design and implementation
DeYoung H, Garg D, Jia L, Kaynar D, Datta A (2010) Experiences in the logical specification of the HIPAA and GLBA privacy laws. In: ACM workshop on privacy in the electronic society, pp 73–82
Farrell CB (2011) FTC charges deceptive privacy practices in Google’s rollout of its buzz social network. In: U.S. Federal Trade Commission News Release, March 30, 2011
Hanson C, Berners-Lee T, Kagal L, Sussman GJ, Weitzner D (2007) Data-purpose algebra: modeling data usage policies. In: 8th IEEE workshop on policies for distributed systems and networks, pp 173–177
Horty JF (1993) Deontic logic as founded in non-monotonic logic. Ann Math Artif Intell 9:69–91
Kagal L (2004) A policy-based approach to governing autonomous behavior in distributed environments. Ph.D. Thesis, University of Maryland, Baltimore County
Kahmer M, Gilliot M, Muller G (2008) Automating privacy compliance with ExPDT. In: 10th IEEE conference on e-commerce technology, pp 87–94
Krippendorff K (2004) Content analysis: an introduction to its methodology. Sage, Thousand Oaks
Leon PG, Cranor LF, McDonald AM, McGuire R (2010) Token attempt: the misrepresentation of website privacy policies through the misuse of p3p compact policy tokens. In: 9th Workshop on privacy in the electronic society, pp 93–104
Lin HT, Sirin E (2008) Pellint—a performance lint tool for pellet. In: International workshop on OWL: experiences and directions (OWL-ED 2008)
Lupu E, Sloman M, Dulay N, Damianou N (2000) Ponder: realizing enterprise viewpoint concepts. In: 4th International conference on enterprise distributed object computing, Japan, pp 66–75
Lutz C, Wolter F, Zakharyashev M (2008) Temporal description logics: a survey. In: 15th IEEE international symposium on temporal representation and reasoning, pp 3–14
Moses T (ed) (2005) eXtensible Access Control Markup Language (XACML), v.2.0, OASIS Standard
May MJ (2008) Privacy APIs: formal models for analyzing legal and privacy requirements. Ph.D. Thesis, University of Pennsylvania
Nissenbaum H (2004) Privacy as contextual integrity. Wash Law Rev 791:119–158
Powers C, Schunter M (2003) Enterprise policy authorization language, version 1.2. W3C Member Submission
Park J, Sandhu R (2004) The UCONABC usage control model. ACM Trans Inf Syst Secur 7(1):128–174
Steel E, Fowler GA (2010) Facebook in privacy breach. Wall Street J. http://online.wsj.com/news/articles/SB10001424052702304772804575558484075236968
Sweeney Latanya (2002) k-Anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Tonti G, Bradshaw JM, Jeffers R, Montanari R, Suri N, Uszok A (2003) Semantic web languages for policy representation and reasoning: a comparison of KAoS, Rei, and Ponder. LNCS 2870:419–437
Uszok A, Bradshaw JM, Lott J, Breedy M, Bunch L (2008) New developments in ontology-based policy management: increasing the practicality and comprehensiveness of KAoS. In: IEEE workshop on policies for distributed systems and networks, pp 145–152
Wan F, Singh MP (2005) Formalizing and achieving multiparty agreements via commitments. In: 4th international joint conference on autonomous agents multiagent systems, pp. 770–777
Yin RK (2009) Case study research, 4th edn. In: Applied social research methods series, v.5. Sage Publications
Yu T, Li N, Antón AI (2004) A formal semantics for P3P. ACM workshop on secure web services, pp 1–8
Young J (2011) Commitment analysis to operationalize software requirements from privacy policies. Requir Eng J 16:33–46
Acknowledgments
We thank Dave Gordon and Darya Kurilova for their earlier feedback and the Requirements Engineering Lab at Carnegie Mellon University. This work was supported by NSF Award #1330596 and ONR Award #N00244-12-1-0014.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The context-free grammar for the privacy requirements specification language, called Eddy, is presented here in the Extended Backus-Naur Form (EBNF):
Rights and permissions
About this article
Cite this article
Breaux, T.D., Hibshi, H. & Rao, A. Eddy, a formal language for specifying and analyzing data flow specifications for conflicting privacy requirements. Requirements Eng 19, 281–307 (2014). https://doi.org/10.1007/s00766-013-0190-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-013-0190-7