Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2030613.2030630acmconferencesArticle/Chapter ViewAbstractPublication PagesmobicomConference Proceedingsconference-collections
research-article

Anonymization of location data does not work: a large-scale measurement study

Published: 19 September 2011 Publication History

Abstract

We examine a very large-scale data set of more than 30 billion call records made by 25 million cell phone users across all 50 states of the US and attempt to determine to what extent anonymized location data can reveal private user information. Our approach is to infer, from the call records, the "top N" locations for each user and correlate this information with publicly-available side information such as census data. For example, the measured "top 2" locations likely correspond to home and work locations, the "top 3" to home, work, and shopping/school/commute path locations. We consider the cases where those "top N" locations are measured with different levels of granularity, ranging from a cell sector to whole cell, zip code, city, county and state. We then compute the anonymity set, namely the number of users uniquely identified by a given set of "top N" locations at different granularity levels. We find that the "top 1" location does not typically yield small anonymity sets. However, the top 2 and top 3 locations do, certainly at the sector or cell-level granularity. We consider a variety of different factors that might impact the size of the anonymity set, for example the distance between the "top N" locations or the geographic environment (rural vs urban). We also examine to what extent specific side information, in particular the size of the user's social network, decrease the anonymity set and therefore increase risks to privacy. Our study shows that sharing anonymized location data will likely lead to privacy risks and that, at a minimum, the data needs to be coarse in either the time domain (meaning the data is collected over short periods of time, in which case inferring the top N locations reliably is difficult) or the space domain (meaning the data granularity is strictly higher than the cell level). In both cases, the utility of the anonymized location data will be decreased, potentially by a significant amount.

Supplementary Material

JPG File (mobicom_4_3.jpg)
MP4 File (mobicom_4_3.mp4)

References

[1]
A. R. Beresford and F. Stajano. Location privacy in pervasive computing. IEEE Pervasive Computing, 2(1):46--55, 2003.
[2]
J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '08), pages 70--78, New York, NY, USA, 2008. ACM.
[3]
Y. De Mulder, G. Danezis, L. Batina, and B. Preneel. Identification via location-profiling in gsm networks. In WPES '08: Proceedings of the 7th ACM workshop on Privacy in the electronic society, pages 23--32, New York, NY, USA, 2008. ACM.
[4]
C. Dwork. Differential privacy. Automata, Languages and Programming, 4052:1--12, 2006.
[5]
Foursquare. http://foursquare.com.
[6]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 42:14:1--14:53, June 2010.
[7]
P. Golle. Revisiting the uniqueness of simple demographics in the US population. In WPES '06: Proceedings of the 5th ACM workshop on Privacy in electronic society, pages 77--80, New York, NY, USA, 2006. ACM.
[8]
P. Golle and K. Partridge. On the anonymity of home/work location pairs. In Pervasive '09: Proceedings of the 7th International Conference on Pervasive Computing, pages 390--397, Berlin, Heidelberg, 2009. Springer-Verlag.
[9]
M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi. Understanding individual human mobility patterns. Nature, 453:779--782, 2008.
[10]
M. Gruteser and D. Grunwald. Anonymous usage of location-based services through spatial and temporal cloaking. In MobiSys '03: Proceedings of the 1st international conference on Mobile systems, applications and services, pages 31--42, New York, NY, USA, 2003. ACM.
[11]
M. Gruteser and X. Liu. Protecting privacy in continuous location-tracking applications. IEEE Security and Privacy, 2(2):28--34, 2004.
[12]
J. Krumm. Inference attacks on location tracks. In PERVASIVE '07: Proceedings of the 5th international conference on Pervasive computing, pages 127--143, Berlin, Heidelberg, 2007. Springer-Verlag.
[13]
J. Krumm. A survey of computational location privacy. Personal Ubiquitous Comput., 13(6):391--399, 2009.
[14]
L. Kulik. Privacy for real-time location-based services. SIGSPATIAL Special, 1(2):9--14, 2009.
[15]
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE), pages 106 --115, April 2007.
[16]
N. Li, W. H. Qardaji, and D. Su. Provably private data anonymization: Or, k-anonymity meets differential privacy. Technical report, Purdue university, 2011.
[17]
T. Li and N. Li. On the tradeoff between privacy and utility in data publishing. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '09), pages 517--526, New York, NY, USA, 2009. ACM.
[18]
C. Y. Ma, D. K. Yau, N. K. Yip, and N. S. Rao. Privacy vulnerability of published anonymous mobility traces. In Proceedings of the 16th annual international conference on Mobile computing and networking, MobiCom '10, pages 185--196, New York, NY, USA, 2010. ACM.
[19]
A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In IEEE 24th International Conference on Data Engineering(ICDE), 2008.
[20]
A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data, 1, March 2007.
[21]
V. Rastogi, D. Suciu, and S. Hong. The boundary between privacy and utility in data publishing. In Proceedings of the 33rd international conference on Very large data bases, VLDB '07, pages 531--542. VLDB Endowment, 2007.
[22]
M. Seshadri, S. Machiraju, A. Sridharan, J. Bolot, C. Faloutsos, and J. Leskove. Mobile call graphs: beyond power-law and lognormal distributions. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '08, pages 596--604, New York, NY, USA, 2008. ACM.
[23]
R. Shokri, C. Troncoso, C. Diaz, J. Freudiger, and J.-P. Hubaux. Unraveling an Old Cloak: k-anonymity for Location Privacy. In ACM Workshop on Privacy in the Electronic Society (WPES). ACM, 2010.
[24]
C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi. Limits of predictability in human mobility. Science, 327(5968):1018--1021, 2010.
[25]
L. Sweeney. Uniqueness of simple demographics in the U.S. population, 2000.
[26]
L. Sweeney. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557--570, 2002.
[27]
I. Trestian, S. Ranjan, A. Kuzmanovic, and A. Nucci. Measuring serendipity: connecting people, locations and interests in a mobile 3G network. In Proceedings of IMC '09, pages 267--279, New York, NY, USA, 2009. ACM.
[28]
H. Zang and J. Bolot. Mining call and mobility data to improve paging efficiency in cellular networks. In Proceedings of the 13th annual international conference on Mobile computing and networking, MobiCom '07, pages 123--134, New York, NY, USA, 2007. ACM.

Cited By

View all
  • (2024)Efficiency Boosts in Human Mobility Data Privacy Risk Assessment: Advancements within the PRUDEnce FrameworkApplied Sciences10.3390/app1417801414:17(8014)Online publication date: 7-Sep-2024
  • (2024)Privkit: A Toolkit of Privacy-Preserving Mechanisms for Heterogeneous Data TypesProceedings of the Fourteenth ACM Conference on Data and Application Security and Privacy10.1145/3626232.3653284(319-324)Online publication date: 19-Jun-2024
  • (2024)A Privacy-Aware Remapping Mechanism for Location DataProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636050(1433-1440)Online publication date: 8-Apr-2024
  • Show More Cited By

Index Terms

  1. Anonymization of location data does not work: a large-scale measurement study

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        MobiCom '11: Proceedings of the 17th annual international conference on Mobile computing and networking
        September 2011
        362 pages
        ISBN:9781450304924
        DOI:10.1145/2030613
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 19 September 2011

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. cellular data
        2. k-anonymity
        3. location
        4. privacy

        Qualifiers

        • Research-article

        Conference

        Mobicom'11
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 440 of 2,972 submissions, 15%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)78
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 03 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Efficiency Boosts in Human Mobility Data Privacy Risk Assessment: Advancements within the PRUDEnce FrameworkApplied Sciences10.3390/app1417801414:17(8014)Online publication date: 7-Sep-2024
        • (2024)Privkit: A Toolkit of Privacy-Preserving Mechanisms for Heterogeneous Data TypesProceedings of the Fourteenth ACM Conference on Data and Application Security and Privacy10.1145/3626232.3653284(319-324)Online publication date: 19-Jun-2024
        • (2024)A Privacy-Aware Remapping Mechanism for Location DataProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636050(1433-1440)Online publication date: 8-Apr-2024
        • (2024)Anonymization: The imperfect science of using data while preserving privacyScience Advances10.1126/sciadv.adn705310:29Online publication date: 19-Jul-2024
        • (2024)A Privacy-Preserving Querying Mechanism with High Utility for Electric VehiclesIEEE Open Journal of Vehicular Technology10.1109/OJVT.2024.33603025(262-277)Online publication date: 2024
        • (2024)On Passive Privacy-Preserving Exposure Notification Using Hash CollisionsIEEE Internet of Things Journal10.1109/JIOT.2024.335325511:9(16134-16147)Online publication date: 1-May-2024
        • (2024)TrajectGuard: A Comprehensive Privacy-Risk Framework for Multiple-Aspects TrajectoriesIEEE Access10.1109/ACCESS.2024.346208812(136354-136378)Online publication date: 2024
        • (2024)The exciting potential and daunting challenge of using GPS human-mobility data for epidemic modelingNature Computational Science10.1038/s43588-024-00637-04:6(398-411)Online publication date: 19-Jun-2024
        • (2024)Privacy-preserving generation and publication of synthetic trajectory microdata: A comprehensive surveyJournal of Network and Computer Applications10.1016/j.jnca.2024.103951230(103951)Online publication date: Oct-2024
        • (2024)Advances in Privacy Preservation TechnologiesPrivacy Computing10.1007/978-981-99-4943-4_2(17-42)Online publication date: 13-Feb-2024
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media