Your Apps Give You Away: Distinguishing Mobile Users by Their App Usage Fingerprints
Article No.: 138, Pages 1 - 23
Abstract
Understanding mobile app usage has become instrumental to service providers to optimize their online services. Meanwhile, there is a growing privacy concern that users' app usage may uniquely reveal who they are. In this paper, we seek to understand how likely a user can be uniquely re-identified in the crowd by the apps she uses. We systematically quantify the uniqueness of app usage via large-scale empirical measurements. By collaborating with a major cellular network provider, we obtained a city-scale anonymized dataset on mobile app traffic (1.37 million users, 2000 apps, 9.4 billion network connection records). Through extensive analysis, we show that the set of apps that a user has installed is already highly unique. For users with more than 10 apps, 88% of them can be uniquely re-identified by 4 random apps. The uniqueness level is even higher if we consider when and where the apps are used. We also observe that user attributes (e.g., gender, social activity, and mobility patterns) all have an impact on the uniqueness of app usage. Our work takes the first step towards understanding the unique app usage patterns for a large user population, paving the way for further research to develop privacy-protection techniques and building personalized online services.
References
[1]
O. Abul, F. Bonchi, and M. Nanni. 2010. Anonymization of moving objects databases by clustering and perturbation. Information Systems (2010).
[2]
G. Acs and C. Castelluccia. 2014. A case study: privacy preserving release of spatio-temporal density in paris. In Proceedings of the 17th International Conference on Knowledge Discovery and Data Mining (SIGKDD).
[3]
APKsHub. 2017. com.pp.assistant. https://www.apkshub.com/app/com.pp.assistant.
[4]
Bhuvan Bamba, Ling Liu, Peter Pesti, and Ting Wang. 2008. Supporting anonymous location queries in mobile environments with privacygrid. In Proceedings of the 17th international conference on World Wide Web (WWW). 237--246.
[5]
Nikola Banovic, Christina Brant, Jennifer Mankoff, and Anind Dey. 2014. ProactiveTasks: the short of mobile device use sessions. In Proceedings of the 16th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI). 243--252.
[6]
Konrad Blaszkiewicz, Konrad Blaszkiewicz, Konrad Blaszkiewicz, and Alexander Markowetz. 2016. Differentiating smartphone users by app usage. In Proceedings of the International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp). 519--523.
[7]
Hristo Bojinov, Michalevsky Yan, Gabi Nakibly, and Boneh Dan. 2014. Mobile Device Identification via Sensor Fingerprinting. Computer Science (2014).
[8]
Karen Church, Denzil Ferreira, Nikola Banovic, Kent Lyons, Yong Liu, Jorge Goncalves, Denzil Ferreira, Bei Xiao, Simo Hosio, and Vassilis Kostakos. 2015. Understanding the Challenges of Mobile Phone Usage Data. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI).
[9]
Trinh Minh Tri Do, Jan Blom, and Daniel Gatica-Perez. 2011. Smartphone usage in the wild: a large-scale analysis of applications and context. In Proceedings of the 13th International Conference on Multimodal Interfaces (ICMI). 353--360.
[10]
J. Domingo-Ferrer and R. Trujillo-Rasua. 2012. Microaggregation- and permutation-based anonymization of movement data. Information Sciences (2012).
[11]
Cynthia Dwork. 2008. Differential privacy: a survey of results. In International Conference on Theory and Applications of MODELS of Computation. 1--19.
[12]
Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman, and Salil Vadhan. 2015. Robust traceability from trace amounts. In Foundations of Computer Science (FOCS), 2015 IEEE 56th Annual Symposium on. IEEE, 650--669.
[13]
Peter Eckersley. 2010. How Unique Is Your Web Browser? Lecture Notes in Computer Science 6205 (2010), 1--18.
[14]
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 1054--1067.
[15]
Hossein Falaki, Ratul Mahajan, Srikanth Kandula, Dimitrios Lymberopoulos, Ramesh Govindan, and Deborah Estrin. 2010. Diversity in smartphone usage. In Proceedings of the 8th International Conference on Mobile Systems, Applications and Services (MobiSys). 179--194.
[16]
Denzil Ferreira, Jorge Goncalves, Vassilis Kostakos, Louise Barkhuus, and Anind K. Dey. 2014. Contextual experience sampling of mobile application micro-usage. In Proceedings of the 16th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI). 91--100.
[17]
S. Garfinkel. 2006. Privacy Protection and RFID. In Ubiquitous and Pervasive Commerce. Springer.
[18]
M. Gramaglia and M. Fiore. 2015. Hiding Mobile Traffic Fingerprints with GLOVE. ACM CoNEXT (2015).
[19]
Marco Gruteser and Dirk Grunwald. 2003. Anonymous usage of location-based services through spatial and temporal cloaking. In Proceedings of the 1st International Conference on Mobile Systems, Applications and Services (MobiSys). 31--42.
[20]
Lisa Gutermuth. 2018. How to Understand What Info Mobile Apps Are Collecting About You. http://www.slate.com/articles/technology/future_tense/2017/02/how_to_understand_what_info_mobile_apps_collect_about_you.html.
[21]
Hackernoon. 2017. How Much Time Do People Spend on Their Mobile Phones in 2017? (2017). https://hackernoon.com/how-much-time-do-people-spend-on-their-mobile-phones-in-2017-e5f90a0b10a6.
[22]
Ke Huang, Chunhui Zhang, Xiaoxiao Ma, and Guanling Chen. 2012. Predicting mobile application usage using contextual information. In Proceedings of the International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp). 1059--1065.
[23]
Simon L. Jones, Denzil Ferreira, Simo Hosio, Jorge Goncalves, and Vassilis Kostakos. 2015. Revisitation analysis of smartphone app use. In Proceedings of the International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp). 1197--1208.
[24]
Philip Leroux, Klaas Roobroeck, Bart Dhoedt, Piet Demeester, and Filip De Turck. 2013. Mobile application usage prediction through context-based learning. Journal of Ambient Intelligence and Smart Environments 5, 2 (2013), 213--235.
[25]
Huoran Li, Xuan Lu, Xuanzhe Liu, Tao Xie, Kaigui Bian, Felix Xiaozhu Lin, Feng Feng, and Feng Feng. 2015. Characterizing Smartphone Usage Patterns from Millions of Android Users. In Proceedings of the Conference on Internet Measurement Conference (IMC). 459--472.
[26]
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In Proceedings of the of International Conference on Data Engineering(ICDE). 106--115.
[27]
Liao, ZhungXun, YiChin, Peng, WenChih, Lei, and PoRuey. 2013. On mining mobile apps usage behavior for predicting apps usage in smartphones. (2013), 609--618.
[28]
A. Machanavajjhala, D. Kifer, J. Gehrke, et al. 2007. l-diversity: Privacy beyond k-anonymity. ACM TKDD (2007).
[29]
Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasubramaniam. 2007. l-diversity: Privacy beyond k-anonymity. Transactions on Knowledge Discovery from Data (TKDD) 1, 1 (2007), 3.
[30]
Eric Malmi and Ingmar Weber. 2016. You Are What Apps You Use: Demographic Prediction Based on User's Apps. (2016).
[31]
A. Monreale, G. L. Andrienko, N. V. Andrienko, et al. 2010. Movement Data Anonymity through Generalization. Transactions on Data Privacy (2010).
[32]
Yves Alexandre De Montjoye, CÃl'sar A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. 2013. Unique in the Crowd: The privacy bounds of human mobility. Scientific Reports 3, 6 (2013), 1376.