Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3548606.3560626acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

Privacy Limitations of Interest-based Advertising on The Web: A Post-mortem Empirical Analysis of Google's FLoC

Published: 07 November 2022 Publication History

Abstract

In 2020, Google announced it would disable third-party cookies in the Chrome browser to improve user privacy. In order to continue to enable interest-based advertising while mitigating risks of individualized user tracking, Google proposed FLoC. The FLoC algorithm assigns users to "cohorts" that represent groups of users with similar browsing behaviors so that ads can be served to users based on their cohort. In 2022, after testing FLoC in a real world trial, Google canceled the proposal with little explanation in favor of another way to enable interest-based advertising. This work provides a post-mortem analysis of two critical privacy risks for FloC by applying an implementation of FLoC to a real-world browsing history dataset collected from over 90,000 U.S. devices over a one year period.
First, we show how, contrary to its privacy goals, FLoC would have enabled individualized cross-site user tracking by providing a unique identifier for users available across sites, similar to the third-party cookies FLoC was meant to be an improvement over. We show how FLoC cohort ID sequences observed over time can provide this unique identifier to trackers, even with third-party cookies disabled. We estimate the number of users in our dataset that could be uniquely identified by FLoC IDs is more than 50% after 3 weeks and more than 95% after 4 weeks. We also show how these risks increase when cohort data are combined with browser fingerprinting, and how our results are conservative underestimates of the risks FLoC would have posed in a real-world deployment. Second, we examine the risk of FLoC leaking sensitive demographic information about users. Although we find statistically significant differences in browsing behaviors between demographic groups, we do not find that FLoC significantly risks exposing race or income information about users in our dataset. Our contributions provide insights and example analyses for future novel approaches that seek to protect user privacy while monetizing the web.

References

[1]
Pablo Aguilar. 2021. Floc: A Floc Simulator Wrapper for Python over a Go Implementation. https://github.com/thepabloaguilar/floc (https://perma.cc/337Y-5YDT).
[2]
Adam Barth. 2011. HTTP State Management Mechanism. RFC RFC 6265. Internet Engineering Task Force. https://doi.org/10.17487/RFC6265
[3]
Muhammad Ahmad Bashir and Christo Wilson. 2018. Diffusion of User Tracking Data in the Online Advertising Ecosystem. Proceedings on Privacy Enhancing Technologies, Vol. 2018, 4 (2018), 85--103.
[4]
Chrome Developers. 2021. How to Take Part in the FLoC Origin Trial. https://developer.chrome.com/blog/floc/ (https://perma.cc/U33Y-TEDG).
[5]
Comscore, Inc. 2021. Comscore Home Page. Comscore, Inc. https://www.comscore.com.
[6]
Bennett Cyphers. 2021. Google's FLoC Is a Terrible Idea. Electronic Frontier Foundation. https://perma.cc/F8T6-WPZY.
[7]
Jessica Davies. 2019. After GDPR, The New York Times cut off ad exchanges in Europe -- and kept growing ad revenue. https://perma.cc/JSX2-PAFV
[8]
Sam Dutton. 2021. What Is FLoC? web.dev. https://perma.cc/2BX5-MGRD.
[9]
Peter Eckersley. 2010. How Unique Is Your Web Browser?. In International Symposium on Privacy Enhancing Technologies (2010) (Lecture Notes in Computer Science), Mikhail J. Atallah and Nicholas J. Hopper (Eds.). Springer, 1--18. https://doi.org/10.1007/978-3-642-14527-8_1
[10]
Alessandro Epasto, Andrés Muñoz Medina, Steven Avery, Yijian Bai, Robert Busa-Fekete, C. J. Carey, Ya Gao, David Guthrie, Subham Ghosh, and James Ioannidis. 2021. Clustering for Private Interest-based Advertising. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (2021). 2802--2810.
[11]
José Estrada-Jiménez, Javier Parra-Arnau, Ana Rodríguez-Hoyos, and Jordi Forné. 2017. Online Advertising: Analysis of Privacy Threats and Protection Approaches. Computer Communications, Vol. 100 (2017), 32--51. https://doi.org/10.1016/j.comcom.2016.12.016
[12]
Google. 2022. The Privacy Sandbox - Timeline. Google. https://privacysandbox.com/timeline/ (https://perma.cc/TSG4-CW48).
[13]
Google. n.d. Personalized Advertising - Advertising Policies Help. https://support.google.com/adspolicy/answer/143465?hl=en (https://perma.cc/7D5P-NTE2).
[14]
Saikat Guha, Bin Cheng, and Paul Francis. 2011. Privad: Practical privacy in online advertising. In 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11).
[15]
Gómez-Boix, Alejandro and Laperdrix, Pierre and Baudry, Benoit. 2018. Hiding in the Crowd: An Analysis of the Effectiveness of Browser Fingerprinting at Large Scale. In Proceedings of the 2018 World Wide Web Conference (Republic and Canton of Geneva, CHE). 309--318. https://doi.org/10.1145/3178876.3186097
[16]
Daniel Howe and Helen Nissenbaum. 2017. Engineering Privacy and Protest: a Case Study of AdNauseam. In IWPE@ SP.
[17]
Tim Hwang. 2020. Subprime Attention Crisis: Advertising and the Time Bomb at the Heart of the Internet. FSO X Logic. https://perma.cc/HZ87-HV6C.
[18]
Ari Juels. 2001. Targeted advertising... and privacy too. In Cryptographers' Track at the RSA Conference. Springer, 408--424.
[19]
Przemysŀaw Kazienko and Michaŀ Adamski. 2007. AdROSA-Adaptive Personalization of Web Advertising. Information Sciences, Vol. 177, 11 (2007), 2269--2295.
[20]
Georgios Kontaxis and Monica Chew. 2015. Tracking Protection in Firefox For Privacy and Performance. In 9th Workshop on Web 2.0 Security and Privacy (W2SP). http://arxiv.org/abs/1506.04104
[21]
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 2007 IEEE 23rd International Conference on Data Engineering (2007). IEEE, 106--115.
[22]
Veronica Marotta, Vibhanshu Abhishek, and Alessandro Acquisti. 2019. Online tracking and publishers' revenues: An empirical analysis. In Workshop on the Economics of Information Security.
[23]
Veronica Marotta, Yue Wu, Kaifu Zhang, and Alessandro Acquisti. 2022. The welfare impact of targeted advertising technologies. Information Systems Research, Vol. 33, 1 (2022), 131--151. https://doi.org/10.1287/isre.2021.1024
[24]
Andrés Muñoz Medina, Michael Kleber, Josh Karlin, and Marshall Vale. 2021. Measuring Sensitivity of Cohorts Generated by the FloC API. https://perma.cc/B95S-HKMG.
[25]
Vikas Mishra, Pierre Laperdrix, Antoine Vastel, Walter Rudametkin, Romain Rouvoy, and Martin Lopatka. 2020. Don't Count Me Out: On the Relevance of IP Address in the Tracking Ecosystem. In Proceedings of The Web Conference 2020 (2020-04-20). 808--815. https://doi.org/10.1145/3366423.3380161
[26]
Mozilla. 2022. State Partitioning - Privacy, Permissions, and Information Security | MDN. https://perma.cc/4QJE-54V6.
[27]
Shigeki Ohtsu. 2021. FLoC Simulator. https://github.com/shigeki/floc_simulator.
[28]
The Chromium Projects. 2021. FLoC Origin Trial & Clustering. https://perma.cc/68WM-Q5EZ.
[29]
Deepak Ravichandran and Nitish Korula. 2019. Effect of Disabling Third-Party Cookies on Publisher Revenue. Google Report (2019). https://perma.cc/SS3S-WRWJ.
[30]
Deepak Ravichandran and Sergei Vassilvitski. 2020. Evaluation of Cohort Algorithms for the FloC API. Google Research & Ads Whitepaper (2020). https://perma.cc/JA9X-DRUF (Accessed April 22, 2022).
[31]
Eric Rescorla and Martin Thomson. 2021. Technical Comments on FLoC Privacy. Technical Report. Mozilla. https://perma.cc/E5RL-C2C9.
[32]
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 10, 05 (2002), 557--570.
[33]
Vincent Toubiana, Arvind Narayanan, Dan Boneh, Helen Nissenbaum, and Solon Barocas. 2010. Adnostic: Privacy Preserving Targeted Advertising. In Proceedings of the Network and Distributed System Symposium. Rochester, NY. https://papers.ssrn.com/abstract=2567076
[34]
U.S. Census Bureau. 2017a. American Community Survey 5-year Estimates.
[35]
U.S. Census Bureau. 2017b. Current Population Survey Data Tables. Retrieved from: https://www.census.gov/programs-surveys/cps/data/tables.html.
[36]
Web Incubator CG. 2021. Federated Learning of Cohorts (FLoC ). https://github.com/WICG/floc (https://perma.cc/5JTK-D9MB).
[37]
Kang Zhao, Hongtao Lu, and Jincheng Mei. 2014. Locality Preserving Hashing. In Twenty-Eighth AAAI Conference on Artificial Intelligence, Vol. 28. https://doi.org/10.1609/aaai.v28i1.9133showDOI

Cited By

View all
  • (2024)Insights from an Experiment Crowdsourcing Data from Thousands of US Amazon Users: The importance of transparency, money, and data useProceedings of the ACM on Human-Computer Interaction10.1145/36870058:CSCW2(1-48)Online publication date: 7-Nov-2024
  • (2024)Re-Identification Attacks against the Topics APIACM Transactions on the Web10.1145/3675400Online publication date: 27-Jun-2024
  • (2023)DDQ: Collaborating Against Common DNS-Resolver-based Trackers2023 IEEE 23rd International Conference on Communication Technology (ICCT)10.1109/ICCT59356.2023.10419627(1325-1329)Online publication date: 20-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security
November 2022
3598 pages
ISBN:9781450394505
DOI:10.1145/3548606
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2022

Check for updates

Author Tags

  1. data analysis
  2. privacy
  3. web advertising

Qualifiers

  • Research-article

Conference

CCS '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)370
  • Downloads (Last 6 weeks)28
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Insights from an Experiment Crowdsourcing Data from Thousands of US Amazon Users: The importance of transparency, money, and data useProceedings of the ACM on Human-Computer Interaction10.1145/36870058:CSCW2(1-48)Online publication date: 7-Nov-2024
  • (2024)Re-Identification Attacks against the Topics APIACM Transactions on the Web10.1145/3675400Online publication date: 27-Jun-2024
  • (2023)DDQ: Collaborating Against Common DNS-Resolver-based Trackers2023 IEEE 23rd International Conference on Communication Technology (ICCT)10.1109/ICCT59356.2023.10419627(1325-1329)Online publication date: 20-Oct-2023
  • (2023)The Group Right to Mutual PrivacyDigital Society10.1007/s44206-023-00051-52:2Online publication date: 6-Jun-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media