Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3589334.3645409acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article
Open access

Understanding GDPR Non-Compliance in Privacy Policies of Alexa Skills in European Marketplaces

Published: 13 May 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Amazon Alexa is one of the largest Voice Personal Assistant (VPA) platforms and it allows third-party developers to publish their voice apps, named skills, to the Alexa skill store. To satisfy the needs of European users, Amazon Alexa has established multiple skill marketplaces in Europe and allows developers to publish skills in their native languages. Skills in European marketplaces are required to comply with GDPR (General Data Protection Regulation), which imposes strict obligations on data collection and processing. Skills that involve data collection should provide a privacy policy to disclose the data practice to users and meet GDPR requirements.
    In this work, we analyze the privacy policies of skills in European marketplaces, focusing on whether skills' privacy policies and data collection behaviors comply with GDPR. We collect a large-scale dataset that includes skills in all European marketplaces with privacy policies. To classify whether a sentence in a privacy policy provides GDPR information, we gather a labeled dataset including skills' privacy policy sentences and use it to train a BERT model. Then, we analyze the GDPR compliance of European skills. Using a dynamic testing tool based on ChatGPT, we check whether skills' privacy policies comply with GDPR and are consistent with the actual data collection behaviors. Surprisingly, we find that 67% of the privacy policies fail to comply with GDPR and don't provide necessary GDPR-related information. For 1,187 skills with data collection behaviors, we observe that 603 skills (50.8%) don't provide a complete privacy policy and 1,128 skills (95%) have GDPR non-compliance issues in their privacy policies. Meanwhile, we find that the GDPR has a positive influence on European privacy policies.

    Supplemental Material

    MP4 File
    Supplemental video

    References

    [1]
    Alexa-hosted Skills. https://developer.amazon.com/en-US/docs/alexa/hostedskills/ alexa-hosted-skills-create.html.
    [2]
    Alexa Skills Privacy Requirements. https://developer.amazon.com/fr/docs/customskills/security-testing-for-an-alexa-skill.html#25-privacy-requirements. Accessed: 25-Nov-2020.
    [3]
    California Consumer Privacy Act (CCPA). https://oag.ca.gov/privacy/ccpa.
    [4]
    California Online Privacy Protection Act (CalOPPA). https://consumercal.org/about-cfc/cfc-education-foundation/californiaonline-privacy-protection-act-caloppa-3/.
    [5]
    Children's Online Privacy Protection Rule (COPPA). https://www.ftc.gov/legallibrary/ browse/rules/childrens-online-privacy-protection-rule-coppa.
    [6]
    Configure Permissions for Customer Information in Your Skill. https://developer.amazon.com/en-US/docs/alexa/custom-skills/configurepermissions-for-customer-information-in-your-skill.html.
    [7]
    General Data Protection Regulation. https://gdpr-info.eu.
    [8]
    Google fined "50 million for GDPR violation in France. https://www.theverge.com/2019/1/21/18191591/google-gdpr-fine-50-millioneuros-data-consent-cnil.
    [9]
    Google Privacy Policy Guidance. https://developers.google.com/assistant/console/policies/privacy-policy-guide.
    [10]
    Health Insurance Portability and Accountability Act of 1996 (HIPAA). https://www.cdc.gov/phlp/publications/topic/hipaa.html.
    [11]
    How voice assistants are changing our lifestyle. https://voxpow.com/blog/howvoice-assistants-are-changing-our-lifestyle/.
    [12]
    Selenium WebDriver. https://pypi.org/project/selenium/.
    [13]
    Noura Abdi, Kopo M. Ramokapane, and Jose M. Such. More than smart speakers: Security and privacy perceptions of smart home personal assistants. In Fifteenth Symposium on Usable Privacy and Security (SOUPS 2019), Santa Clara, CA, 2019. USENIX Association.
    [14]
    Benjamin Andow, Samin Yaseer Mahmud,WenyuWang, Justin Whitaker, William Enck, Bradley Reaves, Kapil Singh, and Tao Xie. Policylint: Investigating internal privacy policy contradictions on google play. In Proceedings of the 28th USENIX Conference on Security Symposium, page 585--602, 2019.
    [15]
    Long Cheng, Christin Wilson, Song Liao, Jeffrey Young, Daniel Dong, and Hongxin Hu. Dangerous skills got certified: Measuring the trustworthiness of skill certification in voice personal assistant platforms. In ACM SIGSAC Conference on Computer and Communications Security (CCS), 2020.
    [16]
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
    [17]
    Jide Edu, Xavi Ferrer Aran, Jose Such, and Guillermo Suarez-Tangil. Skillvet: Automated traceability analysis of amazon alexa skills. IEEE Transactions on Dependable and Secure Computing, 2021.
    [18]
    Jide Edu, Xavier Ferrer-Aran, Jose Such, and Guillermo Suarez-Tangil. Measuring alexa skill privacy practices across three years. In Proceedings of the ACM Web Conference 2022, pages 670--680, 2022.
    [19]
    Sergio Esposito, Daniele Sgandurra, and Giampaolo Bella. Alexa versus alexa: Controlling smart speakers by self-issuing voice commands. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, pages 1064--1078, 2022.
    [20]
    Alex Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. Hybrid speech recognition with deep bidirectional lstm. In 2013 IEEE workshop on automatic speech recognition and understanding, pages 273--278. IEEE, 2013.
    [21]
    Nils Gruschka, Vasileios Mavroeidis, Kamer Vishi, and Meiko Jensen. Privacy issues and data protection in big data: a case study analysis under gdpr. In 2018 IEEE International Conference on Big Data (Big Data), pages 5027--5033. IEEE, 2018.
    [22]
    Zhixiu Guo, Zijin Lin, Pan Li, and Kai Chen. Skillexplorer: Understanding the behavior of skills in large scale. In 29th {USENIX} Security Symposium ({USENIX} Security 20), pages 2649--2666, 2020.
    [23]
    Umar Iqbal, Pouneh Nikkhah Bahrami, Rahmadi Trimananda, Hao Cui, Alexander Gamero-Garrido, Daniel Dubois, David Choffnes, Athina Markopoulou, Franziska Roesner, and Zubair Shafiq. Your echos are heard: Tracking, profiling, and ad targeting in the amazon smart speaker ecosystem. arXiv preprint arXiv:2204.10920, 2022.
    [24]
    Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. Skill Squatting Attacks on Amazon Alexa. In 27th USENIX Security Symposium (USENIX Security), pages 33--47, 2018.
    [25]
    Tu Le, Danny Yuxing Huang, Noah Apthorpe, and Yuan Tian. Skillbot: Identifying risky content for children in alexa skills. ACM Transactions on Internet Technology (TOIT), 22(3):1--31, 2022.
    [26]
    Christopher Lentzsch, Sheel Jayesh Shah, Benjamin Andow, Martin Degeling, Anupam Das, and William Enck. Hey alexa, is this skill safe?: Taking a closer look at the alexa skill ecosystem. Network and Distributed Systems Security (NDSS) Symposium2021, 2021.
    [27]
    Suwan Li, Lei Bu, Guangdong Bai, Zhixiu Guo, Kai Chen, and Hanlin Wei. Vitas: Guided model-based vui testing of vpa apps. In 37th IEEE/ACM International Conference on Automated Software Engineering, pages 1--12, 2022.
    [28]
    Song Liao, Long Cheng, Haipeng Cai, Linke Guo, and Hongxin Hu. Skillscanner: Detecting policy-violating voice applications through static analysis at the development phase. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, pages 2321--2335, 2023.
    [29]
    Song Liao, Christin Wilson, Long Cheng, Hongxin Hu, and Huixing Deng. Measuring the effectiveness of privacy policies for voice assistant applications. In Annual Computer Security Applications Conference (ACSAC), page 856--869, 2020.
    [30]
    Thomas Linden, Rishabh Khandelwal, Hamza Harkous, and Kassem Fawaz. The privacy policy landscape after the gdpr. Proceedings on Privacy Enhancing Technologies, 2020(1):47--64, 2020.
    [31]
    Yuxi Ling, Kailong Wang, Guangdong Bai, Haoyu Wang, and Jin Song Dong. Are they toeing the line? diagnosing privacy compliance violations among browser extensions. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2022.
    [32]
    Shuang Liu, Baiyang Zhao, Renjie Guo, Guozhu Meng, Fan Zhang, and Meishan Zhang. Have you been properly notified? automatic compliance analysis of privacy policy text with gdpr article 13. In Proceedings of the Web Conference 2021, pages 2154--2164, 2021.
    [33]
    Tamjid Al Rahat, Minjun Long, and Yuan Tian. Is your policy compliant? a deep learning-based empirical study of privacy policies' compliance with gdpr. In Proceedings of the 21stWorkshop on Privacy in the Electronic Society, pages 89--102, 2022.
    [34]
    Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85--117, 2015.
    [35]
    Faysal Shezan, Hang Hu, JiaminWang, GangWang, and Yuan Tian. Read between the lines: An empirical measurement of sensitive applications of voice personal assistant systems. In Proceedings of The Web Conference (WWW), 2020.
    [36]
    Faysal Hossain Shezan, Hang Hu, GangWang, and Yuan Tian. Verhealth: Vetting medical voice applications through policy enforcement. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2020.
    [37]
    Welderufael B Tesfay, Peter Hofmann, Toru Nakamura, Shinsaku Kiyomoto, and Jetzabel Serna. I read but don't agree: Privacy policy benchmarking using machine learning and the eu gdpr. In Companion Proceedings of the The Web Conference 2018, pages 163--166, 2018.
    [38]
    Dawei Wang, Kai Chen, and Wei Wang. Demystifying the vetting process of voice-controlled skills on markets. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(3):1--28, 2021.
    [39]
    Fuman Xie, Yanjun Zhang, Chuan Yan, Suwan Li, Lei Bu, Kai Chen, Zi Huang, and Guangdong Bai. Scrutinizing privacy policy compliance of virtual personal assistant apps. In 37th IEEE/ACM International Conference on Automated Software Engineering, pages 1--13, 2022.
    [40]
    Chuan Yan, Fuman Xie, Mark Huasong Meng, Yanjun Zhang, and Guangdong Bai. On the quality of privacy policy documents of virtual personal assistant applications. Proceedings on Privacy Enhancing Technologies, 1:478--493, 2024.
    [41]
    Jeffrey Young, Song Liao, Long Cheng, Hongxin Hu, and Huixing Deng. {SkillDetective}: Automated {Policy-Violation} detection of voice assistant applications in the wild. In 31st USENIX Security Symposium (USENIX Security, 2022.
    [42]
    Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. Understanding and mitigating the security risks of voice-controlled thirdparty skills on amazon alexa and google home. In IEEE Symposium on Security and Privacy (SP), 2019.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM on Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Check for updates

    Author Tags

    1. amazon alexa
    2. gdpr
    3. privacy policy

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 82
      Total Downloads
    • Downloads (Last 12 months)82
    • Downloads (Last 6 weeks)46
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media