\useunder

\ul \AtEndPreamble

^†^†thanks: 1231213

A Solution toward Transparent and Practical AI Regulation: Privacy Nutrition Labels for Open-source Generative AI-based Applications

Meixue Si Renmin University of China Equal contribution, ^†Corresponding author (shidong.pan@anu.edu.au) Shidong Pan CSIRO’s Data61 Australian National University Dianshu Liao Australian National University Xiaoyu Sun Australian National University Zhen Tao CSIRO’s Data61 Australian National University Wenchang Shi Renmin University of China Zhenchang Xing CSIRO’s Data61 Australian National University

Abstract

The rapid development and widespread adoption of Generative Artificial Intelligence-based (GAI) applications have greatly enriched our daily lives, benefiting people by enhancing creativity, personalizing experiences, improving accessibility, and fostering innovation and efficiency across various domains. However, along with the development of GAI applications, concerns have been raised about transparency in their privacy practices. Traditional privacy policies often fail to effectively communicate essential privacy information due to their complexity and length, and open-source community developers often neglect privacy practices even more. Only 12.2% of examined open-source GAI apps provide a privacy policy. To address this, we propose a regulation-driven GAI Privacy Label and introduce Repo2Label, a novel framework for automatically generating these labels based on code repositories. Our user study indicates common endorsement of the proposed GAI privacy label format. Additionally, Repo2Label achieves a precision of 0.81, recall of 0.88, and F1-score of 0.84 based on the benchmark dataset, significantly outperforming the developer self-declared privacy notices. We also discuss the common regulatory (in)compliance of open-source GAI apps, comparison with other privacy notices, and broader impacts to different stakeholders. Our findings suggest that Repo2Label could serve as a significant tool for bolstering the privacy transparency of GAI apps and make them more practical and responsible.

Keywords: Generative AI Applications, AI Regulation, Privacy Policy, Privacy Labels, Open-source

I Introduction

Refer to caption — Figure 1: An overview of Repo2Label and an example GAI privacy label for Stable Diffusion. Given a repository (A), Repo2Label extracts all code files (B) and semi-structured textual documents (e.g., C) from the repository. Answers and references are then generated for each label filed in our proposed regulation-driven GAI privacy nutrition labels (D).

Online Generative Artificial Intelligence-based (GAI) applications¹¹1In the rest of the paper, we use “GAI apps” for short. have emerged as powerful resources for a wide array of productive tools, ranging from content creation to decision support systems. These GAI apps often leverage powerful pre-trained large language models (LLMs) to generate new content that resembles the input data, enabling them to produce text, images, code, and more [26]. The most typical tool is ChatGPT released in November 2022 [70], which can create new content through the chatting interactions based on user prompts [66, 36]. Since its release, ChatGPT attracted over one million users within just five days, and within two months, the user base surged to 100 million [63]. Simultaneously, a plethora of open-source GAI apps have emerged, garnering significant attention within the open-source community. Empirical data indicate that the number of GAI repositories on the GitHub platform in 2023 has more than doubled compared to 2022 [48]. These repositories often contain specific implementation code and guidance documentation for the GAI apps. As an open-source and freely available project, AutoGPT has rapidly gained widespread attention within the open-source community. [25, 71, 80]. Remarkably, in just seven days, the project received 44,000 stars on GitHub [77].

The rise of ChatGPT, AutoGPT, and other GAI apps that allow users to enter simple prompts has caused explosive growth among the public. The latest annual McKinsey Global Survey on the state of AI highlights the rapid expansion of GAI apps [19, 20]. 79% of all respondents report having been exposed to GAI, either for work or in other contexts. These GAI apps play increasingly crucial roles in both daily life and professional domains. However, considering the unique characteristics inherent to GAI, the complexity and diversity of their outputs may potentially give rise to trustworthiness concerns [84, 42], privacy concerns [27, 16, 34, 92, 91], and copyright implications [17, 93].

Concurrently, governments worldwide have acknowledged the transformative impact of GAI and are proactively implementing measures to address the associated challenges. For instance, during the AI Seoul Summit²²2https://www.gov.uk/government/topical-events/ai-seoul-summit-2024 in May 2024, ten countries and the European Union reached an agreement to collaborate on the establishment of an international network dedicated to accelerating advancements in the science of AI safety [28], and 27 nations committed to work together on severe AI risks [29]. Policymakers have instituted a series of GAI-specific regulations aimed at safeguarding privacy, enhancing transparency, and ensuring the dependability of GAI apps. The recent amendments to Art. 52 of the EU AI Act [59] have introduced specific regulations for GAI, mandating that AI systems generating deepfake content must disclose that the content has been artificially manipulated. Similarly, other countries such as Singapore [9], Canada [69], and China [18, 8] have also drafted and enacted legislation for GAI. Additionally, it is important to recognize that GAI apps, as a specific subset of general software, should also comply with general regulations such as the General Data Protection Regulation (GDPR) [3] and the California Consumer Privacy Act (CCPA) [2]. These regulations, collectively seek to mitigate the risks associated with GAI deployment and foster its beneficial integration into society.

To comply with transparency requirements mandated by privacy regulations, developers typically disclose software privacy practices to users through privacy notices. The most common form of these notices is the privacy policy [31, 22, 14]. However, previous research [46, 79, 83] indicates that traditional privacy policies often suffer from excessive jargon, lengthy content, and ambiguous language, which undermine their effectiveness in securing informed consent from users [64]. To mitigate the problem of information overload in privacy policy communication, researchers have introduced a more concise, standardized, and easy-to-understand form of privacy notices – privacy nutrition label [38]. This privacy label is designed to convey privacy information to users in a streamlined manner, enabling them to quickly and accurately access the details they are most concerned about.

In this study, we propose the regulation-driven GAI Privacy Nutrition Label³³3In the rest of paper, we use ”GAI Privacy Label” for short. and a novel framework, dubbed Repo2Label (Repository sto privacy nutrition Label), to automatically generate GAI privacy labels based on the code repository (as shown in Fig. 1). First, we conducted an empirical study of the status quo of GAI apps and their privacy notices. Only 12.2% (18/148) of examined GAI apps offer a privacy policy, indicating a significant transparency deficiency in providing essential privacy information to end users. Then, we performed a thematic analysis of general privacy regulations (GDPR, CCPA, PIPL) and GAI-specific regulations, to establish a regulation-driven GAI privacy label format. Next, we introduced the design and implementation of Repo2Label framework which aims to automatically generate GAI privacy labels based on their code repositories. In the evaluation, we evaluated various aspects of the proposed GAI privacy label design through a user study involving 48 participants. Results show that our proposed GAI privacy label format is widely endorsed by participants. Additionally, based on the manual annotation dataset, Repo2Label achieves a precision of 0.81, recall of 0.88, and F1-score of 0.84 under the optimal experimental settings. Overall, the key contributions are:

•

To the best of our knowledge, this is the first research to empirically investigate the status quo of privacy notices to open-source GAI apps.
•

To the best of our knowledge, this is the first research to propose regulation-driven privacy labels for GAI apps.
•

We propose a Repo2Label framework for automatically generating GAI privacy labels based on code repositories. This code-based privacy notice generation method can more authentically reflect the privacy practices of GAI apps, compared to traditional self-declared approaches.

The rest of this paper is organized as follows: In Section II, we analyze the current state of GAI apps and their privacy notices. Section III presents the challenges faced by existing privacy labels and Section IV introduces our regulation-driven GAI privacy nutrition label. Section V details the Repo2Label framework for generating GAI privacy labels based on repositories. Section VI reports a user study about our GAI privacy label design and describes the performance of the Repo2Label framework. Section VII discusses the broader impact of our work on the community, and we conclude in Section VIII.

Ethical approval for this research was secured from our institution’s Institutional Review Board (IRB).

II Status Quo of GAI Applications

This section describes our empirical observations of GAI apps and an in-depth analysis of the current GAI apps on the market. Additionally, the problems existing in the current privacy notices, especially privacy policies, of GAI apps are discussed.

II-A GAI-based Applications

Generative Artificial Intelligence [88] refers to AI systems that utilize existing media to generate new content [67, 81, 82]. These systems leverage large datasets to learn patterns and structures, enabling them to create original outputs that resemble the input data. This capability spans various modalities, including text, images, audio, and video, making GAI a versatile tool in numerous apps [58]. A variety of apps with different functionality are popping up on the market. For instance, it can be used to produce realistic images from textual descriptions, generate emails, write news, and even simulate voices. The potential of GAI lies in its ability to extend human creativity and productivity by automating the creation of high-quality, innovative content.

Recent advancements in GAI, especially in the domain of LLMs, have substantially improved the ability of these systems to understand and process textual information. The leaps in this field have empowered GAI apps to interpret user inputs with greater precision and generate contextually relevant responses. This progress has markedly reduced the accessibility barriers to such GAI apps, enabling users, even those with limited expertise in prompts or instructions, to effectively utilize text-guided generation capabilities [10].

II-B An Overview of GAI Application Market

TABLE I: The statistic analysis of GAI Apps on gpt3demo.com and gpt4demo.com.

	gpt3demo	gpt4demo	Total
Available Code Repository	138	10	148
Available Privacy Policy	16	2	18
Available Privacy Labels	1	1	2
GAI Tools	887	87	974

(a) Overall.

No. Privacy policies	18
No. Words	53,540
No. Sentences	2,222
Avg. Words per Sentence	2,974
Avg. Sentences per Privacy policy	123

(b) Privacy policies.

No. Repositories	148
No. Stars	1,693.7k
No. Forks	288.3k
Avg. Stars	11.4k
Avg. Forks	1.9k

Third-party GAI app collection websites are independent online marketplaces established by external developers or organizations. The gpt3demo⁴⁴4https://gpt3demo.com and gpt4demo⁵⁵5https://gpt4demo.com are two popular third-party websites that actively collect and curate GAI apps and demonstrations since the mid of 2022, as pioneers of their kind. We use those two popular third-party collections to harvest mainstream GAI apps on the market. Through the deployment of customized web scrapers, we successfully retrieved information including the name, category tags, code repository link, and privacy notices (e.g., privacy policies). For the gpt3demo site, a total of 887 GAI apps are categorized into 228 fine-grained categories. Meanwhile, the gpt4demo site lists 87 GAI apps, spread across 47 distinct categories. Most of these GAI apps position themselves as domain experts, incorporating professional domain knowledge to act as advisors for users in specific fields, including but not limited to Robot Lawyers and Coding Assistants. We manually checked and selected open-source GAI apps that explicitly provided a GitHub repository address. After excluding three inaccessible repositories, we successfully collected 148 open-source GAI apps with a valid GitHub repository link, with the proportions being 15.2% of all curated GAI apps. As shown in Table I.c, the average stars and the average forks are 11.4k and 1.9k, respectively, reflecting the significant popularity of those open-source GAI Apps. According to Repositories Ranking [7], the median stars of the top 10,000 GitHub repositories is about 5.8k. GAI apps provide expertise and consultation to support a wide range of app scenarios. However, the complexity and capability of these apps lead to potentially significant privacy risks, which remain to be adequately addressed [93].

Notably, as this is a super dynamic market, our data collection is a snapshot from January 2024. This subset is sufficient to demonstrate some characteristics of GAI apps discussed in this section. OpenAI official GPTs store⁶⁶6https://chatgpt.com/gpts is another GAI app repository that attracts increasing attention, but all apps are solely based on GPT foundation models and most of them are not open-source.

II-C Current Privacy Notices of Open-source GAI Applications

The privacy notice is an essential component of software, including the emerging GAI apps and tools. The privacy policy is the most common type of privacy notice and is widely required by privacy regulations and industry standards [76, 75]. In the mobile app ecosystem, two of the largest app stores, Google Play and Apple App Store, both mandate that app developers provide a privacy policy before publishing an app. Fig. 2.a shows that the privacy policies of mobile apps in the Google Play app store are curated in a visually obvious and easily accessible position (in the sidebar) on the homepage. However, the GAI apps market is an emerging sector, and there are no such industry standards to enforce developers to follow. Therefore, we conduct an empirical study to investigate the current state and practices of privacy notices for open-source GAI apps, as follows.

For the 148 GAI apps, we conducted manual scrutinization about their GitHub repositories, official websites (if available), and tool interfaces (if available) to check the provision of privacy notices, especially privacy policies. Specifically, we visited all the aforementioned websites and searched for relevant information in their HTML sources. The keywords used for the search included “privacy”, “privacy policy”, “notices”, “terms”, “terms of services”, etc. Although the terms of services (ToS) are not privacy notices, they are commonly put aside in practice. Those keywords might help us to find privacy notices. Additionally, we employed the Google search engine to broadly search the privacy notices for GAI apps. The search terms are “[GAI application name] + [keyword] (e.g., privacy policy)”, and for each search keyword, we evaluated all entries on the first result page. For example, the privacy policies of AgentGPT⁷⁷7Its official website is https://agentgpt.reworkd.ai, and its privacy policy is available at https://agentgpt.reworkd.ai/privacypolicy.html and PromptLayer⁸⁸8Its official website is https://promptlayer.com/, and its privacy policy is available at https://promptlayer.com/privacy_policy.pdf are discovered by additional searching, but their privacy policies are not directly included in their repositories or main interfaces. Fig. 2.b shows the privacy policy link of AutoGPT, located in the footer of the advertisement website, is not easily accessible to users. In total, only 12.2% (18/148) of examined GAI apps offer a privacy policy as listed in Table I.a, indicating a significant transparency deficiency in providing essential privacy information to end users. Our findings also highlight the issue that existing privacy policies can be difficult for users to find without extra effort.

Among the small portion of GAI apps, the readability of their privacy policies is as unsatisfactory as reported in previous studies [76, 51]. These policies are often lengthy, filled with jargon, and contain frequent hyperlinks and cross-references. Table I.b shows the statistics of GAI app privacy policies we collected. The average length of examined privacy policies is about 3,000 words, and the average reading time is about 12 minutes, according to [13]. We then calculated the readability based on the Flesch Reading-Ease Test for privacy policies [43, 76], and the average readability score is 39.9. This number indicates that fully comprehending the privacy policy of GAI apps requires at least a college-level education [43], which contradicts the mission of GAI apps to mitigate the expertise gap.

Surprisingly, through additional searching, we noticed only two (1.3%) GAI apps provide privacy labels as their privacy notices, instead of privacy policies. Privacy labels are short-form, clear, table-like disclosures that enable users to quickly understand how their data is collected and utilized [39, 41, 24]. Privacy labels have been proven useful over the past decade, and the concept has now been widely applied in the industry, such as Apple and Google. Comparably, 60% of apps on the Apple App Store and 44% of apps on the Google Play Store have completed the necessary forms to generate privacy labels, as of August 26, 2022 [50]. The two existing privacy labels for GAI apps are compulsorily required privacy disclosures (similar to the Data Safety section in the Google Play app store) in the Chrome web store⁹⁹9https://chromewebstore.google.com/category/extensions for Chrome extensions, as both GAI apps are implemented and launched in the Chrome store. Such privacy label format is tailored for browser extensions, neglecting the unique challenges and regulatory requirements of GAI apps. We detail the current challenges of privacy labels in the next section and propose a GAI-specific privacy label format in Section IV.

TABLE II: Thematic analysis results for requirements in general privacy regulations and GAI-specific regulations. Meas-GAI denotes Administrative Measures for Generative Artificial Intelligence Services [8]. Req-GAI denotes Basic Security Requirements for Generative Artificial Intelligence Service [18]. Prin-GAI denotes Principles for Responsible, Trustworthy and Privacy-Protective Generative AI Technologies [69]. MAIF-GAI denotes Model AI Governance Framework for Generative AI [9]. In the subsequent design of GAI privacy label, we further elaborate Tool Type into Tool Modality and Tool Functionality. Notably, we only include the intersection of these requirements in this table.

Regulation

Region

Publish

Date

Operationalized Requirements [Article Reference]

General Privacy Regulations

GDPR[3]

Apr’16

Right to Lodge Complaints [Art.13.2.(d) & 14.2.(e)]

Data Encryption [Art.32.1.(a)]

Right to be Forgotten [Art.13.2.(c) & 14.2.(d)]

Data Retention [Art.13.2.(a) & Art.14.2.(a)]

Right to Access [Art.13.2.(b) & 14.2.(c)]

Right to Lodge Complaints [Art.13.2.(e)]

Controller Contact [Art.13.1.(a)]

Protection of Minors[Art.32.1]

CCPA[2]

California

Jun’18

Right to be Forgotten [§1798.120]

Right to Access [§1798.110]

Data Retention [§1798.100.a.(3)]

PIPL[6]

China

Aug’21

Right to be Forgotten [Art.15]

Data Encryption [Art.51.(3)]

Controller Contact [Art.17.(1) & Art.52]

Right to Access [Art.45]

Data Retention [Art.17.(2) & Art.19]

Risk Notification [Art.51]

Right to Lodge Complaints [Art.50]

Protection of Minors [Art.31]

GAI-specific Regulations

Meas-GAI[8]

China

May’23

Base Model [Art.7]

Target Users [Art.10]

Protection of Minors [Art.10]

Tool Type [Art.10]

AI-generated Watermarking [Art.12]

Data Retention [Art.11]

Right to Lodge Complaints [Art.15 & Art.18]

Risk Notification [Art.14]

Req-GAI[18]

China

Oct’23

Prompt Guardrail [Art.6.(b).1 & Art.7.(f).1 & Art.6.(b).2]

AI-generated Watermarking [Art.7.(d)]

Right to Lodge Complaints [Art.5.2.(b).3 & Art.7.(e)]

Tool Type [Art.6.(c).1]

Risk Notification [Art.5.2.(b).4 & Art.6.(b).2]

Target Users [Art.6.(c).1]

Base Model [Art.6.(a) & Art.6.(c).1 & Art.6.(c).2]

Right to be Forgotten [Art.7.(c)]

Protection of Minors [Art.7.(a).3 & Art.7.(a).4]

Prin-GAI[69]

Canada

Dec’23

Data Encryption [Art.3]

Tool Type [Art.4]

AI-generated Watermarking [Art.4]

Right to Access [Art.6]

Risk Notification [Art.4]

Data Retention [Art.7]

Working Details [Art.5]

Prompt Guardrail [Art.8]

MAIF-GAI[9]

Singapore

Jan’24

Prompt Guardrail [Art.3.(d) & Art.3]

Base Model [Art.3.(b)]

Risk Notification [Art.3.(e) & Art.6.(a)]

Tool Type [Art.3.(f)]

AI-generated Watermarking [Art.7]

Working Details [Art.3.(d) & Art.3.(g)]

AI Act[59]

TBD

AI-generated Watermarking [Art. 52]

III Motivation

In this section, we examine the challenges encountered in creating these privacy labels. Privacy labels are designed to accurately convey the software privacy practices to users in a timely manner. In the realm of software ecosystems, the implementation of extensive privacy labels ostensibly enhances transparency and user comprehension regarding data handling practices. However, researchers and consumer advocates have articulated various challenges regarding the present privacy labels and the generation [90, 51, 94, 45, 52, 44].

Challenge-1. The concept of privacy labels has transitioned from theoretical frameworks to practical apps. Apple [15], Google [78], and Amazon [11] have sequentially mandated developers to provide privacy labels for apps hosted within their respective app marketplaces. However, despite the widespread adoption of privacy labels at the corporate level, a lack of uniformity and established standards persists throughout their implementation processes. Previous work has discussed the differences in privacy labels between Apple and Google. Lin et al. [55] conducted a detailed comparison of the privacy label designs of Apple and Google. Their study identified significant differences in the structure and categorization of data within the privacy labels of the two companies. Additionally, the research highlighted that the two companies use different terminologies to describe the same concepts. Cranor et al. [21] have discussed the missing key ingredients for mobile app privacy labels. The lack of standardization and consistency in privacy label design remains a significant obstacle. This inconsistency not only hampers users’ ability to make informed decisions regarding their data privacy but also complicates regulatory compliance for developers.

Challenge-2. A central concern pertains to the accuracy of the labels, questioning their fidelity to accurately represent privacy practices. The primary rationale behind this issue stems from the fact that the privacy label generation mainly relies on questionnaire-based methodologies [51, 74, 50]. This process involves querying app developers with a set of inquiries regarding the privacy aspects of their apps and subsequently utilizing their responses to generate privacy labels. Self-declaration can create privacy labels, but their quality could vary [51, 74, 50].

Developing privacy labels is a complex process, demanding both the knowledge of app features and corresponding legal requirements. From the perspective of developers, the task of meticulously completing the requisite questionnaires for the generation of a privacy label presented a considerable challenge. First, developers often hold misconceptions regarding the terminology of privacy labels [51]; In addition, they might not completely understand the behavior of software, especially parts from collaborators in the development team. Therefore, it is important to propose a code-based privacy label generation approach to accurately reflect the actual behaviors of GAI Apps.

IV Proposed GAI Privacy Labels

To respond the Challenge-1 discussed in Section III, we conduct a regulations-driven privacy label design process and propose a GAI privacy label format. We then evaluate the proposed design through a human evaluation in Section VI-A.

In the design and implementation of privacy labels, it is crucial to identify the key components that should be included. The concept of privacy labels was first inspired by the nutritional labels on food packaging and introduced by Kelley et al. [39] in 2009. After that, various types of privacy labels have been proposed for different scenarios, including websites [40], IoT devices [24], and mobile apps [74, 78, 15]. Although the format and information in those privacy labels are different, they are commonly constituted by three major sections: a) data controller information; b) data practices and purposes; and c) risk disclosures.

To tackle the aforementioned issues, we aim for a standardized GAI privacy label to not only transparently, but also compliantly disclose the privacy practices of GAI apps. To this end, we conducted an empirical study of existing regulations to examine and draft a GAI privacy label format. GDPR [3] and CCPA [2] are pioneers on the general data and privacy protection, followed by them, China launched the PIPL [6] to fulfill the blank. With the development of Generative AI models, GAI-based apps present unique challenges not seen in previous scenarios. First, it is difficult to finely define data types in GAI app interactions, as most GAI apps take diverse and sophisticated prompts from users as inputs. Unlike traditional scenarios where data types are relatively straightforward, interactions with GAI apps can involve nuanced and context-dependent data. Second, GAI apps often process large volumes of personal and sensitive data in different modalities, and their advanced capabilities can infer additional information about users. This introduces heightened privacy risks, as the potential for data misuse or unintended consequences increases. Third, data rights, such as the Right to Access, Rectify, or be Forgotten, become more complex in the context of GAI, and have attracted increasing attention from users.

To respond to the unique challenges, governments have placed GAI-specific regulation legislation on the agenda. By October 2023, 31 countries had passed AI legislation, and 13 more were debating AI laws [1]. These GAI-specific regulations impose comprehensive requirements on GAI apps, focusing on risk assessment and disclosure to ensure compliance with legal standards. In total, we consider three general privacy regulations (GDPR, CCPA, PIPL) and five GAI-specific regulations in four countries/regions (China, Canada, Singapore, and the EU). For each regulation, we carefully scrutinized and extracted requirements about GAI apps. For instance, Singapore Model AI Governance Framework for Generative AI [9], article 3, stipulates that “A crucial step for safety is also to consider the context of the use case and conduct a risk assessment. For example, further fine-tuning or using user interaction techniques (such as input and output filters) can help to reduce harmful output…”. Upon conducting a thematic analysis of this requirement, we have encapsulated it as Prompt Guardrail, which is then incorporated as a GAI privacy label field. Two authors conducted the thematic analysis and created the initial codebook, independently. Both authors have at least two years experience on privacy regulation, AI governance, and Responsible AI field. For any disagreement in the codebook, they discussed and agreed on the same answer, and if the disagreement persisted, a third author joined the discussion to facilitate a resolution. Above all, we determined the regulation-driven GAI privacy label fields as shown in Table II.

TABLE III: Tallies of GAI privacy label fields according to regulations.

General Regulations

GAI-specific Regulations

GDPR

California

CCPA

China

PIPL

China

Meas-GAI[8]

China

Req-GAI[18]

Canada

Prin-GAI[69]

Singapore

MAIF-GAI[9]

AI Act[59]

Tallies

Base Model

✗

✓

✗

✓

✗

3/9

Tool Type

✗

✓

✗

4/9

Working Details

✗

✓

✗

2/9

Controller Contact

✓

✗

✓

✗

2/9

Target Users

✗

✓

✗

2/9

Data Retention

✓

✗

✓

✗

5/9

Right to Access

✓

✗

✓

✗

4/9

Right to be Forgotten

✓

✗

✓

✗

4/9

Right to Lodge Complaints

✓

✗

✓

✗

4/9

AI-generated Watermarking

✗

✓

5/9

Prompt Guardrail

✗

✓

✗

3/9

Risk Notification

✗

✓

✗

5/9

Data Encryption

✗

✓

✗

✓

✗

2/9

Protection of Minors

✗

✓

✗

3/9

Table III presents the existence for each label field across various regulations. In the design of GAI privacy label format, we meticulously considered the limitations of human cognitive psychology, particularly the concept of information chunking as proposed by Miller [61, 62]. Fig. 1.D illustrates an example of our proposed GAI privacy label. To optimize user comprehension and retention, we categorized the privacy label items into four primary sections:

1.

Basic Info: This section aims to provide a fundamental description of the GAI app, including its base model, supported modalities, primary functions, working details, developer information, and target users.
2.

Data Rights: This section details the essential data rights users possess when utilizing the tool, such as the right to access and the right to be forgotten.
3.

Risk Related: This section outlines any potential risks associated with the use of the tool, including whether there are risk notifications and if there are identifiers for content generated by the tool.
4.

Additional Info: This section discloses information on data encryption and special protections for minors.

The explanations for each label item are shown in Table IV. Additionally, the last three groups of privacy label fields have binary content. “Yes” denotes that this GAI app does implement this requirement, and “No” denotes otherwise. For each label item, there will be a floating bubble explaining the source of each answer. This structured approach ensures that users are comprehensively informed about the privacy aspects of the tool, enhancing transparency and trust.

TABLE IV: Explanations about GAI privacy labels. *The GitHub account does not count as a publicly available contact.

	Label Field	Explanation	Example Answers
Basic Info	Base Model	The names of foundation models that are embedded in this tool. (e.g., GPT-4, GPT-3.5, Ernie, etc)	GPT-3.5/GPT-4/…
	Tool Modality	The Modalities of information processed by the reception and response of the tool, respectively. (e.g., text-to-text, image to text)	Text to Image
	Tool Functionality	The major capabilities and services provided to users to meet their needs and solve specific problems.	Image Generation
	Working Details	Comprehensive details provided to users about this tool. (e.g., documents about how the system works, data processing process)	A link to the GAI app documentation
	Controller Contact	The publicly available contact of the GAI app developers*. (e.g., an email address)	abc@company.com
	Target Users	The intended audience or primary user base for this service.	Researchers
	Label Field	Explanation	Compliance Status
Data Rights	Data Retention	The practice of storing data for a specific period of time.	Yes/No
	Right to Access	The right of users to request to access their collected personal information.	Yes/No
	Right to be Forgotten	The right of users to request to erasure or deletion of their personal information.	Yes/No
	Right to Lodge Complaints	The right of users to lodge a complaint with a supervisory authority.	Yes/No
Risk Related	AI-generated Watermarking	A machine-readable and detectable mark embedded in content generated or modified by GAI systems.	Yes/No
	Prompt Guardrail	Comprehensive security protocols implemented to scrutinize both user inputs and system outputs for potential malicious activities. (e.g., employing stringent input/output filtering mechanisms)	Yes/No
	Risk Notification	A notification that informs users of the relevant risks they may face when using GAI tools. (e.g. copyright disputes)	Yes/No
Additional Info	Data Encryption	Data are encrypted and transferred over a secure connection.	Yes/No
	Protection of Minors	Special treatment made for the protection and convenience of children.	Yes/No

V Repo2Label framework

To response the Challenge-2 discussed in Section III, we propose Repo2Label, an automated framework that can generate GAI privacy label based on code repository, authentically reflecting the privacy practices of GAI apps.

V-A Overview

Given a GAI app, our ultimate objective is to analyze its corresponding GitHub repository, determining the value of each GAI privacy label field mentioned in Section IV. Fig. 3 illustrates the overview of our Repo2Label framework, including four stages. First, we extract the code repository from a given GAI app for subsequent analysis (Section V-B). Then, we design multiple AI units to extract information about GAI privacy labels from the code repository and generate explanations for each field (Section V-C). Next, we conduct label verification to assess the accuracy of the provided references and implement reflection processes to correct any hallucinated labels identified by the model (Section V-D). Finally, we combine individually generated GAI privacy labels (for each code file) into a comprehensive repository-level GAI privacy label, with references to further explain the value of each field.(Section V-E).

V-B Resource Extraction

The only input of the Repo2Label framework is the link of the GitHub repository of the given GAI apps. Based on the link, we can access all files contained in the repository by the GitHub API [4]. Normally, these code repositories contain a semi-structured textual document, README, that demonstrate the implementation, and various source code files. We exclude irrelevant art files (e.g., images), embedding vectors (e.g., the weights of models), and datasets, since they do not provide any informative hint about the behaviors of GAI apps. We filter out those files based on the file type.

V-C Privacy Label Extraction

This is the core stage of the whole framework. We aim to analyze the content in the repository to obtain the behaviors that are related to the GAI privacy label. This task requires understanding the functionality of individual functions within code files and the semantics of natural language descriptions. Recent studies have shown that large language models (LLMs) are exceptionally capable of software code understanding tasks [33, 54, 30, 73]. Considering our need to analyze the semantics of both code and natural language for extracting privacy nutrition label information, we designed several AI units by employing LLMs as our foundational model.

Privacy Nutrition Label Extraction AI Unit. For each AI unit, there are two inputs: a file from the repository and the definition of GAI privacy label fields. Specifically, all content in the file will be treated as character strings and the definition is the field explanations in Table IV. We then strategically design the prompt templates to exploit the LLMs’ capabilities, enabling the effective extraction of GAI privacy label information from code and textual files. The prompt template mainly comprises three main components: task description, input, and output. The task description utilizes a structured prompt to clearly outline the process. There are three key components of this prompt as follows. 1) @persona: This component instructs the model to assume the role of an expert data analyzer, ensuring clarity and focus on the task’s specific requirements. 2) @terminology: Here, essential terms are defined in a keyword format to clarify the context. For instance, the privacy label field, such as Base Model and its corresponding definition, such as “The names of foundation models that are embedded in this tool. (e.g., GPT-4, GPT-3.5, Ernie, etc)” 3) @instruction: This section furnishes explicit instructions for executing the task:

•

@command: Offers an overarching strategy for the model to follow, ensuring that the extraction process adheres to defined objectives.
•

@rule: Outlines the fundamental constraints and considerations that must be observed during the GAI privacy label extraction process.
•

@Input_format: Specifies the format of the File Content to be inputted.
•

@Output_format: Details the expected presentation format for information extracted from the file content.

In the @command, we guide the model through the extraction process step-by-step by employing a chain-of-thought approach, as detailed in the works of Wei et al.[87], Wang et al.[86], and Lyu et al. [57]. This structured methodology helps in unfolding the reasoning required for each step of the task, enhancing the model’s ability to generate coherent and contextually accurate outputs. Additionally, in the @rule section, we establish specific guidelines to govern the extraction of nutrition labels by the model, ensuring clarity and accuracy in its tasks:

•

@rule1: Limit the model’s task exclusively to information extraction.
•

@rule2: Require the model to provide a reference for each extraction.
•

@rule3: Ensure that all references provided by the model originate exclusively from the file content. The validity of these references will be evaluated in Section V-D to determine if any model-generated hallucinations occur during the extraction process and to enable necessary corrective reflections.
•

@rule4: Reiterate that all references come solely from the file content. Emphasizing this rule multiple times underlines its importance and reinforces the model’s adherence to rigorous standards.
•

@rule5: Reiterate that the model generates results according to the specified output format.

In addition to the privacy label content, the second and third rules also fetch the references from the file content that supports the extracted label, serving for the next stage for verification, self-correcting potential hallucinations. Based on the aforementioned prompt template, we develop four specialized AI units, each designed to target specific sections of the privacy labels:

•

Basic Info Extractor: This unit utilizes six key labels: Base Model, Tools Modality, Tool Functionality, Tool’s Working Details, Controller Contact, and Target Users. It systematically extracts these labels from the input content, each supported by detailed explanations.
•

Data Rights Info Extractor: Integrating four critical labels: Data Retention, Right to Access, Right to be Forgotten, and Right to Lodge Complaints. This unit extracts information pertinent to data rights from the input content, ensuring each label is clearly explained.
•

Risk Related Info Extractor: This unit focuses on extracting three risk-related labels: AI-generated Watermarking, Prompt Guardrail, and Risk Notification. Each label is detailed within the input to provide a clear contextual understanding.
•

Additional Info Extractor: Targeting additional factors, this unit processes two labels: Data Encryption and Protection of Minors.

V-D Label Verification

LLMs are widely criticized for their tendency to produce hallucinations [37, 35, 95]. This phenomenon occurs because the inherently probabilistic nature of LLMs can result in outputs that appear convincingly accurate but are actually incorrect, making it difficult for humans to discern the inaccuracies. Given this, privacy labels, which serve as crucial privacy notices for GAI app users, must strive to be as accurate as possible. Studies also show that LLMs can self-correct the hallucinations by proper strategies [72]. Therefore, we design the Label Verification to double-check the potentially incorrect extracted privacy labels. Specifically, instead of directly checking the labels, we employ the AI unit to review the correctness of the references used by the model during the label-generation process. We use the string matching techniques to confirm whether the reference indeed exists within the file content. If the reference is indeed in the file content, we consider the corresponding label generated by the model to be likely accurate and thus retain it. Conversely, if the reference is incorrect (i.e., not exist in the original file), we ask the AI unit to proceed with the reflection prompt to regenerate the label with the specific instruction: “You previously extracted a label with an incorrect reference that does not exist in the file content. Please ensure that the reference provided this time is present in the file content.” If the model provides incorrect references more than three times for the same label, we categorize the label as N/A, indicating its non-applicability or unreliability.

V-E Labels Merging

After the previous stages, we obtain all extracted GAI privacy labels and references from each file. We then aggregate them as a complete GAI privacy label based on the whole repository. For each nutrition label, from a set of n files, we collect all labels that are not N/A. We record the source code path and the corresponding reference for each valid label. The format used for this recording is $<$ file_path, nutrition_label, reference $>$ . For example,

•

codeRepos/babyagi/babyagi.py
•

gpt-3.5-turbo
•

LLM_MODEL = os.getenv(‘LLM_MODEL’,os.getenv (‘OPENAI_API_MODEL’, ‘gpt-3.5-turbo’)).lower()

To generate a repository-level nutrition label, we merge the labels from all files, taking their union. For instance, if the base model label across different files includes versions like gpt-4, gpt-3.5-turbo, and text-embedding-ada-002, the final label for the repository would be the combination of these three. Additionally, the final repository-level nutrition label features an expansion window that allows users to view all the references for each label. This feature provides a transparent view of the evidence supporting each label, enhancing the credibility and usability of our extracted information.

VI Evaluation

In this section, we systematically evaluate 1) the design of the proposed GAI privacy label format and 2) the performance of Repo2Label framework.

VI-A Evaluation of GAI Privacy Label Design

To examine the design of our proposed GAI privacy label format, we conducted a comprehensive online survey as a human evaluation. Ethical approval for this research was secured from our institution’s Institutional Review Board (IRB). It is important to note that before the formal experiment, we conducted a small-scale pilot study. This pilot study allowed us to preliminarily assess the duration required for the formal study, which averaged 11 minutes and 53 seconds. More critically, through actual task performance and participant feedback, we identified and refined ambiguous statements within the scales.

VI-A1 Participants Recruitment

Following the research approach proposed by Lin et al. [55], we opted to recruit participants via the crowd worker platform Prolific¹⁰¹⁰10https://www.prolific.com. Recognizing the findings of Hasegawa et al. [32] regarding the demographics of participants in Usable Privacy and Security (UPS) field studies, which often exhibit a strong bias towards individuals from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) countries. To proactively avoid the potential unrepresentativeness in our experimental results, we employed a purposive sampling method [85] to balance the distribution of participants across gender, age, and nationality, thereby ensuring diversity in these aspects among our participants. With a total planned recruitment of 48 participants, we divided the experiment into two stages. Initially, we released 20 slots on the Prolific platform. Subsequently, we adjusted our recruitment criteria (age, nationality) to facilitate the participation of individuals who had not yet taken part in our experiment. Finally, we recruited 22 females and 26 males. 20 of them were from developing countries (normally regarded as non-WEIRD) and 28 were from developing countries (normally regarded as WEIRD).

VI-A2 Survey Design

We conducted the online survey via the Prolific platform to evaluate participants’ assessments of the design of the GAI privacy labels. After presenting the informed consent to the participants, we provided a detailed information sheet as an introduction to the concept of the GAI privacy labels. We also listed several examples to help participants gain a better understanding. Our evaluation questionnaire primarily focused on key dimensions such as the understandability and interpretability of the GAI privacy labels, and participants were asked to rate statements (5-point Likert scale) about those aspects. In response to previous UPS studies about the issues of privacy notices, including excessive terminology and lengthy content [47, 43], poor practical utility because of unacceptable reading costs [60], and the importance of standardized privacy notification formats [40], therefore; we paid particular attention to aspects on user needs, reading burden, and concise representation to ensure the GAI privacy label is both normative and practical. Detailed questions are displayed in Table V.

TABLE V: Questionnaire and results of human evaluation for our GAI privacy label design. “Mdn.” stands for Median. “Distr.” stands for the Distribution of the responses (from left to right: strongly disagree to strongly agree).

No.	Aspect	Question	Mean	Mdn	SD	Distr.
$\textit{Q}_{1}$	User Needs	The GAI privacy nutrition label provides the information I concern about in terms of online privacy.	3.98	4	1.11
$\textit{Q}_{2}$	Reading Burden	The GAI privacy nutrition label can relieve my reading pressure compared to normal privacy policies.	4.02	4	1.11
$\textit{Q}_{3}$	Privacy Assurance	The GAI privacy nutrition label can enhance privacy assurance.	4.08	4.5	1.20
$\textit{Q}_{4}$	Trackability	For each label data field, the GAI privacy nutrition label can provide the source and the reason.	4.21	5	1.12
$\textit{Q}_{5}$	Understandability	The GAI privacy nutrition label is easy to be comprehended in a timely manner.	4.06	4	1.03
$\textit{Q}_{6}$	Interpretability	It is easy to interpret the meaning of each data field of the GAI privacy nutrition label.	4.31	5	0.98
$\textit{Q}_{7}$	Concise Representation	The representation and design of the GAI privacy nutrition label are compact and concise.	4.33	5	1.12
$\textit{Q}_{8}$	Appropriate Volume	The volume of GAI privacy nutrition label is neither too much nor too little, and it can be read in a timely manner.	4.5	5	0.89

VI-A3 Results Analysis

Table V shows the results of the human evaluation. Overall, the proposed design of GAI privacy labels received high ratings across all eight dimensions. In particular, Concise Representation and Appropriate Volume were rated the highest, with over 80% of participants selecting “Agree” even “Strongly Agree”. This indicates that the GAI privacy label effectively delivers information in a succinct manner without being redundant, thus meeting the informational needs of users efficiently. Conversely, Reading Burden was identified as the dimension with relatively lower agreement. Nevertheless, 71% of participants still rated this dimension as “Agree” or “Strongly Agree”. The higher mean (4.02) and median (4) scores for this dimension further support this finding, suggesting that while there is room for improvement, the overall user perception is that the reading burden remains within acceptable limits. Overall, the proposed design of GAI privacy labels receives greatly positive ratings across multiple critical aspects. These findings highlight the label’s quality and high level of user endorsement.

VI-B Evaluation of Repo2Label Framework

The accuracy of privacy labels is crucial, as inaccuracies can lead to information overload for users and severely undermine their trust in the software.

VI-B1 Dataset Construction

In Section II-B, we have identified 148 GAI apps and their GitHub repositories. Given the absence of ground truth for privacy labels for GAI apps, we manually crafted a benchmark dataset for the evaluation. As this is a time-consuming task, we randomly sampled approximately 20% of the repositories, resulting in a total of 29 repositories for manual annotation. This involved 922 code files, with an average of 160 lines of code per file. Two experienced researchers painstakingly examined the content within the repository files of sampled GAI apps, and manually annotated the values of their privacy label field according to the current file. Both annotators have at least three years of experience in AI4SE (AI for Software Engineering) research and two years in UPS research. LABEL:lst:Right_to_be_Forgotten presents an example where the label Right to be Forgotten is marked as ‘Yes’. In this case, the GAI app offers users the functionality to clear their conversation history. Similarly, LABEL:lst:AI-generated_Watermarking illustrates a code example where the label field AI-generated Watermarking is marked as ‘Yes’. In this instance, the GAI app is capable of adding watermarks to generated images, thereby mitigating potential copyright issues. Furthermore, when functions like the one depicted in LABEL:lst:Prompt_Guardrail, which perform security checks on user inputs, are present in the repository, the Prompt Guardrail label is accordingly marked as ‘Yes’. Examples of Right to Lodge Complaints, Risk Notification and the Data Encryption are illustrated in LABEL:lst:Right_to_Lodge_Complaints, LABEL:lst:Risk_Notification and LABEL:lst:Data_Encryption, respectively.

⬇

1async function deleteConversation(conversationId) {

2 const accessToken = await getAccessToken();

3 const resp = await fetch(

4 ‘https://chat.openai.com/backend-api/conversation/${conversationId}‘,

5 {

6 method: "PATCH",

7 headers: {

8 "Content-Type": "application/json",

9 Authorization: ‘Bearer ${accessToken}‘,

10 },

11 body: JSON.stringify({ is_visible: false }),

12 }

13 )

14 .then((r) => r.json())

15 .catch(() => ({}));

16 if (resp?.success) {

17 return true;

18 }

19 return false;

20}

Listing 1: A code example of Right to be Forgotten label from Large Language and Vision Assistant (699 stars).

⬇

1print("Creating invisible watermark encoder (see https://github.com/ShieldMnt/invisible-watermark)...")

2 wm = "StableDiffusionV1"

3 wm_encoder = WatermarkEncoder()

4 wm_encoder.set_watermark(’bytes’, wm.encode(’utf-8’))

Listing 2: A Code example of AI-generated Watermarking label from Stable Diffusion (63k stars).

⬇

1def from_defaults(

2 cls,

3 temperature: float = 0.7,

4 answer_style: int = 1,

5 safety_setting: List["genai.SafetySetting"] = [],

6) -> "GoogleTextSynthesizer":

7 """Create a new Google AQA.

9 Example:

10 responder = GoogleTextSynthesizer.create(

11 temperature=0.7,

12 answer_style=AnswerStyle.ABSTRACTIVE,

13 safety_setting=[

14 SafetySetting(

15 category=HARM_CATEGORY_SEXUALLY_EXPLICIT,

16 threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

17 ),

18 ]

19 )......"

Listing 3: A code example of Prompt Guardrail label from GPT Index (27k stars).

⬇

1Please click the "Flag" button if you get any inappropriate answers! We will collect those to keep improving our moderator.

Listing 4: A code example of Right to Lodge Complaints label from Large Language and Vision Assistant (13k stars).

⬇

1## Disclaimer

3As a model capable of generating free form text, the output of the model is not guaranteed to be free of

4offensive material, so appropriate caution is advised when using the model.

Listing 5: A code example of Risk Notification label from Macaw (456 stars).

⬇

1def test_encrypt_decrypt():

2 key = Fernet.generate_key()

3 service = EncryptionService(key)

5 original_text = "Hello, world!"

6 encrypted = service.encrypt(original_text)

7 decrypted = service.decrypt(encrypted)

9 assert original_text == decrypted

Listing 6: A code example of Data Encryption label from AgentGPT (29k stars).

Ultimately, 13,830 labels were annotated based on 922 code files in 29 repositories. It took an average of 5 minutes to annotate each code file. For any disagreement in the annotation, they discussed and agreed on the same answer, and if the disagreement persisted, a senior researcher joined the discussion to facilitate a resolution. The Cohen’s Kappa of initial annotation is $\kappa=0.78$ , indicating a high level of inter-rater agreement.

Fig. 4 shows the frequency of the GAI Privacy Label fields marked as “Yes”. All 29 repositories include information on Tool Modality and Tool Functionality. It is important to note that not every repository explicitly specifies the base model used, as sometimes this can be determined by the user input and some of the apps are Visual Studio Code plugins. Additionally, Working Details and Controller Contact exist in about half of the examined repositories. In the Basic Info section, the ratio of target user is the lowest. Despite this attention to Basic Info, we observed a notable deficiency in other sections. Only Data Retention, Risk Notification, Right to Access, and Right to Be Forgotten are sporadically covered, with each category appearing infrequently. Moreover, none of the items within Protection of Minors were detected in any real-world apps. These observations suggest that while developers consistently focus on basic information, they often overlook other crucial privacy-related alignment measurements during the development process, potentially leaving significant gaps in privacy considerations. A comprehensive analysis of these findings is provided in Section VII-A.

TABLE VI: Performance of Repo2Label framework. “Prec.” stands for precision and “Rec.” stands for recall. “V” stands for the verification stage. GPT-4 Turbo is a large multimodal model released by OpenAI. GPT-4o (“o” for “omni”) is the most advanced, multimodal flagship model released in May 2024. The GPT-4o is twice as fast as the GPT4-turbo, 50% cheaper, and significantly better at handling text in non-English languages [5].

		Basic Info			Data Rights			Risk Related			Additional Info			Average
LLM	Settings	Prec.	Rec.	F1	Prec.	Rec.	F1	Prec.	Rec.	F1	Prec.	Rec.	F1	Prec.	Rec.	F1
GPT-4o	zero-shot	.68	.84	.75	.76	.73	.74	.78	.90	.84	.62	.83	.71	.68	.84	.75
	zero-shot + V	.81	.89	.85	.76	.70	.73	.86	.90	.88	.83	.62	.71	.81	.88	.84
	few-shot	.67	.83	.74	.76	.73	.74	.50	.89	.64	.62	.83	.71	.67	.83	.74
	few-shot + V	.80	.87	.84	.76	.70	.73	.49	.89	.63	.62	.83	.71	.79	.87	.83
GPT-4 Turbo	zero-shot	.67	.69	.68	.50	.19	.27	.42	.84	.56	.23	.50	.32	.64	.68	.66
	zero-shot + V	.69	.70	.70	.50	.19	.27	.57	.84	.68	.25	.50	.33	.68	.69	.68
	few-shot	.58	.72	.64	.59	.63	.61	.61	.74	.67	.18	.43	.25	.58	.71	.64
	few-shot + V	.60	.73	.66	.58	.70	.63	.55	.84	.67	.38	.50	.43	.60	.73	.66

VI-B2 Evaluation of Repo2Label

Table VI demonstrates the performance of Repo2Label in generating GAI privacy labels. We utilized widely recognized and high-performing foundation LLMs, GPT-4o [5] and GPT-4 Turbo [5], to drive the AI units in the framework. Additionally, we conducted a comparative analysis of the results before and after applying the verification process described in Section V-D. Given the In-Context Learning (ICL) capabilities of LLM [23], we also compared results under zero-shot and few-shot settings.

Results show that GPT-4o greatly outperforms GPT-4 Turbo under all settings, achieving a 0.84 F1-score under the optimal settings, compared to a 0.64 F1-score. Contrary to our expectations, the few-shot strategy does not significantly increase the performance compared to the zero-shot, especially for GPT-4 Turbo. Upon manual inspection of the result from Repo2Label, we found that not providing sufficiently representative examples in the few-shot learning could cause the LLM to capture irrelevant content. For instance, when given examples related to the Protection of Minors, as illustrated in LABEL:lst:Protection_of_Minors, the LLMs incorrectly classified all code files dealing with Sexual Content as related to the Protection of Minors. This misclassification occurred because the provided examples led the LLMs to generalize the context overly.

Additionally, the verification step led to improvements in all metrics. Specifically, with the GPT-4o model and a zero-shot setup, the app of verification resulted in precision increasing from 0.68 to 0.81 (+19.12%), recall from 0.84 to 0.88 (+5.95%), and F1-score from 0.75 to 0.84 (+13.33%). Other configurations involving few-shot learning and GPT-4 Turbo also showed improvements after verification, although the extent varied. The least improvement was observed with the GPT-4 Turbo in a zero-shot setup. In short, using GPT-4o with the zero-shot method followed by verification achieved the best performance, with a precision of 0.81, recall of 0.88, and F1-score of 0.84.

⬇

1Llama Guard safety taxonomy:

3- Violence & Hate: Content promoting violence or hate against specific groups.

4- Sexual Content: Encouraging sexual acts, particularly with minors, or explicit content.

Listing 7: A code example of Protection of Minors label from GPT Index (27k stars).

VI-B3 Comparison between Repo2Label and self-declared privacy policies

TABLE VII: Results of quality examination on self-declared privacy policies of GAI apps.

	Precision	Recall	F1
Basic Info	0.17	0.02	0.04
Data Rights	0.12	0.38	0.18
Risk Relate	0.50	0.25	0.33
Additional Info	0.10	0.20	0.13
Overall	0.15	0.11	0.13

Among the 29 annotated GAI apps and their repositories, 11 GAI apps provide a privacy policy. We then manually scrutinize their privacy policies in terms of the aspects covered by the GAI privacy labels, compared against the benchmark annotations. Table VII presents the results of the disclosures in GAI app privacy policies, with an overall F1-score of only 0.13. This indicates that the long-standing issue of under-disclosure in privacy policies also persists in the GAI app context, and may even be more pronounced. Also, as lack of enforcement force from a central market, such as the Google Play app store for mobile apps, developers do not proactively work on providing authentic privacy policies. For example, Large Language and Vision Assistant has attracted over 13k GitHub stars, but their privacy policy¹¹¹¹11Privacy Policy of Large Language and Vision Assistant only contains 63 words. Despite the statements of various data rights in examined privacy policies, our manual annotation results reveal that the actual implementation of these apps does not align with their stated privacy policies. For instance, 72.2% of the privacy policies claim that their services are not intended for use by children. However, there is no implementation of age verification or other validation functions in the code to implement this disclosure.

VII Discussion

This discussion aims to provide a reference and basis for a more comprehensive understanding of the significance and implications of our study.

VII-A Regulatory (In)compliance of Open-source GAI Apps

By observing the manually annotated dataset of privacy practices of GAI apps, we notice a significant discrepancy between GAI app implementation and regulations. Notably, a label marked as “No” does not necessarily indicate that the GAI app violates regulations, and vice versa. There are several reasons for this: 1) Open-source GAI apps are not commercial software, and they may not be subject to certain regulations; 2) Regulations vary by region, and their interpretation could be subjective and controversial; and 3) The implementation of privacy practices may not rigorously comply against regulatory requirements. Nonetheless, our findings still reflect that the open-source GAI apps community commonly lacks awareness and responsiveness to those regulations. At the early stage of the market, compliance often cannot be guaranteed, but we advocate for the community to pay attention to these requirements.

VII-B Comparison with Other Privacy Notices

To establish transparency surrounding AI-enabled products (e.g., GAI apps) and thus yield trust, researchers have also proposed various types of privacy notices in addition to privacy policies and privacy labels, such as the Model Cards [65, 53] and AI Bills of Materials (AIBOMs) [89]. Model cards are files that accompany the foundation models (e.g., Llama) and provide handy information. They are essential for discoverability, reproducibility, and sharing. Typically, Model Cards coefficiently provide critical information about the model’s dataset, evaluation results, and potential ethical considerations. However, Model Cards do not provide privacy disclosures and the dynamic nature of model development presents challenges in maintaining the model card content timely. AIBOMs offers a more specialized disclosure aimed at experts, focusing on elements such as software dependency, version & licenses, etc. While model cards and AIBOM play a crucial role in enhancing transparency and promoting responsible AI usage, they currently exist only as a community norm and are not proactively aligned with privacy and GAI-specific laws and regulations.

VII-C Broader Impact for Stakeholders

VII-C1 GAI app developers

Previous research has indicated that developers tend to prioritize the development of system-specific functionalities, often neglecting privacy considerations as a primary concern [56, 12, 49]. Our study contributes to the creation of more accurate privacy labels, thereby relieving developers from the time-consuming task of creating these labels manually. In contrast to the self-declaration approach traditionally required for generating privacy labels, our work shows better usability and compliance. Furthermore, collaborative development of GAI apps is commonplace, where individual developers may not retain all tool-specific details in memory. The privacy labels generated by our Repo2Label framework provide developers with a reference for understanding the implementation and rationale behind each privacy practice. This facilitates better communication and reduces the overhead for developers who need to collaborate with others. By automating the generation of privacy labels with Repo2Label, timely updates and adherence to legal regulations are ensured.

VII-C2 End-users

The concise and easily readable format of GAI privacy labels facilitates users in the timely grasping of the privacy practices of GAI apps. In comparison to lengthy and academically demanding privacy policies, these privacy labels lower the barrier for ordinary users to understand GAI apps, due to their comprehensibility and acceptability.

While the aim of developing GAI is to benefit a broader demographic, the increasing complexity of privacy practices in these apps can create additional understanding barriers. Our approach potentially broadens the accessibility of GAI apps to a more diverse range of users. This includes individuals who may not possess advanced literacy skills as well as those who face challenges related to reading disorders. By making privacy practices more transparent and understandable, we can ensure that a broader demographic can benefit from the development of these technologies.

VII-C3 Regulator

The rapid advancement of GAI has introduced new risks, positioning regulators as pivotal players in addressing these issues. Repo2Label aims to respond to high-level requirements about transparency and privacy disclosure presentation. For instance, legislation passed in Singapore [9] mandates that GAI systems provide model information to downstream users in a format akin to privacy labels, which states “End-users need greater understanding of content provenance across the content lifecycle and to learn to utilise tools to verify for authenticity.”. Additionally, we advocate for regulators to be more agile in this dynamic market and to provide more proactive UPS solutions to facilitate practitioners in complying with regulations.

VII-D Broader Impact for the Ecosystem

Privacy labels can enhance data controllers’ internal compliance routines [68]. Hence, GAI privacy labels contribute to enhancing the security of the entire open-source software ecosystem. Our privacy label design serves as a reference model for building a responsible AI. Drawing from research in related fields, open-source projects with Model Cards have seen a significant increase in downloads [53], primarily due to improved transparency. This provides a theoretical basis for the expectation that GAI apps with privacy labels in the open-source community will also experience higher user adoption. Our regulation-driven privacy label ensures that privacy practices and technologies are aligned with current legal standards.

VII-E Thread to Validity

1) Internal Validity: We follow OpenAI’s official API to access the aforementioned GPT models in our experiments. We evaluate the performance of our framework on the GAI app dataset. The results from GPT models may deviate due to the probabilistic nature of the model. We have taken extensive measures to mitigate this risk, including providing references when generating labels and implementing a verification step to ensure accuracy. We believe that with the continued development of more advanced foundation models, this issue can be further alleviated.

2) External Validity: Our privacy label design incorporates common requirements from various regulations to emphasize the current regulatory focus on generative AI tools. However, for our privacy labels to be deployed in individual countries, they must adhere to specific local regulations. In addition, although our framework is evaluated on a dataset based on GitHub code repositories, the core analysis focuses on individual code files, making it can be easily generalized to other code repositories.

VIII Conclusion

GAI apps have greatly facilitated and enriched our daily lives, yet have raised concerns about transparency in their privacy practices. Traditional privacy policies often fail to effectively communicate essential privacy information due to their complexity and length, and open-source community developers often neglect privacy practices even more. Only 12.2% of examined open-source GAI apps provide a privacy policy. In this paper, we propose a regulation-driven GAI privacy labels and introduce Repo2Label, a novel framework for automatically generating these labels based on code repositories. Our user study indicates endorsement of the proposed GAI privacy label design. Additionally, Repo2Label achieves a precision of 0.81, recall of 0.88, and F1-score of 0.84 under optimal settings (GPT-4o and verification) based on the benchmark dataset, significantly outperforming the developer self-declared privacy notices. Our findings suggest that Repo2Label could serve as a significant tool for bolstering the privacy transparency of GAI apps and make them more practical and responsible.

References

[1] Ai regulation is coming- what is the likely outcome? https://www.csis.org/blogs/strategic-technologies-blog/ai-regulation-coming-what-likely-outcome.
[2] “California consumer privacy act of 2018 (CCPA),” https://cppa.ca.gov/regulations/pdf/cppa_act.pdf.
[3] “General data protection regulation (GDPR),” https://gdpr-info.eu.
[4] Github rest api documentation. https://docs.github.com/en/rest?apiVersion=2022-11-28.
[5] “Openai.” https://platform.openai.com/docs/models.
[6] “Personal information protection law of the people’s republic of china (PIPL),” http://www.npc.gov.cn/npc/index.html.
[7] Repositories ranking of github. https://gitstar-ranking.com/repositories.
[8] “Administrative measures for generative artificial intelligence services.” http://www.cac.gov.cn/2023-07/13/c_1690898327029107.htm, 2023.
[9] A. V. F. (AIVF) and I. M. D. A. (IMDA), “Model ai governance framework for generative ai.” https://aiverifyfoundation.sg/downloads/Proposed_MGF_Gen_AI_2024.pdf, 2024.
[10] S. Ali, P. Ravi, R. Williams, D. DiPaola, and C. Breazeal, “Constructing dreams using generative ai,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 21, 2024, pp. 23 268–23 275.
[11] Amazon, “Amazon appstore privacy labels,” https://developer.amazon.com/docs/app-submission/appstore-privacy-labels.html, 2023.
[12] R. Balebako and L. Cranor, “Improving app privacy: Nudging app developers to protect user privacy,” IEEE Security & Privacy, vol. 12, no. 4, pp. 55–58, 2014.
[13] J. Blakkarly and D. Graham. (2022) Privacy policy comparison reveals half have poor readability. choice.com.au/consumers-and-data/.
[14] D. Bui, B. Tang, and K. G. Shin, “Detection of inconsistencies in privacy practices of browser extensions,” in 2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023, pp. 2780–2798.
[15] I. C. Campbell, “Apple will require apps to add privacy ‘nutrition labels’ starting december 8th,” The Verge, 2020.
[16] N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson et al., “Extracting training data from large language models,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 2633–2650.
[17] N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramer, B. Balle, D. Ippolito, and E. Wallace, “Extracting training data from diffusion models,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 5253–5270.
[18] C. N. I. S. S. T. Committee, “Basic security requirements for generative artificial intelligence service.” https://cset.georgetown.edu/wp-content/uploads/t0574_generative_AI_safety_EN.pdf, 2023.
[19] M. Company, “The state of ai in 2023: Generative ai’s breakout year,” https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year, 2023.
[20] ——, “The state of ai in early 2024: Gen ai adoption spikes and starts to generate value,” https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai#/, 2024.
[21] L. F. Cranor, “Mobile-app privacy nutrition labels missing key ingredients for success,” Communications of the ACM, vol. 65, no. 11, pp. 26–28, 2022.
[22] H. Cui, R. Trimananda, A. Markopoulou, and S. Jordan, “ $\{$ PoliGraph $\}$ : Automated privacy policy analysis using knowledge graphs,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 1037–1054.
[23] Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, and Z. Sui, “A survey on in-context learning,” arXiv preprint arXiv:2301.00234, 2022.
[24] P. Emami-Naeini, Y. Agarwal, L. F. Cranor, and H. Hibshi, “Ask the experts: What should be on an iot privacy and security label?” in 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020, pp. 447–464.
[25] M. Fırat and S. Kuleli, “What if gpt4 became autonomous: The auto-gpt project and use cases,” Journal of Emerging Computer Technologies, vol. 3, no. 1, pp. 1–6, 2023.
[26] F. Fui-Hoon Nah, R. Zheng, J. Cai, K. Siau, and L. Chen, “Generative ai and chatgpt: Applications, challenges, and ai-human collaboration,” pp. 277–304, 2023.
[27] A. Golda, K. Mekonen, A. Pandey, A. Singh, V. Hassija, V. Chamola, and B. Sikdar, “Privacy and security concerns in generative ai: A comprehensive survey,” IEEE Access, 2024.
[28] GOV.UK, “Global leaders agree to launch first international network of ai safety institutes to boost cooperation of ai,” https://www.gov.uk/government/news/global-leaders-agree-to-launch-first-international-network-of-ai-safety-institutes-to-boost-understanding-of-ai, 2024.
[29] ——, “New commitment to deepen work on severe ai risks concludes ai seoul summit,” https://www.gov.uk/government/news/new-commitmentto-deepen-work-on-severe-ai-risks-concludes-ai-seoul-summit, 2024.
[30] L. Han, S. Pan, Z. Xing, J. Sun, S. Yitagesu, X. Zhang, and Z. Feng, “Don’t chase your tail! missing key aspects augmentation in textual vulnerability descriptions of long-tail software through feature inference,” arXiv preprint arXiv:2405.07430, 2024.
[31] H. Harkous, K. Fawaz, R. Lebret, F. Schaub, K. G. Shin, and K. Aberer, “Polisis: Automated analysis and presentation of privacy policies using deep learning,” in 27th USENIX Security Symposium (USENIX Security 18), 2018, pp. 531–548.
[32] A. A. Hasegawa, D. Inoue, and M. Akiyama, “How weird is usable privacy and security research?”
[33] X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engineering: A systematic literature review,” arXiv preprint arXiv:2308.10620, 2023.
[34] J. Huang, H. Shao, and K. C.-C. Chang, “Are large pre-trained language models leaking your personal information?” arXiv preprint arXiv:2205.12628, 2022.
[35] L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin et al., “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,” arXiv preprint arXiv:2311.05232, 2023.
[36] M. Javaid, A. Haleem, and R. P. Singh, “A study on chatgpt for industry 4.0: Background, potentials, challenges, and eventualities,” Journal of Economy and Technology, vol. 1, pp. 127–143, 2023.
[37] Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023.
[38] P. G. Kelley, “Designing a privacy label: assisting consumer understanding of online privacy practices,” in CHI’09 Extended Abstracts on Human Factors in Computing Systems, 2009, pp. 3347–3352.
[39] P. G. Kelley, J. Bresee, L. F. Cranor, and R. W. Reeder, “A” nutrition label” for privacy,” in Proceedings of the 5th Symposium on Usable Privacy and Security, 2009, pp. 1–12.
[40] P. G. Kelley, L. Cesca, J. Bresee, and L. F. Cranor, “Standardizing privacy notices: an online study of the nutrition label approach,” in Proceedings of the SIGCHI Conference on Human factors in Computing Systems, 2010, pp. 1573–1582.
[41] P. G. Kelley, L. F. Cranor, and N. Sadeh, “Privacy as part of the app decision-making process,” in Proceedings of the SIGCHI conference on human factors in computing systems, 2013, pp. 3393–3402.
[42] K. Kenthapadi, H. Lakkaraju, and N. Rajani, “Generative ai meets responsible ai: Practical challenges and opportunities,” in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 5805–5806.
[43] J. P. Kincaid, R. P. Fishburne Jr, R. L. Rogers, and B. S. Chissom, “Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel,” 1975.
[44] S. Koch, M. Wessels, B. Altpeter, M. Olvermann, and M. Johns, “Keeping privacy labels honest,” Proceedings on Privacy Enhancing Technologies, 2022.
[45] K. Kollnig, A. Shuba, M. Van Kleek, R. Binns, and N. Shadbolt, “Goodbye tracking? impact of ios app tracking transparency and privacy labels,” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 508–520.
[46] J. Korunovska, B. Kamleitner, and S. Spiekermann, “The challenges and impact of privacy policy comprehension,” arXiv preprint arXiv:2005.08967, 2020.
[47] B. Krumay and J. Klar, “Readability of privacy policies,” in Data and Applications Security and Privacy XXXIV: 34th Annual IFIP WG 11.3 Conference, DBSec 2020, Regensburg, Germany, June 25–26, 2020, Proceedings 34. Springer, 2020, pp. 388–399.
[48] G. S. Kyle Daigle, “Octoverse: The state of open source and rise of ai in 2023,” https://github.blog/2023-11-08-the-state-of-open-source-and-ai/, 2023.
[49] T. Li, Y. Agarwal, and J. I. Hong, “Coconut: An ide plugin for developing privacy-friendly apps,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 2, no. 4, pp. 1–35, 2018.
[50] T. Li, L. F. Cranor, Y. Agarwal, and J. I. Hong, “Matcha: An ide plugin for creating accurate privacy nutrition labels,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 8, no. 1, pp. 1–38, 2024.
[51] T. Li, K. Reiman, Y. Agarwal, L. F. Cranor, and J. I. Hong, “Understanding challenges for developers to create accurate privacy nutrition labels,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–24.
[52] Y. Li, D. Chen, T. Li, Y. Agarwal, L. F. Cranor, and J. I. Hong, “Understanding ios privacy nutrition labels: An exploratory large-scale analysis of app store data,” in CHI Conference on Human Factors in Computing Systems Extended Abstracts, 2022, pp. 1–7.
[53] W. Liang, N. Rajani, X. Yang, E. Ozoani, E. Wu, Y. Chen, D. S. Smith, and J. Zou, “What’s documented in ai? systematic analysis of 32k ai model cards,” arXiv preprint arXiv:2402.05160, 2024.
[54] D. Liao, S. Pan, Q. Huang, X. Ren, Z. Xing, H. Jin, and Q. Li, “Context-aware code generation framework for code repositories: Local, global, and third-party library awareness,” arXiv preprint arXiv:2312.05772, 2023.
[55] Y. Lin, J. Juneja, E. Birrell, and L. Cranor, “Data safety vs. app privacy: Comparing the usability of android and ios privacy labels,” arXiv preprint arXiv:2312.03918, 2023.
[56] K.-U. Loser and M. Degeling, “Security and privacy as hygiene factors of developer behavior in small and agile teams,” in ICT and Society: 11th IFIP TC 9 International Conference on Human Choice and Computers, HCC11 2014, Turku, Finland, July 30–August 1, 2014. Proceedings 11. Springer, 2014, pp. 255–265.
[57] Q. Lyu, S. Havaldar, A. Stein, L. Zhang, D. Rao, E. Wong, M. Apidianaki, and C. Callison-Burch, “Faithful chain-of-thought reasoning,” arXiv preprint arXiv:2301.13379, 2023.
[58] Y. Lyu, T. Hao, and Z. Yi, “Design futures with gai: Exploring the potential of generative ai tools in collaborative speculation,” in International Conference on Human-Computer Interaction. Springer, 2023, pp. 149–161.
[59] T. Madiega, “Artificial intelligence act,” European Parliament: European Parliamentary Research Service, 2021.
[60] A. M. McDonald and L. F. Cranor, “The cost of reading privacy policies,” Isjlp, vol. 4, p. 543, 2008.
[61] G. Miller, “Human memory and the storage of information,” IRE Transactions on Information Theory, vol. 2, no. 3, pp. 129–137, 1956.
[62] G. A. Miller, “The magical number seven, plus or minus two: Some limits on our capacity for processing information.” Psychological review, vol. 63, no. 2, p. 81, 1956.
[63] D. Milmo, “Chatgpt reaches 100 million users two months after launch,” The Guardian, vol. 2, 2023.
[64] G. R. Milne and M. J. Culnan, “Strategies for reducing online privacy risks: Why consumers read (or don’t read) online privacy notices,” Journal of interactive marketing, vol. 18, no. 3, pp. 15–29, 2004.
[65] M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for model reporting,” in Proceedings of the conference on fairness, accountability, and transparency, 2019, pp. 220–229.
[66] S. Mohamadi, G. Mujtaba, N. Le, G. Doretto, and D. A. Adjeroh, “Chatgpt in the age of generative ai and large language models: a concise survey,” arXiv preprint arXiv:2307.04251, 2023.
[67] A. Nigam, R. Pollice, M. Krenn, G. dos Passos Gomes, and A. Aspuru-Guzik, “Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (stoned) algorithm for molecules using selfies,” Chemical science, vol. 12, no. 20, pp. 7079–7090, 2021.
[68] M. Novović, “Privacy nutrition labels, app store and the gdpr: Unintended consequences?” Journal of Data Protection & Privacy, vol. 5, no. 3, pp. 267–280, 2022.
[69] O. of the Privacy Commissioner of Canada, “Principles for responsible, trustworthy and privacy-protective generative ai technologies.” https://www.priv.gc.ca/en/privacy-topics/technology/artificial-intelligence/gd_principles_ai/#wb-cont, 2023.
[70] OpenAI, “Introducing chatgpt.” https://openai.com/index/chatgpt, 2022.
[71] S. Ortiz, “What is auto-gpt? everything to know about the next powerful ai tool,” 2023.
[72] L. Pan, M. Saxon, W. Xu, D. Nathani, X. Wang, and W. Y. Wang, “Automatically correcting large language models: Surveying the landscape of diverse self-correction strategies,” arXiv preprint arXiv:2308.03188, 2023.
[73] S. Pan, T. Guo, L. Zhang, P. Liu, Z. Xing, and X. Sun, “A large-scale investigation of semantically incompatible apis behind compatibility issues in android apps,” arXiv preprint arXiv:2406.17431, 2024.
[74] S. Pan, T. Hoang, D. Zhang, Z. Xing, X. Xu, Q. Lu, and M. Staples, “Toward the cure of privacy policy reading phobia: Automated generation of privacy nutrition labels from privacy policies,” arXiv preprint arXiv:2306.10923, 2023.
[75] S. Pan, Z. Tao, T. Hoang, D. Zhang, Z. Xing, X. Xu, M. Staples, and D. Lo, “Seeprivacy: Automated contextual privacy policy generation for mobile applications,” 2023.
[76] S. Pan, D. Zhang, M. Staples, Z. Xing, J. Chen, X. Xu, and T. Hoang, “Is it a trap? a large-scale empirical study and comprehensive assessment of online automated privacy policy generators for mobile apps,” USENIX Security 2024, 2024.
[77] M. Pogla, “Auto-gpt: Understanding its constraints and limitations,” https://autogpt.net/auto-gpt-understanding-its-constraints-and-limitations/, 2023.
[78] J. Porter, “Google play store’s app privacy labels start appearing,” The Verge, 2022.
[79] J. R. Reidenberg, J. Bhatia, T. D. Breaux, and T. B. Norton, “Ambiguity in privacy policies and the impact of regulation,” The Journal of Legal Studies, vol. 45, no. S2, pp. S163–S190, 2016.
[80] T. B. Richards et al., “Auto-gpt: An autonomous gpt-4 experiment,” Accessed May, vol. 3, p. 2023, 2023.
[81] H. Saleem, “What is generative ai and how much power does it have,” Published Aug, vol. 20, 2020.
[82] O. Sbai, M. Elhoseiny, A. Bordes, Y. LeCun, and C. Couprie, “Design: Design inspiration from generative networks,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
[83] R. I. Singh, M. Sumeeth, and J. Miller, “Evaluating the readability of privacy policies in mobile environments,” International Journal of Mobile Human Computer Interaction (IJMHCI), vol. 3, no. 1, pp. 55–78, 2011.
[84] L. Sun, Y. Huang, H. Wang, S. Wu, Q. Zhang, C. Gao, Y. Huang, W. Lyu, Y. Zhang, X. Li et al., “Trustllm: Trustworthiness in large language models,” arXiv preprint arXiv:2401.05561, 2024.
[85] M. D. C. Tongco, “Purposive sampling as a tool for informant selection,” 2007.
[86] X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou, “Self-consistency improves chain of thought reasoning in language models,” arXiv preprint arXiv:2203.11171, 2022.
[87] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 824–24 837, 2022.
[88] J. D. Weisz, M. Muller, J. He, and S. Houde, “Toward general design principles for generative ai applications,” arXiv preprint arXiv:2301.05578, 2023.
[89] B. Xia, D. Zhang, Y. Liu, Q. Lu, Z. Xing, and L. Zhu, “Trust in software supply chains: Blockchain-enabled sbom and the aibom future,” arXiv preprint arXiv:2307.02088, 2023.
[90] Y. Xiao, Z. Li, Y. Qin, X. Bai, J. Guan, X. Liao, and L. Xing, “Lalaine: Measuring and characterizing non-compliance of apple privacy labels,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 1091–1108.
[91] D. Zhang, P. Finckenberg-Broman, T. Hoang, S. Pan, Z. Xing, M. Staples, and X. Xu, “Right to be forgotten in the era of large language models: Implications, challenges, and solutions,” arXiv preprint arXiv:2307.03941, 2023.
[92] D. Zhang, S. Pan, T. Hoang, Z. Xing, M. Staples, X. Xu, L. Yao, Q. Lu, and L. Zhu, “To be forgotten or to be fair: Unveiling fairness implications of machine unlearning methods,” AI and Ethics, vol. 4, no. 1, pp. 83–93, 2024.
[93] D. Zhang, B. Xia, Y. Liu, X. Xu, T. Hoang, Z. Xing, M. Staples, Q. Lu, and L. Zhu, “Privacy and copyright protection in generative ai: A lifecycle perspective,” in Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, 2024, pp. 92–97.
[94] S. Zhang, Y. Feng, Y. Yao, L. F. Cranor, and N. Sadeh, “How usable are ios app privacy labels?” Proceedings on Privacy Enhancing Technologies, 2022.
[95] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.