research-article

Open access

The Unequal Opportunities of Large Language Models: Examining Demographic Biases in Job Recommendations by ChatGPT and LLaMA

Authors:

Robert McCormack,

Fred MorstatterAuthors Info & Claims

EAAMO '23: Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization

Article No.: 34, Pages 1 - 15

https://doi.org/10.1145/3617694.3623257

Published: 30 October 2023 Publication History

All formats PDF

Abstract

Warning: This paper discusses and contains content that is offensive or upsetting. Large Language Models (LLMs) have seen widespread deployment in various real-world applications. Understanding these biases is crucial to comprehend the potential downstream consequences when using LLMs to make decisions, particularly for historically disadvantaged groups. In this work, we propose a simple method for analyzing and comparing demographic bias in LLMs, through the lens of job recommendations. We demonstrate the effectiveness of our method by measuring intersectional biases within ChatGPT and LLaMA, two cutting-edge LLMs. Our experiments primarily focus on uncovering gender identity and nationality bias; however, our method can be extended to examine biases associated with any intersection of demographic identities. We identify distinct biases in both models toward various demographic identities, such as both models consistently suggesting low-paying jobs for Mexican workers or preferring to recommend secretarial roles to women. Our study highlights the importance of measuring the bias of LLMs in downstream applications to understand the potential for harm and inequitable outcomes. Our code is available at https://github.com/Abel2Code/Unequal-Opportunities-of-LLMs.

References

[1]

Abubakar Abid, Maheen Farooqi, and James Zou. 2021. Persistent Anti-Muslim Bias in Large Language Models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (Virtual Event, USA) (AIES ’21). Association for Computing Machinery, New York, NY, USA, 298–306. https://doi.org/10.1145/3461702.3462624

Digital Library

[2]

Heather Antecol and Kelly Bedard. 2004. The racial wage gap: The importance of labor force attachment differences across black, Mexican, and white men. Journal of Human Resources 39, 2 (2004), 564–583.

[3]

Bard 2023. Google AI Updates: Bard and New AI Features in Search. Retrieved May 7, 2023 from https://blog.google/technology/ai/bard-google-ai-search-updates/

[4]

Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. Language (Technology) is Power: A Critical Survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5454–5476. https://doi.org/10.18653/v1/2020.acl-main.485

[5]

Su Lin Blodgett, Gilsinia Lopez, Alexandra Olteanu, Robert Sim, and Hanna M. Wallach. 2021. Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In Annual Meeting of the Association for Computational Linguistics.

[6]

Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186. https://doi.org/10.1126/science.aal4230 arXiv:https://www.science.org/doi/pdf/10.1126/science.aal4230

[7]

ChatGPT 2023. Introducing ChatGPT. Retrieved May 7, 2023 from https://openai.com/blog/chatgpt

[8]

Roi Cohen, Mor Geva, Jonathan Berant, and Amir Globerson. 2023. Crawling The Internal Knowledge-Base of Language Models. In Findings of the Association for Computational Linguistics: EACL 2023. Association for Computational Linguistics, Dubrovnik, Croatia, 1856–1869. https://aclanthology.org/2023.findings-eacl.139

[9]

Emilio Ferrara. 2023. Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models. arxiv:2304.03738 [cs.CY]

[10]

Maarten Grootendorst. 2022. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arxiv:2203.05794 [cs.CL]

[11]

Yulin Hswen, Qiuyuan Qin, David R Williams, K Viswanath, SV Subramanian, and John S Brownstein. 2020. Online negative sentiment towards Mexicans and Hispanics and impact on mental well-being: A time-series analysis of social media data during the 2016 United States presidential election. Heliyon 6, 9 (2020).

[12]

HuggingChat 2023. HuggingChat. Retrieved May 7, 2023 from https://huggingface.co/chat/

[13]

HuggingFace 2022. sentence-transformers/all-MiniLM-L6-v2. Retrieved May 7, 2023 from https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

[14]

Hannah Kirk, Yennie Jun, Haider Iqbal, Elias Benussi, Filippo Volpin, Frederic A. Dreyer, Aleksandar Shtedritski, and Yuki M. Asano. 2021. Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models. arxiv:2102.04130 [cs.CL]

[15]

Li Lucy and David Bamman. 2021. Gender and Representation Bias in GPT-3 Generated Stories. In Proceedings of the Third Workshop on Narrative Understanding. Association for Computational Linguistics, Virtual, 48–55. https://doi.org/10.18653/v1/2021.nuse-1.5

[16]

Todor Markov, Chong Zhang, Sandhini Agarwal, Tyna Eloundou, Teddy Lee, Steven Adler, Angela Jiang, and Lilian Weng. 2023. A Holistic Approach to Undesired Content Detection in the Real World. arxiv:2208.03274 [cs.CL]

[17]

Douglas S Massey. 2009. Racial formation in theory and practice: The case of Mexicans in the United States. Race and social problems 1 (2009), 12–26.

[18]

Robert W. McGee. 2023. Is Chat Gpt Biased Against Conservatives? An Empirical Study. (15 February 2023). https://doi.org/10.2139/ssrn.4359405

[19]

Leland McInnes, John Healy, and James Melville. 2020. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arxiv:1802.03426 [stat.ML]

[20]

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 54, 6, Article 115 (jul 2021), 35 pages. https://doi.org/10.1145/3457607

Digital Library

[21]

Moin Nadeem, Anna Bethke, and Siva Reddy. 2021. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 5356–5371. https://doi.org/10.18653/v1/2021.acl-long.416

[22]

Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447–453. https://doi.org/10.1126/science.aax2342 arXiv:https://www.science.org/doi/pdf/10.1126/science.aax2342

[23]

Orestis Papakyriakopoulos and Ethan Zuckerman. 2021. The media during the rise of trump: Identity politics, immigration," Mexican" demonization and hate-crime. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 15. 467–478.

[24]

Filip Radlinski, Krisztian Balog, Fernando Diaz, Lucas Gill Dixon, and Ben Wedin. 2022. On Natural Language User Profiles for Transparent and Scrutable Recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22).

Digital Library

[25]

Cordelia W. Reimers. 1983. Labor Market Discrimination Against Hispanic and Black Men. The Review of Economics and Statistics 65, 4 (1983), 570–579. http://www.jstor.org/stable/1935925

[26]

David Rozado. 2023. The Political Biases of ChatGPT. Social Sciences 12, 3 (2023). https://doi.org/10.3390/socsci12030148

[27]

Jérôme Rutinowski, Sven Franke, Jan Endendyk, Ina Dormuth, and Markus Pauly. 2023. The Self-Perception and Political Biases of ChatGPT. arxiv:2304.07333 [cs.CY]

[28]

Preethi Seshadri, Pouya Pezeshkpour, and Sameer Singh. 2022. Quantifying Social Biases Using Templates is Unreliable. arxiv:2210.04337 [cs.CL]

[29]

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. 2019. The Woman Worked as a Babysitter: On Biases in Language Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3407–3412. https://doi.org/10.18653/v1/D19-1339

[30]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. arxiv:2302.13971 [cs.CL]

[31]

Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber. 2020. Investigating Gender Bias in Language Models Using Causal Mediation Analysis. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 12388–12401. https://proceedings.neurips.cc/paper_files/paper/2020/file/92650b2e92217715fe312e6fa7b90d82-Paper.pdf

[32]

Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C Schmidt. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023).

[33]

Terry Yue Zhuo, Yujin Huang, Chunyang Chen, and Zhenchang Xing. 2023. Exploring AI Ethics of ChatGPT: A Diagnostic Analysis. arxiv:2301.12867 [cs.CL]

Cited By

Georgiou G(2024)ChatGPT Exhibits Bias Toward Developed Countries Over Developing Ones, as Indicated by a Sentiment Analysis ApproachJournal of Language and Social Psychology10.1177/0261927X24129833744:1(132-141)Online publication date: 15-Nov-2024
https://doi.org/10.1177/0261927X241298337
Fabris ABaranowska NDennis MGraus DHacker PSaldivar JZuiderveen Borgesius FBiega A(2024)Fairness and Bias in Algorithmic Hiring: A Multidisciplinary SurveyACM Transactions on Intelligent Systems and Technology10.1145/369645716:1(1-54)Online publication date: 23-Sep-2024
https://dl.acm.org/doi/10.1145/3696457
Liu ESo WHosoi PD'Ignazio C(2024)Racial Steering by Large Language Models: A Prospective Audit of GPT-4 on Housing RecommendationsProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3689904.3694709(1-13)Online publication date: 29-Oct-2024
https://dl.acm.org/doi/10.1145/3689904.3694709
Show More Cited By

Index Terms

The Unequal Opportunities of Large Language Models: Examining Demographic Biases in Job Recommendations by ChatGPT and LLaMA

Recommendations

Gender bias and stereotypes in Large Language Models
CI '23: Proceedings of The ACM Collective Intelligence Conference

Large Language Models (LLMs) have made substantial progress in the past several months, shattering state-of-the-art benchmarks in many domains. This paper investigates LLMs’ behavior with respect to gender stereotypes, a known issue for prior models. We ...
Indian-BhED: A Dataset for Measuring India-Centric Biases in Large Language Models
GoodIT '24: Proceedings of the 2024 International Conference on Information Technology for Social Good

Large Language Models (LLMs), now used daily by millions, can encode societal biases, exposing their users to representational harms. A large body of scholarship on LLM bias exists but it predominantly adopts a Western-centric frame and attends ...
Intersectionality and cyberbullying

Display Omitted Our paper applies an intersectional approach to the study of cyberbullying.We explore the conditional impact of race, gender, and sexuality on victimization.We conducted an original survey of students in a Midwestern high school (N=752)...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EAAMO '23: Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization

October 2023

498 pages

ISBN:9798400703812

DOI:10.1145/3617694

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Defense Advanced Research Projects Agency

Conference

EAAMO '23

Sponsor:

EAAMO '23: Equity and Access in Algorithms, Mechanisms, and Optimization

October 30 - November 1, 2023

MA, Boston, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,945
Total Downloads

Downloads (Last 12 months)1,602
Downloads (Last 6 weeks)205

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Georgiou G(2024)ChatGPT Exhibits Bias Toward Developed Countries Over Developing Ones, as Indicated by a Sentiment Analysis ApproachJournal of Language and Social Psychology10.1177/0261927X24129833744:1(132-141)Online publication date: 15-Nov-2024
https://doi.org/10.1177/0261927X241298337
Fabris ABaranowska NDennis MGraus DHacker PSaldivar JZuiderveen Borgesius FBiega A(2024)Fairness and Bias in Algorithmic Hiring: A Multidisciplinary SurveyACM Transactions on Intelligent Systems and Technology10.1145/369645716:1(1-54)Online publication date: 23-Sep-2024
https://dl.acm.org/doi/10.1145/3696457
Liu ESo WHosoi PD'Ignazio C(2024)Racial Steering by Large Language Models: A Prospective Audit of GPT-4 on Housing RecommendationsProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3689904.3694709(1-13)Online publication date: 29-Oct-2024
https://dl.acm.org/doi/10.1145/3689904.3694709
Lin JDai XXi YLiu WChen BZhang HLiu YWu CLi XZhu CGuo HYu YTang RZhang W(2024)How Can Recommender Systems Benefit from Large Language Models: A SurveyACM Transactions on Information Systems10.1145/3678004Online publication date: 13-Jul-2024
https://doi.org/10.1145/3678004
Klein LD'Ignazio C(2024)Data Feminism for AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658543(100-112)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658543
Kumar AAndreopoulos WAttar N(2024)Cross-Linguistic Examination of Gender Bias Large Language Models2024 Artificial Intelligence x Humanities, Education, and Art (AIxHEART)10.1109/AIxHeart62327.2024.00020(70-75)Online publication date: 30-Sep-2024
https://doi.org/10.1109/AIxHeart62327.2024.00020
Nedungadi PRamesh MGovindaraju VRao BBerbeglia PRaman R(2024)Emerging leaders or persistent gaps? Generative AI research may foster women in STEMInternational Journal of Information Management10.1016/j.ijinfomgt.2024.10278577(102785)Online publication date: Aug-2024
https://doi.org/10.1016/j.ijinfomgt.2024.102785
Barman KWood NPawlowski P(2024)Beyond transparency and explainability: on the need for adequate and contextualized user guidelines for LLM useEthics and Information Technology10.1007/s10676-024-09778-226:3Online publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1007/s10676-024-09778-2
Ducel FNévéol AFort K(2024)“You’ll be a nurse, my son!” Automatically assessing gender biases in autoregressive language models in French and ItalianLanguage Resources and Evaluation10.1007/s10579-024-09780-6Online publication date: 24-Oct-2024
https://doi.org/10.1007/s10579-024-09780-6
Castelblanco GCruz‐Castro LYang Z(2024)Performance of a Large‐Language Model in scoring construction management capstone design projectsComputer Applications in Engineering Education10.1002/cae.2279632:6Online publication date: 14-Sep-2024
https://doi.org/10.1002/cae.22796

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten