Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3630106.3658981acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Open access

Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI

Published: 05 June 2024 Publication History

Abstract

The factual inaccuracies ("hallucinations") of large language models have recently inspired more research on knowledge-enhanced language modeling approaches. These are often assumed to enhance the overall trustworthiness and objectivity of language models. Meanwhile, the issue of bias is usually only mentioned as a limitation of statistical representations. This dissociation of knowledge-enhancement and bias is in line with previous research on AI engineers’ assumptions about knowledge, which indicate that knowledge is commonly understood as objective and value-neutral by this community. We argue that claims and practices by actors of the field still reflect this underlying conception of knowledge. We contrast this assumption with literature from social and, in particular, feminist epistemology, which argues that the idea of a universal disembodied knower is blind to the reality of knowledge practices and seriously challenges claims of "objective" or "neutral" knowledge.
Knowledge enhancement techniques commonly use Wikidata and Wikipedia as their sources for knowledge, due to their large scales, public accessibility, and assumed trustworthiness. In this work, they serve as a case study for the influence of the social setting and the identity of knowers on epistemic processes. Indeed, the communities behind Wikidata and Wikipedia are known to be male-dominated and many instances of hostile behavior have been reported in the past decade. In effect, the contents of these knowledge bases are highly biased. It is therefore doubtful that these knowledge bases would contribute to bias reduction. In fact, our empirical evaluations of RoBERTa, KEPLER, and CoLAKE, demonstrate that knowledge enhancement may not live up to the hopes of increased objectivity. In our study, the average probability for stereotypical associations was preserved on two out of three metrics and performance-related gender gaps on knowledge-driven task were also preserved.
We build on these results and critical literature to argue that the label of "knowledge" and the commonly held beliefs about it can obscure the harm that is still done to marginalized groups. Knowledge enhancement is at risk of perpetuating epistemic injustice, and AI engineers’ understanding of knowledge as objective per se conceals this injustice. Finally, to get closer to trustworthy language models, we need to rethink knowledge in AI and aim for an agenda of diversification and scrutiny from outgroup members.

References

[1]
Alison Adam. 2000. Deleting the Subject: A Feminist Reading of Epistemology in Artificial Intelligence. Minds and Machines 10, 2 (2000), 231–253. https://doi.org/10.1023/A:1008306015799
[2]
Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2021. Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Online) (NAACL). ACL, 3554–3565. https://doi.org/10.18653/v1/2021.naacl-main.278
[3]
Garima Agrawal, Tharindu Kumarage, Zeyad Alghami, and Huan Liu. 2023. Can Knowledge Graphs Reduce Hallucinations in LLMs?: A Survey. arXiv:2311.07914https://doi.org/10.48550/arXiv.2311.07914
[4]
Elizabeth Anderson. 1995. Knowledge, Human Interests, and Objectivity in Feminist Epistemology. Philosophical Topics 23, 2 (1995), 27–58. https://www.jstor.org/stable/43154207
[5]
Elizabeth Anderson. 2020. Feminist Epistemology and Philosophy of Science. In The Stanford Encyclopedia of Philosophy (Spring 2020 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/spr2020/entries/feminism-epistemology/
[6]
Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2023. Fairness and Machine Learning: Limitations and Opportunities. MIT Press.
[7]
Emily M. Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604. https://doi.org/10.1162/TACL_A_00041
[8]
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? . In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Online) (FAccT 2021). ACM, 610–623. https://doi.org/10.1145/3442188.3445922
[9]
Abeba Birhane, William Isaac, Vinodkumar Prabhakaran, Mark Diaz, Madeleine Clare Elish, Iason Gabriel, and Shakir Mohamed. 2022. Power to the People? Opportunities and Challenges for Participatory AI. In Proceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (Arlington, VA, USA) (EAAMO 2022). ACM, 6:1–6:8. https://doi.org/10.1145/3551624.3555290
[10]
Jude Browne. 2023. AI and Structural Injustice. Feminist AI: Critical Perspectives on Algorithms, Data, and Intelligent Machines (2023), 328. https://doi.org/10.1093/oso/9780192889898.003.0019
[11]
Collin Burns, Haotian Ye, Dan Klein, and Jacob Steinhardt. 2022. Discovering Latent Knowledge in Language Models Without Supervision. arXiv:2212.03827https://doi.org/10.48550/arXiv.2212.03827
[12]
Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. Semantics Derived Automatically From Language Corpora Contain Human-Like Biases. Science 356, 6334 (2017), 183–186. https://doi.org/10.1126/science.aal4230
[13]
Paramita Das, Sai Keerthana Karnam, Anirban Panda, Bhanu Prakash Reddy Guda, Soumya Sarkar, and Animesh Mukherjee. 2023. Diversity Matters: Robustness of Bias Measurements in Wikidata. In Proceedings of the 15th ACM Web Science Conference 2023 (Austin, TX, USA) (WebSci 2023). ACM, 208–218. https://doi.org/10.1145/3578503.3583620
[14]
René Descartes. 2012. Discourse on Method. Hackett Publishing.
[15]
Sunipa Dev, Emily Sheng, Jieyu Zhao, Aubrie Amstutz, Jiao Sun, Yu Hou, Mattie Sanseverino, Jiin Kim, Akihiro Nishi, Nanyun Peng, and Kai-Wei Chang. 2022. On Measures of Biases and Harms in NLP. In Findings of the Association for Computational Linguistics (Online) (AACL-IJCNLP 2022). ACL, 246–267. https://aclanthology.org/2022.findings-aacl.24
[16]
Yupei Du, Qi Zheng, Yuanbin Wu, Man Lan, Yan Yang, and Meirong Ma. 2022. Understanding Gender Bias in Knowledge Base Embeddings. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Dublin, Ireland). ACL, 1381–1395. https://doi.org/10.18653/v1/2022.acl-long.98
[17]
Hady ElSahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon S. Hare, Frédérique Laforest, and Elena Simperl. 2018. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (Miyazaki, Japan) (LREC 2018). ELRA. http://www.lrec-conf.org/proceedings/lrec2018/summaries/632.html
[18]
Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Sydney, NSW, Australia). ACM, 259–268. https://doi.org/10.1145/2783258.2783311
[19]
Luciano Floridi and Jeff W. Sanders. 2004. On the Morality of Artificial Agents. Minds and Machines 14 (2004), 349–379. https://doi.org/10.1023/B:MIND.0000035461.63578.9d
[20]
Richard Foley. 1987. The Theory of Epistemic Rationality. Harvard University Press.
[21]
Diana E. Forsythe. 1993. Engineering Knowledge: The Construction of Knowledge in Artificial Intelligence. Social Studies of Science 23, 3 (1993), 445–477. http://www.jstor.org/stable/370256
[22]
Miranda Fricker. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford University Press.
[23]
Batya Friedman and Helen Nissenbaum. 1996. Bias in Computer Systems. ACM Transactions on Information Systems 14, 3 (1996), 330–347. https://doi.org/10.1145/230538.230561
[24]
Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. 2023. Retrieval-Augmented Generation for Large Language Models: A Survey. (2023). arXiv:2312.10997https://doi.org/10.48550/arXiv.2312.10997
[25]
Andrew Gaut, Tony Sun, Shirlyn Tang, Yuxin Huang, Jing Qian, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. 2020. Towards Understanding Gender Bias in Relation Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Online). ACL, 2943–2953. https://doi.org/10.18653/v1/2020.acl-main.265
[26]
Timnit Gebru. 2021. Hierarchy of Knowledge in Machine Learning & Related Fields & Its Consequences. In Carnegie Mellon Human-Computer Interaction Institute Seminar Series. https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=70f6edd7-de91-464e-ae94-acbb011ba2c7 (Video recording).
[27]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna M. Wallach, Hal Daumé III, and Kate Crawford. 2021. Datasheets for Datasets. Commun. ACM 64, 12 (2021), 86–92. https://doi.org/10.1145/3458723
[28]
Ramón Grosfoguel. 2007. The Epistemic Decolonial Turn: Beyond Political-Economy Paradigms. Cultural Studies 21, 2-3 (2007), 211–223. https://www.tandfonline.com/doi/full/10.1080/09502380601162514
[29]
Leif Hancox-Li and I. Elizabeth Kumar. 2021. Epistemic Values in Feature Importance Methods: Lessons from Feminist Epistemology. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Online) (FAccT ’21). ACM, 817–826. https://doi.org/10.1145/3442188.3445943
[30]
Donna Haraway. 2016. Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective. In Space, Gender, Knowledge: Feminist Readings. Routledge, 53–72. https://www.jstor.org/stable/3178066
[31]
Sandra Harding. 2013. Rethinking Standpoint Epistemology: What is “Strong Objectivity”? In Feminist Epistemologies. Routledge, 49–82. https://www.jstor.org/stable/23739232
[32]
John Hardwig. 1985. Epistemic Dependence. The Journal of Philosophy 82, 7 (1985), 335–349.
[33]
Anna Lauren Hoffmann. 2021. Terms of Inclusion: Data, Discourse, Violence. New Media & Society 23, 12 (2021), 3539–3556. https://journals.sagepub.com/doi/10.1177/1461444820958725
[34]
Filip Ilievski, Pedro A. Szekely, and Daniel Schwabe. 2020. Commonsense Knowledge in Wikidata. In Proceedings of the 1st Wikidata Workshop (Wikidata 2020) co-located with 19th International Semantic Web Conference (Online) (OPub 2020). CEUR-ws.org. https://ceur-ws.org/Vol-2773/paper-10.pdf
[35]
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. 2023. Survey of Hallucination in Natural Language Generation. Comput. Surveys 55, 12, Article 248 (2023), 38 pages. https://doi.org/10.1145/3571730
[36]
Yuchen Jiang, Xiang Li, Hao Luo, Shen Yin, and Okyay Kaynak. 2022. Quo Vadis Artificial Intelligence?Discover Artificial Intelligence 2 (2022). Issue 1. https://doi.org/10.1007/s44163-022-00022-8
[37]
Eun Seo Jo and Timnit Gebru. 2020. Lessons From Archives: Strategies for Collecting Sociocultural Data in Machine Learning. In Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* 2020). ACM, 306–316. https://doi.org/10.1145/3351095.3372829
[38]
Dan Jurafsky and James H. Martin. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd Edition. Prentice Hall, Pearson Education International. https://www.worldcat.org/oclc/315913020
[39]
Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt, Kamal Ndousse, Catherine Olsson, Sam Ringer, Dario Amodei, Tom Brown, Jack Clark, Nicholas Joseph, Ben Mann, Sam McCandlish, Chris Olah, and Jared Kaplan. 2022. Language Models (Mostly) Know What They Know. (2022). arXiv:2207.05221https://doi.org/10.48550/arXiv.2207.05221
[40]
Immanuel Kant. 2013. An Answer to the Question: ’What is Enlightenment?’. Penguin UK.
[41]
C. Maria Keet. 2021. An Exploration Into Cognitive Bias in Ontologies. In Joint Ontology Workshops 2021 Episode VII: The Bolzano Summer of Knowledge (Bolzano, Italy) (JOWO 2021). CEUR-ws.org. http://ceur-ws.org/Vol-2969/paper38-CAOS.pdf
[42]
Os Keyes and Kathleen Creel. 2022. Artificial Knowing Otherwise. Feminist Philosophy Quarterly 8, 3 (2022). https://doi.org/10.5206/fpq/2022.3/4.14313
[43]
Svetlana Kiritchenko and Saif Mohammad. 2018. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (New Orleans, Louisiana). ACL, 43–53. https://doi.org/10.18653/v1/S18-2005
[44]
Angelie Kraft and Ricardo Usbeck. 2022. The Lifecycle of “Facts”: A Survey of Social Bias in Knowledge Graphs. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Online) (AACL-IJCNLP 2022). ACL, 639–652. https://aclanthology.org/2022.aacl-main.49
[45]
Shyong K. Lam, Anuradha Uduwage, Zhenhua Dong, Shilad Sen, David R. Musicant, Loren G. Terveen, and John Riedl. 2011. WP: Clubhouse?: An Exploration of Wikipedia’s Gender Imbalance. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration, 2011 (Mountain View, CA, USA). ACM, 1–10. https://doi.org/10.1145/2038558.2038560
[46]
Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020 (Online) (NeurIPS 2020). https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html
[47]
Junyi Li, Tianyi Tang, Wayne Xin Zhao, and Ji-Rong Wen. 2021. Pretrained Language Models for Text Generation: A Survey. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (Montréal, Canada) (IJCAI-21). IJCAI, 4492–4499. https://doi.org/10.24963/ijcai.2021/612
[48]
Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2021. Towards Understanding and Mitigating Social Biases in Language Models. In Proceedings of the 38th International Conference on Machine Learning (Online) (ICML 2021, Vol. 139). PMLR, 6565–6576. http://proceedings.mlr.press/v139/liang21a.html
[49]
Nora Freya Lindemann. 2024. Chatbots, Search Engines, and the Sealing of Knowledges. AI & Society (2024). https://doi.org/10.1007/s00146-024-01944-w
[50]
Shlomit Aharoni Lir. 2021. Strangers in a Seemingly Open-to-all Website: The Gender Bias in Wikipedia. Equality, Diversity and Inclusion: An International Journal 40, 7 (2021), 2040–7149. https://doi.org/10.1108/EDI-10-2018-0198
[51]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. (2019). arXiv:1907.11692http://arxiv.org/abs/1907.11692
[52]
Helen E. Longino. 1990. Science as Social Knowledge: Values and Objectivity in Scientific Inquiry. Princeton University Press. http://www.jstor.org/stable/j.ctvx5wbfz
[53]
Helen E. Longino. 2002. The Fate of Knowledge. Princeton University Press. https://doi.org/
[54]
Jeffrey Jun-jie Ma and Charles Chuankai Zhang. 2023. Understanding Structured Knowledge Production: A Case Study of Wikidata’s Representation Injustice. arXiv:2311.02767https://doi.org/10.48550/arXiv.2311.02767
[55]
Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, and Hannaneh Hajishirzi. 2023. When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Toronto, Canada). ACL, 9802–9822. https://doi.org/10.18653/v1/2023.acl-long.546
[56]
Gary Marcus. 2020. The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence. (2020). arXiv:2002.06177https://doi.org/10.48550/arXiv.2002.06177
[57]
José-Lázaro Martínez-Rodríguez, Aidan Hogan, and Ivan López-Arévalo. 2020. Information Extraction Meets the Semantic Web: A Survey. Semantic Web 11, 2 (2020), 255–335. https://doi.org/10.3233/SW-180333
[58]
Rebecca Mason. 2011. Two Kinds of Unknowing. Hypatia 26, 2 (2011), 294–307. https://doi.org/10.1111/j.1527-2001.2011.01175.x
[59]
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On Measuring Social Biases in Sentence Encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (Minneapolis, Minnesota). ACL, 622–628. https://doi.org/10.18653/v1/N19-1063
[60]
Nicholas Meade, Elinor Poole-Dayan, and Siva Reddy. 2022. An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Dublin, Ireland). ACL, 1878–1898. https://doi.org/10.18653/v1/2022.acl-long.132
[61]
Ninareh Mehrabi, Thamme Gowda, Fred Morstatter, Nanyun Peng, and Aram Galstyan. 2020. Man is to Person as Woman is to Location: Measuring Gender Bias in Named Entity Recognition. In Proceedings of the 31st ACM Conference on Hypertext and Social Media (Online) (HT 2020). ACM, 231–232. https://doi.org/10.1145/3372923.3404804
[62]
Sara Melotte, Filip Ilievski, Linglan Zhang, Aditya Malte, Namita Mutha, Fred Morstatter, and Ninareh Mehrabi. 2022. Where Does Bias in Common Sense Knowledge Models Come From?IEEE Internet Computing 26, 4 (2022), 12–20. https://doi.org/10.1109/MIC.2022.3170914
[63]
Amanda Menking and Ingrid Erickson. 2015. The Heart Work of Wikipedia: Gendered, Emotional Labor in the World’s Largest Online Encyclopedia. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI 2015 (Seoul, Republic of Korea). ACM, 207–210. https://doi.org/10.1145/2702123.2702514
[64]
Amanda Menking, Ingrid Erickson, and Wanda Pratt. 2019. People Who Can Take It: How Women Wikipedians Negotiate and Navigate Safety. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019 (Glasgow, Scotland, UK). ACM, 472. https://doi.org/10.1145/3290605.3300702
[65]
Amanda Menking and Jon Rosenberg. 2021. WP:NOT, WP:NPOV, and Other Stories Wikipedia Tells Us: A Feminist Critique of Wikipedia’s Epistemology. Science, Technology, & Human Values 46, 3 (2021), 455–479. https://doi.org/10.1177/0162243920924783
[66]
Charles W. Mills. 2017. Ideology. In The Routledge Handbook of Epistemic Injustice. Routledge London, 100–111.
[67]
Shubhanshu Mishra, Sijun He, and Luca Belli. 2020. Assessing Demographic Bias in Named Entity Recognition. In Proceedings of the Bias in Automatic Knowledge Graph Construction - A Workshop at AKBC 2020 (Online). ACM. https://kg-bias.github.io/NER_Bias_KG_Bias.pdf
[68]
Brent Mittelstadt, Sandra Wachter, and Chris Russell. 2023. To Protect Science, we Must Use LLMs as Zero-Shot Translators. Nature Human Behaviour 7 (2023), 1830–1832. https://doi.org/10.1038/s41562-023-01744-0
[69]
Cedric Möller, Jens Lehmann, and Ricardo Usbeck. 2022. Survey on English Entity Linking on Wikidata: Datasets and Approaches. Semantic Web 13, 6 (2022), 925–966. https://www.semantic-web-journal.net/content/survey-english-entity-linking-wikidata-0
[70]
Evgeny Morozov. 2013. To Save Everything, Click Here: Technology, Solutionism and the Urge to Fix Problems that Don’t Exist. PublicAffairs. 432 pages.
[71]
Moin Nadeem, Anna Bethke, and Siva Reddy. 2021. StereoSet: Measuring Stereotypical Bias in Pretrained Language Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Online) (ACL-IJCNLP 2021). ACL, 5356–5371. https://doi.org/10.18653/v1/2021.acl-long.416
[72]
Thomas Nagel. 1989. The View From Nowhere. Oxford University Press.
[73]
Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman. 2020. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (Online) (EMNLP). ACL, 1953–1967. https://doi.org/10.18653/v1/2020.emnlp-main.154
[74]
Andrei Nesterov, Laura Hollink, and Jacco van Ossenbruggen. 2023. How Contentious Terms About People and Cultures are Used in Linked Open Data. (2023). arXiv:2311.10757https://doi.org/10.48550/arXiv.2311.10757
[75]
Sven Nyholm and Lily Eva Frank. 2017. From Sex Robots to Love Robots: Is Mutual Love With a Robot Possible? In Robot Sex: Social and Ethical Implications.
[76]
Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. 2023. Large Language Models and Knowledge Graphs: Opportunities and Challenges. TGDK 1, 1 (2023), 2:1–2:38. https://doi.org/10.4230/TGDK.1.1.2
[77]
Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu. 2024. Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Transactions on Knowledge and Data Engineering (2024), 1–20. https://doi.org/10.1109/TKDE.2024.3352100
[78]
Heiko Paulheim. 2017. Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods. Semantic Web 8, 3 (2017), 489–508. https://doi.org/10.3233/SW-160218
[79]
Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. 2019. Language Models as Knowledge Bases?. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (Hong Kong, China) (EMNLP-IJCNLP). ACL, 2463–2473. https://doi.org/10.18653/v1/D19-1250
[80]
Andrea J. Pitts. 2017. Decolonial Praxis and Epistemic Injustice. In The Routledge Handbook of Epistemic Injustice. Routledge London, 149–157.
[81]
Yujia Qin, Yankai Lin, Ryuichi Takanobu, Zhiyuan Liu, Peng Li, Heng Ji, Minlie Huang, Maosong Sun, and Jie Zhou. 2021. ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Online) (ACL-IJCNLP 2021). ACL, 3350–3363. https://doi.org/10.18653/v1/2021.acl-long.260
[82]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models Are Unsupervised Multitask Learners. OpenAI Blog (2019). https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
[83]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21 (2020), 140:1–140:67. http://jmlr.org/papers/v21/20-074.html
[84]
Inioluwa Deborah Raji, Morgan Klaus Scheuerman, and Razvan Amironesei. 2021. You Can’t Sit With Us: Exclusionary Pedagogy in AI Ethics Education. In FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency (Online). ACM, 515–525. https://doi.org/10.1145/3442188.3445914
[85]
Charle Rathkopf. 2023. Do LLMs Believe. (December 2023). Talk at Philosophy and Theory of AI Conference, Erlangen.
[86]
Adam Roberts, Colin Raffel, and Noam Shazeer. 2020. How Much Knowledge Can You Pack Into the Parameters of a Language Model?. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (Online) (EMNLP 2020). ACL, 5418–5426. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.437
[87]
Salvatore Romano, Natalie Kerby, Riccardo Angius, Simone Robutti, Miazia Schueler, Marc Faddoul, Raziye Buse Çetin, Clara Helming, Angela Müller, Matthias Spielkamp, Anna Lena Schiller, Waldemar Kesler, Melis Omalar, Marc Thümmler, Mira Zimmermann, Isabel Sanchez, Alexandra Kimel, Estelle Pannatier, Tobias Urech, Denis Sorie, Michele Loi, and Alex Felder. 2023. Generative AI and Elections: Are Chatbots a Reliable Source of Information for Voters?AI Forensics, Algorithm Watch, Algorithm Watch CH. https://algorithmwatch.org/en/wp-content/uploads/2023/12/AlgorithmWatch_AIForensics_Bing_Chat_Report.pdf
[88]
Naomi Scheman. 2015. Epistemology Resuscitated: Objectivity as Trustworthiness. In Shifting Ground: Knowledge and Reality, Transgression and Trustworthiness. Oxford University Press. https://doi.org/10.1093/acprof:osobl/9780195395112.003.0012
[89]
Phillip Schneider, Tim Schopf, Juraj Vladika, Mikhail Galkin, Elena Simperl, and Florian Matthes. 2022. A Decade of Knowledge Graphs in Natural Language Processing: A Survey. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Online) (AACL-IJCNLP 2022). ACL, 601–614. https://aclanthology.org/2022.aacl-main.46
[90]
Zaina Shaik, Filip Ilievski, and Fred Morstatter. 2021. Analyzing Race and Citizenship Bias in Wikidata. In IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (Denver, CO, USA) (MASS 2021). IEEE, 665–666. https://doi.org/10.1109/MASS52906.2021.00099
[91]
Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. 2019. The Woman Worked as a Babysitter: On Biases in Language Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (Hong Kong, China) (EMNLP-IJCNLP 2019). ACL, 3405–3410. https://doi.org/10.18653/V1/D19-1339
[92]
Judith Simon. 2010. The Entanglement of Trust and Knowledge on the Web. Ethics and Information Technology 12 (2010), 343–355. https://doi.org/10.1007/s10676-010-9243-5
[93]
Robyn Speer, Joshua Chin, and Catherine Havasi. 2017. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (San Francisco, CA, USA) (AAAI 2017). AAAI Press, 4444–4451. https://dl.acm.org/doi/10.5555/3298023.3298212
[94]
Matthias Steup and Ram Neta. 2024. Epistemology. In The Stanford Encyclopedia of Philosophy (Spring 2024 ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/spr2024/entries/epistemology/
[95]
Jiao Sun and Nanyun Peng. 2021. Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (Online) (ACL-IJCNLP 2021). ACL, 350–360. https://doi.org/10.18653/V1/2021.ACL-SHORT.45
[96]
Tony Sun, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth M. Belding, Kai-Wei Chang, and William Yang Wang. 2019. Mitigating Gender Bias in Natural Language Processing: Literature Review. In Proceedings of the 57th Conference of the Association for Computational Linguistics (Volume 1: Long Papers) (Florence, Italy) (ACL 2019). ACL, 1630–1640. https://doi.org/10.18653/V1/P19-1159
[97]
Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. In Proceedings of the 28th International Conference on Computational Linguistics (Barcelona, Spain and Online) (COLING 2020). ICCL, 3660–3670. https://doi.org/10.18653/V1/2020.COLING-MAIN.327
[98]
Deborah Perron Tollefsen. 2009. Wikipedia and the Epistemology of Testimony. Episteme 6, 1 (2009), 8–24. https://doi.org/10.3366/E1742360008000518
[99]
Francesca Tripodi. 2023. Ms. Categorized: Gender, Notability, and Inequality on Wikipedia. New Media & Society 25, 7 (2023), 1687–1707. https://doi.org/10.1177/14614448211023772
[100]
Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (sep 2014), 78–85. https://doi.org/10.1145/2629489
[101]
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, and Bo Li. 2023. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. In Advances in Neural Information Processing Systems, Vol. 36. Curran Associates, Inc., 31232–31339. https://proceedings.neurips.cc/paper_files/paper/2023/file/63cb9921eecf51bfad27a99b2c53dd6d-Paper-Datasets_and_Benchmarks.pdf
[102]
Jianing Wang, Wenkang Huang, Minghui Qiu, Qiuhui Shi, Hongbin Wang, Xiang Li, and Ming Gao. 2022. Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (Abu Dhabi, UAE) (EMNLP 2022). ACL, 3164–3177. https://doi.org/10.18653/V1/2022.EMNLP-MAIN.207
[103]
Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu Ji, Guihong Cao, Daxin Jiang, and Ming Zhou. 2021. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics (Online) (ACL-IJCNLP 2021). ACL, 1405–1418. https://doi.org/10.18653/V1/2021.FINDINGS-ACL.121
[104]
Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Transactions of the Association for Computational Linguistics 9 (2021), 176–194. https://doi.org/10.1162/tacl_a_00360
[105]
David Gray Widder and Dawn Nafus. 2023. Dislocated Accountabilities in the "AI Supply Chain": Modularity and Developers’ Notions of Responsibility. Big Data Soc. 10, 1 (2023). https://doi.org/10.1177/20539517231177620
[106]
Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, and Xindong Wu. 2024. Give Us the Facts: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling. IEEE Transactions on Knowledge and Data Engineering (2024), 1–20. https://doi.org/10.1109/TKDE.2024.3360454
[107]
Paul Youssef, Osman Alperen Koras, Meijie Li, Jörg Schlötterer, and Christin Seifert. 2023. Give Me the Facts! A Survey on Factual Knowledge Probing in Pre-trained Language Models. In Findings of the Association for Computational Linguistics (Singapore) (EMNLP 2023). ACL, 15588–15605. https://aclanthology.org/2023.findings-emnlp.1043
[108]
Charles Chuankai Zhang and Loren Terveen. 2021. Quantifying the Gap: A Case Study of Wikidata Gender Disparities. In OpenSym 2021: 17th International Symposium on Open Collaboration (Online). ACM, 6:1–6:12. https://doi.org/10.1145/3479986.3479992
[109]
Taolin Zhang, Chengyu Wang, Nan Hu, Minghui Qiu, Chengguang Tang, Xiaofeng He, and Jun Huang. 2022. DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 (Online). AAAI Press, 11703–11711. https://doi.org/10.1609/AAAI.V36I10.21425

Cited By

View all
  • (2025)What We Talk About When We Talk About K-12 Computing Education2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709612(226-257)Online publication date: 22-Jan-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
FAccT '24: Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency
June 2024
2580 pages
ISBN:9798400704505
DOI:10.1145/3630106
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2024

Check for updates

Author Tags

  1. bias
  2. epistemology
  3. fairness
  4. feminism
  5. knowledge enhancement
  6. knowledge graphs
  7. language models
  8. natural language processing
  9. representation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • German Research Foundation (DFG)

Conference

FAccT '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,074
  • Downloads (Last 6 weeks)192
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)What We Talk About When We Talk About K-12 Computing Education2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709612(226-257)Online publication date: 22-Jan-2025

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media