Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3632410.3633297acmotherconferencesArticle/Chapter ViewAbstractPublication PagescomadConference Proceedingsconference-collections
tutorial
Open access

Towards understanding and mitigating the hallucinations in NLP and Speech

Published: 04 January 2024 Publication History

Abstract

With the recent advances in natural language processing, thanks to deep learning architectures such as the transformer, the performances on many of the challenging NLP tasks such as question answering, machine translation, abstractive summarization, etc. have exponentially improved. However, with the state-of-the-art models, it is observed that even though these models generate natural and fluent-looking text but many times they are unfaithful and may contain facts/information that is irrelevant or not supported by the input. This phenomenon is referred to in the literature as a hallucination. A similar phenomenon is observed in end-to-end speech recognition systems, where the portion of the output text is having different acoustics when compared to the corresponding speech signal.
In this tutorial, we introduce the problem of hallucinations in various Speech and NLP tasks such as machine translation, summarization and speech recognition. We categorize the hallucinations observed in this model and describe the techniques to quantify them. Next, we describe recent techniques to overcome hallucinations for many of these tasks. We draw the attention of the AI community to the potential problems of hallucinations in NLP and speech.

References

[1]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[2]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[3]
Jerome Goddard. 2023. Hallucinations in ChatGPT: A Cautionary Tale for Biomedical Researchers. The American Journal of Medicine (2023).
[4]
Alex Graves. 2012. Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711 (2012).
[5]
Nuno M. Guerreiro, Elena Voita, and André F. T. Martins. 2022. Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation. https://doi.org/10.48550/ARXIV.2208.05309
[6]
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of hallucination in natural language generation. arXiv preprint arXiv:2202.03629 (2022).
[7]
Marcin Junczys-Dowmunt. 2018. Dual conditional cross-entropy filtering of noisy parallel corpora. arXiv preprint arXiv:1809.00197 (2018).
[8]
Bar Lanyado. 2023. Can you trust chatgpt’s package recommendations?https://vulcan.io/blog/ai-hallucinations-package-risk
[9]
Katherine Lee, Orhan Firat, Ashish Agarwal, Clara Fannjiang, and David Sussillo. 2018. Hallucinations in neural machine translation. (2018).
[10]
Prasanna Parthasarathi, Koustuv Sinha, Joelle Pineau, and Adina Williams. 2021. Sometimes we want ungrammatical translations. In Findings of the Association for Computational Linguistics: EMNLP 2021. 3205–3227.
[11]
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.
[12]
Vikas Raunak, Arul Menezes, and Marcin Junczys-Dowmunt. 2021. The curious case of hallucinations in neural machine translation. arXiv preprint arXiv:2104.06683 (2021).
[13]
Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. 2021. Retrieval Augmentation Reduces Hallucination in Conversation. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 3784–3803. https://doi.org/10.18653/v1/2021.findings-emnlp.320
[14]
Vinit Unni, Shreya Khare, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi, and Samarth Bharadwaj. 2022. Adaptive Discounting of Implicit Language Models in RNN-Transducers. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8122–8126.
[15]
Vinit S Unni, Ashish Mittal, Preethi Jyothi, and Sunita Sarawagi. 2023. Improving RNN-Transducers with Acoustic LookAhead. arXiv preprint arXiv:2307.05006 (2023).
[16]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[17]
Chaojun Wang and Rico Sennrich. 2020. On exposure bias, hallucination and domain shift in neural machine translation. arXiv preprint arXiv:2005.03642 (2020).
[18]
Benjamin Weiser. 2023. Here’s what happens when your lawyer uses Chatgpt. https://www.nytimes.com/2023/05/27/nyregion/avianca-airline-lawsuit-chatgpt.html
[19]
Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, and Marjan Ghazvininejad. 2020. Detecting hallucinated content in conditional neural sequence generation. arXiv preprint arXiv:2011.02593 (2020).

Cited By

View all
  • (2024)MAI - A Proactive Speech Agent for Metacognitive Mediation in Collaborative LearningProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665585(1-5)Online publication date: 8-Jul-2024
  • (2024)Estimating the Completeness of Discrete Speech Units2024 IEEE Spoken Language Technology Workshop (SLT)10.1109/SLT61566.2024.10832198(415-422)Online publication date: 2-Dec-2024
  • (2024)The AI Act in a law enforcement context: The case of automatic speech recognition for transcribing investigative interviewsForensic Science International: Synergy10.1016/j.fsisyn.2024.1005639(100563)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Towards understanding and mitigating the hallucinations in NLP and Speech
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)
      January 2024
      627 pages
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 January 2024

      Check for updates

      Qualifiers

      • Tutorial
      • Research
      • Refereed limited

      Conference

      CODS-COMAD 2024

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,073
      • Downloads (Last 6 weeks)72
      Reflects downloads up to 25 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MAI - A Proactive Speech Agent for Metacognitive Mediation in Collaborative LearningProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665585(1-5)Online publication date: 8-Jul-2024
      • (2024)Estimating the Completeness of Discrete Speech Units2024 IEEE Spoken Language Technology Workshop (SLT)10.1109/SLT61566.2024.10832198(415-422)Online publication date: 2-Dec-2024
      • (2024)The AI Act in a law enforcement context: The case of automatic speech recognition for transcribing investigative interviewsForensic Science International: Synergy10.1016/j.fsisyn.2024.1005639(100563)Online publication date: 2024
      • (2024)Can Large Language Models (LLMs) Compete with Human Requirements Reviewers? – Replication of an Inspection Experiment on Requirements DocumentsProduct-Focused Software Process Improvement10.1007/978-3-031-78386-9_3(27-42)Online publication date: 27-Nov-2024
      • (2024)Assessing Generative Language Models in Classification Tasks: Performance and Self-evaluation Capabilities in the Environmental and Climate Change DomainNatural Language Processing and Information Systems10.1007/978-3-031-70242-6_29(302-313)Online publication date: 20-Sep-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media