Reliable LLM: Hallucination & Knowledge & Uncertainty (From Factuality Perception to Expression)

Introduction

The project demonstrates the background about LLM hallucination 👻 as well as the mitigation methods regarding uncertainty 🤔 & knowledge 📓. The research works are collected and systematically clustered in various directions and methods for reliable AI development. The project provides a framework of improving LLMs' factuality perception and eliciting factual expressions to address the hallucination issue.

Welcome to participate in this project to share valuable papers and exchange great ideas!

Outline

Reliable LLM: Hallucination & Knowledge & Uncertainty (From Factuality Perception to Expression)
Related Works of Hallucination & Knowledge & Uncertainty
🔭 Future Directions

👻 Hallucination & Factuality

Definition of LLM Hallucination

The definitions of hallucination vary and depend on specific tasks. This project focuses on hallucination issues in knowledge-intensive tasks (closed-book QA, dialogue, RAG, commonsense reasoning, translation, etc.), where hallucinations refer to the non-factual, incorrect knowledge in generations unfaithful with world knowledge.

Causes of LLM Hallucination

The causes of hallucinations vary in unfiltered incorrect statements in pertaining data, limited input length of model architecture, maximum likelihood training strategy, and diverse decoding strategies.

Architectures and input lengths, pertaining data and strategy of released LLMs are fixed. Tracing incorrect texts in substantial pertaining data is challenging. This project mainly focuses on detecting hallucinations by tracing what LLMs learn in the pertaining stage and mitigating hallucinations in fine-tuning and decoding.

Comparing open-generation tasks, knowledge-intensive tasks have specific grounding-truth reference - world knowledge. Therefore, we can estimate the knowledge boundary map of an LLM to specify what it knows. It is crucial to ensure the certainty level or honesty of LLMs to a piece of factual knowledge for hallucination detection (from grey area to green area).

📓 LLM Knowledge

The above diagram can roughly and simply represent the knowledge boundary. However, in reality, like humans, for much knowledge, we exist in a state of uncertainty, rather than only in a state of knowing or not knowing. Moreover, maximum likelihood prediction in pertaining makes LLMs be prone to generate over-confident responses. Even if the LLM knows a fact, how to make LLMs accurately tell what they know is also important.

This adds complexity to determining the knowledge boundary, which leads to two challenging questions:

How to accurately perceive (Perception) the knowledge boundary?

(Example: Given a question, such as "What is the capital of France?", the model is required to provide its confidence level for this question.)
How to accurately express (Expression) knowledge where the boundary is somewhat vague? (Previous work U2Align is a method to enhance expressions. Current interests for the second stage “expression” also lie in “alignment” methods.)

(Example: If the confidence level for answering "Paris" to the above question is 40%, should the model refuse to answer or provide a response in this situation?)

🤔 Uncertainty Estimation

Traditional Model Calibration

Models are prone to be over-confident in predictions using maximizing likelihood (MLE) training, it is crucial to identify the confidence score or uncertainty estimation for reliable AI applications.
A model is considered well-calibrated if the confidence score of predictions (SoftMax probability) are well-aligned with the actual probability of answers being correct.
Expected Calibration Error (ECE) and Reliability Diagram is used to measure the calibration performance.

Uncalibrated (left), over-confident (mid) and well-calibrated (right) models.

Uncertainty Estimation of Generative Models

To calibrate generative LLMs, we should quantify the confidence & uncertainty on generated sentences.
Uncertainty: Categorized into aleatoric (data) and epistemic (model) uncertainty. Frequently measured by the entropy of the prediction to indicate the dispersion of the model prediction.
Confidence: Generally associated with both the input and the prediction.
The terms uncertainty and confidence are often used interchangeably.

Although the knowledge boundary is important for knowledge-intensive tasks, there are no specific definitions or concepts in previous works. Current methods for estimating knowledge boundaries refer to confidence/uncertainty estimation methods including ① logit-based methods using token-level probabilities; ② prompt-based methods to make LLMs express confidence in words; ③ sampling-based methods to calculate consistency; and ④ training-based methods to learn the ability to express uncertainty.

Related Works of Hallucination & Knowledge & Uncertainty

👻 Hallucination & Factuality

Hallucination Detection

Consistency-based Detection

Title	Conference/Journal
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models	EMNLP 2023
RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought	prePrint

Internal State based Detection

Title	Conference/Journal
The Internal State of an LLM Knows When It's Lying	prePrint
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models	prePrint
On the Universal Truthfulness Hyperplane Inside LLMs	prePrint
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection	prePrint
LLM Internal States Reveal Hallucination Risk Faced With a Query	prePrint
Discovering Latent Knowledge in Language Models Without Supervision	prePrint

📓 LLM Knowledge

Knowledge Boundary

Title	Conference/Journal
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models	prePrint
Can AI Assistants Know What They Don’t Know?	prePrint
Do Large Language Models Know What They Don't Know?	prePrint
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation	EMNLP 2023
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?	prePrint

🤔 Uncertainty Estimation

Survey & Investigation

Title	Conference/Journal
A Survey of Confidence Estimation and Calibration in Large Language Models	prePrint
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis	EMNLP 2022
Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach	prePrint
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models	prePrint
Large Language Models Must Be Taught to Know What They Don’t Know	prePrint

Uncertainty Quantification

Title	Conference/Journal
Language Models (Mostly) Know What They Know	prePrint
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation	ICLR 2023
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models	prePrint
When Quantization Affects Confidence of Large Language Models?	prePrint
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs	ICLR 2024
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities	prePrint
Semantically Diverse Language Generation for Uncertainty Estimation in Language Models	prePrint
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models	prePrint

Linguistic Uncertainty Expressions

Title	Conference/Journal
Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models	EMNLP 2023
Teaching Models to Express Their Uncertainty in Words	TMLR 2022
Relying on the Unreliable: The Impact of Language Models’ Reluctance to Express Uncertainty	prePrint
"I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust	FAccT 2024
Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?	prePrint

Confidence Expressions Improvements

This part of works focus on improving confidence expressions of LLMs in a two-stage form by 1) self-prompting LLMs to generate responses to queries and then collecting the samples to construct a dataset with specific features, and 2) fine-tuning LLMs on the collected dataset to improve the specific capability of LLMs.

Title	Conference/Journal
Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience	prePrint
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning	prePrint
Uncertainty in Language Models: Assessment through Rank-Calibration	prePrint
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales	prePrint
Linguistic Calibration of Language Models	prePrint
R-Tuning: Instructing Large Language Models to Say ‘I Don’t Know’	prePrint

Hallucination Detection by Uncertainty

Title	Conference/Journal
On Hallucination and Predictive Uncertainty in Conditional Language Generation	EACL 2021
Learning Confidence for Transformer-based Neural Machine Translation	ACL 2022
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4	EMNLP 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models	EMNLP 2023
Detecting Hallucinations in Large Language Models using Semantic Entropy	Nature
LLM Internal States Reveal Hallucination Risk Faced With a Query	prePrint

Factuality Alignment by Confidence

Title	Conference/Journal
When to Trust LLMs: Aligning Confidence with Response Quality	prePrint
Fine-tuning Language Models for Factuality	ICLR 2024
Uncertainty Aware Learning for Language Model Alignment	ACL 2024
FLAME: Factuality-Aware Alignment for Large Language Models	prePrint
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation	prePrint
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation	ACL 2024

Generative Model Calibration

Title	Conference/Journal
Reducing Conversational Agents’ Overconfidence Through Linguistic Calibration	TACL 2022
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models	ICLR 2023
Calibrating the Confidence of Large Language Models by Eliciting Fidelity	prePrint
Few-Shot Recalibration of Language Models	prePrint
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering	TACL 2022
Knowing More About Questions Can Help: Improving Calibration in Question Answering	ACL 2021 Findings
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback	EMNLP 2023
Re-Examining Calibration: The Case of Question Answering	TACL 2021
Calibrating Large Language Models Using Their Generations Only	prePrint
Calibrating Large Language Models with Sample Consistency	prePrint
Linguistic Calibration of Language Models	prePrint

🔭 Future Directions

More advanced methods to assist LLMs hallucination detection and human decisions. (A new paradigm)
Confidence estimation for long-term generations like code, novel, etc. (Benchmark)
Learning to explain and clarify its confidence estimation and calibration. (Natural language)
Calibration on human variation (Misalignment between LM measures and human disagreement).
Confidence estimation and calibration for multi-modal LLMs.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
docs		docs
figs		figs
pages/docs/assets		pages/docs/assets
utils		utils
.nojekyll		.nojekyll
README.md		README.md
_coverpage.md		_coverpage.md
_navbar.md		_navbar.md
favicon.ico		favicon.ico
index.html		index.html
logo.png		logo.png
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reliable LLM: Hallucination & Knowledge & Uncertainty (From Factuality Perception to Expression)

Introduction

Outline

👻 Hallucination & Factuality

Definition of LLM Hallucination

Causes of LLM Hallucination

📓 LLM Knowledge

🤔 Uncertainty Estimation

Traditional Model Calibration

Uncertainty Estimation of Generative Models

Related Works of Hallucination & Knowledge & Uncertainty

👻 Hallucination & Factuality

Hallucination Detection

Consistency-based Detection

Internal State based Detection

📓 LLM Knowledge

Knowledge Boundary

🤔 Uncertainty Estimation

Survey & Investigation

Uncertainty Quantification

Linguistic Uncertainty Expressions

Confidence Expressions Improvements

Hallucination Detection by Uncertainty

Factuality Alignment by Confidence

Generative Model Calibration

🔭 Future Directions

About

Releases

Packages

Languages

AmourWaltz/Reliable-LLM

Folders and files

Latest commit

History

Repository files navigation

Reliable LLM: Hallucination & Knowledge & Uncertainty (From Factuality Perception to Expression)

Introduction

Outline

👻 Hallucination & Factuality

Definition of LLM Hallucination

Causes of LLM Hallucination

📓 LLM Knowledge

🤔 Uncertainty Estimation

Traditional Model Calibration

Uncertainty Estimation of Generative Models

Related Works of Hallucination & Knowledge & Uncertainty

👻 Hallucination & Factuality

Hallucination Detection

Consistency-based Detection

Internal State based Detection

📓 LLM Knowledge

Knowledge Boundary

🤔 Uncertainty Estimation

Survey & Investigation

Uncertainty Quantification

Linguistic Uncertainty Expressions

Confidence Expressions Improvements

Hallucination Detection by Uncertainty

Factuality Alignment by Confidence

Generative Model Calibration

🔭 Future Directions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages