Representations Matter: Embedding Modes of Large Language Models using Dynamic Mode Decomposition

Akrout, Mohamed

Computer Science > Computation and Language

arXiv:2309.01245 (cs)

[Submitted on 3 Sep 2023]

Title:Representations Matter: Embedding Modes of Large Language Models using Dynamic Mode Decomposition

Authors:Mohamed Akrout

View PDF

Abstract:Existing large language models (LLMs) are known for generating "hallucinated" content, namely a fabricated text of plausibly looking, yet unfounded, facts. To identify when these hallucination scenarios occur, we examine the properties of the generated text in the embedding space. Specifically, we draw inspiration from the dynamic mode decomposition (DMD) tool in analyzing the pattern evolution of text embeddings across sentences. We empirically demonstrate how the spectrum of sentence embeddings over paragraphs is constantly low-rank for the generated text, unlike that of the ground-truth text. Importantly, we find that evaluation cases having LLM hallucinations correspond to ground-truth embedding patterns with a higher number of modes being poorly approximated by the few modes associated with LLM embedding patterns. In analogy to near-field electromagnetic evanescent waves, the embedding DMD eigenmodes of the generated text with hallucinations vanishes quickly across sentences as opposed to those of the ground-truth text. This suggests that the hallucinations result from both the generation techniques and the underlying representation.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2309.01245 [cs.CL]
	(or arXiv:2309.01245v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.01245

Submission history

From: Mohamed Akrout [view email]
[v1] Sun, 3 Sep 2023 19:10:18 UTC (952 KB)

Computer Science > Computation and Language

Title:Representations Matter: Embedding Modes of Large Language Models using Dynamic Mode Decomposition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Representations Matter: Embedding Modes of Large Language Models using Dynamic Mode Decomposition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators