Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Malloy, Tyler; Ferreira, Maria José; Fang, Fei; Gonzalez, Cleotilde

Computer Science > Computation and Language

arXiv:2409.00269 (cs)

[Submitted on 30 Aug 2024 (v1), last revised 10 Oct 2024 (this version, v2)]

Title:Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Authors:Tyler Malloy, Maria José Ferreira, Fei Fang, Cleotilde Gonzalez

View PDF HTML (experimental)

Abstract:Cosine similarity between two documents can be computed using token embeddings formed by Large Language Models (LLMs) such as GPT-4, and used to categorize those documents across a range of uses. However, these similarities are ultimately dependent on the corpora used to train these LLMs, and may not reflect subjective similarity of individuals or how their biases and constraints impact similarity metrics. This lack of cognitively-aware personalization of similarity metrics can be particularly problematic in educational and recommendation settings where there is a limited number of individual judgements of category or preference, and biases can be particularly relevant. To address this, we rely on an integration of an Instance-Based Learning (IBL) cognitive model with LLM embeddings to develop the Instance-Based Individualized Similarity (IBIS) metric. This similarity metric is beneficial in that it takes into account individual biases and constraints in a manner that is grounded in the cognitive mechanisms of decision making. To evaluate the IBIS metric, we also introduce a dataset of human categorizations of emails as being either dangerous (phishing) or safe (ham). This dataset is used to demonstrate the benefits of leveraging a cognitive model to measure the subjective similarity of human participants in an educational setting.

Comments:	7 Figures, 1 table
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2409.00269 [cs.CL]
	(or arXiv:2409.00269v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2409.00269

Submission history

From: Tyler Malloy [view email]
[v1] Fri, 30 Aug 2024 21:54:13 UTC (1,674 KB)
[v2] Thu, 10 Oct 2024 14:51:43 UTC (1,731 KB)

Computer Science > Computation and Language

Title:Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators