Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Peng, Letian; Zhang, Yuwei; Shang, Jingbo

Computer Science > Computation and Language

arXiv:2307.07099 (cs)

[Submitted on 14 Jul 2023 (v1), last revised 22 May 2024 (this version, v3)]

Title:Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Authors:Letian Peng, Yuwei Zhang, Jingbo Shang

View PDF HTML (experimental)

Abstract:Prompting large language models (LLMs) for data augmentation has recently become a common practice in few-shot NLP tasks. In this paper, we propose Chain-of-Thought Attribute Manipulation (CoTAM), a novel approach that generates new data from existing examples by only tweaking in the user-provided, task-specific attribute, e.g., sentiment polarity or topic in movie reviews. Instead of conventional latent representation controlling, we leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction. Extensive results on various tasks, such as text (pair) classification, aspect-based sentiment analysis, and conditional text generation, verify the superiority of CoTAM over other LLM-based augmentation methods with the same number of training examples for both fine-tuning and in-context learning. Remarkably, the 2D visualization of the augmented dataset using principal component analysis revealed a human-recognizable decision boundary that is likely hinted by the attribute manipulation, demonstrating the potential of our proposed approach.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2307.07099 [cs.CL]
	(or arXiv:2307.07099v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.07099

Submission history

From: Letian Peng [view email]
[v1] Fri, 14 Jul 2023 00:10:03 UTC (9,402 KB)
[v2] Sat, 18 May 2024 19:57:15 UTC (9,922 KB)
[v3] Wed, 22 May 2024 00:08:47 UTC (9,922 KB)

Computer Science > Computation and Language

Title:Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators