RuleR: Improving LLM Controllability by Rule-based Data Recycling

Li, Ming; Chen, Han; Wang, Chenguang; Nguyen, Dang; Li, Dianqi; Zhou, Tianyi

Computer Science > Computation and Language

arXiv:2406.15938 (cs)

[Submitted on 22 Jun 2024]

Title:RuleR: Improving LLM Controllability by Rule-based Data Recycling

Authors:Ming Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou

View PDF HTML (experimental)

Abstract:Large language models (LLMs) still lack delicate controllability over their responses, which is critical to enhancing their performance and the user experience. However, curating supervised fine-tuning (SFT) datasets to improve LLM controllability usually relies on human experts or proprietary LLMs, which requires additional costs. To bridge this gap, we propose Rule-based Data Recycling (RuleR), a data augmentation method incorporating multiple constraints into the original data samples according to predefined rules, which creates new training tasks to consolidate the controllability of LLMs. Instead of creating new data from scratch, RuleR ``recycles'' existing data by simply applying rule-based edits to their responses and appending the rule-instructions in their original instructions. Experimental results demonstrate RuleR's effectiveness in improving LLM controllability while maintaining general instruction-following capabilities. The code will be released on this https URL.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2406.15938 [cs.CL]
	(or arXiv:2406.15938v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.15938

Submission history

From: Ming Li [view email]
[v1] Sat, 22 Jun 2024 20:57:12 UTC (431 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:RuleR: Improving LLM Controllability by Rule-based Data Recycling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:RuleR: Improving LLM Controllability by Rule-based Data Recycling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators