CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text

Nandy, Abhilash; Kapadnis, Manav Nitin; Goyal, Pawan; Ganguly, Niloy

Computer Science > Computation and Language

arXiv:2310.14326 (cs)

[Submitted on 22 Oct 2023]

Title:CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text

Authors:Abhilash Nandy, Manav Nitin Kapadnis, Pawan Goyal, Niloy Ganguly

View PDF

Abstract:In this paper, we propose CLMSM, a domain-specific, continual pre-training framework, that learns from a large set of procedural recipes. CLMSM uses a Multi-Task Learning Framework to optimize two objectives - a) Contrastive Learning using hard triplets to learn fine-grained differences across entities in the procedures, and b) a novel Mask-Step Modelling objective to learn step-wise context of a procedure. We test the performance of CLMSM on the downstream tasks of tracking entities and aligning actions between two procedures on three datasets, one of which is an open-domain dataset not conforming with the pre-training dataset. We show that CLMSM not only outperforms baselines on recipes (in-domain) but is also able to generalize to open-domain procedural NLP tasks.

Comments:	Accepted to EMNLP Findings 2023, 14 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.14326 [cs.CL]
	(or arXiv:2310.14326v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.14326

Submission history

From: Abhilash Nandy [view email]
[v1] Sun, 22 Oct 2023 15:20:11 UTC (9,098 KB)

Computer Science > Computation and Language

Title:CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators