Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mahesan Niranjan


2024

pdf bib
Do Prompt Positions Really Matter?
Junyu Mao | Stuart E. Middleton | Mahesan Niranjan
Findings of the Association for Computational Linguistics: NAACL 2024

Prompt-based models have gathered a lot of attention from researchers due to their remarkable advancements in the fields of zero-shot and few-shot learning. Developing an effective prompt template plays a critical role. However, prior studies have mainly focused on prompt vocabulary searching or embedding initialization within a predefined template with the prompt position fixed. In this empirical study, we conduct the most comprehensive analysis to date of prompt position for diverse Natural Language Processing (NLP) tasks. Our findings quantify the substantial impact prompt position has on model performance. We observe that the prompt positions used in prior studies are often sub-optimal, and this observation is consistent even in widely used instruction-tuned models. These findings suggest prompt position optimisation as a valuable research direction to augment prompt engineering methodologies and prompt position-aware instruction tuning as a potential way to build more robust models in the future.

2014

pdf bib
Sinhala-Tamil Machine Translation: Towards better Translation Quality
Randil Pushpananda | Ruvan Weerasinghe | Mahesan Niranjan
Proceedings of the Australasian Language Technology Association Workshop 2014

pdf bib
Bayesian Reordering Model with Feature Selection
Abdullah Alrajeh | Mahesan Niranjan
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Large-scale Reordering Model for Statistical Machine Translation using Dual Multinomial Logistic Regression
Abdullah Alrajeh | Mahesan Niranjan
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2009

pdf bib
Handling phrase reorderings for machine translation
Yizhao Ni | Craig Saunders | Sandor Szedmak | Mahesan Niranjan
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers