Authors:
Meng Wang
1
;
Zhixiong Zhang
1
;
2
;
Hanyu Li
1
;
2
and
Guangyin Zhang
2
Affiliations:
1
National Science Library, Chinese Academy of Science, Beijing, China
;
2
University of Chinese Academy of Science, Beijing, China
Keyword(s):
Research Question Generation, Prompt Engineering, Knowledge Extraction, LLMs, Knowledge-Rich Regions.
Abstract:
Research questions are crucial for the development of science, which are an important driving force for
scientific evolution and progress. This study analyses the key meta knowledge required for generating
research questions in scientific literature, including research objective and research method. To extract metaknowledge, we obtained feature words of meta-knowledge from knowledge-enriched regions and embedded
them into the DeBERTa (Decoding-enhanced BERT with disentangled attention) for training. Compared to
existing models, our proposed approach demonstrates superior performance across all metrics, achieving
improvements in F1 score of +9% over BERT (88% vs. 97%), +3% over BERT-CNN (94% vs. 97%), and
+2% over DeBERTa (95% vs. 97%) for identifying meta-knowledge. And, we construct the prompts integrate
meta-knowledge to fine tune LLMs. Compared to the baseline model, the LLMs fine-tuned using metaknowledge prompt engineering achieves an average 88.6% F1 score in the researc
h question generation task,
with improvements of 8.4%. Overall, our approach can be applied to the research question generation in
different domains. Additionally, by updating or replacing the meta-knowledge, the model can also serve as a
theoretical foundation and model basis for the generation of different types of sentences.
(More)