research-article

KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansion

Authors:

Yong TangAuthors Info & Claims

World Wide Web, Volume 28, Issue 1

https://doi.org/10.1007/s11280-024-01322-y

Published: 20 December 2024 Publication History

Abstract

Short-text Expansion plays a significant role in enhancing the quality, diversity, and practicality of Short-text, helping users to more comprehensively understand the content expressed in the Short-text. In this paper, we aim to enhance the capabilities of large language models in short-text expansion through knowledge graphs and propose the knowledge-enhanced and prompt-aware large language models. First, we construct a multi-dimensional knowledge graph that includes semantics, sentiment, and topics based on large language models in domain-specific text. Second, we propose a method for mining prompts of Short-text across the three dimensions of semantics, sentiment, and topics based on the constructed multi-dimensional knowledge graph. Finally, we match triplets in the constructed knowledge graph based on the generated prompts in the three dimensions. The matched triplets is then integrated by the large language model to generate a expansion of given short-text. Experiments are conducted using three large language models on two public datasets, and the results indicate that our model shows improvements across multiple metrics for text similarity, readability, and coherence compared to the short-text expansion generated by the baseline large language models and existing methods.

References

[1]

Lee Y, Wei C, Hu PJ, Wu P, and Jiang H A text summary-based method to detect new events from streams of online news articles Inf. Manag. 2022 59 6 103684

Digital Library

[2]

Li Y, Wang X, and Slyke CV Determinants of online professor reviews: an elaboration likelihood model perspective Internet Res. 2023 33 6 2086-2108

[3]

Zhang H, Zhong H, Bai W, and Pan F Cross-platform rating prediction method based on review topic Future Gener. Comput. Syst. 2019 101 236-245

Digital Library

[4]

Tang, J., Wang, Y., Zheng, K., Mei, Q.: End-to-end learning for short text expansion. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017, pp. 1105–1113 (2017)

[5]

Hu X, Wang H, and Li P Online biterm topic model based short text stream classification using short text expansion and concept drifting detection Pattern Recognit. Lett. 2018 116 187-194

[6]

Zhang L, Jiang W, and Zhao Z Short-text feature expansion and classification based on nonnegative matrix factorization Int. J. Intell. Syst. 2022 37 12 10066-10080

Digital Library

[7]

Bicalho PV, Pita M, Pedrosa G, Lacerda A, and Pappa GL A general framework to expand short text for topic modeling Inf. Sci. 2017 393 66-81

[8]

Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)

[9]

Ekgren, A., Gyllensten, A.C., Gogoulou, E., Heiman, A., Verlinden, S., Öhman, J., Carlsson, F., Sahlgren, M.: Lessons learned from GPT-SW3: building the first large-scale generative language model for swedish. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, Marseille, France, 20-25 June 2022, pp. 3509–3518 (2022)

[10]

Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR. abs/1907.11692 (2019)

[11]

Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp. 5754–5764 (2019)

[12]

Wang, W., Chen, Z., Chen, X., Wu, J., Zhu, X., Zeng, G., Luo, P., Lu, T., Zhou, J., Qiao, Y., Dai, J.: Visionllm: Large language model is also an open-ended decoder for vision-centric tasks. In: Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 (2023)

[13]

Cohen S, Presil D, Katz O, Arbili O, Messica S, and Rokach L Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time Inf. Fusion 2023 99 101887

Digital Library

[14]

Yuan, L., Chen, Y., Cui, G., Gao, H., Zou, F., Cheng, X., Ji, H., Liu, Z., Sun, M.: Revisiting out-of-distribution robustness in NLP: benchmarks, analysis, and llms evaluations. In: Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 (2023)

[15]

Jiang S, Pan Y, Chen Q, Xiang Y, and Wu X Learning to improve out-of-distribution generalization via self-adaptive language masking IEEE/ACM Trans. Audio Speech Lang. Process. 2024 32 2739-2750

Digital Library

[16]

Li, X., Zhou, Y., Dou, Z.: Unigen: A unified generative framework for retrieval and question answering with large language models. In: Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, pp. 8688–8696 (2024)

[17]

Gu Z, He X, Yu P, Jia W, Yang X, Peng G, Hu P, Chen S, Chen H, and Lin Y Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model Artif. Intell. Med. 2024 150 102822

Digital Library

[18]

Llanes-Jurado J, Gómez-Zaragozá L, Minissi ME, Alcañiz M, and Marín-Morales J Developing conversational virtual humans for social emotion elicitation based on large language models Expert Syst. Appl. 2024 246 123261

Digital Library

[19]

Li, A., Feng, X., Narang, S., Peng, A., Cai, T., Shah, R.S., Varma, S.: Incremental comprehension of garden-path sentences by large language models: Semantic interpretation, syntactic re-analysis, and attention. CoRR. abs/2405.16042 (2024)

[20]

Deng, X., Bashlovkina, V., Han, F., Baumgartner, S., Bendersky, M.: Llms to the moon? reddit market sentiment analysis with large language models. In: Companion Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, pp. 1014–1019 (2023)

[21]

Wang, Q., Ding, K., Liang, B., Yang, M., Xu, R.: Reducing spurious correlations in aspect-based sentiment analysis with explanation from large language models. In: Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pp. 2930–2941 (2023)

[22]

Xian, L., Li, L., Xu, Y., Zhang, B.Z., Hemphill, L.: Landscape of large language models in global english news: Topics, sentiments, and spatiotemporal analysis. In: Proceedings of the Eighteenth International AAAI Conference on Web and Social Media, ICWSM 2024, Buffalo, New York, USA, June 3-6, 2024, pp. 1661–1673 (2024)

[23]

Wang, H., Prakash, N., Hoang, N., Hee, M.S., Naseem, U., Lee, R.K.: Prompting large language models for topic modeling. In: IEEE International Conference on Big Data, BigData 2023, Sorrento, Italy, December 15-18, 2023, pp. 1236–1241 (2023)

[24]

Li, Z., Zhu, H., Lu, Z., Yin, M.: Synthetic data generation with large language models for text classification: Potential and limitations. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pp. 10443–10461 (2023)

[25]

Zhang, Y., Wang, Z., Shang, J.: Clusterllm: Large language models as a guide for text clustering. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pp. 13903–13920 (2023)

[26]

Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 6442–6454 (2020)

[27]

Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., Liu, T.: Biogpt: generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinform. 23(6) (2022)

[28]

Nguyen, D.Q., Vu, T., Nguyen, A.T.: Bertweet: A pre-trained language model for english tweets. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, pp. 9–14 (2020)

[29]

Rybak, P., Mroczkowski, R., Tracz, J., Gawlik, I.: KLEJ: comprehensive benchmark for polish language understanding. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 1191–1201 (2020)

[30]

Wei, J., Ren, X., Li, X., Huang, W., Liao, Y., Wang, Y., Lin, J., Jiang, X., Chen, X., Liu, Q.: NEZHA: neural contextualized representation for chinese language understanding. CoRR. abs/1909.00204 (2019)

[31]

Nguyen, D.Q., Nguyen, A.T.: Phobert: Pre-trained language models for vietnamese. In: Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020. Findings of ACL, vol. EMNLP 2020, pp. 1037–1042 (2020)

[32]

Martin, L., Muller, B., Suárez, P.J.O., Dupont, Y., Romary, L., Clergerie, É., Seddah, D., Sagot, B.: Camembert: a tasty french language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 7203–7219 (2020)

[33]

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, and Liu PJ Exploring the limits of transfer learning with a unified text-to-text transformer J. Mach. Learn. Res. 2020 21 140-114067

[34]

Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, and Kang J Biobert: a pre-trained biomedical language representation model for biomedical text mining Bioinform. 2020 36 4 1234-1240

[35]

Beltagy, I., Lo, K., Cohan, A.: Scibert: A pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pp. 3613–3618 (2019)

[36]

Liu, Z., Huang, D., Huang, K., Li, Z., Zhao, J.: Finbert: A pre-trained financial language representation model for financial text mining. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 4513–4519 (2020)

[37]

Ding N, Qin Y, Yang G, Wei F, Yang Z, Su Y, Hu S, Chen Y, Chan C, Chen W, Yi J, Zhao W, Wang X, Liu Z, Zheng H, Chen J, Liu Y, Tang J, Li J, and Sun M Parameter-efficient fine-tuning of large-scale pre-trained language models Nat. Mach. Intell. 2023 5 3 220-235

[38]

Andrus, B.R., Nasiri, Y., Cui, S., Cullen, B., Fulda, N.: Enhanced story comprehension for large language models through dynamic document-based knowledge graphs. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pp. 10436–10444 (2022)

[39]

Yang L, Chen H, Li Z, Ding X, and Wu X Give us the facts: Enhancing large language models with knowledge graphs for fact-aware language modeling IEEE Trans. Knowl. Data Eng. 2024 36 7 3091-3110

Digital Library

[40]

Mou, X., Li, Z., Lyu, H., Luo, J., Wei, Z.: Unifying local and global knowledge: Empowering large language models as political experts with knowledge graphs. In: Proceedings of the ACM on Web Conference 2024, WWW 2024, Singapore, May 13-17, 2024, pp. 2603–2614 (2024)

[41]

Gouidis, F., Papantoniou, K., Papoutsakis, K.E., Patkos, T., Argyros, A.A., Plexousakis, D.: Fusing domain-specific content from large language models into knowledge graphs for enhanced zero shot object state classification. In: Proceedings of the AAAI 2024 Spring Symposium Series, Stanford, CA, USA, March 25-27, 2024, pp. 115–124 (2024)

[42]

Kim, J., Kwon, Y., Jo, Y., Choi, E.: KG-GPT: A general framework for reasoning on knowledge graphs using large language models. In: Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pp. 9410–9421 (2023)

[43]

Ahmed, A.F., Firmansyah, A.F., Sherif, M.A., Moussallem, D., Ngomo, A.N.: Explainable integration of knowledge graphs using large language models. In: Natural Language Processing and Information Systems - 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21-23, 2023, Proceedings. Lecture Notes in Computer Science, vol. 13913, pp. 124–139 (2023)

[44]

Wróblewska, A., Kaliska, A., Pawlowski, M., Wisniewski, D., Sosnowski, W., Lawrynowicz, A.: Tasteset - recipe dataset and food entities recognition benchmark. CoRR. abs/2204.07775 (2022)

[45]

Yoo, K.M., Shin, Y., Lee, S.: Data augmentation for spoken language understanding via joint variational generation. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp. 7402–7409 (2019)

[46]

McAuley, J.J., Leskovec, J.: From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In: 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17, 2013, pp. 897–908 (2013)

[47]

Shokry, A., Youssef, M.: Qradar: A deployable quantum euclidean similarity large-scale localization system. In: 48th IEEE Conference on Local Computer Networks, LCN 2023, Daytona Beach, FL, USA, October 1-5, 2023, pp. 1–8 (2023)

[48]

Gao C, Li W, He L, and Zhong L A distance and cosine similarity-based fitness evaluation mechanism for large-scale many-objective optimization Eng. Appl. Artif. Intell. 2024 133 108127

Digital Library

[49]

Gong H, Li Y, Zhang J, Zhang B, and Wang X A new filter feature selection algorithm for classification task by ensembling pearson correlation coefficient and mutual information Eng. Appl. Artif. Intell. 2024 131 107865

Digital Library

[50]

Hartley J Is time up for the flesch measure of reading ease? Scientometrics 2016 107 3 1523-1526

Digital Library

[51]

Sharma N, Tridimas A, and Fitzsimmons PR A readability assessment of online stroke information J. Stroke Cerebrovasc. Dis. 2014 23 6 1362-1367

[52]

Lin, C.-Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

[53]

Song J, Qin G, Liang Y, Yan J, and Sun M Sidildng: A similarity-based intrusion detection system using improved levenshtein distance and n-gram for CAN Comput. Secur. 2024 142 103847

Digital Library

[54]

Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR. abs/1910.01108 (2019)

[55]

Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., Liu, Q.: Tinybert: Distilling BERT for natural language understanding. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020. Findings of ACL, vol. EMNLP 2020, pp. 4163–4174 (2020)

[56]

Iandola, F.N., Shaw, A.E., Krishna, R., Keutzer, K.: Squeezebert: What can computer vision teach NLP about efficient neural networks? In: Moosavi, N.S., Fan, A., Shwartz, V., Glavas, G., Joty, S.R., Wang, A., Wolf, T. (eds.) Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, SustaiNLP@EMNLP 2020, Online, November 20, 2020, pp. 124–135 (2020)

[57]

Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020)

[58]

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 7871–7880 (2020)

[59]

Rothe S, Narayan S, and Severyn A Leveraging pre-trained checkpoints for sequence generation tasks Trans. Assoc. Comput. Linguist. 2020 8 264-280

Index Terms

KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansion

Index terms have been assigned to the content through auto-classification.

Recommendations

Knowledge Graph Enhanced Language Models for Sentiment Analysis
The Semantic Web – ISWC 2023
Abstract
Pre-trained language models (LMs) have been widely used in sentiment analysis, and some recent works have focused on injecting sentiment knowledge from sentiment lexicons or structured commonsense knowledge from knowledge graphs (KGs) into pre-...
Topic-enhanced knowledge-aware retrieval model for diverse relevance estimation
WWW '21: Proceedings of the Web Conference 2021

Relevance measures the relation between query and document which contains several different dimensions, e.g., semantic similarity, topical relatedness, cognitive relevance (the relations in the aspect of knowledge), usefulness, timeliness, utility and ...
Message Injection Attack on Rumor Detection under the Black-Box Evasion Setting Using Large Language Model
WWW '24: Proceedings of the ACM Web Conference 2024

Recent analyses have disclosed that existing rumor detection techniques, despite playing a pivotal role in countering the dissemination of misinformation on social media, are vulnerable to both white-box and surrogate-based black-box adversarial attacks. ...

Comments

Information & Contributors

Information

Published In

cover image World Wide Web

World Wide Web Volume 28, Issue 1

Jan 2025

297 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 20 December 2024

Accepted: 07 December 2024

Revision received: 14 November 2024

Received: 25 July 2024

Author Tags

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China (Research and Demonstration Application of Key Technologies for Personalized Learning Driven by Educational Big Data)
National Natural Science Foundation of China
Research Cultivation Fund for The Youth Teachers of South China Normal University

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents