Editorial: Innovation measurement for scientific communication (IMSC) in the era of big data

Zhongyi Wang (School of Information Management, Central China Normal University, Wuhan, China)

Haihua Chen (Department of Information Science, University of North Texas, Denton, Texas, USA and Intelligent Data Engineering and Analytics Lab, University of North Texas, Denton, Texas, USA)

Chengzhi Zhang (School of Economics and Management, Nanjing University of Science and Technology, Nanjing, China)

Wei Lu (School of Information Management, Wuhan University, Wuhan, China)

Jian Wu (Department of Computer Science, Old Dominion University, Norfolk, Virginia, USA)

The Electronic Library

ISSN: 0264-0473

Article publication date: 31 October 2024

Issue publication date: 31 October 2024

Downloads

228

pdf (102 KB)

Citation

Wang, Z., Chen, H., Zhang, C., Lu, W. and Wu, J. (2024), "Editorial: Innovation measurement for scientific communication (IMSC) in the era of big data", The Electronic Library, Vol. 42 No. 6, pp. 849-853. https://doi.org/10.1108/EL-12-2024-353

Publisher

:

Emerald Publishing Limited

Introduction

Scientific communication plays a central role in addressing many of the pressing challenges facing society today (Achiam et al., 2022). In this context, the measurement of innovation within scientific communication is crucial for both understanding and advancing knowledge in today’s fast-paced, data-driven world (Wang et al., 2023). However, measuring innovation accurately is a complex endeavour. The rapid growth of scientific output and the increasing complexity of interdisciplinary research make it challenging to identify and assess truly ground-breaking work.

Artificial intelligence (AI) has already proven its effectiveness across a range of societal applications (Barakat et al., 2021). AI, including large language models (LLMs), excels in identifying patterns from vast data sets and generating accurate predictions based on these insights (Ni et al., 2024; Rubungo et al., 2023; Xu et al., 2024). In the scientific domain, AI has demonstrated remarkable capabilities in synthesis, comprehension, reasoning and interpretation, showcasing its potential to revolutionise research processes (Krenn et al., 2022; Xu et al., 2021; Zheng et al., 2023). By harnessing these capabilities, AI offers promising new approaches for evaluating innovation more effectively and accurately. LLMs, as a subset of AI, are particularly well-suited for handling complex linguistic data, allowing for nuanced understanding and generation of scientific discourse. Their ability to process interdisciplinary knowledge and integrate information across domains makes LLMs essential tools in assessing innovation across various scientific fields.

Developing robust methods for measuring innovation is essential, not only for recognising pioneering contributions in scientific communication but also for deepening our understanding of research impact and fostering a culture of continuous advancement within the academic community. As we navigate the era of big data, the integration of AI into innovation measurement tools is critical – not only for identifying significant scientific contributions but also for enhancing our understanding of how innovation emerges and spreads across the scientific landscape.

Moreover, communication lies at the heart of science, permeating every stage of the research process and enabling scientists to share their findings with peers through journal articles, preprints and conference proceedings (Russell, 2001). The rise of information technology, particularly personal computers and the World Wide Web, has transformed scientific communication, shifting it from a traditional print-based system reliant on scientific journals to one increasingly dependent on electronic communication and digital storage (Hurd, 2000). As a result, the measurement of innovation in scientific communication frequently involves analysing digital journal articles, preprints and other electronic formats to assess their impact and contributions to the field.

For example, Wang et al. (2024b) developed a content-based evaluation method for scientific papers that focuses on integrity, clarity, novelty and significance, thereby enhancing innovation measurement. Similarly, Zhao et al. (2024) found that teams with more thought leaders tend to produce less disruptive ideas, while factors, such as team size, diversity and international collaboration, also influence innovation. Wang et al. (2024a) also introduced the D_Div index, a method incorporating interdisciplinary factors, which has been validated to outperform traditional techniques in identifying breakthroughs. Liu et al. (2024b) used the pre-trained Bio-BERT model to explore gender disparities in scientific novelty, analysing biomedical doctoral theses to uncover gender-based influences on innovation. Additionally, the work of Iacopini et al. (2018) introduced a model for the emergence of innovations, where cognitive processes are described as random walks on a network of linked ideas or concepts, with innovation emerging from the first visit to a novel node. Hofstra et al. (2020) demonstrated that underrepresented groups contribute significantly to scientific novelty, yet their innovations are often undervalued, highlighting a gap in current innovation measurement. Uzzi et al. (2013) found that high-impact papers blend conventional and novel combinations of prior work, with teams being 37.7% more likely than solo authors to introduce these innovative combinations. This highlights the role of collaboration in driving scientific innovation.

The primary objective of this special issue of The Electronic Library, titled “Innovation Measurement in Scientific Communication in the Big Data Era”, is to leverage advanced technologies, such as big data and AI, to develop more effective and reliable methods for measuring innovation in scientific communication.

The present issue

For this Special Issue, each submission was initially evaluated by one of the Guest Editors to assess its relevance to the theme. Submissions that aligned with the topic were then reviewed by at least two external reviewers, while those that did not were declined. Following a rigorous peer review process, three submissions were accepted for publication. The contributions featured in this issue explored the application of various big data and AI technologies in identifying and measuring innovation in scientific communication.

We extend our sincere appreciation to all contributors for their invaluable insights and rigorous research. Each paper in this Special Issue has significantly enriched our understanding of innovation measurement, collectively advancing the discourse in this critical area. The three accepted papers offer valuable insights into the evolving landscape of the field. By analysing these contributions, we aim to emphasise the importance of combining established methodologies with new technological advancements to tackle the complex challenges of evaluating scientific innovation in the big data era.

Yang et al. (2024) designed a framework aimed at analysing the distribution of novelty in academic papers from a collaboration perspective. They first use the BERTopic model to identify the topic of each paper, then calculate the novelty score of all papers based on the combination innovation theory of the papers. Building on this, they analysed the novelty of papers across different topics and examined the variations in collaboration patterns among authors within different topics. The study seeks to explain how these collaborative patterns relate to the novelty of papers from an author collaboration perspective. Through an empirical study of articles published in Chinese library science journals indexed by the Chinese Social Sciences Citation Index (CSSCI) from 2000 to 2022, they found that papers with different topics and novelty levels exhibit distinct author collaboration patterns: low-novelty topics are dominated by solo authors, while high-novelty topics involve a higher proportion of inter-institutional collaboration.

Liu et al. (2024a) introduced the mix prompt tuning method, which uses both manually designed and automatically learned prompt templates to enhance multi-granularity academic function recognition tasks in low-resource settings. By reducing reliance on annotated data, this approach integrates deep learning and prompt-based techniques, significantly improving performance. This versatile, semi-supervised model contributes to innovation measurement in scientific communication by being applicable to various low-resource classification tasks in the scientific domain.

Chen et al. (2024) examined the relationship between team institutional composition and the fine-grained novelty of academic papers in the field of natural language processing. Using entity extraction and entity combination methods, they evaluated paper novelty and categorised author teams into academic, industrial and mixed institutions. The study highlights how collaboration between industry and academia influences the creation of innovative work, offering a nuanced approach that leverages AI techniques to assess novelty based on specific entity combinations. This work contributes to the broader discourse on measuring innovation in scientific communication in the big data era.

Future directions

In the era of big data and rapid advancements in AI, these technologies have attracted significant attention from both the academic community and various professional fields. This shift has introduced new opportunities and perspectives for innovation measurement in scientific communication. As demonstrated by the research articles featured in this Special Issue, the application of AI and big data technologies in innovation measurement has produced notable results. In this section, we explore the anticipated developments and trends in innovation measurement in the context of big data.

Open science is expected to play a pivotal role in fostering greater collaboration and innovation within the scientific community. By promoting transparency, accessibility and the sharing of research outputs, open science accelerates knowledge dissemination and encourages more diverse and inclusive contributions to scientific progress. Future research should focus on how innovation measurement frameworks can be adapted to incorporate open science practices, such as open access publications, preprint archives and open peer review. These practices not only democratise access to knowledge but also offer new avenues for measuring the impact and innovative potential of scientific work beyond traditional metrics.

The integration of big data and AI presents significant opportunities for advancing innovation measurement. Big data offers a vast repository of information – including publications, patents, data sets and social media interactions – that can be analysed to reveal patterns and trends in scientific communication. AI, especially through machine learning and deep learning algorithms, can process and analyse this data at an unprecedented scale and speed. Future research should focus on the development of AI-driven tools capable of automatically evaluating research contributions, predicting emerging fields and assessing the potential impact of ongoing projects. Such tools could improve the accuracy, comprehensiveness and timeliness of innovation metrics, providing a more nuanced understanding of scientific progress.

AI agents, designed to perform tasks autonomously, represent a promising direction for the future of innovation measurement. These agents can simulate human decision-making processes and operate continuously, offering real-time monitoring and evaluation of scientific communication. Future studies should investigate how AI agents can be used to autonomously track research outputs, conduct preliminary peer reviews and forecast the trajectory of scientific fields. By integrating AI agents with big data and machine learning models, it is possible to develop sophisticated systems capable of identifying ground-breaking research and informing strategic decisions in science policy and funding.

Conclusion

In conclusion, the future of innovation measurement in scientific communication is deeply intertwined with the advancements in open science, big data, AI and AI agents. By focusing on these emerging areas, future research can create more effective, inclusive and adaptive frameworks for evaluating innovation. These advancements will not only enhance our understanding of how scientific knowledge is generated and disseminated but also ensure that the most innovative and impactful research receives the recognition and support it deserves in an increasingly complex, data-driven world.

References

Achiam, M., Kupper, J.F.H. and Roche, J. (2022), “Inclusion, reflection and co-creation: responsible science communication across the globe”, Journal of Science Communication, Vol. 21 No. 4, p. E.

Barakat, Y., Bourekkadi, S., Khoulji, S. and Kerkeb, M.L. (2021), “What contributions of artificial intelligence in innovation?”, E3S Web of Conferences, Vol. 234, p. 105.

Chen, Z., Zhang, C., Zhang, H., Zhao, Y., Yang, C. and Yang, Y. (2024), “Exploring the relationship between team institutional composition and novelty in academic papers based on fine-grained knowledge entities”, The Electronic Library, doi: 10.1108/EL-03-2024-0070.

Hofstra, B., Kulkarni, V.V., Munoz-Najar Galvez, S., He, B., Jurafsky, D. and McFarland, D.A. (2020), “The diversity-innovation paradox in science”, Proceedings of the National Academy of Sciences, Vol. 117 No. 17, pp. 9284-9291.

Hurd, J.M. (2000), “The transformation of scientific communication: a model for 2020”, Journal of the American Society for Information Science, Vol. 51 No. 14, pp. 1279-1283.

Iacopini, I., Milojević, S. and Latora, V. (2018), “Network dynamics of innovation processes”, Physical Review Letters, Vol. 120 No. 4, p. 48301.

Krenn, M., Pollice, R., Guo, S.Y., Aldeghi, M., Cervera-Lierta, A., Friederich, P., dos Passos Gomes, G., Häse, F., Jinich, A. and Nigam, A. (2022), “On scientific understanding with artificial intelligence”, Nature Reviews Physics, Vol. 4 No. 12, pp. 761-769.

Liu, J., Xiong, Z., Jiang, Y., Ma, Y., Lu, W., Huang, Y. and Cheng, Q. (2024a), “Low-resource multi-granularity academic function recognition based on multiple prompt knowledge”, The Electronic Library, doi: 10.1108/EL-01-2024-0022.

Liu, M., Xie, Z., Yang, A.J., Yu, C., Xu, J., Ding, Y. and Bu, Y. (2024b), “The prominent and heterogeneous gender disparities in scientific novelty: evidence from biomedical doctoral theses”, Information Processing and Management, Vol. 61 No. 4, p. 103743.

Ni, H., Meng, S., Chen, X., Zhao, Z., Chen, A., Li, P., Zhang, S., Yin, Q., Wang, Y. and Chan, Y. (2024), “Harnessing earnings reports for stock predictions: a QLoRA-enhanced LLM approach”, arXiv preprint arXiv:2408.06634.

Rubungo, A.N., Arnold, C., Rand, B.P. and Dieng, A.B. (2023), “LLM-Prop: predicting physical and electronic properties of crystalline solids from their text descriptions”, arXiv preprint arXiv:2310.14029.

Russell, J.M. (2001), “Scientific communication at the beginning of the twenty-first century”, International Social Science Journal, Vol. 53 No. 168.

Uzzi, B., Mukherjee, S., Stringer, M. and Jones, B. (2013), “Atypical combinations and scientific impact”, Science, Vol. 342 No. 6157, pp. 468-472.

Wang, Z., Qiao, X., Chen, J., Li, L., Zhang, H., Ding, J. and Chen, H. (2024a), “Exploring and evaluating the index for interdisciplinary breakthrough innovation detection”, The Electronic Library, Vol. 42 No. 4, pp. 536-552.

Wang, Z., Zhang, H., Chen, H., Feng, Y. and Ding, J. (2024b), “Content-based quality evaluation of scientific papers using coarse feature and knowledge entity network”, Journal of King Saud University - Computer and Information Sciences, Vol. 36 No. 6, p. 102119.

Wang, Z., Chen, H., Zhang, C., Lu, W. and Wu, J. (2023), “JCDL2023 workshop: Innovation measurement for scientific communication (IMSC) in the era of big data”, ACM/IEEE Joint Conference on Digital Libraries (JCDL 23), pp. 303-305.

Xu, Y., Liu, X., Cao, X., Huang, C., Liu, E., Qian, S., Liu, X., Wu, Y., Dong, F. and Qiu, C.-W. (2021), “Artificial intelligence: a powerful paradigm for scientific research”, The Innovation, Vol. 2 No. 4, p. 100179.

Xu, X., Yao, B., Dong, Y., Gabriel, S., Yu, H., Hendler, J., Ghassemi, M., Dey, A.K. and Wang, D. (2024), “Mental-LLM: leveraging large language models for mental health prediction via online text data”, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 8 No. 1, pp. 1-32.

Yang, C., Wang, Y. and Zhang, C. (2024), “Unveiling novelty evolution in the field of library and information science in China”, The Electronic Library, doi: 10.1108/EL-03-2024-0071.

Zhao, Y., Wang, Y., Zhang, H., Kim, D., Lu, C., Zhu, Y. and Zhang, C. (2024), “Do more heads imply better performance? An empirical study of team thought leaders’ impact on scientific team performance”, Information Processing and Management, Vol. 61 No. 4, p. 103757.

Zheng, Y., Koh, H.Y., Ju, J., Nguyen, A.T., May, L.T., Webb, G.I. and Pan, S. (2023), “Large language models for scientific synthesis, inference and explanation”, arXiv preprint arXiv:2310.07984.

Acknowledgements

This paper forms part of a special section “Innovation measurement for scientific communication (IMSC) in the era of big data”, guest edited by Zhongyi Wang, Haihua Chen, Chengzhi Zhang, Wei Lu and Jian Wu.