The use and effects of visual representations in knowledge production have been a charged topic i... more The use and effects of visual representations in knowledge production have been a charged topic in scientific research. In the field of humanities studies, however, this topic remains under-examined despite the increasing applications of data visualization in the field. This paper aims to understand how visual representations facilitate narrative construction in published articles in the emerging field of digital humanities (DH). Through the methods of content analysis and close reading, we analyzed the narrative functions of visualizations in the argumenta-tion process with a selected sample of research articles published in the Journal of Cultural Analytics from 2017 to 2019. With four observations from the analysis , this study presented a preliminary yet innovative examination of DH's visual language and proposed suggestions on integrating existing functional frameworks of data visualization with the research contexts of digital humanities.
Clarivate Analytics's Web of Science (WoS) is the world's leading scientific citation search and ... more Clarivate Analytics's Web of Science (WoS) is the world's leading scientific citation search and analytical information platform. It is used as both a research tool supporting a broad array of scientific tasks across diverse knowledge domains as well as a dataset for large-scale data-intensive studies. WoS has been used in thousands of published academic studies over the past 20 years. It is also the most enduring commercial legacy of Eugene Garfield. Despite the central position WoS holds in contemporary research, the quantitative impact of WoS has not been previously examined by rigorous scientific studies. To better understand how this key piece of Eugene Garfield's heritage has contributed to science, we investigated the ways in which WoS (and associated products and features) is mentioned in a sample of 19,478 English-language research and review papers published between 1997 and 2017, as indexed in WoS databases. We offered descriptive analyses of the distribution of the papers across countries, institutions and knowledge domains. We also used natural language processingtechniques to identify the verbs and nouns in the abstracts of these papers that are grammatically connected to WoS-related phrases. This is the first study to empirically investigate the documentation of the use of the WoS platform in published academic papers in both scientometric and linguistic terms.
Despite its rising position as a first-class research object, scientific software remains a margi... more Despite its rising position as a first-class research object, scientific software remains a marginal object in studies of scholarly communication. This study aims to fill the gap by examining the co-mention network of R packages across all Public Library of Science (PLoS) journals. To that end, we developed a software entity extraction method and identified 14,310 instances of R packages across the 13,684 PLoS journal papers mentioning or citing R. A paper-level co-mention network of these packages was visualized and analyzed using three major centrality measures: degree centrality, betweenness centrality, and PageRank. We analyzed the distributive patterns of R packages in all PLoS papers, identified the top packages mentioned in these papers, and examined the clustering structure of the network. Specifically, we found that the discipline and function of the packages can partly explain the largest clusters. The present study offers the first large-scale analysis of R packages' extensive use in scientific research. As such, it lays the foundation for future explorations of various roles played by software packages in the scientific enterprise.
This paper addresses software citation by analyzing how R and its packages are cited in a sample ... more This paper addresses software citation by analyzing how R and its packages are cited in a sample of PLoS papers. A codebook is developed to support a content analysis of the full-text papers. Our results indicate that the software R and its packages are inconsistently cited, as is the case with other scientific software. The inconsistency derives partly from the variety of citation standards currently used for software, and partly from fact that these standards are not well followed by authors on multiple levels. This work sheds light on the future development of software citation standards, especially given the present landscape of conflicting citation practices. Moreover, our approach furnishes a possible blueprint for dealing with the granularity of software entities in scientific citation: we consider citations of the core R software environment, of specific R packages, and of individual functions.
Scientific software is as important to scientific studies as raw data. Yet, attention to this gen... more Scientific software is as important to scientific studies as raw data. Yet, attention to this genre of research data is limited in studies on data reuse, citation, and metadata standards. This paper presents results from an exploratory study that examined how scientific software's reuse information is presented in the current citation practice and natural language descriptions in research papers. We selected LAMMPS, popular simulation software used in material science, for this study. Both descriptive metadata elements and the types of reuse are examined from a sample of 400 research papers. The results indicate that both descriptive metadata elements and reuse types about LAMMPS are presented in incomplete and inconsistent ways, and this interferes with the values of scientific software, as a type of research data. Our findings necessitate future studies on the metadata standards to facilitate the identification of information related with scientific software reuse.
The use and effects of visual representations in knowledge production have been a charged topic i... more The use and effects of visual representations in knowledge production have been a charged topic in scientific research. In the field of humanities studies, however, this topic remains under-examined despite the increasing applications of data visualization in the field. This paper aims to understand how visual representations facilitate narrative construction in published articles in the emerging field of digital humanities (DH). Through the methods of content analysis and close reading, we analyzed the narrative functions of visualizations in the argumenta-tion process with a selected sample of research articles published in the Journal of Cultural Analytics from 2017 to 2019. With four observations from the analysis , this study presented a preliminary yet innovative examination of DH's visual language and proposed suggestions on integrating existing functional frameworks of data visualization with the research contexts of digital humanities.
Clarivate Analytics's Web of Science (WoS) is the world's leading scientific citation search and ... more Clarivate Analytics's Web of Science (WoS) is the world's leading scientific citation search and analytical information platform. It is used as both a research tool supporting a broad array of scientific tasks across diverse knowledge domains as well as a dataset for large-scale data-intensive studies. WoS has been used in thousands of published academic studies over the past 20 years. It is also the most enduring commercial legacy of Eugene Garfield. Despite the central position WoS holds in contemporary research, the quantitative impact of WoS has not been previously examined by rigorous scientific studies. To better understand how this key piece of Eugene Garfield's heritage has contributed to science, we investigated the ways in which WoS (and associated products and features) is mentioned in a sample of 19,478 English-language research and review papers published between 1997 and 2017, as indexed in WoS databases. We offered descriptive analyses of the distribution of the papers across countries, institutions and knowledge domains. We also used natural language processingtechniques to identify the verbs and nouns in the abstracts of these papers that are grammatically connected to WoS-related phrases. This is the first study to empirically investigate the documentation of the use of the WoS platform in published academic papers in both scientometric and linguistic terms.
Despite its rising position as a first-class research object, scientific software remains a margi... more Despite its rising position as a first-class research object, scientific software remains a marginal object in studies of scholarly communication. This study aims to fill the gap by examining the co-mention network of R packages across all Public Library of Science (PLoS) journals. To that end, we developed a software entity extraction method and identified 14,310 instances of R packages across the 13,684 PLoS journal papers mentioning or citing R. A paper-level co-mention network of these packages was visualized and analyzed using three major centrality measures: degree centrality, betweenness centrality, and PageRank. We analyzed the distributive patterns of R packages in all PLoS papers, identified the top packages mentioned in these papers, and examined the clustering structure of the network. Specifically, we found that the discipline and function of the packages can partly explain the largest clusters. The present study offers the first large-scale analysis of R packages' extensive use in scientific research. As such, it lays the foundation for future explorations of various roles played by software packages in the scientific enterprise.
This paper addresses software citation by analyzing how R and its packages are cited in a sample ... more This paper addresses software citation by analyzing how R and its packages are cited in a sample of PLoS papers. A codebook is developed to support a content analysis of the full-text papers. Our results indicate that the software R and its packages are inconsistently cited, as is the case with other scientific software. The inconsistency derives partly from the variety of citation standards currently used for software, and partly from fact that these standards are not well followed by authors on multiple levels. This work sheds light on the future development of software citation standards, especially given the present landscape of conflicting citation practices. Moreover, our approach furnishes a possible blueprint for dealing with the granularity of software entities in scientific citation: we consider citations of the core R software environment, of specific R packages, and of individual functions.
Scientific software is as important to scientific studies as raw data. Yet, attention to this gen... more Scientific software is as important to scientific studies as raw data. Yet, attention to this genre of research data is limited in studies on data reuse, citation, and metadata standards. This paper presents results from an exploratory study that examined how scientific software's reuse information is presented in the current citation practice and natural language descriptions in research papers. We selected LAMMPS, popular simulation software used in material science, for this study. Both descriptive metadata elements and the types of reuse are examined from a sample of 400 research papers. The results indicate that both descriptive metadata elements and reuse types about LAMMPS are presented in incomplete and inconsistent ways, and this interferes with the values of scientific software, as a type of research data. Our findings necessitate future studies on the metadata standards to facilitate the identification of information related with scientific software reuse.
Uploads