Search | arXiv e-print repository

doi 10.1016/j.nlp.2024.100085

Cutting through the noise to motivate people: A comprehensive analysis of COVID-19 social media posts de/motivating vaccination

Authors: Ashiqur Rahman, Ehsan Mohammadi, Hamed Alhoori

Abstract: The COVID-19 pandemic exposed significant weaknesses in the healthcare information system. The overwhelming volume of misinformation on social media and other socioeconomic factors created extraordinary challenges to motivate people to take proper precautions and get vaccinated. In this context, our work explored a novel direction by analyzing an extensive dataset collected over two years, identif… ▽ More The COVID-19 pandemic exposed significant weaknesses in the healthcare information system. The overwhelming volume of misinformation on social media and other socioeconomic factors created extraordinary challenges to motivate people to take proper precautions and get vaccinated. In this context, our work explored a novel direction by analyzing an extensive dataset collected over two years, identifying the topics de/motivating the public about COVID-19 vaccination. We analyzed these topics based on time, geographic location, and political orientation. We noticed that while the motivating topics remain the same over time and geographic location, the demotivating topics change rapidly. We also identified that intrinsic motivation, rather than external mandate, is more advantageous to inspire the public. This study addresses scientific communication and public motivation in social media. It can help public health officials, policymakers, and social media platforms develop more effective messaging strategies to cut through the noise of misinformation and educate the public about scientific findings. △ Less

Submitted 26 July, 2024; v1 submitted 14 June, 2024; originally announced July 2024.

Comments: 51 pages, 13 figures, 12 tables. Accepted at Natural Language Processing Journal

Journal ref: Natural Language Processing Journal, Volume 8, 2024, 100085, ISSN 2949-7191

arXiv:2310.03174 [pdf, other]

Test Case Recommendations with Distributed Representation of Code Syntactic Features

Authors: Mosab Rezaei, Hamed Alhoori, Mona Rahimi

Abstract: Frequent modifications of unit test cases are inevitable due to software's continuous underlying changes in source code, design, and requirements. Since manually maintaining software test suites is tedious, timely, and costly, automating the process of generation and maintenance of test units will significantly impact the effectiveness and efficiency of software testing processes. To this end, w… ▽ More Frequent modifications of unit test cases are inevitable due to software's continuous underlying changes in source code, design, and requirements. Since manually maintaining software test suites is tedious, timely, and costly, automating the process of generation and maintenance of test units will significantly impact the effectiveness and efficiency of software testing processes. To this end, we propose an automated approach which exploits both structural and semantic properties of source code methods and test cases to recommend the most relevant and useful unit tests to the developers. The proposed approach initially trains a neural network to transform method-level source code, as well as unit tests, into distributed representations (embedded vectors) while preserving the importance of the structure in the code. Retrieving the semantic and structural properties of a given method, the approach computes cosine similarity between the method's embedding and the previously-embedded training instances. Further, according to the similarity scores between the embedding vectors, the model identifies the closest methods of embedding and the associated unit tests as the most similar recommendations. The results on the Methods2Test dataset showed that, while there is no guarantee to have similar relevant test cases for the group of similar methods, the proposed approach extracts the most similar existing test cases for a given method in the dataset, and evaluations show that recommended test cases decrease the developers' effort to generating expected test cases. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 8 pages, 4 figures, 14th Workshop on Automating Test Case Design, Selection and Evaluation (A-TEST 2023) co-located with 38th IEEE/ACM International Conference on ASE 2023 conference

arXiv:2308.12580 [pdf, other]

Laying foundations to quantify the "Effort of Reproducibility"

Authors: Akhil Pandey Akella, David Koop, Hamed Alhoori

Abstract: Why are some research studies easy to reproduce while others are difficult? Casting doubt on the accuracy of scientific work is not fruitful, especially when an individual researcher cannot reproduce the claims made in the paper. There could be many subjective reasons behind the inability to reproduce a scientific paper. The field of Machine Learning (ML) faces a reproducibility crisis, and survey… ▽ More Why are some research studies easy to reproduce while others are difficult? Casting doubt on the accuracy of scientific work is not fruitful, especially when an individual researcher cannot reproduce the claims made in the paper. There could be many subjective reasons behind the inability to reproduce a scientific paper. The field of Machine Learning (ML) faces a reproducibility crisis, and surveying a portion of published articles has resulted in a group realization that although sharing code repositories would be appreciable, code bases are not the end all be all for determining the reproducibility of an article. Various parties involved in the publication process have come forward to address the reproducibility crisis and solutions such as badging articles as reproducible, reproducibility checklists at conferences (\textit{NeurIPS, ICML, ICLR, etc.}), and sharing artifacts on \textit{OpenReview} come across as promising solutions to the core problem. The breadth of literature on reproducibility focuses on measures required to avoid ir-reproducibility, and there is not much research into the effort behind reproducing these articles. In this paper, we investigate the factors that contribute to the easiness and difficulty of reproducing previously published studies and report on the foundational framework to quantify effort of reproducibility. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: Accepted at ACM/IEEE conference JCDL' 2023. Refer https://2023.jcdl.org/program/schedule-printable/ for confirmation

arXiv:2306.12118 [pdf, other]

doi 10.1109/JCDL57899.2023.00067

Visualizing Relation Between (De)Motivating Topics and Public Stance toward COVID-19 Vaccine

Authors: Ashiqur Rahman, Hamed Alhoori

Abstract: While social media plays a vital role in communication nowadays, misinformation and trolls can easily take over the conversation and steer public opinion on these platforms. We saw the effect of misinformation during the COVID-19 pandemic when public health officials faced significant push-back while trying to motivate the public to vaccinate. To tackle the current and any future threats in emerge… ▽ More While social media plays a vital role in communication nowadays, misinformation and trolls can easily take over the conversation and steer public opinion on these platforms. We saw the effect of misinformation during the COVID-19 pandemic when public health officials faced significant push-back while trying to motivate the public to vaccinate. To tackle the current and any future threats in emergencies and motivate the public towards a common goal, it is essential to understand how public motivation shifts and which topics resonate among the general population. In this study, we proposed an interactive visualization tool to inspect and analyze the topics that resonated among Twitter-sphere during the COVID-19 pandemic and understand the key factors that shifted public stance for vaccination. This tool can easily be generalized for any scenario for visual analysis and to increase the transparency of social media data for researchers and the general population alike. △ Less

Submitted 6 July, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

arXiv:2306.09812 [pdf, other]

Boundary Blending: Reconsidering the Design of Multi-View Visualizations

Authors: Maoyuan Sun, Abdul Rahman Shaikh, Yue Ma, David Koop, Hamed Alhoori

Abstract: Multiple-view visualizations (MVs) have been widely used for visual analysis. Each view shows some part of the data in a usable way, and together multiple views enable a holistic understanding of the data under investigation. For example, an analyst may check a social network graph, a map of sensitive locations, a table of transaction records, and a collection of reports to identify suspicious act… ▽ More Multiple-view visualizations (MVs) have been widely used for visual analysis. Each view shows some part of the data in a usable way, and together multiple views enable a holistic understanding of the data under investigation. For example, an analyst may check a social network graph, a map of sensitive locations, a table of transaction records, and a collection of reports to identify suspicious activities. While each view is designed to preserve its own visual context with visible borders or perceivable spatial distance from others, the key to solving real-world analysis problems often requires "breaking" such boundaries, and further integrating and synthesizing the data scattered across multiple views. This calls for blending the boundaries in MVs, instead of simply breaking them, which brings key questions: what are possible boundaries in MVs, and what are design options that can support the boundary blending in MVs? To answer these questions, we present three boundaries in MVs: 1) data boundary, 2) representation boundary, and 3) semantic boundary, corresponding to three major aspects regarding the usage of MVs: encoded information, visual representation, and interpretation. Then, we discuss four design strategies (highlighting, linking, embedding, and extending) and their pros and cons for supporting boundary blending in MVs. We conclude our discussion with future research opportunities. △ Less

Submitted 16 June, 2023; originally announced June 2023.

ACM Class: H.5.0

arXiv:2301.04369 [pdf, other]

Reproducibility Signals in Science: A preliminary analysis

Authors: Akhil Pandey Akella, Hamed Alhoori, David Koop

Abstract: Reproducibility is an important feature of science; experiments are retested, and analyses are repeated. Trust in the findings increases when consistent results are achieved. Despite the importance of reproducibility, significant work is often involved in these efforts, and some published findings may not be reproducible due to oversights or errors. In this paper, we examine a myriad of features i… ▽ More Reproducibility is an important feature of science; experiments are retested, and analyses are repeated. Trust in the findings increases when consistent results are achieved. Despite the importance of reproducibility, significant work is often involved in these efforts, and some published findings may not be reproducible due to oversights or errors. In this paper, we examine a myriad of features in scholarly articles published in computer science conferences and journals and test how they correlate with reproducibility. We collected data from three different sources that labeled publications as either reproducible or irreproducible and employed statistical significance tests to identify features of those publications that hold clues about reproducibility. We found the readability of the scholarly article and accessibility of the software artifacts through hyperlinks to be strong signals noticeable amongst reproducible scholarly articles. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: Accepted as a Workshop paper for WIESP-22 (https://ui.adsabs.harvard.edu/WIESP/2022/Schedule)

arXiv:2209.07333 [pdf, other]

doi 10.2478/jdis-2022-0003

Public Reaction to Scientific Research via Twitter Sentiment Prediction

Authors: Murtuza Shahzad, Hamed Alhoori

Abstract: Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of res… ▽ More Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles. △ Less

Submitted 11 September, 2022; originally announced September 2022.

Comments: Journal of Data and Information Sciences

Journal ref: Journal of Data and Information Science (2022), Volume 7, Issue 1, 97-124

arXiv:2209.06212 [pdf]

doi 10.1016/j.joi.2022.101288

Quantifying the Online Long-Term Interest in Research

Authors: Murtuza Shahzad, Hamed Alhoori, Reva Freedman, Shaikh Abdul Rahman

Abstract: Research articles are being shared in increasing numbers on multiple online platforms. Although the scholarly impact of these articles has been widely studied, the online interest determined by how long the research articles are shared online remains unclear. Being cognizant of how long a research article is mentioned online could be valuable information to the researchers. In this paper, we analy… ▽ More Research articles are being shared in increasing numbers on multiple online platforms. Although the scholarly impact of these articles has been widely studied, the online interest determined by how long the research articles are shared online remains unclear. Being cognizant of how long a research article is mentioned online could be valuable information to the researchers. In this paper, we analyzed multiple social media platforms on which users share and/or discuss scholarly articles. We built three clusters for papers, based on the number of yearly online mentions having publication dates ranging from the year 1920 to 2016. Using the online social media metrics for each of these three clusters, we built machine learning models to predict the long-term online interest in research articles. We addressed the prediction task with two different approaches: regression and classification. For the regression approach, the Multi-Layer Perceptron model performed best, and for the classification approach, the tree-based models performed better than other models. We found that old articles are most evident in the contexts of economics and industry (i.e., patents). In contrast, recently published articles are most evident in research platforms (i.e., Mendeley) followed by social media platforms (i.e., Twitter). △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: Journal of Informetrics

Journal ref: Journal of Informetrics 16.2 (2022): 101288

arXiv:2209.02380 [pdf, other]

YouTube and Science: Models for Research Impact

Authors: Abdul Rahman Shaikh, Hamed Alhoori, Maoyuan Sun

Abstract: Video communication has been rapidly increasing over the past decade, with YouTube providing a medium where users can post, discover, share, and react to videos. There has also been an increase in the number of videos citing research articles, especially since it has become relatively commonplace for academic conferences to require video submissions. However, the relationship between research arti… ▽ More Video communication has been rapidly increasing over the past decade, with YouTube providing a medium where users can post, discover, share, and react to videos. There has also been an increase in the number of videos citing research articles, especially since it has become relatively commonplace for academic conferences to require video submissions. However, the relationship between research articles and YouTube videos is not clear, and the purpose of the present paper is to address this issue. We created new datasets using YouTube videos and mentions of research articles on various online platforms. We found that most of the articles cited in the videos are related to medicine and biochemistry. We analyzed these datasets through statistical techniques and visualization, and built machine learning models to predict (1) whether a research article is cited in videos, (2) whether a research article cited in a video achieves a level of popularity, and (3) whether a video citing a research article becomes popular. The best models achieved F1 scores between 80% and 94%. According to our results, research articles mentioned in more tweets and news coverage have a higher chance of receiving video citations. We also found that video views are important for predicting citations and increasing research articles' popularity and public engagement with science. △ Less

Submitted 1 September, 2022; originally announced September 2022.

Comments: 21 pages, 12 figures, Scientometrics Journal

arXiv:2207.07558 [pdf, other]

Toward Systematic Design Considerations of Organizing Multiple Views

Authors: Abdul Rahman Shaikh, David Koop, Hamed Alhoori, Maoyuan Sun

Abstract: Multiple-view visualization (MV) has been used for visual analytics in various fields (e.g., bioinformatics, cybersecurity, and intelligence analysis). Because each view encodes data from a particular perspective, analysts often use a set of views laid out in 2D space to link and synthesize information. The difficulty of this process is impacted by the spatial organization of these views. For inst… ▽ More Multiple-view visualization (MV) has been used for visual analytics in various fields (e.g., bioinformatics, cybersecurity, and intelligence analysis). Because each view encodes data from a particular perspective, analysts often use a set of views laid out in 2D space to link and synthesize information. The difficulty of this process is impacted by the spatial organization of these views. For instance, connecting information from views far from each other can be more challenging than neighboring ones. However, most visual analysis tools currently either fix the positions of the views or completely delegate this organization of views to users (who must manually drag and move views). This either limits user involvement in managing the layout of MV or is overly flexible without much guidance. Then, a key design challenge in MV layout is determining the factors in a spatial organization that impact understanding. To address this, we review a set of MV-based systems and identify considerations for MV layout rooted in two key concerns: perception, which considers how users perceive view relationships, and content, which considers the relationships in the data. We show how these allow us to study and analyze the design of MV layout systematically. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: Short paper with 4 pages + 1 reference page, 2 figures, 1 table, accepted at IEEE VIS 2022 conference

arXiv:2109.14099 [pdf, other]

An Explainable-AI approach for Diagnosis of COVID-19 using MALDI-ToF Mass Spectrometry

Authors: Venkata Devesh Reddy Seethi, Zane LaCasse, Prajkta Chivte, Joshua Bland, Shrihari S. Kadkol, Elizabeth R. Gaillard, Pratool Bharti, Hamed Alhoori

Abstract: The severe acute respiratory syndrome coronavirus type-2 (SARS-CoV-2) caused a global pandemic and immensely affected the global economy. Accurate, cost-effective, and quick tests have proven substantial in identifying infected people and mitigating the spread. Recently, multiple alternative platforms for testing coronavirus disease 2019 (COVID-19) have been published that show high agreement with… ▽ More The severe acute respiratory syndrome coronavirus type-2 (SARS-CoV-2) caused a global pandemic and immensely affected the global economy. Accurate, cost-effective, and quick tests have proven substantial in identifying infected people and mitigating the spread. Recently, multiple alternative platforms for testing coronavirus disease 2019 (COVID-19) have been published that show high agreement with current gold standard real-time polymerase chain reaction (RT-PCR) results. These new methods do away with nasopharyngeal (NP) swabs, eliminate the need for complicated reagents, and reduce the burden on RT-PCR test reagent supply. In the present work, we have designed an artificial intelligence-based (AI) testing method to provide confidence in the results. Current AI applications for COVID-19 studies often lack a biological foundation in the decision-making process, and our AI approach is one of the earliest to leverage explainable AI (X-AI) algorithms for COVID-19 diagnosis using mass spectrometry. Here, we have employed X-AI to explain the decision-making process on a local (per-sample) and global (all samples) basis underscored by biologically relevant features. We evaluated our technique with data extracted from human gargle samples and achieved a testing accuracy of 94.12%. Such techniques would strengthen the relationship between AI and clinical diagnostics by providing biomedical researchers and healthcare workers with trustworthy and, most importantly, explainable test results △ Less

Submitted 23 May, 2023; v1 submitted 28 September, 2021; originally announced September 2021.

arXiv:2108.01044 [pdf, other]

doi 10.1109/TVCG.2021.3114801

SightBi: Exploring Cross-View Data Relationships with Biclusters

Authors: Maoyuan Sun, Abdul Rahman Shaikh, Hamed Alhoori, Jian Zhao

Abstract: Multiple-view visualization (MV) has been heavily used in visual analysis tools for sensemaking of data in various domains (e.g., bioinformatics, cybersecurity and text analytics). One common task of visual analysis with multiple views is to relate data across different views. For example, to identify threats, an intelligence analyst needs to link people from a social network graph with locations… ▽ More Multiple-view visualization (MV) has been heavily used in visual analysis tools for sensemaking of data in various domains (e.g., bioinformatics, cybersecurity and text analytics). One common task of visual analysis with multiple views is to relate data across different views. For example, to identify threats, an intelligence analyst needs to link people from a social network graph with locations on a crime-map, and then search for and read relevant documents. Currently, exploring cross-view data relationships heavily relies on view-coordination techniques (e.g., brushing and linking), which may require significant user effort on many trial-and-error attempts, such as repetitiously selecting elements in one view, and then observing and following elements highlighted in other views. To address this, we present SightBi, a visual analytics approach for supporting cross-view data relationship explorations. We discuss the design rationale of SightBi in detail, with identified user tasks regarding the use of cross-view data relationships. SightBi formalizes cross-view data relationships as biclusters, computes them from a dataset, and uses a bi-context design that highlights creating stand-alone relationship-views. This helps preserve existing views and offers an overview of cross-view data relationships to guide user exploration. Moreover, SightBi allows users to interactively manage the layout of multiple views by using newly created relationship-views. With a usage scenario, we demonstrate the usefulness of SightBi for sensemaking of cross-view data relationships. △ Less

Submitted 27 September, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

Comments: IEEE VIS 2021, ACM 2012 CCS - Human-centered computing, Visualization, Visualization design and evaluation methods

ACM Class: H.5.2

Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2021

arXiv:2012.13599 [pdf]

Early Indicators of Scientific Impact: Predicting Citations with Altmetrics

Authors: Akhil Pandey Akella, Hamed Alhoori, Pavan Ravikanth Kondamudi, Cole Freeman, Haiming Zhou

Abstract: Identifying important scholarly literature at an early stage is vital to the academic research community and other stakeholders such as technology companies and government bodies. Due to the sheer amount of research published and the growth of ever-changing interdisciplinary areas, researchers need an efficient way to identify important scholarly work. The number of citations a given research publ… ▽ More Identifying important scholarly literature at an early stage is vital to the academic research community and other stakeholders such as technology companies and government bodies. Due to the sheer amount of research published and the growth of ever-changing interdisciplinary areas, researchers need an efficient way to identify important scholarly work. The number of citations a given research publication has accrued has been used for this purpose, but these take time to occur and longer to accumulate. In this article, we use altmetrics to predict the short-term and long-term citations that a scholarly publication could receive. We build various classification and regression models and evaluate their performance, finding neural networks and ensemble models to perform best for these tasks. We also find that Mendeley readership is the most important factor in predicting the early citations, followed by other factors such as the academic status of the readers (e.g., student, postdoc, professor), followers on Twitter, online post length, author count, and the number of mentions on Twitter, Wikipedia, and across different countries. △ Less

Submitted 25 December, 2020; originally announced December 2020.

arXiv:2001.01029 [pdf, other]

doi 10.1145/3375192

Measuring the Diversity of Facebook Reactions to Research

Authors: Cole Freeman, Hamed Alhoori, Murtuza Shahzad

Abstract: Online and in the real world, communities are bonded together by emotional consensus around core issues. Emotional responses to scientific findings often play a pivotal role in these core issues. When there is too much diversity of opinion on topics of science, emotions flare up and give rise to conflict. This conflict threatens positive outcomes for research. Emotions have the power to shape how… ▽ More Online and in the real world, communities are bonded together by emotional consensus around core issues. Emotional responses to scientific findings often play a pivotal role in these core issues. When there is too much diversity of opinion on topics of science, emotions flare up and give rise to conflict. This conflict threatens positive outcomes for research. Emotions have the power to shape how people process new information. They can color the public's understanding of science, motivate policy positions, even change lives. And yet little work has been done to evaluate the public's emotional response to science using quantitative methods. In this paper, we use a dataset of responses to scholarly articles on Facebook to analyze the dynamics of emotional valence, intensity, and diversity. We present a novel way of weighting click-based reactions that increases their comprehensibility, and use these weighted reactions to develop new metrics of aggregate emotional responses. We use our metrics along with LDA topic models and statistical testing to investigate how users' emotional responses differ from one scientific topic to another. We find that research articles related to gender, genetics, or agricultural/environmental sciences elicit significantly different emotional responses from users than other research topics. We also find that there is generally a positive response to scientific research on Facebook, and that articles generating a positive emotional response are more likely to be widely shared---a conclusion that contradicts previous studies of other social media platforms. △ Less

Submitted 3 January, 2020; originally announced January 2020.

Comments: 17 pages, 3 figures, ACM Group

arXiv:1911.01275 [pdf]

Using Arabic Tweets to Understand Drug Selling Behaviors

Authors: Wesam Alruwaili, Bradley Protano, Tejasvi Sirigiriraju, Hamed Alhoori

Abstract: Twitter is a popular platform for e-commerce in the Arab region including the sale of illegal goods and services. Social media platforms present multiple opportunities to mine information about behaviors pertaining to both illicit and pharmaceutical drugs and likewise to legal prescription drugs sold without a prescription, i.e., illegally. Recognized as a public health risk, the sale and use of i… ▽ More Twitter is a popular platform for e-commerce in the Arab region including the sale of illegal goods and services. Social media platforms present multiple opportunities to mine information about behaviors pertaining to both illicit and pharmaceutical drugs and likewise to legal prescription drugs sold without a prescription, i.e., illegally. Recognized as a public health risk, the sale and use of illegal drugs, counterfeit versions of legal drugs, and legal drugs sold without a prescription constitute a widespread problem that is reflected in and facilitated by social media. Twitter provides a crucial resource for monitoring legal and illegal drug sales in order to support the larger goal of finding ways to protect patient safety. We collected our dataset using Arabic keywords. We then categorized the data using four machine learning classifiers. Based on a comparison of the respective results, we assessed the accuracy of each classifier in predicting two important considerations in analysing the extent to which drugs are available on social media: references to drugs for sale and the legality/illegality of the drugs thus advertised. For predicting tweets selling drugs, Support Vector Machine, yielded the highest accuracy rate (96%), whereas for predicting the legality of the advertised drugs, the Naive Bayes, classifier yielded the highest accuracy rate (85%). △ Less

Submitted 26 October, 2019; originally announced November 2019.

arXiv:1906.08244 [pdf, other]

Predicting Patent Citations to measure Economic Impact of Scholarly Research

Authors: Abdul Rahman Shaikh, Hamed Alhoori

Abstract: A crucial goal of funding research and development has always been to advance economic development. On this basis, a consider-able body of research undertaken with the purpose of determining what exactly constitutes economic impact and how to accurately measure that impact has been published. Numerous indicators have been used to measure economic impact, although no single indicator has been widel… ▽ More A crucial goal of funding research and development has always been to advance economic development. On this basis, a consider-able body of research undertaken with the purpose of determining what exactly constitutes economic impact and how to accurately measure that impact has been published. Numerous indicators have been used to measure economic impact, although no single indicator has been widely adapted. Based on patent data collected from Altmetric we predict patent citations through various social media features using several classification models. Patents citing a research paper implies the potential it has for direct application inits field. These predictions can be utilized by researchers in deter-mining the practical applications for their work when applying for patents. △ Less

Submitted 7 June, 2019; originally announced June 2019.

Comments: 2 Pages, 1 figure, JCDL conference

arXiv:1905.10975 [pdf, other]

Shared Feelings: Understanding Facebook Reactions to Scholarly Articles

Authors: Cole Freeman, Mrinal Kanti Roy, Michele Fattoruso, Hamed Alhoori

Abstract: Research on social-media platforms has tended to rely on textual analysis to perform research tasks. While text-based approaches have significantly increased our understanding of online behavior and social dynamics, they overlook features on these platforms that have grown in prominence in the past few years: click-based responses to content. In this paper, we present a new dataset of Facebook Rea… ▽ More Research on social-media platforms has tended to rely on textual analysis to perform research tasks. While text-based approaches have significantly increased our understanding of online behavior and social dynamics, they overlook features on these platforms that have grown in prominence in the past few years: click-based responses to content. In this paper, we present a new dataset of Facebook Reactions to scholarly content. We give an overview of its structure, analyze some of the statistical trends in the data, and use it to train and test two supervised learning algorithms. Our preliminary tests suggest the presence of stratification in the number of users following pages, divisions that seem to fall in line with distinctions in the subject matter of those pages. △ Less

Submitted 27 May, 2019; originally announced May 2019.

Comments: 4 pages, 5 figures, JCDL 2019

arXiv:1804.03522 [pdf]

doi 10.1002/pra2.2017.14505401163

What Makes A Research Article Newsworthy?

Authors: Harish Varma Siravuri, Hamed Alhoori

Abstract: There has been tremendous growth in the amount of scientific literature being published every year. Yet, very little of it receives press coverage. Mentions by news outlets often establish the relevance the research has to society in general. In the present study, we focused on better understanding the factors that contribute to a research article's newsworthiness. We have built three classifiers… ▽ More There has been tremendous growth in the amount of scientific literature being published every year. Yet, very little of it receives press coverage. Mentions by news outlets often establish the relevance the research has to society in general. In the present study, we focused on better understanding the factors that contribute to a research article's newsworthiness. We have built three classifiers to predict the likelihood of research article receiving online press coverage, based on features that quantify the attention it has received on various online platforms. The Random Forest classifier performed best with an accuracy rate of 0.92. △ Less

Submitted 7 April, 2018; originally announced April 2018.

arXiv:1708.01658 [pdf, ps, other]

Exploring Features for Predicting Policy Citations

Authors: Christian Bailey, Bharat Kale, Jamieson Walker, Harish Varma Siravuri, Hamed Alhoori, Micheal E. Papka

Abstract: In this study we performed an initial investigation and evaluation of altmetrics and their relationship with public policy citation of research papers. We examined methods for using altmetrics and other data to predict whether a research paper is cited in public policy and applied receiver operating characteristic curve on various feature groups in order to evaluate their potential usefulness. Fro… ▽ More In this study we performed an initial investigation and evaluation of altmetrics and their relationship with public policy citation of research papers. We examined methods for using altmetrics and other data to predict whether a research paper is cited in public policy and applied receiver operating characteristic curve on various feature groups in order to evaluate their potential usefulness. From the methods we tested, classifying based on tweet count provided the best results, achieving an area under the ROC curve of 0.91. △ Less

Submitted 15 June, 2017; originally announced August 2017.

Comments: 2 pages, accepted to JCDL '17

arXiv:1706.04140 [pdf, ps, other]

doi 10.1145/3091478.3098865

Predicting Research that will be Cited in Policy Documents

Authors: Bharat Kale, Harish Varma Siravuri, Hamed Alhoori, Michael E. Papka

Abstract: Scientific publications and other genres of research output are increasingly being cited in policy documents. Citations in documents of this nature could be considered a critical indicator of the significance and societal impact of the research output. In this study, we built classification models that predict whether a particular research work is likely to be cited in a public policy document bas… ▽ More Scientific publications and other genres of research output are increasingly being cited in policy documents. Citations in documents of this nature could be considered a critical indicator of the significance and societal impact of the research output. In this study, we built classification models that predict whether a particular research work is likely to be cited in a public policy document based on the attention it received online, primarily on social media platforms. We evaluated the classifiers based on their accuracy, precision, and recall values. We found that Random Forest and Multinomial Naive Bayes classifiers performed better overall. △ Less

Submitted 13 June, 2017; originally announced June 2017.

Comments: 2 page extended abstract submitted for ACM WebSci'17 conference

arXiv:1706.02192 [pdf]

doi 10.1145/3091478.3098861

Pokémon Go: Impact on Yelp Restaurant Reviews

Authors: Pavan Ravikanth Kondamudi, Bradley Protono, Hamed Alhoori

Abstract: Pokémon Go, the popular Augmented Reality based mobile application, launched in July of 2016. The game's meteoric rise in usage since that time has had an impact on not just the mobile gaming industry, but also the physical activity of players, where they travel, where they spend their money, and possibly how they interact with other social media applications. In this paper, we studied the impact… ▽ More Pokémon Go, the popular Augmented Reality based mobile application, launched in July of 2016. The game's meteoric rise in usage since that time has had an impact on not just the mobile gaming industry, but also the physical activity of players, where they travel, where they spend their money, and possibly how they interact with other social media applications. In this paper, we studied the impact of Pokémon Go on Yelp reviews. For restaurants near PokéStops, we found a slight drop in the number of online reviews. △ Less

Submitted 7 June, 2017; originally announced June 2017.

arXiv:1612.07863 [pdf]

Anatomy of Scholarly Information Behavior Patterns in the Wake of Academic Social Media Platforms

Authors: Hamed Alhoori, Mohammed Samaka, Richard Furuta, Edward A. Fox

Abstract: As more scholarly content is born digital or converted to a digital format, digital libraries are becoming increasingly vital to researchers seeking to leverage scholarly big data for scientific discovery. Although scholarly products are available in abundance-especially in environments created by the advent of social networking services-little is known about international scholarly information ne… ▽ More As more scholarly content is born digital or converted to a digital format, digital libraries are becoming increasingly vital to researchers seeking to leverage scholarly big data for scientific discovery. Although scholarly products are available in abundance-especially in environments created by the advent of social networking services-little is known about international scholarly information needs, information-seeking behavior, or information use. The purpose of this paper is to address these gaps via an in-depth analysis of the information needs and information-seeking behavior of researchers, both students and faculty, at two universities, one in the U.S. and the other in Qatar. Based on this analysis, the study identifies and describes new behavior patterns on the part of researchers as they engage in the information-seeking process. The analysis reveals that the use of academic social networks has notable effects on various scholarly activities. Further, this study identifies differences between students and faculty members in regard to their use of academic social networks, and it identifies differences between researchers according to discipline. Although the researchers who participated in the present study represent a range of disciplinary and cultural backgrounds, the study reports a number of similarities in terms of the researchers' scholarly activities. △ Less

Submitted 7 August, 2018; v1 submitted 22 December, 2016; originally announced December 2016.

arXiv:1612.05817 [pdf]

Recommendation of Scholarly Venues Based on Dynamic User Interests

Authors: Hamed Alhoori, Richard Furuta

Abstract: The ever-growing number of venues publishing academic work makes it difficult for researchers to identify venues that publish data and research most in line with their scholarly interests. A solution is needed, therefore, whereby researchers can identify information dissemination pathways in order to both access and contribute to an existing body of knowledge. In this study, we present a system to… ▽ More The ever-growing number of venues publishing academic work makes it difficult for researchers to identify venues that publish data and research most in line with their scholarly interests. A solution is needed, therefore, whereby researchers can identify information dissemination pathways in order to both access and contribute to an existing body of knowledge. In this study, we present a system to recommend scholarly venues rated in terms of relevance to a given researcher's current scholarly pursuits and interests. We collected our data from an academic social network and modeled researchers' scholarly reading behavior in order to propose a new and adaptive implicit rating technique for venues. We present a way to recommend relevant, specialized scholarly venues using these implicit ratings that can provide quick results, even for new researchers without a publication history and for emerging scholarly venues that do not yet have an impact factor. We performed a large-scale experiment with real data to evaluate the current scholarly recommendation system and showed that our proposed system achieves better results than the baseline. The results provide important up-to-the-minute signals that compared with post-publication usage-based metrics represent a closer reflection of a researcher's interests. △ Less

Submitted 26 December, 2017; v1 submitted 17 December, 2016; originally announced December 2016.

Showing 1–23 of 23 results for author: Alhoori, H