Events in clinical narratives are naturally associated with medical trials such as surgery, vacci... more Events in clinical narratives are naturally associated with medical trials such as surgery, vaccination, lab test, medication, medical procedure, diagnosis, and they are interrelated with many temporal relations, however it is difficult to define these events quantitatively or consistently in coarse time-bins (e.g. before vaccination, after admission). The grouping of medical events onto temporal clusters is a key to applications such as longitudinal studies, clinical question answering, and information retrieval. In this paper, we developed two algorithms based on Min-conflicts and K-means to enable labeling a sequence of medical events with predefined time-bins. The computation is based solely on temporal similarity and integrated with a timeline visualization tool.
A major obstacle impeding progress on the “web of data” is content creation—a difficult, tedious,... more A major obstacle impeding progress on the “web of data” is content creation—a difficult, tedious, and timeconsuming task. How do we make human-scalable, user-friendly tools to enable the web of data? Content integrity is also a major concern. How do we engender confidence in results returned from the web of data? Although seemingly unrelated, we show in this paper that it is exactly their relationship that is the key to solving both problems. As we show in this paper, we can semi-automatically derive both data and metadata from data-rich web pages to create a web of data that we then superimpose over these data-rich web pages. We link the web of data to the current web of pages, resulting in a higher-order “web of knowledge.” This web of knowledge provides provenance and thus engenders the confidence necessary to raise the level of the web from “data” to “knowledge.” We focus mainly on two prototype tools we have implemented: (1) TISP—a tool to automatically generate ontologies for ...
To date, there are no effective treatments for most neurodegenerative diseases. However, certain ... more To date, there are no effective treatments for most neurodegenerative diseases. However, certain foods may be associated with these diseases and bring an opportunity to prevent or delay neurodegenerative progression. Our objective is to construct a knowledge graph for neurodegenerative diseases using literature mining to study their relations with diet. We collected biomedical annotations (Disease, Chemical, Gene, Species, SNP&Mutation) in the abstracts from 4,300 publications relevant to both neurodegenerative diseases and diet using PubTator, an NIH-supported tool that can extract biomedical concepts from literature. A knowledge graph was created from these annotations. Graph embeddings were then trained with the node2vec algorithm to support potential concept clustering and similar concept identification. We found several food-related species and chemicals that might come from diet and have an impact on neurodegenerative diseases. 1 Scientific Background Neurodegenerative disease...
There are huge and growing amounts of biological data that reside in different online repositorie... more There are huge and growing amounts of biological data that reside in different online repositories. Most of these Web-based sources only focus on some specific areas or only allow limited types of user queries. To obtain needed information, biologists usually have to traverse different Web sources and combine their data manually. In this research, we propose a system that can help users to overcome these difficulties. Given a user’s query within the the area of molecular biology, our system can automatically discover appropriate repositories, retrieve useful information from these repositories and integrate the retrieved information together.
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020
The human papillomavirus (HPV) vaccine is the most effective way to prevent HPV-related cancers. ... more The human papillomavirus (HPV) vaccine is the most effective way to prevent HPV-related cancers. Integrating provider vaccine counseling is crucial to improving HPV vaccine completion rates. Automating the counseling experience through a conversational agent could help improve HPV vaccine coverage and reduce the burden of vaccine counseling for providers. In a previous study, we tested a simulated conversational agent that provided HPV vaccine counseling for parents using the Wizard of OZ protocol. In the current study, we assessed the conversational agent among young college adults (n=24), a population that may have missed the HPV vaccine during their adolescence when vaccination is recommended. We also administered surveys for system and voice usability, and for health beliefs concerning the HPV vaccine. Participants perceived the agent to have high usability that is slightly better or equivalent to other voice interactive interfaces, and there is some evidence that the agent impa...
In this study, we introduce an ontology-driven software engine to provide dialogue interaction fu... more In this study, we introduce an ontology-driven software engine to provide dialogue interaction functionality for a conversational agent for HPV vaccine counseling. Currently, the HPV vaccination rates are low that risks unprotected individuals at being infected with HPV, a virus that leads to life-threatening cancers. In addition, we developed a question answering subsystem to support the dialogue engine. In this paper, we discuss our design and development of an ontology-driven dialogue engine that uses the Patient Health Information Dialogue Ontology, an ontology that we previously developed, and a question answering subsystem based on various previous methods to supplement the dialogue engine’s interaction with the user. Our next step is to test the functional ability of the ontology-driven software components and deploy the engine in a live environment to be integrated with a speech interface.
The informed consent process is a complicated procedure involving permissions as well a variety o... more The informed consent process is a complicated procedure involving permissions as well a variety of entities and actions. In this paper, we discuss the use of Semantic Web Rule Language (SWRL) to further extend the Informed Consent Ontology (ICO) to allow for semantic machine-based reasoning to manage and generate important permission-based information that can later be viewed by stakeholders. We present four use cases of permissions from the All of Us informed consent document and translate these permissions into SWRL expressions to extend and operationalize ICO. Our efforts show how SWRL is able to infer some of the implicit information based on the defined rules, and demonstrate the utility of ICO through the use of SWRL extensions. Future work will include developing formal and generalized rules and expressing permissions from the entire document, as well as working towards integrating ICO into software systems to enhance the semantic representation of informed consent for biomed...
HCI International 2021 - Late Breaking Papers: HCI Applications in Health, Transport, and Industry, 2021
Narratives can have a powerful impact on our health-related beliefs, attitudes, and behaviors. Th... more Narratives can have a powerful impact on our health-related beliefs, attitudes, and behaviors. The human papillomavirus (HPV) vaccine can protect against human papillomavirus that leads to different types of cancers. However, HPV vaccination rates are low. This study explored the effectiveness of a narrative-based interactive game about the HPV vaccines as a method to communicate knowledge and perhaps create behavioral outcomes. We developed a serious storytelling game called Vaccination Vacation inspired by personal narratives of individuals who were impacted by the HPV. We tested the game using a randomized control study of 99 adult participants and compared the HPV knowledge and vaccine beliefs of the Gamer Group (who played the game, n = 44) and the Reader group (who read a vaccine information sheet, n = 55). We also evaluated the usability of the game. In addition to high usability, the interactive game slightly impacted the beliefs about the HPV vaccine over standard delivery of vaccine information, especially among those who never received the HPV vaccine. We also observed some gender-based differences in perception towards usability and the likelihood of frequently playing the game. A narrative-based game could bring positive changes to players' HPV-related health beliefs. The combination of more comprehensive HPV vaccine information with the narratives may produce a larger impact. Narrative-based games can be effectively used in other vaccine education interventions and warrant future research.
Background Dyadic-based social networks analyses have been effective in a variety of behavioral- ... more Background Dyadic-based social networks analyses have been effective in a variety of behavioral- and health-related research areas. We introduce an ontology-driven approach towards social network analysis through encoding social data and inferring new information from the data. Methods The Friend of a Friend (FOAF) ontology is a lightweight social network ontology. We enriched FOAF by deriving social interaction data and relationships from social data to extend its domain scope. Results Our effort produced Friend of a Friend with Benefits (FOAF+) ontology that aims to support the spectrum of human interaction. A preliminary semiotic evaluation revealed a semantically rich and comprehensive knowledge base to represent complex social network relationships. With Semantic Web Rules Language, we demonstrated FOAF+ potential to infer social network ties between individual data. Conclusion Using logical rules, we defined interpersonal dyadic social connections, which can create inferred li...
Background Fast food with its abundance and availability to consumers may have health consequence... more Background Fast food with its abundance and availability to consumers may have health consequences due to the high calorie intake which is a major contributor to life threatening diseases. Providing nutritional information has some impact on consumer decisions to self regulate and promote healthier diets, and thus, government regulations have mandated the publishing of nutritional content to assist consumers, including for fast food. However, fast food nutritional information is fragmented, and we realize a benefit to collate nutritional data to synthesize knowledge for individuals. Methods We developed the ontology of fast food facts as an opportunity to standardize knowledge of fast food and link nutritional data that could be analyzed and aggregated for the information needs of consumers and experts. The ontology is based on metadata from 21 fast food establishment nutritional resources and authored in OWL2 using Protégé. Results Three evaluators reviewed the logical structure of...
Background Social media platforms such as YouTube are hotbeds for the spread of misinformation ab... more Background Social media platforms such as YouTube are hotbeds for the spread of misinformation about vaccines. Objective The aim of this study was to explore how individuals are exposed to antivaccine misinformation on YouTube based on whether they start their viewing from a keyword-based search or from antivaccine seed videos. Methods Four networks of videos based on YouTube recommendations were collected in November 2019. Two search networks were created from provaccine and antivaccine keywords to resemble goal-oriented browsing. Two seed networks were constructed from conspiracy and antivaccine expert seed videos to resemble direct navigation. Video contents and network structures were analyzed using the network exposure model. Results Viewers are more likely to encounter antivaccine videos through direct navigation starting from an antivaccine video than through goal-oriented browsing. In the two seed networks, provaccine videos, antivaccine videos, and videos containing health ...
With the proliferation of heterogeneous health care data in the last three decades, biomedical on... more With the proliferation of heterogeneous health care data in the last three decades, biomedical ontologies and controlled biomedical terminologies play a more and more important role in knowledge representation and management, data integration, natural language processing, as well as decision support for health information systems and biomedical research. Biomedical ontologies and controlled terminologies are intended to assure interoperability. Nevertheless, the quality of biomedical ontologies has hindered their applicability and subsequent adoption in real-world applications. Ontology evaluation is an integral part of ontology development and maintenance. In the biomedicine domain, ontology evaluation is often conducted by third parties as a quality assurance (or auditing) effort that focuses on identifying modeling errors and inconsistencies. In this work, we first organized four categorical schemes of ontology evaluation methods in the existing literature to create an integrated...
BMC medical informatics and decision making, Jan 5, 2017
Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate... more Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers. This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources. The results of our efforts generated accurate extractions ranging from 80-89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase. OIE delivers an effective and ...
BMC medical informatics and decision making, Jan 5, 2017
As one of the serious public health issues, vaccination refusal has been attracting more and more... more As one of the serious public health issues, vaccination refusal has been attracting more and more attention, especially for newly approved human papillomavirus (HPV) vaccines. Understanding public opinion towards HPV vaccines, especially concerns on social media, is of significant importance for HPV vaccination promotion. In this study, we leveraged a hierarchical machine learning based sentiment analysis system to extract public opinions towards HPV vaccines from Twitter. English tweets containing HPV vaccines-related keywords were collected from November 2, 2015 to March 28, 2016. Manual annotation was done to evaluate the performance of the system on the unannotated tweets corpus. Followed time series analysis was applied to this corpus to track the trends of machine-deduced sentiments and their associations with different days of the week. The evaluation of the unannotated tweets corpus showed that the micro-averaging F scores have reached 0.786. The learning system deduced the ...
Analysing public opinions on HPV vaccines on social media using machine learning based approaches... more Analysing public opinions on HPV vaccines on social media using machine learning based approaches will help us understand the reasons behind the low vaccine coverage and come up with corresponding strategies to improve vaccine uptake. To propose a machine learning system that is able to extract comprehensive public sentiment on HPV vaccines on Twitter with satisfying performance. We collected and manually annotated 6,000 HPV vaccines related tweets as a gold standard. SVM model was chosen and a hierarchical classification method was proposed and evaluated. Additional feature sets evaluation and model parameters optimization was done to maximize the machine learning model performance. A hierarchical classification scheme that contains 10 categories was built to access public opinions toward HPV vaccines comprehensively. A 6,000 annotated tweets gold corpus with Kappa annotation agreement at 0.851 was created and made public available. The hierarchical classification model with optimi...
Vaccines have been one of the most successful public health interventions to date. The use of vac... more Vaccines have been one of the most successful public health interventions to date. The use of vaccination, however, also comes with possible adverse events. The U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS) currently contains more 200,000 reports for post-vaccination events that occur after the administration of vaccines licensed in the United States. Although the data from VAERS has been applied to many public health and vaccine safety studies, each individual report does not necessary indicate a casuality relationship between the vaccine and the reported symptoms. Further statistical analysis and summarization needs to be done before this data can be leveraged. In this paper, we introduces our preliminary work on summarzing the VAERS data and representing the vaccine-symptom correlations as well as the meta data of their relations using RDF. We then apply network analysis approaches to the RDF data to illustrate a use case of the data. We further discuss our vision on integrating the data with vaccine information from other sources using RDF linked approach to faciliate more comprehensive analyses.
Events in clinical narratives are naturally associated with medical trials such as surgery, vacci... more Events in clinical narratives are naturally associated with medical trials such as surgery, vaccination, lab test, medication, medical procedure, diagnosis, and they are interrelated with many temporal relations, however it is difficult to define these events quantitatively or consistently in coarse time-bins (e.g. before vaccination, after admission). The grouping of medical events onto temporal clusters is a key to applications such as longitudinal studies, clinical question answering, and information retrieval. In this paper, we developed two algorithms based on Min-conflicts and K-means to enable labeling a sequence of medical events with predefined time-bins. The computation is based solely on temporal similarity and integrated with a timeline visualization tool.
A major obstacle impeding progress on the “web of data” is content creation—a difficult, tedious,... more A major obstacle impeding progress on the “web of data” is content creation—a difficult, tedious, and timeconsuming task. How do we make human-scalable, user-friendly tools to enable the web of data? Content integrity is also a major concern. How do we engender confidence in results returned from the web of data? Although seemingly unrelated, we show in this paper that it is exactly their relationship that is the key to solving both problems. As we show in this paper, we can semi-automatically derive both data and metadata from data-rich web pages to create a web of data that we then superimpose over these data-rich web pages. We link the web of data to the current web of pages, resulting in a higher-order “web of knowledge.” This web of knowledge provides provenance and thus engenders the confidence necessary to raise the level of the web from “data” to “knowledge.” We focus mainly on two prototype tools we have implemented: (1) TISP—a tool to automatically generate ontologies for ...
To date, there are no effective treatments for most neurodegenerative diseases. However, certain ... more To date, there are no effective treatments for most neurodegenerative diseases. However, certain foods may be associated with these diseases and bring an opportunity to prevent or delay neurodegenerative progression. Our objective is to construct a knowledge graph for neurodegenerative diseases using literature mining to study their relations with diet. We collected biomedical annotations (Disease, Chemical, Gene, Species, SNP&Mutation) in the abstracts from 4,300 publications relevant to both neurodegenerative diseases and diet using PubTator, an NIH-supported tool that can extract biomedical concepts from literature. A knowledge graph was created from these annotations. Graph embeddings were then trained with the node2vec algorithm to support potential concept clustering and similar concept identification. We found several food-related species and chemicals that might come from diet and have an impact on neurodegenerative diseases. 1 Scientific Background Neurodegenerative disease...
There are huge and growing amounts of biological data that reside in different online repositorie... more There are huge and growing amounts of biological data that reside in different online repositories. Most of these Web-based sources only focus on some specific areas or only allow limited types of user queries. To obtain needed information, biologists usually have to traverse different Web sources and combine their data manually. In this research, we propose a system that can help users to overcome these difficulties. Given a user’s query within the the area of molecular biology, our system can automatically discover appropriate repositories, retrieve useful information from these repositories and integrate the retrieved information together.
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020
The human papillomavirus (HPV) vaccine is the most effective way to prevent HPV-related cancers. ... more The human papillomavirus (HPV) vaccine is the most effective way to prevent HPV-related cancers. Integrating provider vaccine counseling is crucial to improving HPV vaccine completion rates. Automating the counseling experience through a conversational agent could help improve HPV vaccine coverage and reduce the burden of vaccine counseling for providers. In a previous study, we tested a simulated conversational agent that provided HPV vaccine counseling for parents using the Wizard of OZ protocol. In the current study, we assessed the conversational agent among young college adults (n=24), a population that may have missed the HPV vaccine during their adolescence when vaccination is recommended. We also administered surveys for system and voice usability, and for health beliefs concerning the HPV vaccine. Participants perceived the agent to have high usability that is slightly better or equivalent to other voice interactive interfaces, and there is some evidence that the agent impa...
In this study, we introduce an ontology-driven software engine to provide dialogue interaction fu... more In this study, we introduce an ontology-driven software engine to provide dialogue interaction functionality for a conversational agent for HPV vaccine counseling. Currently, the HPV vaccination rates are low that risks unprotected individuals at being infected with HPV, a virus that leads to life-threatening cancers. In addition, we developed a question answering subsystem to support the dialogue engine. In this paper, we discuss our design and development of an ontology-driven dialogue engine that uses the Patient Health Information Dialogue Ontology, an ontology that we previously developed, and a question answering subsystem based on various previous methods to supplement the dialogue engine’s interaction with the user. Our next step is to test the functional ability of the ontology-driven software components and deploy the engine in a live environment to be integrated with a speech interface.
The informed consent process is a complicated procedure involving permissions as well a variety o... more The informed consent process is a complicated procedure involving permissions as well a variety of entities and actions. In this paper, we discuss the use of Semantic Web Rule Language (SWRL) to further extend the Informed Consent Ontology (ICO) to allow for semantic machine-based reasoning to manage and generate important permission-based information that can later be viewed by stakeholders. We present four use cases of permissions from the All of Us informed consent document and translate these permissions into SWRL expressions to extend and operationalize ICO. Our efforts show how SWRL is able to infer some of the implicit information based on the defined rules, and demonstrate the utility of ICO through the use of SWRL extensions. Future work will include developing formal and generalized rules and expressing permissions from the entire document, as well as working towards integrating ICO into software systems to enhance the semantic representation of informed consent for biomed...
HCI International 2021 - Late Breaking Papers: HCI Applications in Health, Transport, and Industry, 2021
Narratives can have a powerful impact on our health-related beliefs, attitudes, and behaviors. Th... more Narratives can have a powerful impact on our health-related beliefs, attitudes, and behaviors. The human papillomavirus (HPV) vaccine can protect against human papillomavirus that leads to different types of cancers. However, HPV vaccination rates are low. This study explored the effectiveness of a narrative-based interactive game about the HPV vaccines as a method to communicate knowledge and perhaps create behavioral outcomes. We developed a serious storytelling game called Vaccination Vacation inspired by personal narratives of individuals who were impacted by the HPV. We tested the game using a randomized control study of 99 adult participants and compared the HPV knowledge and vaccine beliefs of the Gamer Group (who played the game, n = 44) and the Reader group (who read a vaccine information sheet, n = 55). We also evaluated the usability of the game. In addition to high usability, the interactive game slightly impacted the beliefs about the HPV vaccine over standard delivery of vaccine information, especially among those who never received the HPV vaccine. We also observed some gender-based differences in perception towards usability and the likelihood of frequently playing the game. A narrative-based game could bring positive changes to players' HPV-related health beliefs. The combination of more comprehensive HPV vaccine information with the narratives may produce a larger impact. Narrative-based games can be effectively used in other vaccine education interventions and warrant future research.
Background Dyadic-based social networks analyses have been effective in a variety of behavioral- ... more Background Dyadic-based social networks analyses have been effective in a variety of behavioral- and health-related research areas. We introduce an ontology-driven approach towards social network analysis through encoding social data and inferring new information from the data. Methods The Friend of a Friend (FOAF) ontology is a lightweight social network ontology. We enriched FOAF by deriving social interaction data and relationships from social data to extend its domain scope. Results Our effort produced Friend of a Friend with Benefits (FOAF+) ontology that aims to support the spectrum of human interaction. A preliminary semiotic evaluation revealed a semantically rich and comprehensive knowledge base to represent complex social network relationships. With Semantic Web Rules Language, we demonstrated FOAF+ potential to infer social network ties between individual data. Conclusion Using logical rules, we defined interpersonal dyadic social connections, which can create inferred li...
Background Fast food with its abundance and availability to consumers may have health consequence... more Background Fast food with its abundance and availability to consumers may have health consequences due to the high calorie intake which is a major contributor to life threatening diseases. Providing nutritional information has some impact on consumer decisions to self regulate and promote healthier diets, and thus, government regulations have mandated the publishing of nutritional content to assist consumers, including for fast food. However, fast food nutritional information is fragmented, and we realize a benefit to collate nutritional data to synthesize knowledge for individuals. Methods We developed the ontology of fast food facts as an opportunity to standardize knowledge of fast food and link nutritional data that could be analyzed and aggregated for the information needs of consumers and experts. The ontology is based on metadata from 21 fast food establishment nutritional resources and authored in OWL2 using Protégé. Results Three evaluators reviewed the logical structure of...
Background Social media platforms such as YouTube are hotbeds for the spread of misinformation ab... more Background Social media platforms such as YouTube are hotbeds for the spread of misinformation about vaccines. Objective The aim of this study was to explore how individuals are exposed to antivaccine misinformation on YouTube based on whether they start their viewing from a keyword-based search or from antivaccine seed videos. Methods Four networks of videos based on YouTube recommendations were collected in November 2019. Two search networks were created from provaccine and antivaccine keywords to resemble goal-oriented browsing. Two seed networks were constructed from conspiracy and antivaccine expert seed videos to resemble direct navigation. Video contents and network structures were analyzed using the network exposure model. Results Viewers are more likely to encounter antivaccine videos through direct navigation starting from an antivaccine video than through goal-oriented browsing. In the two seed networks, provaccine videos, antivaccine videos, and videos containing health ...
With the proliferation of heterogeneous health care data in the last three decades, biomedical on... more With the proliferation of heterogeneous health care data in the last three decades, biomedical ontologies and controlled biomedical terminologies play a more and more important role in knowledge representation and management, data integration, natural language processing, as well as decision support for health information systems and biomedical research. Biomedical ontologies and controlled terminologies are intended to assure interoperability. Nevertheless, the quality of biomedical ontologies has hindered their applicability and subsequent adoption in real-world applications. Ontology evaluation is an integral part of ontology development and maintenance. In the biomedicine domain, ontology evaluation is often conducted by third parties as a quality assurance (or auditing) effort that focuses on identifying modeling errors and inconsistencies. In this work, we first organized four categorical schemes of ontology evaluation methods in the existing literature to create an integrated...
BMC medical informatics and decision making, Jan 5, 2017
Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate... more Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers. This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources. The results of our efforts generated accurate extractions ranging from 80-89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase. OIE delivers an effective and ...
BMC medical informatics and decision making, Jan 5, 2017
As one of the serious public health issues, vaccination refusal has been attracting more and more... more As one of the serious public health issues, vaccination refusal has been attracting more and more attention, especially for newly approved human papillomavirus (HPV) vaccines. Understanding public opinion towards HPV vaccines, especially concerns on social media, is of significant importance for HPV vaccination promotion. In this study, we leveraged a hierarchical machine learning based sentiment analysis system to extract public opinions towards HPV vaccines from Twitter. English tweets containing HPV vaccines-related keywords were collected from November 2, 2015 to March 28, 2016. Manual annotation was done to evaluate the performance of the system on the unannotated tweets corpus. Followed time series analysis was applied to this corpus to track the trends of machine-deduced sentiments and their associations with different days of the week. The evaluation of the unannotated tweets corpus showed that the micro-averaging F scores have reached 0.786. The learning system deduced the ...
Analysing public opinions on HPV vaccines on social media using machine learning based approaches... more Analysing public opinions on HPV vaccines on social media using machine learning based approaches will help us understand the reasons behind the low vaccine coverage and come up with corresponding strategies to improve vaccine uptake. To propose a machine learning system that is able to extract comprehensive public sentiment on HPV vaccines on Twitter with satisfying performance. We collected and manually annotated 6,000 HPV vaccines related tweets as a gold standard. SVM model was chosen and a hierarchical classification method was proposed and evaluated. Additional feature sets evaluation and model parameters optimization was done to maximize the machine learning model performance. A hierarchical classification scheme that contains 10 categories was built to access public opinions toward HPV vaccines comprehensively. A 6,000 annotated tweets gold corpus with Kappa annotation agreement at 0.851 was created and made public available. The hierarchical classification model with optimi...
Vaccines have been one of the most successful public health interventions to date. The use of vac... more Vaccines have been one of the most successful public health interventions to date. The use of vaccination, however, also comes with possible adverse events. The U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS) currently contains more 200,000 reports for post-vaccination events that occur after the administration of vaccines licensed in the United States. Although the data from VAERS has been applied to many public health and vaccine safety studies, each individual report does not necessary indicate a casuality relationship between the vaccine and the reported symptoms. Further statistical analysis and summarization needs to be done before this data can be leveraged. In this paper, we introduces our preliminary work on summarzing the VAERS data and representing the vaccine-symptom correlations as well as the meta data of their relations using RDF. We then apply network analysis approaches to the RDF data to illustrate a use case of the data. We further discuss our vision on integrating the data with vaccine information from other sources using RDF linked approach to faciliate more comprehensive analyses.
Uploads
Papers by Cui Tao