Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Shaina Raza

    Shaina Raza

    Recommending points of interest (POI) is a challenging task that requires extracting comprehensive location data from location-based social media platforms. To provide effective location-based recommendations, it is important to analyze... more
    Recommending points of interest (POI) is a challenging task that requires extracting comprehensive location data from location-based social media platforms. To provide effective location-based recommendations, it is important to analyze users’ historical behavior and preferences. In this study, we present a sophisticated location-aware recommendation system that uses Bidirectional Encoder Representations from Transformers (BERT) to offer personalized location-based suggestions. Our model combines location information and user preferences to provide more relevant recommendations compared to models that predict the next POI in a sequence. Based on our experiments conducted on two benchmark datasets, we have observed that our BERT-based model surpasses baselines models in terms of HR by a significant margin of 6% compared to the second-best performing baseline. Furthermore, our model demonstrates a percentage gain of 1–2% in the NDCG compared to second best baseline. These results indi...
    Background Substance use, including the non-medical use of prescription medications, is a global health problem resulting in hundreds of thousands of overdose deaths and other health problems. Social media has emerged as a potent source... more
    Background Substance use, including the non-medical use of prescription medications, is a global health problem resulting in hundreds of thousands of overdose deaths and other health problems. Social media has emerged as a potent source of information for studying substance use-related behaviours and their consequences. Mining large-scale social media data on the topic requires the development of natural language processing (NLP) and machine learning frameworks customized for this problem. Our objective in this research is to develop a framework for conducting a content analysis of Twitter chatter about the non-medical use of a set of prescription medications. Methods We collected Twitter data for four medications—fentanyl and morphine (opioids), alprazolam (benzodiazepine), and Adderall® (stimulant), and identified posts that indicated non-medical use using an automatic machine learning classifier. In our NLP framework, we applied supervised named entity recognition (NER) to identi...
    Clinical decision-making is a challenging and time-consuming task that involves integrating a vast amount of patient data, including medical history, test results, and notes from clinicians. To assist this process, clinical recommender... more
    Clinical decision-making is a challenging and time-consuming task that involves integrating a vast amount of patient data, including medical history, test results, and notes from clinicians. To assist this process, clinical recommender systems have been developed to provide personalized recommendations to healthcare practitioners. However, creating effective clinical recommender systems is complex due to the diversity and intricacy of clinical data and the need for customized recommendations. In this paper, we propose a two-stage recommender framework for clinical decision-making basedon the publicly available MIMIC dataset of electronic health records. The first stage of the framework employs a deep neural networkbased model to retrieve a set of candidate items, such as diagnosis, medication, and prescriptions, from the patient’s electronic health records. The model is trained to extract relevant information from clinical notes using a pre-trained language model. The second stage o...
    The ability to extract critical information about an infectious disease in a timely manner is critical for population health research. The lack of procedures for mining large amounts of health data is a major impediment. The goal of this... more
    The ability to extract critical information about an infectious disease in a timely manner is critical for population health research. The lack of procedures for mining large amounts of health data is a major impediment. The goal of this research is to use natural language processing (NLP) to extract key information (clinical factors, social determinants of health) from free text. The proposed framework describes database construction, NLP modules for locating clinical and non-clinical (social determinants) information, and a detailed evaluation protocol for evaluating results and demonstrating the effectiveness of the proposed framework. The use of COVID-19 case reports is demonstrated for data construction and pandemic surveillance. The proposed approach outperforms benchmark methods in F1-score by about 1–3%. A thorough examination reveals the disease’s presence as well as the frequency of symptoms in patients. The findings suggest that prior knowledge gained through transfer lea...
    Background Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.... more
    Background Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data. Objective This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature. Methods The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports. Results The named entity recognition implementation in the NLP layer achieves a performance gain of about 1–3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extractio...
    BackgroundSocial determinants of health are non-medical factors that influence health outcomes (SDOH). There is a wealth of SDOH information available via electronic health records, clinical reports, and social media, usually in free... more
    BackgroundSocial determinants of health are non-medical factors that influence health outcomes (SDOH). There is a wealth of SDOH information available via electronic health records, clinical reports, and social media, usually in free texts format, which poses a significant challenge and necessitates the use of natural language processing (NLP) techniques to extract key information.ObjectiveThe objective of this research is to advance the automatic extraction of SDOH from clinical texts.Setting and DataThe case reports of COVID-19 patients from the published literature are curated to create a corpus. A portion of the data is annotated by experts to create gold labels, and active learning is used for corpus re-annotation.MethodsA named entity recognition (NER) framework is developed and tested to extract SDOH along with a few prominent clinical entities (diseases, treatments, diagnosis) from the free texts. The proposed model consists of three deep neural networks – A Transformer-base...
    BackgroundDespite significant advancements in biomedical named entity recognition methods, the clinical application of these systems continues to face many challenges: (1) most of the methods are trained on a limited set of clinical... more
    BackgroundDespite significant advancements in biomedical named entity recognition methods, the clinical application of these systems continues to face many challenges: (1) most of the methods are trained on a limited set of clinical entities; (2) these methods are heavily reliant on a large amount of data for both pretraining and prediction, making their use in production impractical; (3) they do not consider non-clinical entities, which are also related to patient’s health, such as social, economic or demographic factors.MethodsIn this paper, we develop Bio-Epidemiology-NER (https://pypi.org/project/Bio-Epidemiology-NER/) an open-source Python package for detecting biomedical named entities from the text. This approach is based on Transformer-based approach and trained on a dataset that is annotated with many named entities (medical, clinical, biomedical and epidemiological). This approach improves on previous efforts in three ways: (1) it recognizes many clinical entity types, suc...
    The clinical application of detecting COVID-19 factors is a challenging task. The existing named entity recognition models are usually trained on a limited set of named entities. Besides clinical, the non-clinical factors, such as social... more
    The clinical application of detecting COVID-19 factors is a challenging task. The existing named entity recognition models are usually trained on a limited set of named entities. Besides clinical, the non-clinical factors, such as social determinant of health (SDoH), are also important to study the infectious disease. In this paper, we propose a generalizable machine learning approach that improves on previous efforts by recognizing a large number of clinical risk factors and SDoH. The novelty of the proposed method lies in the subtle combination of a number of deep neural networks, including the BiLSTM-CNN-CRF method and a transformer-based embedding layer. Experimental results on a cohort of COVID-19 data prepared from PubMed articles show the superiority of the proposed approach. When compared to other methods, the proposed approach achieves a performance gain of about 1–5% in terms of macro- and micro-average F1 scores. Clinical practitioners and researchers can use this approac...
    Background: The management of hyperglycemia in hospitalized patients has a significant impact on both morbidity and mortality. This study used a large clinical database to predict the need for diabetic patients to be hospitalized, which... more
    Background: The management of hyperglycemia in hospitalized patients has a significant impact on both morbidity and mortality. This study used a large clinical database to predict the need for diabetic patients to be hospitalized, which could lead to improvements in patient safety. These predictions, however, may be vulnerable to health disparities caused by social determinants such as race, age, and gender. These biases must be removed early in the data collection process, before they enter the system and are reinforced by model predictions, resulting in biases in the model’s decisions. In this paper, we propose a machine learning pipeline capable of making predictions as well as detecting and mitigating biases. This pipeline analyses clinical data, determines whether biases exist, removes them, and then make predictions. We demonstrate the classification accuracy and fairness in model predictions using experiments Results: The results show that when we mitigate biases early in a m...
    Background Due to the growing amount of COVID-19 research literature, medical experts, clinical scientists, and researchers frequently struggle to stay up to date on the most recent findings. There is a pressing need to assist researchers... more
    Background Due to the growing amount of COVID-19 research literature, medical experts, clinical scientists, and researchers frequently struggle to stay up to date on the most recent findings. There is a pressing need to assist researchers and practitioners in mining and responding to COVID-19-related questions on time. Methods This paper introduces CoQUAD, a question-answering system that can extract answers related to COVID-19 questions in an efficient manner. There are two datasets provided in this work: a reference-standard dataset built using the CORD-19 and LitCOVID initiatives, and a gold-standard dataset prepared by the experts from a public health domain. The CoQUAD has a Retriever component trained on the BM25 algorithm that searches the reference-standard dataset for relevant documents based on a question related to COVID-19. CoQUAD also has a Reader component that consists of a Transformer-based model, namely MPNet, which is used to read the paragraphs and find the answer...
    This research presents a review of main datasets that are developed for COVID-19 research. We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers in the pursuit of... more
    This research presents a review of main datasets that are developed for COVID-19 research. We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers in the pursuit of effective COVID-19 treatments and management policies. Many organizations, such as the World Health Organization (WHO), John Hopkins, National Institute of Health (NIH), COVID-19 open science table4 and such, in the world, have made numerous datasets available to the public. However, these datasets originate from a variety of different sources and initiatives. The purpose of this research is to summarize the open COVID-19 datasets to make them more accessible to the research community for health systems design and analysis.
    News recommender systems face unique challenges due to the rapidly changing readers’ interests over time. Some of the reader’s interests are long-term, and some are short-term that need to be addressed in a news recommender system.... more
    News recommender systems face unique challenges due to the rapidly changing readers’ interests over time. Some of the reader’s interests are long-term, and some are short-term that need to be addressed in a news recommender system. Diversification is also required in a news recommender system to keep readers engaged in the reading process and expose them to various viewpoints. We propose a deep neural network for the news recommendation problem that learns multi-faceted news representations from the news content. The proposed model also learns the reader’s long-term interests from the whole click history and the short-term ones from the click history using LSTMs. The attention mechanism is used to learn a reader’s diversified interests. We give different levels of attention to the news and reader components. Experiments on two news datasets have shown the superiority of our proposed method compared to state-of-the-art methods.
    News recommender systems are usually designed to provide accurate and personalized recommendations to the readers. The diversity of the recommended results has received much less attention in this field. When it is considered, the current... more
    News recommender systems are usually designed to provide accurate and personalized recommendations to the readers. The diversity of the recommended results has received much less attention in this field. When it is considered, the current state-of-the-art models often apply the re-ranking mechanisms to promote the diversified results to the individual users. In this work, we propose a latent factor model to achieve the requisite level of accuracy while maintaining a reasonable level of diversity in a news recommender system. The existing latent factor methods mostly rely on Tikhonov regularization to improve the generality of the learnt models. These methods tend to focus mainly on accuracy measures, i.e., generating recommendations highly aligned with a user's past preference, which may cause a decrease in the diversity of information to which news readers are exposed. In our work, we make effective use of elastic-net regression to regularize the model for both the accuracy and the diversity in a single optimization framework. We demonstrate the effectiveness of our model over the state-of-the-art methods by conducting extensive experiments on a real-world news dataset.
    In the past, news recommender systems have been built to recommend list of news items similar to those that a user has accessed before (content-based); or similar to those that have been read by similar users (collaborative filtering).... more
    In the past, news recommender systems have been built to recommend list of news items similar to those that a user has accessed before (content-based); or similar to those that have been read by similar users (collaborative filtering). However, the highly volatile nature of the news content and the dynamic and evolving user preferences are either ignored or not taken into full consideration in these systems. In a news recommender system, it is very likely that a user’s short-term interest or preference may have a sudden change due to an emerging social or personal event or breaking news while their long-term interests may change gradually or remain. For these long-term interests of the readers, it is often more appropriate to associate them with news categories than with individual news items. In this paper, we propose a biased matrix factorization model with consideration of both temporal dynamics of user preferences and news taxonomy to build a news recommender system. By conducting an extensive experiment on a collection of news data, we demonstrate the effectiveness of our proposed model against traditional matrix factorization models as well as other neural recommender baselines. The findings from our experiments show that news category is an important factor when readers choose news articles to read, and temporal factors with consideration of different temporal resolution also play a role in this process.
    In a news recommender system, a reader’s preferences change over time. Some preferences drift quite abruptly (short-term preferences), while others change over a longer period of time (long-term preferences). Although the existing news... more
    In a news recommender system, a reader’s preferences change over time. Some preferences drift quite abruptly (short-term preferences), while others change over a longer period of time (long-term preferences). Although the existing news recommender systems consider the reader’s full history, they often ignore the dynamics in the reader’s behavior. Thus, they cannot meet the demand of the news readers for their timevarying preferences. In addition, the state-of-the-art news recommendation models are often focused on providing accurate predictions, which can work well in traditional recommendation scenarios. However, in a news recommender system, diversity is essential, not only to keep news readers engaged, but also to play a key role in a democratic society. In this PhD dissertation, our goal is to build a news recommender system to address these two challenges. Our system should be able to: (i) accommodate the dynamics in reader behavior; and (ii) consider both accuracy and diversit...
    Nowadays, more and more news readers tend to read news online where they have access to millions of news articles from multiple sources. In order to help users to find the right and relevant content, news recommender systems (NRS) are... more
    Nowadays, more and more news readers tend to read news online where they have access to millions of news articles from multiple sources. In order to help users to find the right and relevant content, news recommender systems (NRS) are developed to relieve the information overload problem and suggest news items that users might be in-terested in. In this paper, we highlight the major challenges faced by the news recommen-dation domain and identify the possible solutions from the state-of-the-art. Due to the rapid growth of building recommender systems using deep learning models, we divide our dis-cussion in two parts. In the first part, we present an overview of the conventional recom-mendation solutions, datasets, evaluation criteria beyond accuracy and recommendation platforms being used in NRS. In the second part, we explain deep learning-based recom-mendation solutions applied in NRS. Different from previous surveys, we also study the effects of news recommendations on user behav...
    The news recommender systems are marked by a few unique challenges specific to the news domain. These challenges emerge from rapidly evolving readers’ interests over dynamically generated news items that continuously change over time.... more
    The news recommender systems are marked by a few unique challenges specific to the news domain. These challenges emerge from rapidly evolving readers’ interests over dynamically generated news items that continuously change over time. News reading is also driven by a blend of a reader’s long-term and short-term interests. In addition, diversity is required in a news recommender system, not only to keep the reader engaged in the reading process but to get them exposed to different views and opinions. In this paper, we propose a deep neural network that jointly learns informative news and readers’ interests into a unified framework. We learn the news representation (features) from the headlines, snippets (body) and taxonomy (category, subcategory) of news. We learn a reader’s long-term interests from the reader’s click history, short-term interests from the recent clicks via LSTMSs and the diversified reader’s interests through the attention mechanism. We also apply different levels o...
    The web is full of numerous educational resources but they are not being properly used by the educators. There is so much pedagogical content available on the open web that is being ignored. A lot of learning initiatives stepped in to... more
    The web is full of numerous educational resources but they are not being properly used by the educators. There is so much pedagogical content available on the open web that is being ignored. A lot of learning initiatives stepped in to propose recommendations and guidelines to ensure interoperability of digital content. This has led to the development of learning objects repository (LOR) whose goals are interoperability, reuse, sharing, and retrieval of learning content. However, at the same time, the reproduction of learning material should not breach the copyright protection of the right holders as it is an act of cybercrime. In the lifecycle of LOR development, learning objects (LOs) are annotated using metadata descriptors to specify their syntax and semantics. This annotation process has led to the development of learning objects metadata (LOM) whose ultimate goals are to make searching and cataloging of LOs an easier task. LOM standard includes a number of sections, one of whic...
    The problem of fairness is garnering a lot of interest in the academic and broader literature due to the increasing use of data-centric systems and algorithms in machine learning. This paper introduces Dbias... more
    The problem of fairness is garnering a lot of interest in the academic and broader literature due to the increasing use of data-centric systems and algorithms in machine learning. This paper introduces Dbias (https://pypi.org/project/Dbias/), an open-source Python package for ensuring fairness in news articles. Dbias can take any text to determine if it is biased. Then, it detects biased words in the text, masks them, and suggests a set of sentences with new words that are bias-free or at least less biased. We conduct extensive experiments to assess the performance of Dbias. To see how well our approach works, we compare it to the existing fairness models. We also test the individual components of Dbias to see how effective they were. The experimental results show that Dbias outperforms all the baselines in terms of accuracy and fairness. We make this package (Dbias) as publicly available for the developers and practitioners to mitigate biases in textual data (such as news articles)...