QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
This talk will unveil QIAGEN’s Biomedical Knowledge Base products, elucidating their structure and schema design optimized for complex data exploration and sophisticated question-answering in the biomedical sector.
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
Seminar for Dr. Min Zhang's Purdue Bioinformatics Seminar Series. Touched on learning health systems, the Gen3 Data Commons, the NCI Genomic Data Commons, Data Harmonization, FAIR, and open science.
Focus on the Evidence: a knowledge graph approach to profiling drug targetsNolan Nichols
Presented at D4 2020
The presentation focuses on the application of knowledge graph approach in profiling drug targets. The speaker, Nolan Nichols, highlights the potential of genetic modifiers in developing transformative therapies for patients suffering from diseases such as spinal muscular atrophy (SMA). The company, Maze Therapeutics, is utilizing advanced data science and collaboration with AWS healthcare and life sciences to access and analyze meaningful human genetics data, and build a cloud-based data architecture. The presentation also covers the use of semantic technologies in drug discovery, target discovery, and target validation, and the integration of proprietary and shared data through the knowledge graph. The presentation concludes with a summary of the company's launch in 2019 with a $190 million investment and its focus on translating genetic modifying insights into new therapeutics.
Enabling Discovery in High-Risk Plaque using Semantic Web ApproachesTom Plasterer
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
The HRP initiative (HRP) is a joint research and development effort to advance the understanding, recognition and management of high-risk plaque for the benefit of multiple stakeholders in the healthcare system. As the primary underlying cause of heart attacks, high-risk, or vulnerable plaque is the number one cause of death in the Western world. There are currently no methods of screening, diagnosis or treatment for high-risk plaque.
The HRP initiative leverages recent advances in biology and information technology to design and optimize a care-cycle for high-risk plaque, promising to reduce morbidity, mortality and cost associated with cardiovascular disease. This Initiative is being led by the world’s foremost scientists in the fields of cardiology, pathology, and imaging, and is made possible through funding by leading pharmaceutical and medical technology entities.
HRP takes advantages of semantic web technologies for physician and researcher-lead data analysis and data interoperability. One of the key applications is a web tool linking patient demographics, clinical chemistries, physical measurements and cardiovascular imaging modalities. This empowers scientists to rapidly compare multiple clinical parameters to find patients of interest, assisting greatly in defining high-risk plaque.
1) Quantitative medicine uses large amounts of medical data and advanced analytics to determine the most effective treatment for individual patients based on their specific clinical profile and biomarkers. This approach can help reduce healthcare costs and improve outcomes compared to the traditional one-size-fits-all model.
2) However, realizing the promise of quantitative personalized medicine is challenging due to the huge quantities of diverse medical data located in dispersed systems, lack of computing capabilities, and barriers to data sharing.
3) Grid and service-oriented computing approaches are helping to address these challenges by enabling federated querying, analysis, and sharing of medical data and services across organizations through virtual integration rather than true consolidation.
Clinical Research Informatics Year-in-Review 2024Peter Embi
Peter Embi, MD's presentation of Clinical Research Informatics year-in-review presented at the 2024 AMIA Informatics Summit in Boston, MA on March 20, 2024.
The document outlines various resources and programs available through the National Institutes of Health (NIH) to support research and development efforts, from early stage screening and validation to late stage clinical trials. It describes screening programs, technology characterization services, preclinical and clinical development resources across multiple NIH institutes focused on areas like cancer, neurodegeneration, infectious diseases, and more. The document encourages collaborations between NIH researchers and outside entities through licensing agreements, cooperative research agreements, and material transfers to help move technologies toward public health benefits.
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
This talk will unveil QIAGEN’s Biomedical Knowledge Base products, elucidating their structure and schema design optimized for complex data exploration and sophisticated question-answering in the biomedical sector.
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
Seminar for Dr. Min Zhang's Purdue Bioinformatics Seminar Series. Touched on learning health systems, the Gen3 Data Commons, the NCI Genomic Data Commons, Data Harmonization, FAIR, and open science.
Focus on the Evidence: a knowledge graph approach to profiling drug targetsNolan Nichols
Presented at D4 2020
The presentation focuses on the application of knowledge graph approach in profiling drug targets. The speaker, Nolan Nichols, highlights the potential of genetic modifiers in developing transformative therapies for patients suffering from diseases such as spinal muscular atrophy (SMA). The company, Maze Therapeutics, is utilizing advanced data science and collaboration with AWS healthcare and life sciences to access and analyze meaningful human genetics data, and build a cloud-based data architecture. The presentation also covers the use of semantic technologies in drug discovery, target discovery, and target validation, and the integration of proprietary and shared data through the knowledge graph. The presentation concludes with a summary of the company's launch in 2019 with a $190 million investment and its focus on translating genetic modifying insights into new therapeutics.
Enabling Discovery in High-Risk Plaque using Semantic Web ApproachesTom Plasterer
Enabling Discovery in High-Risk Plaque using Semantic Web Approaches
The HRP initiative (HRP) is a joint research and development effort to advance the understanding, recognition and management of high-risk plaque for the benefit of multiple stakeholders in the healthcare system. As the primary underlying cause of heart attacks, high-risk, or vulnerable plaque is the number one cause of death in the Western world. There are currently no methods of screening, diagnosis or treatment for high-risk plaque.
The HRP initiative leverages recent advances in biology and information technology to design and optimize a care-cycle for high-risk plaque, promising to reduce morbidity, mortality and cost associated with cardiovascular disease. This Initiative is being led by the world’s foremost scientists in the fields of cardiology, pathology, and imaging, and is made possible through funding by leading pharmaceutical and medical technology entities.
HRP takes advantages of semantic web technologies for physician and researcher-lead data analysis and data interoperability. One of the key applications is a web tool linking patient demographics, clinical chemistries, physical measurements and cardiovascular imaging modalities. This empowers scientists to rapidly compare multiple clinical parameters to find patients of interest, assisting greatly in defining high-risk plaque.
1) Quantitative medicine uses large amounts of medical data and advanced analytics to determine the most effective treatment for individual patients based on their specific clinical profile and biomarkers. This approach can help reduce healthcare costs and improve outcomes compared to the traditional one-size-fits-all model.
2) However, realizing the promise of quantitative personalized medicine is challenging due to the huge quantities of diverse medical data located in dispersed systems, lack of computing capabilities, and barriers to data sharing.
3) Grid and service-oriented computing approaches are helping to address these challenges by enabling federated querying, analysis, and sharing of medical data and services across organizations through virtual integration rather than true consolidation.
Clinical Research Informatics Year-in-Review 2024Peter Embi
Peter Embi, MD's presentation of Clinical Research Informatics year-in-review presented at the 2024 AMIA Informatics Summit in Boston, MA on March 20, 2024.
The document outlines various resources and programs available through the National Institutes of Health (NIH) to support research and development efforts, from early stage screening and validation to late stage clinical trials. It describes screening programs, technology characterization services, preclinical and clinical development resources across multiple NIH institutes focused on areas like cancer, neurodegeneration, infectious diseases, and more. The document encourages collaborations between NIH researchers and outside entities through licensing agreements, cooperative research agreements, and material transfers to help move technologies toward public health benefits.
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
Talk given to the NLM Fellows on July 8, 2016. Touches on Cancer Genomics, Open Science and PMI: FAIR in NCI genomics thinking and projects. Includes discussion of the Genomic Data Commons (GDC), Cancer Data Ecosystem, Data sharing, and the NCI cancer clinical trials open API.
As the author of “Big Data in Healthcare Hype and Hope,” Dr. Feldman has interviewed over 180 emerging tech and healthcare companies, always asking, “How can your new approach help patients?” Her research shows that data, as an enabling tool, has the power to give us critical new insights into not only what causes disease, but what comprises normal. Despite this promise, few patients have reaped the benefits of personalized medicine. A panel of leading big data innovators will discuss the evolving health data ecosystem and how big data is being leveraged for research, discovery, clinical trials, genomics, and cancer care. Case studies and real-life examples of what’s working, what’s not working, and how we can help speed up progress to get patients the right care at the right time will be explored and debated.
• Bonnie Feldman, DDS, MBA - Chief Growth Officer, @DrBonnie360
• Colin Hill - CEO, GNS Healthcare
• Jonathan Hirsch - Founder & President, Syapse
• Andrew Kasarskis, PhD - Co-Director, Icahn Institute for Genomics & Multiscale Biology; Associate Professor, Genetics & Genomic Studies, Icaahn School of Medicine at Mt. Sinai
• William King - CEO, Zephyr Health
New York eHealth Collaborative Digital Health Conference
November 18, 2014
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
Gave the inaugural Informatics Grand Rounds at City of Hope on September 8th. NIH Commons, Genomic Data Commons, NCI Cloud Pilots, Cancer Moonshot and rationale for changing incentives around data sharing all discussed.
Building linked data large-scale chemistry platform - challenges, lessons and...Valery Tkachenko
Chemical databases have been around for decades, but in recent years we have observed a qualitative change from rather small in-house built proprietary databases to large-scale, open and increasingly complex chemistry knowledgebases. This tectonic shift has imposed new requirements for database design and system architecture as well as the implementation of completely new components and workflows which did not exist in chemical databases before. Probably the most profound change is being caused by the linked nature of modern resources - individual databases are becoming nodes and hubs of a huge and truly distributed web of knowledge. This change has important aspects such as data and format standards, interoperability, provenance, security, quality control and metainformation standards.
ChemSpider at the Royal Society of Chemistry was first public chemical database which incorporated rigorous quality control by introducing both community curation and automated quality checks at the scale of tens of millions of records. Yet we have come to realize that this approach may now be incomplete in a quickly changing world of linked data. In this presentation we will talk about challenges associated with building modern public and private chemical databases as well as lessons that we have learned from our past and present experience. We will also talk about solutions for some common problems.
This document summarizes a flexible analytical platform for precision clinical research, pharmaceutical R&D, and education. It describes the large and growing omics data analysis market and the need to extract biological meaning from big biomedical data. The platform uses machine learning, biological pathway analysis, visualization, and other techniques to analyze genomics, proteomics, transcriptomics, metabolomics, and other omics data types. It provides basic processing, predictive modeling, and decision support to help with clinical trials, molecular diagnostics, and more. The business model involves remote cloud access, full-service projects, reporting, customization, and educational programs. Testimonials highlight how the platform has helped diverse research teams.
Research Data Alliance (RDA) Webinar: What do you really know about that anti...dkNET
What do you really know about that antibody? Ask dkNET
Research resources-defined here as the tools researchers use in their scientific studies-are a foundation of the biomedical enterprise. It is critical for researchers to be able to select the proper tools for their research, but also be aware of any issues that may arise in their application. Software tools and datasets may have bugs, cell lines get contaminated, knock outs may be incomplete and antibodies may have specificity problems. Such problematic resources can continue to be used in scientific studies, even after problems are detected. Many factors, including the inability to easily retrieve alerts about problematic resources, results in their continued use, wasting both time and money. To make it easy to find information about research resources and how they perform, dkNET (NIDDK Information Network, https://dknet.org), an on-line portal supported by the US National Institute of Diabetes, Digestive and Kidney diseases (NIDDK), has developed a resource information network that utilize Research Resource Identifiers (RRIDs) and natural language processes to aggregate information about individual antibodies, cell lines, organisms, digital tools, plasmids and biosamples. This information is presented in a Resource Report that provides information such as which papers have been published using these resources, who is using them and whether issues have been reported. Using this information, dkNET also provides tools to create authentication reports in support of the NIH rigor and reproducibility guidelines. The dkNET portal includes additional information to enable researchers to easily use and navigate large amounts of data and information about research resources in support of reproducible science.
By the end of this webinar, participants will be familiar with the services and tools provided at dkNET and will be able to create a detailed research resource report and produce an authentication report in support of NIH mandates and policies.
Presenter: Maryann Martone, PhD, FAIR Data Informatics Lab (FDI Lab), University of California, San Diego
Medical innovation calls for new models for collaborations that facilitates, government, academia and industry.
Barriers to research and ultimate commercialization will be lowered by bringing best practices from industry and academic settings.
Hippocrates platform facilitates early drug development extending from basic research to drug invention and commercialization significantly saving time and money.
The platform is designed in such way to facilitate collaboration amongst stakeholders as well as taking advantage of the vast resources currently available on the web to generate and aggregate content based on the needs of the research of the end-user.
Realising the potential of Health Data Science:opportunities and challenges ...Paolo Missier
This document summarizes a presentation on opportunities and challenges for applying health data science and AI in healthcare. It discusses the potential of predictive, preventative, personalized and participatory (P4) approaches using large health datasets. However, it notes major challenges including data sparsity, imbalance, inconsistency and high costs. Case studies on liver disease and COVID datasets demonstrate issues requiring data engineering. Ensuring explanations and human oversight are also key to adopting AI in clinical practice. Overall, the document outlines a complex landscape and the need for better data science methods to realize the promise of data-driven healthcare.
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...David Peyruc
The document summarizes Pfizer's use of the tranSMART platform for various genomics and clinical data analyses including genome-wide association studies (GWAS), supporting exploratory data types like metabolomics and FACS data, and large collaborative efforts like the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Parkinson's Progression Markers Initiative (PPMI) datasets. It also discusses analytical integration with Genedata Expressionist and plans for future enhancements to tranSMART like improved GWAS support and additional genotype data. Contributors to these efforts are acknowledged.
Enabling Patient-Driven Medicine Using Graph DatabaseNeo4j
The document discusses using a graph database approach to integrate diverse types of patient omics and clinical data in order to enable a systems biology perspective in medicine. It describes building a graph database called GENOME to integrate multi-omics and clinical data from brain tumor patients to identify novel biomarkers and survival correlates. Random walk simulations on the graph are able to find statistically significant pathways involving different data types that may be biologically meaningful. This approach allows an unbiased and comprehensive analysis not possible with traditional methods.
CINECA webinar slides: Open science through fair health data networks dream o...CINECAProject
Since the FAIR data principles were published in 2016, many organizations including science funders and governments have adopted these principles to promote and foster true open science collaborations. However, to define a vision and create a video of a Personal Health Train that leverages worldwide FAIR health data in a federated manner is one step. To actually make this happen at scale and be able to show new scientific and medical insights for it is quite another!
In this webinar, we will dive into the basics of FAIR health data, but also take stock of the current situation in health data networks: after a year of frantic research and collaborations and many open datasets and hackathons on COVID-19, has the situation actually improved? Are we sharing health data on a global scale to improve medical practice, or is quality medical data still only accessible to researchers with the right credentials and deep pockets?
This webinar is part of the “How FAIR are you” webinar series and hackathon, which aim at increasing and facilitating the uptake of FAIR approaches into software, training materials and cohort data, to facilitate responsible and ethical data and resource sharing and implementation of federated applications for data analysis.
The CINECA webinar series aims to discuss ways to address common challenges and share best practices in the field of cohort data analysis, as well as distribute CINECA project results. All CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions. Please note that all webinars are recorded and available for posterior viewing. CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions.
This webinar took place on 21st January 2021 and is part of the CINECA webinar series.
For previous and upcoming CINECA webinars see:
https://www.cineca-project.eu/webinars
At GVK BIO, seamless integration of life sciences expertise with information technology helps bring database products and services to accelerate your research from discovery to development.
Knowledge Discovery using an Integrated Semantic WebMichel Dumontier
The document discusses HyQue, a system for knowledge discovery that facilitates hypothesis formulation and evaluation by leveraging Semantic Web technologies to provide access to facts, expert knowledge, and web services. HyQue uses an event-based data model and domain rules to calculate a quantitative measure of evidence for hypothesized events. It aims to enable users to pose a hypothesis and have the system automatically evaluate it using available data, ontologies, and services.
Maze's Compass Platform - A data fabric for drug discovery and developmentNolan Nichols
Maze Therapeutics has developed the Compass Platform, an informatics framework for drug discovery. The platform integrates multiple data sources to translate genetic insights into potential drug targets. It uses a data lake architecture to store over 50 billion records from public and experimental sources. Maze's new Sightline portal provides an overview of the company's research and development portfolio based on this integrated data. The initial focus is on analyzing allelic series to increase confidence in target concepts.
FAIR as a Working Principle for Cancer Genomic DataIan Fore
This document discusses making cancer genomic data FAIR (Findable, Accessible, Interoperable, and Reusable) as a working principle. It summarizes a talk given by Ian Fore of the National Cancer Institute on using FAIR data principles for cancer genomic data. The document also briefly describes several other talks from a conference track on FAIR data.
Access the webinar: http://goo.gl/p08pTz
These slides were presented in a webinar by Denodo in collaboration with BioStorage Technologies and Indiana Clinical and Translational Sciences Institute and Regenstrief Institute.
BioStorage Technologies, Inc., Indiana Clinical and Translational Sciences Institute, and Regenstrief Institute (CTSI) have joined Denodo to talk about the important role of technological advancements, such as data virtualization, in advancing biospecimen research.
By watching this webinar, you can gain insight into best practices around the integration of biospecimen and research data as well as technology solutions that provide consolidated views and rapid conversions of this data into valuable business insights. You will also learn how data virtualization can assist with the integration of data residing in heterogeneous repositories and can securely deliver aggregated data in real-time.
Pine Biotech - a company that merges big -omics data analysis with clinical care and precision applications for Real World Evidence: research & development of new targets and therapeutics, stratified clinical trials, and development of biomarkers for early detection and companion diagnostics. We want to improve patient outcomes and provide tools for researchers and clinicians to have an impact on healthcare.
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
Talk given to the NLM Fellows on July 8, 2016. Touches on Cancer Genomics, Open Science and PMI: FAIR in NCI genomics thinking and projects. Includes discussion of the Genomic Data Commons (GDC), Cancer Data Ecosystem, Data sharing, and the NCI cancer clinical trials open API.
As the author of “Big Data in Healthcare Hype and Hope,” Dr. Feldman has interviewed over 180 emerging tech and healthcare companies, always asking, “How can your new approach help patients?” Her research shows that data, as an enabling tool, has the power to give us critical new insights into not only what causes disease, but what comprises normal. Despite this promise, few patients have reaped the benefits of personalized medicine. A panel of leading big data innovators will discuss the evolving health data ecosystem and how big data is being leveraged for research, discovery, clinical trials, genomics, and cancer care. Case studies and real-life examples of what’s working, what’s not working, and how we can help speed up progress to get patients the right care at the right time will be explored and debated.
• Bonnie Feldman, DDS, MBA - Chief Growth Officer, @DrBonnie360
• Colin Hill - CEO, GNS Healthcare
• Jonathan Hirsch - Founder & President, Syapse
• Andrew Kasarskis, PhD - Co-Director, Icahn Institute for Genomics & Multiscale Biology; Associate Professor, Genetics & Genomic Studies, Icaahn School of Medicine at Mt. Sinai
• William King - CEO, Zephyr Health
New York eHealth Collaborative Digital Health Conference
November 18, 2014
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
Gave the inaugural Informatics Grand Rounds at City of Hope on September 8th. NIH Commons, Genomic Data Commons, NCI Cloud Pilots, Cancer Moonshot and rationale for changing incentives around data sharing all discussed.
Building linked data large-scale chemistry platform - challenges, lessons and...Valery Tkachenko
Chemical databases have been around for decades, but in recent years we have observed a qualitative change from rather small in-house built proprietary databases to large-scale, open and increasingly complex chemistry knowledgebases. This tectonic shift has imposed new requirements for database design and system architecture as well as the implementation of completely new components and workflows which did not exist in chemical databases before. Probably the most profound change is being caused by the linked nature of modern resources - individual databases are becoming nodes and hubs of a huge and truly distributed web of knowledge. This change has important aspects such as data and format standards, interoperability, provenance, security, quality control and metainformation standards.
ChemSpider at the Royal Society of Chemistry was first public chemical database which incorporated rigorous quality control by introducing both community curation and automated quality checks at the scale of tens of millions of records. Yet we have come to realize that this approach may now be incomplete in a quickly changing world of linked data. In this presentation we will talk about challenges associated with building modern public and private chemical databases as well as lessons that we have learned from our past and present experience. We will also talk about solutions for some common problems.
This document summarizes a flexible analytical platform for precision clinical research, pharmaceutical R&D, and education. It describes the large and growing omics data analysis market and the need to extract biological meaning from big biomedical data. The platform uses machine learning, biological pathway analysis, visualization, and other techniques to analyze genomics, proteomics, transcriptomics, metabolomics, and other omics data types. It provides basic processing, predictive modeling, and decision support to help with clinical trials, molecular diagnostics, and more. The business model involves remote cloud access, full-service projects, reporting, customization, and educational programs. Testimonials highlight how the platform has helped diverse research teams.
Research Data Alliance (RDA) Webinar: What do you really know about that anti...dkNET
What do you really know about that antibody? Ask dkNET
Research resources-defined here as the tools researchers use in their scientific studies-are a foundation of the biomedical enterprise. It is critical for researchers to be able to select the proper tools for their research, but also be aware of any issues that may arise in their application. Software tools and datasets may have bugs, cell lines get contaminated, knock outs may be incomplete and antibodies may have specificity problems. Such problematic resources can continue to be used in scientific studies, even after problems are detected. Many factors, including the inability to easily retrieve alerts about problematic resources, results in their continued use, wasting both time and money. To make it easy to find information about research resources and how they perform, dkNET (NIDDK Information Network, https://dknet.org), an on-line portal supported by the US National Institute of Diabetes, Digestive and Kidney diseases (NIDDK), has developed a resource information network that utilize Research Resource Identifiers (RRIDs) and natural language processes to aggregate information about individual antibodies, cell lines, organisms, digital tools, plasmids and biosamples. This information is presented in a Resource Report that provides information such as which papers have been published using these resources, who is using them and whether issues have been reported. Using this information, dkNET also provides tools to create authentication reports in support of the NIH rigor and reproducibility guidelines. The dkNET portal includes additional information to enable researchers to easily use and navigate large amounts of data and information about research resources in support of reproducible science.
By the end of this webinar, participants will be familiar with the services and tools provided at dkNET and will be able to create a detailed research resource report and produce an authentication report in support of NIH mandates and policies.
Presenter: Maryann Martone, PhD, FAIR Data Informatics Lab (FDI Lab), University of California, San Diego
Medical innovation calls for new models for collaborations that facilitates, government, academia and industry.
Barriers to research and ultimate commercialization will be lowered by bringing best practices from industry and academic settings.
Hippocrates platform facilitates early drug development extending from basic research to drug invention and commercialization significantly saving time and money.
The platform is designed in such way to facilitate collaboration amongst stakeholders as well as taking advantage of the vast resources currently available on the web to generate and aggregate content based on the needs of the research of the end-user.
Realising the potential of Health Data Science:opportunities and challenges ...Paolo Missier
This document summarizes a presentation on opportunities and challenges for applying health data science and AI in healthcare. It discusses the potential of predictive, preventative, personalized and participatory (P4) approaches using large health datasets. However, it notes major challenges including data sparsity, imbalance, inconsistency and high costs. Case studies on liver disease and COVID datasets demonstrate issues requiring data engineering. Ensuring explanations and human oversight are also key to adopting AI in clinical practice. Overall, the document outlines a complex landscape and the need for better data science methods to realize the promise of data-driven healthcare.
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...David Peyruc
The document summarizes Pfizer's use of the tranSMART platform for various genomics and clinical data analyses including genome-wide association studies (GWAS), supporting exploratory data types like metabolomics and FACS data, and large collaborative efforts like the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Parkinson's Progression Markers Initiative (PPMI) datasets. It also discusses analytical integration with Genedata Expressionist and plans for future enhancements to tranSMART like improved GWAS support and additional genotype data. Contributors to these efforts are acknowledged.
Enabling Patient-Driven Medicine Using Graph DatabaseNeo4j
The document discusses using a graph database approach to integrate diverse types of patient omics and clinical data in order to enable a systems biology perspective in medicine. It describes building a graph database called GENOME to integrate multi-omics and clinical data from brain tumor patients to identify novel biomarkers and survival correlates. Random walk simulations on the graph are able to find statistically significant pathways involving different data types that may be biologically meaningful. This approach allows an unbiased and comprehensive analysis not possible with traditional methods.
CINECA webinar slides: Open science through fair health data networks dream o...CINECAProject
Since the FAIR data principles were published in 2016, many organizations including science funders and governments have adopted these principles to promote and foster true open science collaborations. However, to define a vision and create a video of a Personal Health Train that leverages worldwide FAIR health data in a federated manner is one step. To actually make this happen at scale and be able to show new scientific and medical insights for it is quite another!
In this webinar, we will dive into the basics of FAIR health data, but also take stock of the current situation in health data networks: after a year of frantic research and collaborations and many open datasets and hackathons on COVID-19, has the situation actually improved? Are we sharing health data on a global scale to improve medical practice, or is quality medical data still only accessible to researchers with the right credentials and deep pockets?
This webinar is part of the “How FAIR are you” webinar series and hackathon, which aim at increasing and facilitating the uptake of FAIR approaches into software, training materials and cohort data, to facilitate responsible and ethical data and resource sharing and implementation of federated applications for data analysis.
The CINECA webinar series aims to discuss ways to address common challenges and share best practices in the field of cohort data analysis, as well as distribute CINECA project results. All CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions. Please note that all webinars are recorded and available for posterior viewing. CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions.
This webinar took place on 21st January 2021 and is part of the CINECA webinar series.
For previous and upcoming CINECA webinars see:
https://www.cineca-project.eu/webinars
At GVK BIO, seamless integration of life sciences expertise with information technology helps bring database products and services to accelerate your research from discovery to development.
Knowledge Discovery using an Integrated Semantic WebMichel Dumontier
The document discusses HyQue, a system for knowledge discovery that facilitates hypothesis formulation and evaluation by leveraging Semantic Web technologies to provide access to facts, expert knowledge, and web services. HyQue uses an event-based data model and domain rules to calculate a quantitative measure of evidence for hypothesized events. It aims to enable users to pose a hypothesis and have the system automatically evaluate it using available data, ontologies, and services.
Maze's Compass Platform - A data fabric for drug discovery and developmentNolan Nichols
Maze Therapeutics has developed the Compass Platform, an informatics framework for drug discovery. The platform integrates multiple data sources to translate genetic insights into potential drug targets. It uses a data lake architecture to store over 50 billion records from public and experimental sources. Maze's new Sightline portal provides an overview of the company's research and development portfolio based on this integrated data. The initial focus is on analyzing allelic series to increase confidence in target concepts.
FAIR as a Working Principle for Cancer Genomic DataIan Fore
This document discusses making cancer genomic data FAIR (Findable, Accessible, Interoperable, and Reusable) as a working principle. It summarizes a talk given by Ian Fore of the National Cancer Institute on using FAIR data principles for cancer genomic data. The document also briefly describes several other talks from a conference track on FAIR data.
Access the webinar: http://goo.gl/p08pTz
These slides were presented in a webinar by Denodo in collaboration with BioStorage Technologies and Indiana Clinical and Translational Sciences Institute and Regenstrief Institute.
BioStorage Technologies, Inc., Indiana Clinical and Translational Sciences Institute, and Regenstrief Institute (CTSI) have joined Denodo to talk about the important role of technological advancements, such as data virtualization, in advancing biospecimen research.
By watching this webinar, you can gain insight into best practices around the integration of biospecimen and research data as well as technology solutions that provide consolidated views and rapid conversions of this data into valuable business insights. You will also learn how data virtualization can assist with the integration of data residing in heterogeneous repositories and can securely deliver aggregated data in real-time.
Pine Biotech - a company that merges big -omics data analysis with clinical care and precision applications for Real World Evidence: research & development of new targets and therapeutics, stratified clinical trials, and development of biomarkers for early detection and companion diagnostics. We want to improve patient outcomes and provide tools for researchers and clinicians to have an impact on healthcare.
Similar to Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians (20)
Atelier - Architecture d’applications de Graphes - GraphSummit ParisNeo4j
Atelier - Architecture d’applications de Graphes
Participez à cet atelier pratique animé par des experts de Neo4j qui vous guideront pour découvrir l’intelligence contextuelle. En utilisant un jeu de données réel, nous construirons étape par étape une solution de graphes ; de la construction du modèle de données de graphes à l’exécution de requêtes et à la visualisation des données. L’approche sera applicable à de multiples cas d’usages et industries.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...Neo4j
Romain CAMPOURCY – Architecte Solution, Sopra Steria
Patrick MEYER – Architecte IA Groupe, Sopra Steria
La Génération de Récupération Augmentée (RAG) permet la réponse à des questions d’utilisateur sur un domaine métier à l’aide de grands modèles de langage. Cette technique fonctionne correctement lorsque la documentation est simple mais trouve des limitations dès que les sources sont complexes. Au travers d’un projet que nous avons réalisé, nous vous présenterons l’approche GraphRAG, une nouvelle approche qui utilise une base Neo4j générée pour améliorer la compréhension des documents et la synthèse d’informations. Cette méthode surpasse l’approche RAG en fournissant des réponses plus holistiques et précises.
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...Neo4j
Charles Gouwy, Business Product Leader, Adeo Services (Groupe Leroy Merlin)
Alors que leur Knowledge Graph est déjà intégré sur l’ensemble des expériences d’achat de leur plateforme e-commerce depuis plus de 3 ans, nous verrons quelles sont les nouvelles opportunités et challenges qui s’ouvrent encore à eux grâce à leur utilisation d’une base de donnée de graphes et l’émergence de l’IA.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GraphAware - Transforming policing with graph-based intelligence analysisNeo4j
Petr Matuska, Sales & Sales Engineering Lead, GraphAware
Western Australia Police Force’s adoption of Neo4j and the GraphAware Hume graph analytics platform marks a significant advancement in data-driven policing. Facing the challenges of growing volumes of valuable data scattered in disconnected silos, the organisation successfully implemented Neo4j database and Hume, consolidating data from various sources into a dynamic knowledge graph. The result was a connected view of intelligence, making it easier for analysts to solve crime faster. The partnership between Neo4j and GraphAware in this project demonstrates the transformative impact of graph technology on law enforcement’s ability to leverage growing volumes of valuable data to prevent crime and protect communities.
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
David Pond, Lead Product Manager, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Shirley Bacso, Data Architect, Ingka Digital
“Linked Metadata by Design” represents the integration of the outcomes from human collaboration, starting from the design phase of data product development. This knowledge is captured in the Data Knowledge Graph. It not only enables data products to be robust and compliant but also well-understood and effectively utilized.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
For senior executives, successfully managing a major cyber attack relies on your ability to minimise operational downtime, revenue loss and reputational damage.
Indeed, the approach you take to recovery is the ultimate test for your Resilience, Business Continuity, Cyber Security and IT teams.
Our Cyber Recovery Wargame prepares your organisation to deliver an exceptional crisis response.
Event date: 19th June 2024, Tate Modern
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB
Join ScyllaDB’s CEO, Dor Laor, as he introduces the revolutionary tablet architecture that makes one of the fastest databases fully elastic. Dor will also detail the significant advancements in ScyllaDB Cloud’s security and elasticity features as well as the speed boost that ScyllaDB Enterprise 2024.1 received.
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMydbops
This presentation, titled "MySQL - InnoDB" and delivered by Mayank Prasad at the Mydbops Open Source Database Meetup 16 on June 8th, 2024, covers dynamic configuration of REDO logs and instant ADD/DROP columns in InnoDB.
This presentation dives deep into the world of InnoDB, exploring two ground-breaking features introduced in MySQL 8.0:
• Dynamic Configuration of REDO Logs: Enhance your database's performance and flexibility with on-the-fly adjustments to REDO log capacity. Unleash the power of the snake metaphor to visualize how InnoDB manages REDO log files.
• Instant ADD/DROP Columns: Say goodbye to costly table rebuilds! This presentation unveils how InnoDB now enables seamless addition and removal of columns without compromising data integrity or incurring downtime.
Key Learnings:
• Grasp the concept of REDO logs and their significance in InnoDB's transaction management.
• Discover the advantages of dynamic REDO log configuration and how to leverage it for optimal performance.
• Understand the inner workings of instant ADD/DROP columns and their impact on database operations.
• Gain valuable insights into the row versioning mechanism that empowers instant column modifications.
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfleebarnesutopia
So… you want to become a Test Automation Engineer (or hire and develop one)? While there’s quite a bit of information available about important technical and tool skills to master, there’s not enough discussion around the path to becoming an effective Test Automation Engineer that knows how to add VALUE. In my experience this had led to a proliferation of engineers who are proficient with tools and building frameworks but have skill and knowledge gaps, especially in software testing, that reduce the value they deliver with test automation.
In this talk, Lee will share his lessons learned from over 30 years of working with, and mentoring, hundreds of Test Automation Engineers. Whether you’re looking to get started in test automation or just want to improve your trade, this talk will give you a solid foundation and roadmap for ensuring your test automation efforts continuously add value. This talk is equally valuable for both aspiring Test Automation Engineers and those managing them! All attendees will take away a set of key foundational knowledge and a high-level learning path for leveling up test automation skills and ensuring they add value to their organizations.
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Keywords: AI, Containeres, Kubernetes, Cloud Native
Event Link: https://meine.doag.org/events/cloudland/2024/agenda/#agendaId.4211
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc
Global data transfers can be tricky due to different regulations and individual protections in each country. Sharing data with vendors has become such a normal part of business operations that some may not even realize they’re conducting a cross-border data transfer!
The Global CBPR Forum launched the new Global Cross-Border Privacy Rules framework in May 2024 to ensure that privacy compliance and regulatory differences across participating jurisdictions do not block a business's ability to deliver its products and services worldwide.
To benefit consumers and businesses, Global CBPRs promote trust and accountability while moving toward a future where consumer privacy is honored and data can be transferred responsibly across borders.
This webinar will review:
- What is a data transfer and its related risks
- How to manage and mitigate your data transfer risks
- How do different data transfer mechanisms like the EU-US DPF and Global CBPR benefit your business globally
- Globally what are the cross-border data transfer regulations and guidelines
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
2. Legal disclaimer
QIAGEN products shown here are intended for molecular biology applications. These products are not intended for the diagnosis, prevention or
treatment of a disease.
For up-to-date licensing information and product-specific disclaimers, see the respective QIAGEN kit instructions for use or user operator manual.
QIAGEN instructions for use and user manuals are available at www.qiagen.com or can be requested from QIAGEN Technical Services (or your
local distributor).
2
3. QIAGEN Digital Insights (QDI)
Leading provider of genomic and clinical knowledge, analysis and interpretation tools and services for scientists and clinicians
3
Powered by the acquisition of:
…one of 3 Business Units within QIAGEN
3,000,000
4. QIAGEN Discovery Insights: leading provider of expert-curated knowledge
4
June 14, 2024
Curated research findings
Highlight pathways, map networks,
discover mechanisms of action
Curated ‘omics data
Search across diseases and tissues,
find comparisons, identify biomarkers
Curated gene variants
Somatic or germline compendiums,
observed clinical case distribution
5. Applications
Quickly and efficiently generate novel, high-quality discoveries through highly flexible data analysis and exploration
5
Analytics-driven drug discovery
Build applications
Integrate
Combine our leading data with your innovative analysis
approaches and a wide range of advanced algorithms
developed by the industry to power analytics and AI-driven
drug discovery
Use the data within your own analysis and data-exploration
applications
Integrate the data with other data types and sources, as
well as third-party technologies. Can act as a foundational
data model.
Primary application categories:
Biomedical knowledge graph construction
and analysis
Most popular applications
Analytics and AI-driven target identification
and drug repositioning
Target, disease and drug intelligence
portals
Disease subtype and biomarker
identification based on functional features
6. QIAGEN Biomedical Knowledge Base
Break knowledge silos to power R&D with data science
6
Biomedical KB-HD
(human-derived)
• Manually curated by expert
scientists
• Contains over 24 million
biomedical relationships
Biomedical KB-AI
(generative AI-derived)
• Curated through advanced
AI processes
• Boasts 600 million+
biomedical relationships
• Quarterly updates
• Available as flat files, knowledge graphs, APIs
• FAIR friendly
• Foundational data model that can scale
Saving time and facilitating research with comprehensive databases.
7. Many ways to access QIAGEN-curated relationships
7
94,000
diseases
Downloadable flat files
Python, R,
and REST APIs
Causal analysis and
export functions
Neo4j and SQL database
imports
PubMed
TargetScan
BioGRID
UMLS
SnoMed
MeSH
FDA, ClinVar
ClinicalTrials.gov
DrugBank
17,000
drugs
51,000
functions
49,000
chemicals
20 M
research
findings
14. Design Choices: non-directional relationships as a single relationship
June 14, 2024 14
Avoids the problem of deduplicating relationships afterwards
Protein-Protein interaction has no directionality. How should we represent it?
15. Design Choices: Clinical Trial Fine-Grained Representation
June 14, 2024 15
Shi, X., Du, J. Constructing a finer-grained representation of clinical trial results from ClinicalTrials.gov. Sci Data 11, 41 (2024). https://doi.org/10.1038/s41597-023-02869-7
16. Design Choices: Clinical Trial Evidence Representation
June 14, 2024 16
Evidence
Drug Drug Target Disease
Evidence attributes
19. Graph Customization: Build Your Own Graph
• Custom names of nodes and relationships
• Customization of attributes
• Aggregation of edges
• Subgraph centered around a certain node
• Exclude irrelevant portions of the content
June 14, 2024 19
20. June 14, 2024 20
Good schema design is a
balance between simplicity and
comprehensiveness
29. June 14, 2024 29
Graph representation enables
discovery through exploration within
the complex interconnections of
biomedical data
30. What genes cause or correlate with asthma?
30
Genes
Diseases
match (d:disease {name: 'Asthma'})<-[r:C|CO]-(g0:gene)
where any (
subtype_list in g0.node_subtype
where subtype_list in [
'enzyme', 'transcription regulator', 'transporter',
'kinase', 'G-protein coupled receptor', 'peptidase',
'transmembrane receptor', 'ion channel', 'phosphatase',
'translation regulator', 'cytokine', 'growth factor',
'ligand-dependent nuclear receptor'])
return d, r, g0
355 nodes
8309 relationships
31. Genes
Tox Functions
Pathways
How are asthma-related genes functionally linked?
31
...
optional match (g0:gene)-[:is_a*]->(g1:gene {macromolecule_level: 'ortholog group level'})
optional match (g1:gene {macromolecule_level: 'ortholog group level'})-[r1:member_of]-(p:pathway|toxlist)
with p, collect(distinct g1) as genes, collect(r1) as relationships
where size(genes) >= 2
return genes, relationships, p
281 genes
426 pathways
62 toxlists
3507 relationships
33. 33
Drugs known to
activate or inhibit
Can we repurpose drugs to target key intersections?
Genes
Tox Functions Drugs
Pathways
Immunosuppressant
approved for atopic
dermatitis
Phase two
complete for
asthma
34. Link Prediction
June 14, 2024 34
Complex Embeddings for Simple Link Prediction, Theo Trouillon et al.
Gene Disease
?
Define train/test split using Neo4j
•Taking random links between gene-disease
•Mark links to child and parent diseases as exclude
Trained ComplEx embeddings with DGL-KE
Compared to predictions based on node degree
35. QIAGEN Biomedical KB-AI: provides the greatest depth and
breadth of knowledge for critical pharmaceutical research
Unstructured relationship sources
Structured relationship sources
Graph enrichment sources
35
NIH PMC PubMed
arXiv
medRxiv
bioRxiv
Google
Patents
GWAS
Catalog
dbSNP ChEMBL
RxNav
CPDB
ClinicalTrials.gov
un1Chem
PubChem
FDA
HGNC
reactome GENEONTOLOGY
MeSH
Open Targets
UniProt DAILYMED
12 billion+ triples; 600 million+ relationships
• 335 million+ relationships from scientific research
• 9.4 million+ relationships from patents
• 14.9 million+ relationships from grants
• 4.7 million+ relationships from clinical trials
• 279 million+ relationships from structured sources
Discovery
Identify new targets and
indications with genetic
evidence found across
scientific literature
Clinical
development
Establish potential
biomarkers for
diseases
Business development
and strategy
Understand the competitive
landscape by target, drug,
indication and augment
scientific due diligence
Data generated using state-of-the-art entity disambiguation, semantically meaningful relationship extraction and causal
relationships.
36. June 14, 2024 36
Entities and Relationships in Biomedical KB-AI
Relationships
Semantic: 290 million
Causal: 9 million
Adverse effects: 280 million
Clinical Trials: 4.7 million
GWAS: 2 million
37. Preclinical Competitive Intelligence
June 14, 2024 37
Biomedical KB-AI provides many competitive
intelligence sources including
• Patents
• Clinical trials
• Research papers
• Grant applications
GLP1R patent mentions
38. Preclinical Competitive Intelligence
June 14, 2024 38
Biomedical KB-AI provides many competitive
intelligence sources including
• Patents
• Clinical trials
• Research papers
• Grant applications
Top 20% clinical trial sponsors for GLP1
40. Timeline – Top 4 drugs
targeting GLP1R
Evidence comes form
NIH grants, Publicaitons
and Patents
Evidence accumulation for GLP1R interacting drugs
41. Rare diseases research
6/14/2024 41
Hypophosphatasia and Ehlers-Danlos syndrome
• Building chat interface model for HPP and EDS scientific publications
• Building model that augments research for HPP and EDS
42. June 14, 2024 42
Graph representation supports
complex analyses of biomedical
data
43. June 14, 2024 43
• Kyle Nilson
• Millie Zhou
• Ivana Grbesa
• Francesco Lamanna
• Andreas Kramer
• Bob Rebres
• Burk Braun
• Swati Mishra
• Bjarke Skjernaa
• Allan Merrild
• Rune Gee Madsen
• Poul Liboriussen
• Thomas Hyldgaard
• Venkatesh Moktali
• Alex Jarasch
• Alexander Erdl
• Vincent Vialard
Acknowledgments