Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Tuning the Electronic Properties of CumAgn Bimetallic Clusters for Enhanced CO2 Activation
Previous Article in Journal
Ginsenoside Rg1 Prevents and Treats Acute Pulmonary Injury Induced by High-Altitude Hypoxia
Previous Article in Special Issue
The Yield of Genetic Testing and Putative Genetic Factors of Disease Heterogeneity in Long QT Syndrome Patients
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bioinformatic Multi-Strategy Profiling of Congenital Heart Defects for Molecular Mechanism Recognition

by
Fabyanne Guimarães de Oliveira
1,2,
João Vitor Pacheco Foletto
1,2,3,
Yasmin Chaves Scimczak Medeiros
2,3,
Lavínia Schuler-Faccini
1,2,4,5 and
Thayne Woycinck Kowalski
1,2,3,4,6,7,*
1
Laboratory of Medical Genetics and Evolution, Graduate Program in Genetics and Molecular Biology, Genetics Department, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre 91501-970, Brazil
2
Teratogen Information System (SIAT), Medical Genetics Service, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre 90035-903, Brazil
3
Laboratory of Genomic Medicine, Center of Experimental Research, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre 90035-903, Brazil
4
National Institute on Population Medical Genetics (INAGEMP), Porto Alegre 90035-903, Brazil
5
Graduate Program in Children and Adolescent Health, Medicine Faculty, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre 90035-903, Brazil
6
Bioinformatics Core, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre 90035-903, Brazil
7
Graduate Program in Medicine: Medical Sciences, Medicine Faculty, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre 90035-003, Brazil
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(22), 12052; https://doi.org/10.3390/ijms252212052
Submission received: 9 October 2024 / Revised: 29 October 2024 / Accepted: 6 November 2024 / Published: 9 November 2024

Abstract

:
Congenital heart defects (CHDs) rank among the most common birth defects, presenting diverse phenotypes. Genetic and environmental factors are critical in molding the process of cardiogenesis. However, these factors’ interactions are not fully comprehended. Hence, this study aimed to identify and characterize differentially expressed genes involved in CHD development through bioinformatics pipelines. We analyzed experimental datasets available in genomic databases, using transcriptome, gene enrichment, and systems biology strategies. Network analysis based on genetic and phenotypic ontologies revealed that EP300, CALM3, and EGFR genes facilitate rapid information flow, while NOTCH1, TNNI3, and SMAD4 genes are significant mediators within the network. Differential gene expression (DGE) analysis identified 2513 genes across three study types, (1) Tetralogy of Fallot (ToF); (2) Hypoplastic Left Heart Syndrome (HLHS); and (3) Trisomy 21/CHD, with LYVE1, PLA2G2A, and SDR42E1 genes found in three of the six studies. Interaction networks between genes from ontology searches and the DGE analysis were evaluated, revealing interactions in ToF and HLHS groups, but none in Trisomy 21/CHD. Through enrichment analysis, we identified immune response and energy generation as some of the relevant ontologies. This integrative approach revealed genes not previously associated with CHD, along with their interactions and underlying biological processes.

1. Introduction

Congenital heart defects (CHDs) are anatomical malformations of the heart and/or major vessels that occur during the embryonic period [1]. Twenty-eight percent of all severe congenital anomalies consist of heart defects, representing a significant global health problem [2]. It occurs in 0.8–1 of every 100 live births, with 25% of diagnosed CHDs needing surgery or intervention, leading to a risk of death within the first month of birth [3,4]. CHDs can be classified into eight categories: conotruncal defects (Tetralogy of Fallot, double outlet right ventricle, and D-transposition of the great arteries), atrioventricular septal defects, left ventricular outflow tract obstructions (hypoplastic left heart syndrome, aortic stenosis, and coarctation of the aorta), septal defects (ventricular and atrial), right ventricular outflow tract obstructions (tricuspid and pulmonary atresia), heterotaxic malformations, complete defects (L-transposition of the great arteries and other defects), and anomalous pulmonary venous return (total or partial) [5].
The etiology of these defects is multifactorial and includes environmental factors, which may contribute to 10% of CHDs, as well as genetic causes, which may be associated with syndromes or occur as isolated heart defects [6]. Maternal diabetes and obesity, alcohol exposure, congenital infections (rubella, hepatitis B), and certain medications (lithium, isotretinoin) are considered environmental factors that increase the risk of congenital heart defects [7,8,9,10,11]. Approximately 70% of CHDs occur in isolation, having a multifactorial etiology; this includes the most severe cardiac defects. Still, they can also be associated with other congenital defects or as part of known genetic syndromes, which usually have a known etiology, such as chromosomal, monogenic, and/or teratogenic causes [12]. Genetic contributions to the development of CHDs vary widely, i.e., CHDs are diagnosed in 35% to 50% of newborns with Trisomy 21, in 60% to 80% with Trisomy 18 and 13, and in 33% with Monosomy X [12,13]. Copy number variations (CNVs) also influence CHDs and can either occur de novo or be inherited. Many CNVs are related to clinically recognized syndromes, such as 22q11.2 deletion syndrome, 7q11.23 deletion (Williams–Beuren syndrome), and 5p15.2 deletion (Cri-Du-Chat syndrome) [13,14,15]. De novo mutations (DNMs) are found in genes highly expressed during heart development, with DNMs being linked to approximately 10% of congenital heart defects [16,17].
The advances in sequencing technologies have helped increase the understanding of the human genome and, consequently, contributed to the discovery of new candidate genes associated with CHDs, facilitating the identification of their genetic variants [18]. It is estimated that the use of molecular biology techniques, bioinformatics tools, and the availability of population study databases have enabled the identification of pathogenic variants in definitive and candidate genes for congenital heart defects in 45% of affected patients [16,19]. The functional characterization of a gene includes identifying its interactions, both at genomic and protein levels. Systems biology is an integrated approach that provides a comprehensive view of genes, proteins, and their interactions through models that assess disturbances, proper phenotype evaluation, and computational methods for probabilistic and mathematical modeling. This approach can assist in understanding gene interactions and the etiology of congenital heart defects [20,21]. Genetic and phenotypic ontology databases have been widely used to explore these molecular mechanisms for characterizing phenotypes and gene functions [22,23].
Based on this approach, the present study aimed to identify and characterize genes that have potential roles in heart development and are consequently involved in the molecular mechanisms related to congenital heart defects. Considering that these defects are prevalent in newborns with diverse phenotypes, in silico analyses were conducted using systems biology tools to investigate the relationship between genes involved in cardiac development and CHDs through genetic and phenotypic ontologies.

2. Results

A scheme of the strategy developed in the present study is described in Figure 1.

2.1. Gene and Phenotype Ontology Analysis and Network Statistics

The GO selection yielded 651 genes corresponding to 430 previously selected ontologies (Table S1), while the 19 CHD phenotypes provided 1111 genes through the Human Phenotype Ontology (HPO) (Table S2). A Venn diagram revealed that 177 genes are shared between both repositories (Figure S1). Some of these genes are known to play critical roles at various stages of cardiac development, including NKX2-5, T-box family genes (TBX1, TBX2, and TBX5), GATA family genes (GATA4, GATA5, and GATA6), and MYH6. Among the genes identified in the HPO, 934 (84%) have not been previously associated with CHDs; however, they are involved in other biological processes such as peroxisome maintenance and organization, cilium assembly and organization, and microtubule-based transport and movement. These processes are critical for the formation of a healthy heart, and their disruption can directly or indirectly influence the development of CHDs. Using the STRING tool, we analyzed the GO and HPO networks individually and in combination, applying the resulting network to Cytoscape v.3.10.0 for topological analysis (Figure S1). To better illustrate gene interactions, a network featuring the genes shared between the HPO and GO is also shown (Figure 2).
Following the same approach, we combined the common HPO and GO genes into a network, identifying the main nodes involved in information flow. Sixty-one genes (nodes) exhibited at least one interaction (edge), with a clustering coefficient of 0.151 and an average number of neighbors of 3.115. This metric means that each node is connected, on average, with at least three other nodes. Based on this analysis, EP300, CALM3, and EGFR had the highest betweenness and closeness centrality levels, meaning that they have high and fast information flow. Additionally, the genes NOTCH1, TNNI3, and SMAD4 emerged as significant mediators of information flow to other genes within the network and have been shown to have a known relationship with CHDs. However, to better understand the gene interactions involved in CHDs, we chose to search for additional genes that could contribute to the various CHD phenotypes.

2.2. Differential Gene Expression Analysis

A total of 122 gene expression studies were obtained through a search in the Gene Expression Omnibus (GEO) repository, selected by two authors based on the inclusion criteria. Among these, 35 studies (28.6%) were excluded for not comprising data on CHDs. Out of the 87 studies (71.4%) related to CHDs, 31 studies (35.6%) did not use microarray or bulk RNA-Seq gene expression methodologies, 17 studies (19.5%) had less than four samples for cases and/or controls, 13 studies (15.0%) were knockout, 6 studies (6.9%) lacked a control group, and another 6 studies (6.9%) did not include cardiac samples. Thus, these studies were excluded from the analysis. Fourteen studies (16.1%) met the selection criteria; however, only seven were included in the final analysis due to processing limitations: GSE196443 (Trisomy 21/CHD), GSE217557 (Trisomy 21/CHD), GSE141955 (Tetralogy of Fallot (ToF)), GSE132401 (Tetralogy of Fallot (ToF) and Single Ventricle Disease (SVD)), GSE36761 (Tetralogy of Fallot—ToF), GSE23959 (Hypoplastic Left Heart Syndrome (HLHS)), and GSE209677 (cardiac cell differentiation). Details about these selected studies are available in Table S3.
The selected studies were analyzed individually, resulting in 2513 differentially expressed genes across the six studies, as shown in Table S4. The dataset of SVD diagnosis samples (GSE141955) did not present significantly differentially expressed genes and was excluded from further analysis. The Lymphatic Vessel Endothelial Hyaluronan Receptor 1 (LYVE1), which encodes a membrane glycoprotein involved in hyaluronan transport during various stages of cell growth, was found to be differentially expressed in three studies: GSE23959 (HLHS), where it was upregulated, and in ToF, on GSE141955 and GSE36761, where it was downregulated. Similarly, a member of the phospholipase A2 family, PLA2G2A, was also differentially expressed, with downregulation observed across these studies. Additionally, a member of the short-chain dehydrogenase/reductase enzyme family, SDR42E1, was found to be upregulated in three of the analyzed studies: GSE36761 (ToF), GSE132401 (ToF), and GSE217557 (Trisomy 21/CHD). Two studies highlighted genes involved in cell recognition: the Mannose Receptor C-Type 1 (MRC1), which encodes membrane receptors involved in the endocytosis process, was downregulated in both GSE141955 (ToF) and GSE36761 (ToF); while the Potassium Two Pore Domain Channel Subfamily K Member 3 (KCNK3), a member of the potassium channel protein superfamily, was downregulated in GSE36761 (ToF) and GSE132401 (ToF). The ADAM Metallopeptidase with Thrombospondin Type 1 Motif 9 (ADAMTS9) gene, which plays a role in proteoglycan cleavage and organ morphology regulation during development, was significantly downregulated in two studies: GSE23959 (HLHS) and GSE36761 (ToF).
To better comprehend the impact of the genes identified in the process of heart development, differential gene expression was analyzed in a healthy cardiac cell differentiation dataset (GSE209677) and compared with previously identified genes in ToF, HLHS, and Trisomy 21 studies to determine if these genes’ expressions were altered through the differentiation process. Using a Venn diagram, 578 common genes were identified, including PLA2G2A and ADAMTS9, which were significant in studies involving ToF and HLHS, and KCNK3, which was significant only in ToF studies.
Additionally, we compared the differentially expressed genes shared across all datasets with those identified through ontology searches in the GO and HPO. We grouped the studies into three categories based on CHD diagnosis: (1) studies on ToF; (2) the study on HLHS; and (3) studies on Trisomy 21 with CHDs (Table 1). Using the systems biology approach, we observed that in the HLHS (GSE23959) and ToF (GSE36761) datasets, there were interactions between the nodes representing the differentially expressed genes from each study and the genes found in the GO and HPO. In contrast, for Trisomy 21 with the CHD dataset, no interaction was detected between the networks (Figure 3). Through the combined analyses, we identified genes that may contribute to the development of CHDs and characterized the interactions and biological processes in which they are involved.

2.3. Ontology Enrichment Analysis

The genes identified in the DGE analyses that were not common to the GO and HPO searches were further analyzed to verify their associated ontologies and signaling pathways. As mentioned, the datasets were divided into three groups according to the type of CHD. The ontologies identified for the ToF studies (GSE141955, GSE132401, and GSE36761) were related to various immunologic processes, including leukocyte-mediated immunity, with 109 genes involved in this process, such as C1RL, IGLC7, and IGHV3-33; and the adaptive immune response based on somatic recombination of immune receptors derived from immunoglobulin superfamily domains, with 98 associated genes, including CCL19, C3, and IGHG2. The regulation of cell–cell adhesion was also enriched, with 76 genes involved, such as LILRB2, ITGB2, and IL1RN (Figure S2a).
For the HLHS study (GSE23959), the ontologies were related to biological processes, such as the generation of precursor metabolites and energy, with 20 associated genes including ATP5MG, NDUFB9, and NDUFA8; energy derivation by oxidation of organic compounds, with 16 genes involved such as COQ10A, UQCRB, and NDUFAF1; and aerobic respiration, with ten genes involved, including PANK2, NDUFA9, and SUCLA2 (Figure S2b). Ontologies such as response and defense against viruses, antimicrobial humoral immune responses mediated by antimicrobial peptides, and the regulation of myoblast fusion were associated with 20 genes in the Trisomy 21 with CHDs datasets (GSE196443 and GSE217557), including GNLY, IFI44L, and CXCL9 (Figure S2c). The genes associated with the observed ontologies were also compared with the dataset related to the differentiation of healthy cardiac cells (GSE209677). Out of the 587 related genes, IL1RN (ToF) and IFI44L (Trisomy 21/CH) were linked to cardiac cell differentiation. Therefore, it is possible to observe that the ontologies related to differentially expressed genes not yet described in the GO and HPO, are responsible for biological processes that directly influence cardiac development, and when affected, may contribute to alterations in this stage of heart formation.

3. Discussion

Heart development is mediated by several biological mechanisms, involving many genes and environmental factors. Through the search for genetic (651 genes) and phenotypic (1111 genes) ontologies, we identified 177 common genes already described as contributing to CHDs, including EP300, CALM3, EGFR, NOTCH1, TNNI3, and SMAD4. However, CHDs present high clinical variability even in recognized syndromes, as they can be associated with congenital or isolated defects. Their complex etiology directly influences the phenotype presented. Hence, we expanded the search for genes that may alter the disease phenotype using DGE analysis. This strategy resulted in 2513 differentially expressed genes obtained from six datasets, divided into three categories based on CHD diagnosis: ToF, HLHS as an isolated CHD, and a dataset of Trisomy 21 associated with CHDs. The genes LYVE1, PLA2G2A, and SDR42E1 were deregulated in three of the six analyzed studies, followed by MRC1, KCNK3, and ADAMTS9, which were differentially expressed in two of the six analyzed studies. All differentially expressed genes, categorized by CHD diagnosis, were compared with a dataset of healthy cardiac cells, where we identified 578 common genes between the two datasets, including the aforementioned PLA2G2A, ADAMTS9, and KCNK3 genes. The genes identified in the present study are listed in Table 2. Additionally, with a systems biology approach, the genetic and phenotypic ontologies identified were compared with the genes differentially expressed. Datasets on isolated CHDs (ToF and HLHS) showed greater interaction with the GO and HPO data, whereas CHDs associated with Trisomy 21 showed no interaction between genes from the repositories. Having identified that genes common to the GO and HPO databases are involved in the development of CHDs, we sought to determine how the genes not found in these GO and HPO repositories contribute to CHDs. Therefore, using gene ontology overrepresentation analysis, we found that most of these genes are involved in immunological processes, energy generation, secondary metabolite production, and cellular communication, which are fundamental for broad cellular maintenance. The identification of these genes that do not directly act on cardiac development might be a consequence of the bioinformatics approach used, which identifies a wide range of genes, including those with many cellular functions. Therefore, to verify the specific contribution of these genes to the development of CHDs, experimental validation is necessary, which constitutes a limitation of the study.
To propel the understanding of congenital heart disease (CHD), it is essential to explore both embryology and associated genetic factors. Such exploratory approaches can be performed, using available resources such as the GO and HPO. In this study, we used these resources to identify previously described gene and phenotypic ontologies for CHD. The NKX2-5 gene stands out as it is involved in multiple stages of heart formation, acting as a key marker in the differentiation of cardiac precursor cells, including the development of the conduction system, valves, and cardiac septal [24]. This gene works with other highly conserved transcription factors (such as TBX20, GATA4, and MYH6) and has a central role in organizing the process of cardiogenesis [25]. Variations in genes known to regulate cardiac development have been extensively studied in both isolated and syndromic CHD, and have been linked to numerous phenotypes, including ToF, atrial septal defect, and ventricular septal defect [26].
Using systems biology, we integrated the genes identified in the GO and HPO to provide new insights into CHD, revealing genes that might play key roles in the flow of biological information. The EP300, critical for cell regulation and differentiation through chromatin remodeling, already has pathogenic variants associated with ToF [26]. EGFR encodes tyrosine kinase receptors and is essential for cardiac cell development, specialization, and differentiation, particularly in regulating human aortic valve embryogenesis [27,28]. CALM3 encodes a highly conserved protein expressed in the heart, regulating various ion channels in cardiac cells, and is involved in several biological processes, such as muscle contraction, inflammation, metabolism, and immune responses [29]. While EP300, EGFR, and CALM3, along with other identified genes, have been associated with CHD, it is crucial to continue identifying genetic variants. Pathogenic variants in specific genes may interfere with embryonic viability and development, potentially leading to embryonic lethality, as seen with variants in the EGFR gene [28]. We also identified and characterized genes involved in three different CHD phenotypes through DGE analysis of six datasets available in the GEO repository. This analysis revealed 2513 differentially expressed genes, including LYVE1 and PLA2G2A, which were found in two of the ToF studies and in the HLHS study. Notably, these genes were not previously mentioned in the literature in relation to CHD. LYVE1 was described as a marker in abnormal lymphatic system development studies, particularly in congenital diaphragmatic hernia [30], and increased nuchal translucency, as seen in Noonan syndrome cases [31]. PLA2G2A has been identified in studies focused on tumors [32] and diabetes [33]. The SDR42E1 gene expressed in two ToF studies and in one Trisomy 21/CHD study plays a crucial role in vitamin D biosynthesis [34], with variations in this gene linked to fragile cornea syndrome [35]. The MRC1 gene, involved in biological processes such as chemotaxis and leukocyte migration, is associated with cleft palate, another congenital defect [36], and was expressed in two ToF studies. The KCNK3 and ADAMTS9 genes have been implicated in cardiac function, with KCNK3 identified as a predisposing factor for pulmonary arterial hypertension (PAH) [37], primarily affecting atrial function and playing roles in rhythm regulation and cardiac conduction [38]. ADAMTS9 is essential for proper cardiovascular development and adult homeostasis, with expression in derivatives of the secondary heart field, vascular smooth muscle cells in the arterial wall, mesenchymal cells of the valves, and non-myocardial cells of the ventricles. It has a described association with CHDs such as bicuspid aortic valve disease [39,40], although it was not found to be related to HLHS and ToF.
With all these genes identified and based on the indication that they play a role in the development of CHD, we aimed to compare these results with datasets of healthy cardiac cells. This comparison revealed 578 genes shared among all analyzed datasets and corroborated the roles of KCNK3 and ADAMTS9 in biological processes relevant to cardiac embryogenesis [37,39,40]. Additionally, we found that while genes such as PLA2G2A are related to heart formation, their specific roles in the development of CHD remain unclear. Considering the broad range of genes not yet directly linked to CHD, we integrated the datasets from the three CHD groups with the data identified in the GO and HPO. Through a protein interaction network analysis, we observed that in the ToF and HLHS groups, there were interactions between nodes related to the disease and those in the GO and HPO; however, this pattern was not evident in the Trisomy 21/CHD group.
ToF, the most common cyanotic CHD, is characterized by a ventricular septal defect, right ventricular outflow tract obstruction, an overlapping of the ventricular septum by the aortic root, and right ventricular hypertrophy [41]. Multiple transcription factors and signaling molecules are related to the disease as reported in the literature, including GATA4, NKX2-5, JAG1, FOXC2, TBX5, and TBX1, which is consistent with the interactions observed with the genes listed in the repositories [42]. Ventricular function is affected in ToF, which may influence gene expression during development. We observed this difference in our results, particularly in the analysis of GSE132401, which studied induced pluripotent stem cells and presented a small number of differentially expressed genes compared to GSE36761, which analyzed ventricular samples. HLHS, on the other hand, is a severe cyanotic CHD resulting from the underdevelopment of the left ventricle, mitral valve, aortic valve, and ascending aorta. Numerous genetic variants and molecular pathways in HLHS have been discovered in recent decades, which was reflected in the interactions identified in the network [43].
The same interaction pattern was not observed in the studies of Trisomy 21 and CHD. Despite CHD being one of the main causes of morbidity and mortality, and the presence of Trisomy 21 being associated with a 50-fold greater likelihood of developing CHD compared to the general population [44], we failed to identify interactions in the repositories used for this study. This indicates a need for new approaches to elucidate the various phenotypes of CHD, especially considering the different etiologies.
Therefore, analyzing the differentially expressed genes from datasets that did not overlap with the GO and HPO was essential for understanding their involvement in congenital heart disease (CHD) through gene ontologies. We observed various genes associated with the immune response across all analyzed datasets. This may be related to multiple biological processes occurring during the embryonic period, such as placental formation, which develops alongside the heart and acts as a barrier that regulates nutrient and oxygen transfer while preventing the passage of pathogens and cells that could impair development [45]. Placental dysfunction has been linked to poor cardiac development, as previously described in the literature [46,47]. Also, we identified genes enriched in ontologies related to the generation of precursor metabolites and energy, which have been associated with the development and postnatal functions of the right ventricle in newborns with CHD. Changes in the ventricular phenotype, such as volume overload in HLHS, could lead to serious consequences that are not yet fully understood [48].

4. Materials and Methods

4.1. Selection of Ontologies for CHD

The complete list of gene ontologies (GOs) and phenotype ontologies was obtained from the AmiGO and Human Phenotype Ontology (HPO) databases, respectively. The GO search was performed using keywords (“cardiac”, “heart”, “cardio”, “myocardial”, “myocardium”, “atrium”, “atrial”, “ventricular”, “ventricle”, “septum”, “septal”, “valve”) with the GO.db package (R v.4.3.1), which identified 727 ontologies, of which 430 were selected for the next step. This selection was conducted by two independent authors who reviewed the ontologies and retained those considered relevant.
With respect to the HPO, the selection was based on the CHDs diagnosed at the Hospital de Clínicas de Porto Alegre, according to a project on active vigilance of congenital anomalies, based on Cardoso et al., 2021 [49]. A total of 2448 genes were found for 19 phenotypes. After removing duplicate genes, 1111 genes were selected for further analysis. To better visualize the selected genes from both ontologies, Venn diagrams were generated throughout the study (https://bioinformatics.psb.ugent.be/webtools/Venn/, accessed on 15 September 2024) [50].

4.2. Systems Biology Analysis

The genes selected from the ontology analyses were input into the STRING v.12 tool [51], where networks of protein–protein interactions (PPIs) for Homo sapiens were generated based on experimental evidence of interactions and co-expression, with a minimum interaction score set >0.400 (medium). The assembled networks were then imported into the Cytoscape v.3.10.0 software to calculate network statistics [52]. Two critical parameters were considered: (1) betweenness centrality, which reflects the frequency with which a node lies on the communication paths between other nodes, indicating that nodes with high betweenness centrality may be crucial for regulating information flow; and (2) closeness centrality, which measures how efficiently information spreads from a central node to others [53]. A comparison between the GO and HPO networks was performed using the DyNet application of Cytoscape v.3.10.0.

4.3. Selection and Analysis of Gene Expression Data

Gene expression datasets were obtained from the Gene Expression Omnibus (GEO) repository [54] using the following search strategy: ((cardiac OR heart OR cardio) AND (anomaly OR defect OR malformation)), filtered for Homo sapiens. The data from each study were collected using the GEO Scraper script and selected by two authors. The inclusion criteria were as follows: studies of gene expression in human cells, tissues, or samples from patients diagnosed with CHD, conducted using microarray or RNA-seq technologies. Studies without raw data available in the GEO, knockout studies, studies with only four samples (n = 4) in the case and/or control groups, and studies without a CHD diagnosis were excluded from the analysis.
For microarray studies, the datasets were downloaded manually and analyzed in the R language using Robust Multi-array Average (RMA) with the affy package [55]. Differentially expressed genes were calculated using the limma package [56]. RNA-seq data followed the pipelines described by Conesa et al., 2016 [57]. Sequence read archives (SRAs) were uploaded into the Galaxy platform [58] using the fastq-dump tool [59]. The quality of the sequences was assessed using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/; accessed on 7 September 2024) [60], followed by sequence alignment to the reference genome GRCh38 (hg38 canonical) using Bowtie2 [61], and transcript counting with featureCounts [62]. In R, normalization was performed using Trimmed Mean Normalization (TMM), and differentially expressed genes were analyzed using the edgeR package [63]. Genes with a logFC > |1| and an adjusted p-value < 0.05 were considered significantly differentially expressed.
Using the STRING tool, we analyzed the combined GO and HPO networks together with the differentially expressed genes, as previously defined. Each category was individually integrated with the GO/HPO network, and the resulting networks were applied to Cytoscape v.3.10.0. We evaluated the interaction between the genes found in GO/HPO and the differentially expressed genes, and identified if any gene presented gene ontology and/or phenotypic characteristics associated with CHD.

4.4. Enrichment Analyses

The differentially expressed genes identified through DGE that were not common with the gene lists obtained from the GO and HPO were evaluated concerning ontologies and signaling pathways. The GO repositories and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were accessed using the clusterProfiler package [64], using over-representation analysis. For this analysis, the studies were divided into four groups based on CHDs: (1) isolated CHD studies (Tetralogy of Fallot); (2) one isolated CHD study (Hypoplastic Left Heart Syndrome); and (3) CHD studies associated with Down syndrome. The genes identified in the HPO that were not common with those found in the GO were also subjected to the same analysis. Biological processes, molecular functions, and cellular components were the types of ontologies accessed in this analysis. A summary of all databases used in this study can be seen in Table 3.

5. Conclusions

The approach used in this study allowed us to integrate several genes previously described for congenital heart disease (CHD), motivating us to explore new interactions based on the phenotypes analyzed from public datasets. The availability of these data allowed us to identify and characterize 2513 genes in six studies analyzed while seeking to understand the biological processes involved and the interactions between these genes. We identified genes that still need to be directly described in the literature on CHD. Knowing that the heart is the first functional organ to develop, in parallel with other biological processes that occur during embryogenesis, it is necessary to understand cellular differentiation in cardiogenesis, especially when we seek to understand the varied phenotypes of CHD and the repercussions of the disease. Therefore, the data presented will contribute to a better understanding of CHD and provide valuable insights for future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms252212052/s1.

Author Contributions

Conceptualization, F.G.d.O. and T.W.K.; methodology, F.G.d.O., J.V.P.F. and Y.C.S.M.; software, F.G.d.O., J.V.P.F. and Y.C.S.M.; validation, F.G.d.O. and J.V.P.F.; formal analysis, F.G.d.O., J.V.P.F. and T.W.K.; investigation, F.G.d.O. and T.W.K.; resources, L.S.-F. and T.W.K.; data curation, F.G.d.O. and T.W.K.; writing—original draft preparation, F.G.d.O. and T.W.K.; writing—review and editing, L.S.-F. and T.W.K.; visualization, L.S.-F. and T.W.K.; supervision, L.S.-F. and T.W.K.; project administration, L.S.-F.; funding acquisition, T.W.K. and L.S.-F. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Accord OPAS/Ministério da Saúde/Fundação Médica do RS Projeto (2178-4) SCON2020-00173—Vigilância e Atenção em Anomalias Congênitas no RS; Hospital de Clínicas de Porto Alegre (HCPA)-Fundo de Incentivo à Pesquisa e Eventos (FIPE), grants no. 2019-0792 and 2020-0174. The scholarships of the authors were funded by the Fundação de Amparo à Pesquisa do Rio Grande do Sul (FAPERGS), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and Fundação Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES). F.G.d.O. is the recipient of a CNPq scholarship (grant no 165593/2021-0), J.V.P.F. is the recipient of a CAPES scholarship (grant no 88887.834560/2023-00) and T.W.K. is the recipient of a CNPq scholarship (grant no. 150181/2023-0).

Institutional Review Board Statement

This study only used data publicly available from databases and repositories. Nevertheless, it was submitted to the Ethics Committee in the Research of Hospital de Clínicas de Porto Alegre and was approved under number (CAAE 30886520910015327).

Informed Consent Statement

This study only used data publicly available in databases and repositories.

Data Availability Statement

The study only used data publicly available in databases and repositories. All the results generated are fully available in the Supplementary Materials.

Acknowledgments

We would like to thank Giovanna Giudicelli and the Bioinformatics Core of Hospital de Clínicas de Porto Alegre for their aid in the transcriptome processing.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Syamasundar, P. Congenital Heart Defects—A Review [Internet]. In Congenital Heart Disease—Selected Aspects; IntechOpen: London, UK, 2012. [Google Scholar] [CrossRef]
  2. Dolk, H.; Loane, M.; Garne, E. European Surveillance of Congenital Anomalies (EUROCAT) Working Group. Congenital heart defects in Europe: Prevalence and perinatal mortality, 2000 to 2005. Circulation 2011, 123, 841–849. [Google Scholar] [CrossRef] [PubMed]
  3. Liu, Y.; Chen, S.; Zühlke, L.; Black, G.C.; Choy, M.; Li, N.; Keavney, B.D. Global birth prevalence of congenital heart defects 1970–2017: Updated systematic review and meta-analysis of 260 studies. Int. J. Epidemiol. 2019, 48, 455–463. [Google Scholar] [CrossRef] [PubMed]
  4. Singh, Y. Evaluation of a child with suspected congenital heart disease. Paediatr. Child Health 2018, 28, 556–561. [Google Scholar] [CrossRef]
  5. Botto, L.D.; Lin, A.E.; Riehle-Colarusso, T.; Malik, S.; Correa, A. Seeking causes: Classifying and evaluating congenital heart defects in etiologic studies. Birth Defects Res. Part A Clin. Mol. Teratol. 2007, 79, 714–727. [Google Scholar] [CrossRef] [PubMed]
  6. Diab, N.S.; Barish, S.; Dong, W.; Zhao, S.; Allington, G.; Yu, X.; Kahle, K.T.; Brueckner, M.; Jin, S.C. Molecular genetics and complex inheritance of congenital heart disease. Genes 2021, 12, 1020. [Google Scholar] [CrossRef]
  7. Wu, H.; Yang, Y.; Jia, J.; Guo, T.; Lei, J.; Deng, Y.Z.; He, Y.; Wang, Y.; Peng, Z.; Zhang, Y.; et al. Maternal Preconception Hepatitis B Virus Infection and Risk of Congenital Heart Diseases in Offspring Among Chinese Women Aged 20 to 49 Years. JAMA Pediatr. 2023, 177, 498–505. [Google Scholar] [CrossRef]
  8. Kalisch-Smith, J.I.; Ved, N.; Sparrow, D.B. Environmental risk factors for congenital heart disease. Cold Spring Harb. Perspect. Biol. 2020, 12, a037234. [Google Scholar] [CrossRef]
  9. Zhang, T.N.; Wu, Q.J.; Liu, Y.S.; Lv, J.le.; Sun, H.; Chang, Q.; Liu, C.F.; Zhao, Y.H. Environmental Risk Factors and Congenital Heart Disease: An Umbrella Review of 165 Systematic Reviews and Meta-Analyses with More Than 120 Million Participants. Front. Cardiovasc. Med. 2021, 8, 640729. [Google Scholar] [CrossRef]
  10. Mondal, D.; Shenoy, R.S.; Mishra, S. Retinoic Acid Embryopathy. Int. J. Appl. Basic Med. Res. 2017, 7, 264–265. [Google Scholar] [CrossRef]
  11. Hedermann, G.; Hedley, P.L.; Thagaard, I.N.; Krebs, L.; Ekelund, C.K.; Sørensen, T.I.A.; Christiansen, M. Maternal obesity and metabolic disorders associate with congenital heart defects in the offspring: A systematic review. PLoS ONE 2021, 16, e0252343. [Google Scholar] [CrossRef]
  12. Digilio, M.C.; Marino, B. What Is New in Genetics of Congenital Heart Defects? Front. Pediatr. 2016, 4, 120. [Google Scholar] [CrossRef] [PubMed]
  13. Costain, G.; Silversides, C.K.; Bassett, A.S. The importance of copy number variation in congenital heart Disease. Genom. Med. 2016, 1, 16031. [Google Scholar] [CrossRef] [PubMed]
  14. Wimalasundera, R.C.; Gardiner, H.M. Congenital heart disease and aneuploidy. Prenat. Diagn. 2004, 24, 1116–1122. [Google Scholar] [CrossRef] [PubMed]
  15. Glessner, J.T.; Bick, A.G.; Ito, K.; Homsy, J.G.; Rodriguez-Murillo, L.; Fromer, M.; Mazaika, E.; Vardarajan, B.; Italia, M.; Leipzig, J.; et al. Increased frequency of de novo copy number variants in congenital heart disease by integrative analysis of single nucleotide polymorphism array and exome sequence data. Circ. Res. 2014, 115, 884–896. [Google Scholar] [CrossRef] [PubMed]
  16. Pierpont, M.E.; Brueckner, M.; Chung, W.K.; Garg, V.; Lacro, R.V.; McGuire, A.L.; Mital, S.; Priest, J.R.; Pu, W.T.; Roberts, A.; et al. Genetic Basis for Congenital Heart Disease: Revisited: A Scientific Statement from the American Heart Association. Circulation 2018, 138, e653–e711. [Google Scholar] [CrossRef]
  17. Zaidi, S.; Choi, M.; Wakimoto, H.; Ma, L.; Jiang, J.; Overton, J.D.; Romano-Adesman, A.; Bjornson, R.D.; Breitbart, R.E.; Brown, K.K.; et al. De novo mutations in histone-modifying genes in congenital heart disease. Nature 2013, 498, 220–223. [Google Scholar] [CrossRef]
  18. Fahed, A.C.; Gelb, B.D.; Seidman, J.G.; Seidman, C.E. Genetics of congenital heart disease: The glass half empty. Circ. Res. 2013, 112, 707–720. [Google Scholar] [CrossRef]
  19. Sifrim, A.; Hitz, M.P.; Wilsdon, A.; Breckpot, J.; Turki, S.H.A.; Thienpont, B.; McRae, J.; Fitzgerald, T.W.; Singh, T.; Swaminathan, G.J.; et al. Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing. Nat. Genet. 2016, 48, 1060–1065. [Google Scholar] [CrossRef]
  20. Sperling, S.R. Systems biology approaches to heart development and congenital heart disease. Cardiovasc. Res. 2011, 91, 269–278. [Google Scholar] [CrossRef]
  21. Yue, R.; Dutta, A. Computational systems biology in disease modeling and control, review and perspectives. Npj Syst. Biol. Appl. 2022, 8, 37. [Google Scholar] [CrossRef]
  22. Zhu, J.; Zhao, Q.; Katsevich, E.; Sabatti, C. Exploratory Gene Ontology Analysis with Interactive Visualization. Sci. Rep. 2019, 9, 7793. [Google Scholar] [CrossRef] [PubMed]
  23. Köhler, S.; Carmody, L.; Vasilevsky, N.; Jacobsen, J.O.B.; Danis, D.; Gourdine, J.P.; Gargano, M.; Harris, N.L.; Matentzoglu, N.; McMurry, J.A.; et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 2019, 47, D1018–D1027. [Google Scholar] [CrossRef] [PubMed]
  24. Behiry, E.G.; Al-Azzouny, M.A.; Sabry, D.; Behairy, O.G.; Salem, N.E. Association of NKX2-5, GATA4, and TBX5 polymorphisms with congenital heart disease in Egyptian children. Mol. Genet. Genom. Med. 2019, 7, e612. [Google Scholar] [CrossRef] [PubMed]
  25. Elliott, D.A.; Kirk, E.P.; Schaft, D.; Harvey, R.P. NK-2 Class Homeodomain Proteins: Conserved Regulators of Cardiogenesis. In Heart Development and Regeneration; Academic Press: Cambridge, MA, USA, 2010; Volume 1. [Google Scholar] [CrossRef]
  26. Ware, S.M.; Lynn-Jefferies, J. New Genetic Insights into Congenital Heart Disease. J. Clin. Exp. Cardiol. 2012, 1. [Google Scholar] [CrossRef]
  27. McBride, K.L.; Zender, G.A.; Fitzgerald-Butt, S.M.; Seagraves, N.J.; Fernbach, S.D.; Zapata, G.; Lewin, M.; Towbin, J.A.; Belmont, J.W. Association of common variants in ERBB4 with congenital left ventricular outflow tract obstruction defects. Birth Defects Res. A Clin. Mol. Teratol. 2011, 91, 162–168. [Google Scholar] [CrossRef]
  28. Olayioye, M.A.; Neve, R.M.; Lane, H.A.; Hynes, N.E. The ErbB signaling network: Receptor heterodimerization in development and cancer. EMBO J. 2000, 19, 3159–3167. [Google Scholar] [CrossRef]
  29. Crotti, L.; Johnson, C.N.; Graf, E.; De Ferrari, G.M.; Cuneo, B.F.; Ovadia, M.; Papagiannis, J.; Feldkamp, M.D.; Rathi, S.G.; Kunic, J.D.; et al. Calmodulin mutations associated with recurrent cardiac arrest in infants. Circulation 2013, 127, 1009–1017. [Google Scholar] [CrossRef]
  30. Shue, E.; Wu, J.; Schecter, S.; Miniati, D. Aberrant pulmonary lymphatic development in the nitrofen mouse model of congenital diaphragmatic hernia. J. Pediatr. Surg. 2013, 48, 1198–1204. [Google Scholar] [CrossRef]
  31. de Mooij, Y.M.; Van den Akker, N.M.; Bekker, M.N.; Bartelings, M.M.; Van Vugt, J.M.; Gittenberger-de-Groot, A.C. Aberrant lymphatic development in euploid fetuses with increased nuchal translucency including Noonan syndrome. Prenat. Diagn. 2011, 31, 159–166. [Google Scholar] [CrossRef]
  32. Praml, C.; Savelyeva, L.; Perri, P.; Schwab, M. Cloning of the human aflatoxin B1-aldehyde reductase gene at 1p35-1p36.1 in a region frequently altered in human tumor cells. Cancer Res. 1998, 58, 5014–5018, Erratum in Cancer Res. 1999, 59, 3019. [Google Scholar]
  33. Khajeniazi, S.; Marjani, A.; Shakeri, R.; Hakimi, S. Polymorphism of Secretary PLA2G2A Gene Associated with Its Serum Level in Type2 Diabetes Mellitus Patients in Northern Iran. Endocr. Metab. Immune Disord. Drug Targets. 2019, 19, 1192–1197. [Google Scholar] [CrossRef] [PubMed]
  34. Hendi, N.N.; Nemer, G. In silico characterization of the novel SDR42E1 as a potential vitamin D modulator. J. Steroid Biochem. Mol. Biol. 2024, 238, 106447. [Google Scholar] [CrossRef] [PubMed]
  35. Bouhouche, A.; Albaroudi, N.; El Alaoui, M.A.; Askander, O.; Habbadi, Z.; El Hassani, A.; Iraqi, H.; El Fahime, E.; Belmekki, M. Identification of the novel SDR42E1 gene that affects steroid biosynthesis associated with the oculocutaneous genital syndrome. Exp. Eye Res. 2021, 209, 108671. [Google Scholar] [CrossRef] [PubMed]
  36. Wang, B.; Xu, M.; Zhao, J.; Yin, N.; Wang, Y.; Song, T. Single-cell Transcriptomics Reveals Activation of Macrophages in All-trans Retinoic Acid (atRA)-induced Cleft Palate. J. Craniofacial Surg. 2024, 35, 177–184. [Google Scholar] [CrossRef] [PubMed]
  37. Girerd, B.; Perros, F.; Antigny, F.; Humbert, M.; Montani, D. KCNK3: New gene target for pulmonary hypertension? Expert. Rev. Respir. Med. 2014, 8, 385–387. [Google Scholar] [CrossRef]
  38. Saint-Martin-Willer, A.; Santos-Gomes, J.; Adão, R.; Brás-Silva, C.; Eyries, M.; Pérez-Vizcaino, F.; Capuano, V.; Montani, D.; Antign, F. Physiological and pathophysiological roles of the KCNK3 potassium channel in the pulmonary circulation and the heart. J. Physiol. 2023, 601, 3717–3737. [Google Scholar] [CrossRef]
  39. Kern, C.B.; Wessels, A.; McGarity, J.; Dixon, L.J.; Alston, E.; Argraves, W.S.; Geeting, D.; Nelson, C.M.; Menick, D.R.; Apte, S.S. Reduced versican cleavage due to Adamts9 haploinsufficiency is associated with cardiac and aortic anomalies. Matrix Biol. 2010, 29, 304–316. [Google Scholar] [CrossRef]
  40. Padang, R.; Bagnall, R.D.; Tsoutsman, T.; Bannon, P.G.; Semsarian, C. Comparative transcriptome profiling in human bicuspid aortic valve disease using RNA sequencing. Physiol. Genom. 2015, 47, 75–87. [Google Scholar] [CrossRef]
  41. Bailliard, F.; Anderson, R.H. Tetralogy of Fallot. Orphanet J. Rare Dis. 2009, 4, 2. [Google Scholar] [CrossRef]
  42. Morgenthau, A.; Frishman, W.H. Genetic Origins of Tetralogy of Fallot. Cardiol. Rev. 2018, 26, 86–92. [Google Scholar] [CrossRef]
  43. Bejjani, A.T.; Wary, N.; Gu, M. Hypoplastic left heart syndrome (HLHS): Molecular pathogenesis and emerging drug targets for cardiac repair and regeneration. Expert. Opin. Ther. Targets 2021, 25, 621–632. [Google Scholar] [CrossRef] [PubMed]
  44. Dimopoulos, K.; Constantine, A.; Clift, P.; Condliffe, R.; Moledina, S.; Jansen, K.; Inuzuka, R.; Veldtman, G.R.; Cua, C.L.; Tay, E.L.W.; et al. Cardiovascular Complications of Down Syndrome: Scoping Review and Expert Consensus. Circulation 2023, 147, 425–441. [Google Scholar] [CrossRef] [PubMed]
  45. Ward, E.J.; Bert, S.; Fanti, S.; Malone, K.M.; Maughan, R.T.; Gkantsinikoudi, C.; Prin, F.; Volpato, L.K.; Piovezan, A.P.; Graham, G.J.; et al. Placental Inflammation Leads to Abnormal Embryonic Heart Development. Circulation 2023, 147, 956–972. [Google Scholar] [CrossRef] [PubMed]
  46. Radhakrishna, U.; Albayrak, S.; Zafra, R.; Baraa, A.; Vishweswaraiah, S.; Veerappa, A.M.; Mahishi, D.; Saiyed, N.; Mishra, N.K.; Guda, C.; et al. Placental epigenetics for evaluation of fetal congenital heart defects: Ventricular Septal Defect (VSD). PLoS ONE 2019, 14, e0200229. [Google Scholar] [CrossRef] [PubMed]
  47. Adams, R.H.; Porras, A.; Alonso, G.; Jones, M.; Vintersten, K.; Panelli, S.; Valladares, A.; Perez, L.; Klein, R.; Nebreda, A.R. Essential role of p38alpha MAP kinase in placental but not embryonic cardiovascular development. Mol. Cell 2000, 6, 109–116. [Google Scholar] [CrossRef]
  48. Sun, S.; Hu, Y.; Xiao, Y.; Wang, S.; Jiang, C.; Liu, J.; Zhang, H.; Hong, H.; Li, F.; Ye, L. Postnatal Right Ventricular Developmental Track Changed by Volume Overload. J. Am. Heart Assoc. 2021, 10, e020854. [Google Scholar] [CrossRef]
  49. Cardoso-dos-Santos, A.C.; Medeiros-De-Souza, A.C.; Bremm, J.M.; Alves, R.F.S.; de Araújo, V.E.M.; Leite, J.C.L.; Schuler-Faccini, L.; Sanseverino, M.T.V.; Karam, S.d.M.; Félix, T.M.; et al. Lista de anomalias congênitas prioritárias para vigilância no âmbito do Sistema de Informações sobre Nascidos Vivos do Brasil. Epidemiol. E Serv. Saude 2021, 30, e2020835. [Google Scholar] [CrossRef]
  50. Bioinformatics & Evolutionary Genomics. 2024. Available online: https://bioinformatics.psb.ugent.be/webtools/Venn/ (accessed on 15 September 2024).
  51. Szklarczyk, D.; Gable, A.L.; Lyon, D.; Junge, A.; Wyder, S.; Huerta-Cepas, J.; Simonovic, M.; Doncheva, N.T.; Morris, J.H.; Bork, P.; et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019, 47, D607–D613. [Google Scholar] [CrossRef]
  52. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef]
  53. Yu, H.; Kim, P.M.; Sprecher, E.; Trifonov, V.; Gerstein, M. The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics. PLoS Comput. Biol. 2007, 3, e59. [Google Scholar] [CrossRef]
  54. Clough, E.; Barrett, T. The Gene Expression Omnibus Database. In Statistical Genomics; Springer: New York, NY, USA, 2016; pp. 93–110. [Google Scholar] [CrossRef]
  55. Gautier, L.; Cope, L.; Bolstad, B.M.; Irizarry, R.A. affy—Analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 2004, 20, 307–315. [Google Scholar] [CrossRef] [PubMed]
  56. Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef] [PubMed]
  57. Conesa, A.; Madrigal, P.; Tarazona, S.; Gomez-Cabrero, D.; Cervera, A.; McPherson, A.; Szcześniak, M.W.; Gaffney, D.J.; Elo, L.L.; Zhang, X.; et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016, 17, 13. [Google Scholar] [CrossRef] [PubMed]
  58. Afgan, E.; Baker, D.; Batut, B.; van den Beek, M.; Bouvier, D.; Čech, M.; Chilton, J.; Clements, D.; Coraor, N.; Grüning, B.A.; et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 2018, 46, W537–W544. [Google Scholar] [CrossRef] [PubMed]
  59. Leinonen, R.; Sugawara, H.; Shumway, M. The Sequence Read Archive. Nucleic Acids Res. 2011, 39, D19–D21. [Google Scholar] [CrossRef]
  60. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2023. Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 7 May 2024).
  61. Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 2012, 9, 357–359. [Google Scholar] [CrossRef]
  62. Liao, Y.; Smyth, G.K.; Shi, W. FeatureCounts: An efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 2014, 30, 923–930. [Google Scholar] [CrossRef]
  63. Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2009, 26, 139–140. [Google Scholar] [CrossRef]
  64. Yu, G.; Wang, L.G.; Han, Y.; He, Q.Y. ClusterProfiler: An R package for comparing biological themes among gene clusters. OMICS A J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
  65. The Gene Ontology Consortium; Aleksander, S.A.; Balhoff, J.; Carbon, S.; Cherry, J.M.; Drabkin, H.J.; Ebert, D.; Feuermann, M.; Gaudet, P.; Harris, N.L.; et al. The Gene Ontology knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar] [CrossRef]
  66. Gargano, M.A.; Matentzoglu, N.; Coleman, B.; Addo-Lartey, E.B.; Anagnostopoulos, A.V.; Anderton, J.; Avillach, P.; Bagley, A.M.; Bakštein, E.; Balhoff, J.P.; et al. The Human Phenotype Ontology in 2024: Phenotypes around the world. Nucleic Acids Res. 2024, 5, D1333–D1346. [Google Scholar] [CrossRef] [PubMed]
  67. Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for functional genomics data sets—Update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef] [PubMed]
  68. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef] [PubMed]
  69. Kanehisa, M.; Furumichi, M.; Sato, Y.; Matsuura, Y.; Ishiguro-Watanabe, M. KEGG: Biological systems database as a model of the real world. Nucleic Acids Res. 2024. [Google Scholar] [CrossRef]
Figure 1. Schematic research strategy comprising systems biology analysis, differential gene expression analysis, and enrichment analyses.
Figure 1. Schematic research strategy comprising systems biology analysis, differential gene expression analysis, and enrichment analyses.
Ijms 25 12052 g001
Figure 2. A network comprising experimental evidence of protein–protein interactions, considering genes in common between the GO and HPO repositories. The size of each node reflects the speed of information flow, with larger nodes indicating a faster flow. Pink and purple nodes represent genes with the highest levels of betweenness and closeness centrality, acting as critical control points for the flow and the speed at which information is relayed to other genes in the network. The green nodes—NOTCH1, TNNI3, and SMAD4—exhibit significant closeness centrality, playing key roles in mediating information flow within smaller groups of genes in the network.
Figure 2. A network comprising experimental evidence of protein–protein interactions, considering genes in common between the GO and HPO repositories. The size of each node reflects the speed of information flow, with larger nodes indicating a faster flow. Pink and purple nodes represent genes with the highest levels of betweenness and closeness centrality, acting as critical control points for the flow and the speed at which information is relayed to other genes in the network. The green nodes—NOTCH1, TNNI3, and SMAD4—exhibit significant closeness centrality, playing key roles in mediating information flow within smaller groups of genes in the network.
Ijms 25 12052 g002
Figure 3. A network of interactions between differentially expressed genes from each study and genes associated with the Gene Ontology (GO) and Human Phenotype Ontology (HPO). (A) Purple nodes represent GO/HPO genes, pink nodes represent genes from Tetralogy of Fallot studies, and blue nodes represent common genes identified and already described for gene ontologies and phenotypes associated with CHD. (B) Purple nodes represent GO/HPO genes, orange nodes represent genes from the Hypoplastic Left Heart Syndrome study, and magenta nodes represent common genes identified and already described for gene ontologies and phenotypes associated with CHDs. (C) Purple nodes represent GO/HPO genes, and green nodes represent genes associated with Trisomy 21/CHD. In this case, there was no interaction between GO/HPO genes and differentially expressed genes, and no genes in common between the databases.
Figure 3. A network of interactions between differentially expressed genes from each study and genes associated with the Gene Ontology (GO) and Human Phenotype Ontology (HPO). (A) Purple nodes represent GO/HPO genes, pink nodes represent genes from Tetralogy of Fallot studies, and blue nodes represent common genes identified and already described for gene ontologies and phenotypes associated with CHD. (B) Purple nodes represent GO/HPO genes, orange nodes represent genes from the Hypoplastic Left Heart Syndrome study, and magenta nodes represent common genes identified and already described for gene ontologies and phenotypes associated with CHDs. (C) Purple nodes represent GO/HPO genes, and green nodes represent genes associated with Trisomy 21/CHD. In this case, there was no interaction between GO/HPO genes and differentially expressed genes, and no genes in common between the databases.
Ijms 25 12052 g003aIjms 25 12052 g003b
Table 1. The proportion of differentially expressed genes considering downregulated genes, upregulated genes, and shared genes in the GO and HPO repositories.
Table 1. The proportion of differentially expressed genes considering downregulated genes, upregulated genes, and shared genes in the GO and HPO repositories.
StudyStudy TypeCHDControls (N)Cases (N)DGE (N) 1UpregulatedDownregulatedGO (N) 2HPO (N) 3GO + HPO
GSE196443RNA-SeqTrisomy 21/CHD5523230200
GSE217557RNA-SeqTrisomy 21/CHD325013121000
GSE36761RNA-SeqToF 472222287271501655913
GSE132401RNA-SeqToF 4551116942654
GSE141955MicroarrayToF 46935233000
GSE23959MicroarrayHLHS 561018413054992
1 Differentially expressed genes identified. 2 Gene Ontology. 3 Human Phenotype Ontology. 4 Tetralogy of Fallot. 5 Hypoplastic Left Heart Syndrome.
Table 2. Genes that may contribute to the development of CHD and its functions.
Table 2. Genes that may contribute to the development of CHD and its functions.
GeneAnalysis SourceGene Function
EP300Systems Biology NetworksChromatin binding and transcription coactivator activity.
CALM3Systems Biology NetworkCalcium ion binding and protein domain specific binding.
EGFRSystems Biology NetworkIdentical protein binding and protein kinase activity.
NOTCH1Systems Biology NetworkDNA-binding transcription factor activity and sequence-specific DNA binding.
TNNI3Systems Biology NetworkProtein kinase binding and protein domain specific binding.
SMAD4Systems Biology NetworkDNA-binding transcription factor activity and sequence-specific DNA binding.
LYVE1DGESignaling receptor activity and hyaluronic acid binding.
PLA2G2ADGECalcium ion binding and phospholipase A2 activity.
SDR42E1DGEOxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor and 3-beta-hydroxy-delta5-steroid dehydrogenase activity.
MRC1DGESignaling receptor activity and mannose binding.
KCNK3DGEProtein homodimerization activity and obsolete protein C-terminus binding.
ADAMTS9DGEMetalloendopeptidase activity and endopeptidase activity.
Table 3. Description of databases used in this study.
Table 3. Description of databases used in this study.
DatabaseDescriptionMain FeaturesPurpose of the StudyReference
Gene Ontology (GO)It provides structured information about genetic functions, serving as the basis for computational analysis of large-scale molecular biology and genetic experiments.Data availability in three categories: biological process, molecular function, and cellular component.Identify the available ontologies for the development of CHD.The Gene Ontology Consortium, 2023 [65]
Human Phenotype Ontology (HPO)It provides an ontology of clinically relevant phenotypes, disease phenotype annotations, and the algorithms that operate on them. The HPO can be used to support differential diagnoses, translational research, and a range of applications in computational biology, providing the means to compute clinical phenotypes.Describes phenotypic abnormalities in human diseases.Identify phenotypes associated with CHD.Gargano et al., 2024 [66]
Gene Expression Omnibus (GEO)A public repository for high-throughput gene expression data, where you can access datasets from multiple organisms and biological conditions.It includes publicly accessible gene expression, microarray, and RNA-Seq data.Investigate gene expression profiles related to CHD.Barrett et al., 2013 [67]
STRINGIt systematically integrates protein–protein interactions from diverse sources, including the scientific literature, experimental databases, and computational predictions.Data are curated from diverse sources: scientific literature, computational interaction predictions, coexpression, conserved genomic context, databases of interaction experiments, and known complexes/pathways from curated sources.Identify relevant interaction networks for CHD-related genes using experimental data and coexpression.Szklarczyk et al., 2023 [68]
Kyoto Encyclopedia of Genes and Genomes (KEGG)A database for representing and analyzing biological systems, with maps of metabolic and signaling pathways, cellular interactions, and disease pathways.Includes information on genes and proteins, disease pathways, drug information, and integration with other databases.Identify biological pathways involved in CHD and their associated genes.Kanehisa et al., 2024 [69]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

de Oliveira, F.G.; Foletto, J.V.P.; Medeiros, Y.C.S.; Schuler-Faccini, L.; Kowalski, T.W. Bioinformatic Multi-Strategy Profiling of Congenital Heart Defects for Molecular Mechanism Recognition. Int. J. Mol. Sci. 2024, 25, 12052. https://doi.org/10.3390/ijms252212052

AMA Style

de Oliveira FG, Foletto JVP, Medeiros YCS, Schuler-Faccini L, Kowalski TW. Bioinformatic Multi-Strategy Profiling of Congenital Heart Defects for Molecular Mechanism Recognition. International Journal of Molecular Sciences. 2024; 25(22):12052. https://doi.org/10.3390/ijms252212052

Chicago/Turabian Style

de Oliveira, Fabyanne Guimarães, João Vitor Pacheco Foletto, Yasmin Chaves Scimczak Medeiros, Lavínia Schuler-Faccini, and Thayne Woycinck Kowalski. 2024. "Bioinformatic Multi-Strategy Profiling of Congenital Heart Defects for Molecular Mechanism Recognition" International Journal of Molecular Sciences 25, no. 22: 12052. https://doi.org/10.3390/ijms252212052

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop