We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in... more We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in ovarian cancer. Here we analyze germline and somatic alterations in 429 ovarian carcinoma cases and 557 controls. We identify 3,635 high confidence, rare truncation and 22,953 missense Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research,
PURPOSE: Cancer neoantigens are important targets of cancer immunotherapy. Neoantigen vaccines ha... more PURPOSE: Cancer neoantigens are important targets of cancer immunotherapy. Neoantigen vaccines have the potential to induce or enhance highly specific antitumor immune responses with minimal risk of autoimmunity. We have developed a neoantigen DNA vaccine platform capable of efficiently presenting both HLA class I and II epitopes. To test the safety, feasibility and efficacy of this platform, we performed a phase 1 clinical trial in triple negative breast cancer patients with persistent disease following neoadjuvant chemotherapy, a patient population at high risk of disease recurrence. EXPERIMENTAL DESIGN: Expressed somatic mutations were identified by tumor/normal exome sequencing and tumor RNA sequencing. The pVACtools software suite was used to identify and prioritize cancer neoantigens. Neoantigen DNA vaccines were designed and manufactured in an academic GMP facility at Washington University School of Medicine. Neoantigen DNA vaccines were administered via electroporation follo...
The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensu... more The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reprodu...
3965 Poster Board III-901 We have recently established that whole genome sequencing is a valid, u... more 3965 Poster Board III-901 We have recently established that whole genome sequencing is a valid, unbiased approach that can identify novel candidate mutations that may be important for AML pathogenesis (Ley et al Nature 2008, Mardis et al NEJM 2009). Acute promyelocytic leukemia (APL, FAB M3 AML) is a subtype of AML characterized by the t(15;17)(q22;q11.2) translocation that creates an oncogenic fusion gene, PML-RARA. Our laboratory has previously modeled APL in a mouse in an effort to understand the genetic events that lead to the disease. In our knockin mouse model, a human PML-RARA cDNA was targeted to the 5' untranslated region of the mouse cathepsin G gene on chromosome 14 (mCG-PR). The targeting vector was transfected into the RW-4 embryonic stem cell line, derived from a 129/SvJ mouse. The transfected RW-4 cells were injected into C57Bl/6 blastocysts, and chimeric offspring were bred to C57Bl/6 mice. F1 129/SvJ x C57Bl/6 mice were subsequently backcrossed onto the B6/Tacon...
404 To characterize the genomic events associated with distinct subtypes of AML, we used whole ge... more 404 To characterize the genomic events associated with distinct subtypes of AML, we used whole genome sequencing to compare 24 tumor/normal sample pairs from patients with normal karyotype (NK) M1-AML (12 cases) and t(15;17)-positive M3-AML (12 cases). All single nucleotide variants (SNVs), small insertions and deletions (indels), and cryptic structural variants (SVs) identified by whole genome sequencing (average coverage 28x) were validated using sample-specific custom Nimblegen capture arrays, followed by Illumina sequencing; an average coverage of 972 reads per somatic variant yielded 10,597 validated somatic variants (average 421/genome). Of these somatic mutations, 308 occurred in 286 unique genes; on average, 9.4 somatic mutations per genome had translational consequences. Several important themes emerged: 1) AML genomes contain a diverse range of recurrent mutations. We assessed the 286 mutated genes for recurrency in an additional 34 NK M1-AML cases and 9 M3-AML cases. We i...
Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biolo... more Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here, we describe the first pan-cancer analysis of sex differences in whole genomes of 1,983 tumours of 28 subtypes from the ICGC Pan-Cancer Analysis of Whole Genomes project. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in cancer research.Sex disparities in cancer epidemiology include an increased overall cancer risk in males corresponding w...
Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversit... more Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversity. We describe the multi-platform molecular landscape of 206 adult soft tissue sarcomas representing 6 major types. Along with novel insights into the biology of individual sarcoma types, we report three overarching findings: (1) unlike most epithelial malignancies, these sarcomas (excepting synovial sarcoma) are characterized predominantly by copy-number changes, with low mutational loads and only a few genes (TP53, ATRX, RB1) highly recurrently mutated across sarcoma types; (2) within sarcoma types, genomic and regulomic diversity of driver pathways defines molecular subtypes associated with patient outcome; and (3) the immune microenvironment, inferred from DNA methylation and mRNA profiles, associates with outcome and may inform clinical trials of immune checkpoint inhibitors. Overall, this large-scale analysis reveals previously unappreciated sarcoma-type-specific changes in copy nu...
Although the MYC oncogene has been implicated in cancer, a systematic assessment of alterations o... more Although the MYC oncogene has been implicated in cancer, a systematic assessment of alterations of MYC, related transcription factors, and co-regulatory proteins, forming the proximal MYC network (PMN), across human cancers is lacking. Using computational approaches, we define genomic and proteomic features associated with MYC and the PMN across the 33 cancers of The Cancer Genome Atlas. Pan-cancer, 28% of all samples had at least one of the MYC paralogs amplified. In contrast, the MYC antagonists MGA and MNT were the most frequently mutated or deleted members, proposing a role as tumor suppressors. MYC alterations were mutually exclusive with PIK3CA, PTEN, APC, or BRAF alterations, suggesting that MYC is a distinct oncogenic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such as immune response and growth factor signaling; chromatin, translation, and DNA replication/repair were conserved pan-cancer. This analysis reveals insights into MYC biology an...
Protein ubiquitination is a dynamic and reversible process of adding single ubiquitin molecules o... more Protein ubiquitination is a dynamic and reversible process of adding single ubiquitin molecules or various ubiquitin chains to target proteins. Here, using multidimensional omic data of 9,125 tumor samples across 33 cancer types from The Cancer Genome Atlas, we perform comprehensive molecular characterization of 929 ubiquitin-related genes and 95 deubiquitinase genes. Among them, we systematically identify top somatic driver candidates, including mutated FBXW7 with cancer-type-specific patterns and amplified MDM2 showing a mutually exclusive pattern with BRAF mutations. Ubiquitin pathway genes tend to be upregulated in cancer mediated by diverse mechanisms. By integrating pan-cancer multiomic data, we identify a group of tumor samples that exhibit worse prognosis. These samples are consistently associated with the upregulation of cell-cycle and DNA repair pathways, characterized by mutated TP53, MYC/TERT amplification, and APC/PTEN deletion. Our analysis highlights the importance of...
DNA damage repair (DDR) pathways modulate cancer risk, progression, and therapeutic response. We ... more DNA damage repair (DDR) pathways modulate cancer risk, progression, and therapeutic response. We systematically analyzed somatic alterations to provide a comprehensive view of DDR deficiency across 33 cancer types. Mutations with accompanying loss of heterozygosity were observed in over 1/3 of DDR genes, including TP53 and BRCA1/2. Other prevalent alterations included epigenetic silencing of the direct repair genes EXO5, MGMT, and ALKBH3 in ∼20% of samples. Homologous recombination deficiency (HRD) was present at varying frequency in many cancer types, most notably ovarian cancer. However, in contrast to ovarian cancer, HRD was associated with worse outcomes in several other cancers. Protein structure-based analyses allowed us to predict functional consequences of rare, recurrent DDR mutations. A new machine-learning-based classifier developed from gene expression data allowed us to identify alterations that phenocopy deleterious TP53 mutations. These frequent DDR gene alterations i...
Hotspot mutations in splicing factor genes have been recently reported at high frequency in hemat... more Hotspot mutations in splicing factor genes have been recently reported at high frequency in hematological malignancies, suggesting the importance of RNA splicing in cancer. We analyzed whole-exome sequencing data across 33 tumor types in The Cancer Genome Atlas (TCGA), and we identified 119 splicing factor genes with significant non-silent mutation patterns, including mutation over-representation, recurrent loss of function (tumor suppressor-like), or hotspot mutation profile (oncogene-like). Furthermore, RNA sequencing analysis revealed altered splicing events associated with selected splicing factor mutations. In addition, we were able to identify common gene pathway profiles associated with the presence of these mutations. Our analysis suggests that somatic alteration of genes involved in the RNA-splicing process is common in cancer and may represent an underappreciated hallmark of tumorigenesis.
Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known t... more Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known to play pathophysiological roles in cancer. We inferred lncRNAs that dysregulate cancer pathways, oncogenes, and tumor suppressors (cancer genes) by modeling their effects on the activity of transcription factors, RNA-binding proteins, and microRNAs in 5,185 TCGA tumors and 1,019 ENCODE assays. Our predictions included hundreds of candidate onco- and tumor-suppressor lncRNAs (cancer lncRNAs) whose somatic alterations account for the dysregulation of dozens of cancer genes and pathways in each of 14 tumor contexts. To demonstrate proof of concept, we showed that perturbations targeting OIP5-AS1 (an inferred tumor suppressor) and TUG1 and WT1-AS (inferred onco-lncRNAs) dysregulated cancer genes and altered proliferation of breast and gynecologic cancer cells. Our analysis indicates that, although most lncRNAs are dysregulated in a tumor-specific manner, some, including OIP5-AS1, TUG1, NEAT1...
We analyzed 921 adenocarcinomas of the esophagus, stomach, colon, and rectum to examine shared an... more We analyzed 921 adenocarcinomas of the esophagus, stomach, colon, and rectum to examine shared and distinguishing molecular characteristics of gastrointestinal tract adenocarcinomas (GIACs). Hypermutated tumors were distinct regardless of cancer type and comprised those enriched for insertions/deletions, representing microsatellite instability cases with epigenetic silencing of MLH1 in the context of CpG island methylator phenotype, plus tumors with elevated single-nucleotide variants associated with mutations in POLE. Tumors with chromosomal instability were diverse, with gastroesophageal adenocarcinomas harboring fragmented genomes associated with genomic doubling and distinct mutational signatures. We identified a group of tumors in the colon and rectum lacking hypermutation and aneuploidy termed genome stable and enriched in DNA hypermethylation and mutations in KRAS, SOX9, and PCBP1.
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data ... more For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedente...
We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse... more We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-β dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tum...
Renal cell carcinoma (RCC) is not a single disease, but several histologically defined cancers wi... more Renal cell carcinoma (RCC) is not a single disease, but several histologically defined cancers with different genetic drivers, clinical courses, and therapeutic responses. The current study evaluated 843 RCC from the three major histologic subtypes, including 488 clear cell RCC, 274 papillary RCC, and 81 chromophobe RCC. Comprehensive genomic and phenotypic analysis of the RCC subtypes reveals distinctive features of each subtype that provide the foundation for the development of subtype-specific therapeutic and management strategies for patients affected with these cancers. Somatic alteration of BAP1, PBRM1, and PTEN and altered metabolic pathways correlated with subtype-specific decreased survival, while CDKN2A alteration, increased DNA hypermethylation, and increases in the immune-related Th2 gene expression signature correlated with decreased survival within all major histologic subtypes. CIMP-RCC demonstrated an increased immune signature, and a uniform and distinct metabolic e...
Cancer progression involves the gradual loss of a differentiated phenotype and acquisition of pro... more Cancer progression involves the gradual loss of a differentiated phenotype and acquisition of progenitor and stem-cell-like features. Here, we provide novel stemness indices for assessing the degree of oncogenic dedifferentiation. We used an innovative one-class logistic regression (OCLR) machine-learning algorithm to extract transcriptomic and epigenetic feature sets derived from non-transformed pluripotent stem cells and their differentiated progeny. Using OCLR, we were able to identify previously undiscovered biological mechanisms associated with the dedifferentiated oncogenic state. Analyses of the tumor microenvironment revealed unanticipated correlation of cancer stemness with immune checkpoint expression and infiltrating immune cells. We found that the dedifferentiated oncogenic phenotype was generally most prominent in metastatic tumors. Application of our stemness indices to single-cell data revealed patterns of intra-tumor molecular heterogeneity. Finally, the indices allo...
Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of... more Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumor-infiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characteriza...
We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in... more We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in ovarian cancer. Here we analyze germline and somatic alterations in 429 ovarian carcinoma cases and 557 controls. We identify 3,635 high confidence, rare truncation and 22,953 missense Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research,
PURPOSE: Cancer neoantigens are important targets of cancer immunotherapy. Neoantigen vaccines ha... more PURPOSE: Cancer neoantigens are important targets of cancer immunotherapy. Neoantigen vaccines have the potential to induce or enhance highly specific antitumor immune responses with minimal risk of autoimmunity. We have developed a neoantigen DNA vaccine platform capable of efficiently presenting both HLA class I and II epitopes. To test the safety, feasibility and efficacy of this platform, we performed a phase 1 clinical trial in triple negative breast cancer patients with persistent disease following neoadjuvant chemotherapy, a patient population at high risk of disease recurrence. EXPERIMENTAL DESIGN: Expressed somatic mutations were identified by tumor/normal exome sequencing and tumor RNA sequencing. The pVACtools software suite was used to identify and prioritize cancer neoantigens. Neoantigen DNA vaccines were designed and manufactured in an academic GMP facility at Washington University School of Medicine. Neoantigen DNA vaccines were administered via electroporation follo...
The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensu... more The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reprodu...
3965 Poster Board III-901 We have recently established that whole genome sequencing is a valid, u... more 3965 Poster Board III-901 We have recently established that whole genome sequencing is a valid, unbiased approach that can identify novel candidate mutations that may be important for AML pathogenesis (Ley et al Nature 2008, Mardis et al NEJM 2009). Acute promyelocytic leukemia (APL, FAB M3 AML) is a subtype of AML characterized by the t(15;17)(q22;q11.2) translocation that creates an oncogenic fusion gene, PML-RARA. Our laboratory has previously modeled APL in a mouse in an effort to understand the genetic events that lead to the disease. In our knockin mouse model, a human PML-RARA cDNA was targeted to the 5' untranslated region of the mouse cathepsin G gene on chromosome 14 (mCG-PR). The targeting vector was transfected into the RW-4 embryonic stem cell line, derived from a 129/SvJ mouse. The transfected RW-4 cells were injected into C57Bl/6 blastocysts, and chimeric offspring were bred to C57Bl/6 mice. F1 129/SvJ x C57Bl/6 mice were subsequently backcrossed onto the B6/Tacon...
404 To characterize the genomic events associated with distinct subtypes of AML, we used whole ge... more 404 To characterize the genomic events associated with distinct subtypes of AML, we used whole genome sequencing to compare 24 tumor/normal sample pairs from patients with normal karyotype (NK) M1-AML (12 cases) and t(15;17)-positive M3-AML (12 cases). All single nucleotide variants (SNVs), small insertions and deletions (indels), and cryptic structural variants (SVs) identified by whole genome sequencing (average coverage 28x) were validated using sample-specific custom Nimblegen capture arrays, followed by Illumina sequencing; an average coverage of 972 reads per somatic variant yielded 10,597 validated somatic variants (average 421/genome). Of these somatic mutations, 308 occurred in 286 unique genes; on average, 9.4 somatic mutations per genome had translational consequences. Several important themes emerged: 1) AML genomes contain a diverse range of recurrent mutations. We assessed the 286 mutated genes for recurrency in an additional 34 NK M1-AML cases and 9 M3-AML cases. We i...
Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biolo... more Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here, we describe the first pan-cancer analysis of sex differences in whole genomes of 1,983 tumours of 28 subtypes from the ICGC Pan-Cancer Analysis of Whole Genomes project. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in cancer research.Sex disparities in cancer epidemiology include an increased overall cancer risk in males corresponding w...
Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversit... more Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversity. We describe the multi-platform molecular landscape of 206 adult soft tissue sarcomas representing 6 major types. Along with novel insights into the biology of individual sarcoma types, we report three overarching findings: (1) unlike most epithelial malignancies, these sarcomas (excepting synovial sarcoma) are characterized predominantly by copy-number changes, with low mutational loads and only a few genes (TP53, ATRX, RB1) highly recurrently mutated across sarcoma types; (2) within sarcoma types, genomic and regulomic diversity of driver pathways defines molecular subtypes associated with patient outcome; and (3) the immune microenvironment, inferred from DNA methylation and mRNA profiles, associates with outcome and may inform clinical trials of immune checkpoint inhibitors. Overall, this large-scale analysis reveals previously unappreciated sarcoma-type-specific changes in copy nu...
Although the MYC oncogene has been implicated in cancer, a systematic assessment of alterations o... more Although the MYC oncogene has been implicated in cancer, a systematic assessment of alterations of MYC, related transcription factors, and co-regulatory proteins, forming the proximal MYC network (PMN), across human cancers is lacking. Using computational approaches, we define genomic and proteomic features associated with MYC and the PMN across the 33 cancers of The Cancer Genome Atlas. Pan-cancer, 28% of all samples had at least one of the MYC paralogs amplified. In contrast, the MYC antagonists MGA and MNT were the most frequently mutated or deleted members, proposing a role as tumor suppressors. MYC alterations were mutually exclusive with PIK3CA, PTEN, APC, or BRAF alterations, suggesting that MYC is a distinct oncogenic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such as immune response and growth factor signaling; chromatin, translation, and DNA replication/repair were conserved pan-cancer. This analysis reveals insights into MYC biology an...
Protein ubiquitination is a dynamic and reversible process of adding single ubiquitin molecules o... more Protein ubiquitination is a dynamic and reversible process of adding single ubiquitin molecules or various ubiquitin chains to target proteins. Here, using multidimensional omic data of 9,125 tumor samples across 33 cancer types from The Cancer Genome Atlas, we perform comprehensive molecular characterization of 929 ubiquitin-related genes and 95 deubiquitinase genes. Among them, we systematically identify top somatic driver candidates, including mutated FBXW7 with cancer-type-specific patterns and amplified MDM2 showing a mutually exclusive pattern with BRAF mutations. Ubiquitin pathway genes tend to be upregulated in cancer mediated by diverse mechanisms. By integrating pan-cancer multiomic data, we identify a group of tumor samples that exhibit worse prognosis. These samples are consistently associated with the upregulation of cell-cycle and DNA repair pathways, characterized by mutated TP53, MYC/TERT amplification, and APC/PTEN deletion. Our analysis highlights the importance of...
DNA damage repair (DDR) pathways modulate cancer risk, progression, and therapeutic response. We ... more DNA damage repair (DDR) pathways modulate cancer risk, progression, and therapeutic response. We systematically analyzed somatic alterations to provide a comprehensive view of DDR deficiency across 33 cancer types. Mutations with accompanying loss of heterozygosity were observed in over 1/3 of DDR genes, including TP53 and BRCA1/2. Other prevalent alterations included epigenetic silencing of the direct repair genes EXO5, MGMT, and ALKBH3 in ∼20% of samples. Homologous recombination deficiency (HRD) was present at varying frequency in many cancer types, most notably ovarian cancer. However, in contrast to ovarian cancer, HRD was associated with worse outcomes in several other cancers. Protein structure-based analyses allowed us to predict functional consequences of rare, recurrent DDR mutations. A new machine-learning-based classifier developed from gene expression data allowed us to identify alterations that phenocopy deleterious TP53 mutations. These frequent DDR gene alterations i...
Hotspot mutations in splicing factor genes have been recently reported at high frequency in hemat... more Hotspot mutations in splicing factor genes have been recently reported at high frequency in hematological malignancies, suggesting the importance of RNA splicing in cancer. We analyzed whole-exome sequencing data across 33 tumor types in The Cancer Genome Atlas (TCGA), and we identified 119 splicing factor genes with significant non-silent mutation patterns, including mutation over-representation, recurrent loss of function (tumor suppressor-like), or hotspot mutation profile (oncogene-like). Furthermore, RNA sequencing analysis revealed altered splicing events associated with selected splicing factor mutations. In addition, we were able to identify common gene pathway profiles associated with the presence of these mutations. Our analysis suggests that somatic alteration of genes involved in the RNA-splicing process is common in cancer and may represent an underappreciated hallmark of tumorigenesis.
Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known t... more Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known to play pathophysiological roles in cancer. We inferred lncRNAs that dysregulate cancer pathways, oncogenes, and tumor suppressors (cancer genes) by modeling their effects on the activity of transcription factors, RNA-binding proteins, and microRNAs in 5,185 TCGA tumors and 1,019 ENCODE assays. Our predictions included hundreds of candidate onco- and tumor-suppressor lncRNAs (cancer lncRNAs) whose somatic alterations account for the dysregulation of dozens of cancer genes and pathways in each of 14 tumor contexts. To demonstrate proof of concept, we showed that perturbations targeting OIP5-AS1 (an inferred tumor suppressor) and TUG1 and WT1-AS (inferred onco-lncRNAs) dysregulated cancer genes and altered proliferation of breast and gynecologic cancer cells. Our analysis indicates that, although most lncRNAs are dysregulated in a tumor-specific manner, some, including OIP5-AS1, TUG1, NEAT1...
We analyzed 921 adenocarcinomas of the esophagus, stomach, colon, and rectum to examine shared an... more We analyzed 921 adenocarcinomas of the esophagus, stomach, colon, and rectum to examine shared and distinguishing molecular characteristics of gastrointestinal tract adenocarcinomas (GIACs). Hypermutated tumors were distinct regardless of cancer type and comprised those enriched for insertions/deletions, representing microsatellite instability cases with epigenetic silencing of MLH1 in the context of CpG island methylator phenotype, plus tumors with elevated single-nucleotide variants associated with mutations in POLE. Tumors with chromosomal instability were diverse, with gastroesophageal adenocarcinomas harboring fragmented genomes associated with genomic doubling and distinct mutational signatures. We identified a group of tumors in the colon and rectum lacking hypermutation and aneuploidy termed genome stable and enriched in DNA hypermethylation and mutations in KRAS, SOX9, and PCBP1.
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data ... more For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedente...
We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse... more We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-β dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tum...
Renal cell carcinoma (RCC) is not a single disease, but several histologically defined cancers wi... more Renal cell carcinoma (RCC) is not a single disease, but several histologically defined cancers with different genetic drivers, clinical courses, and therapeutic responses. The current study evaluated 843 RCC from the three major histologic subtypes, including 488 clear cell RCC, 274 papillary RCC, and 81 chromophobe RCC. Comprehensive genomic and phenotypic analysis of the RCC subtypes reveals distinctive features of each subtype that provide the foundation for the development of subtype-specific therapeutic and management strategies for patients affected with these cancers. Somatic alteration of BAP1, PBRM1, and PTEN and altered metabolic pathways correlated with subtype-specific decreased survival, while CDKN2A alteration, increased DNA hypermethylation, and increases in the immune-related Th2 gene expression signature correlated with decreased survival within all major histologic subtypes. CIMP-RCC demonstrated an increased immune signature, and a uniform and distinct metabolic e...
Cancer progression involves the gradual loss of a differentiated phenotype and acquisition of pro... more Cancer progression involves the gradual loss of a differentiated phenotype and acquisition of progenitor and stem-cell-like features. Here, we provide novel stemness indices for assessing the degree of oncogenic dedifferentiation. We used an innovative one-class logistic regression (OCLR) machine-learning algorithm to extract transcriptomic and epigenetic feature sets derived from non-transformed pluripotent stem cells and their differentiated progeny. Using OCLR, we were able to identify previously undiscovered biological mechanisms associated with the dedifferentiated oncogenic state. Analyses of the tumor microenvironment revealed unanticipated correlation of cancer stemness with immune checkpoint expression and infiltrating immune cells. We found that the dedifferentiated oncogenic phenotype was generally most prominent in metastatic tumors. Application of our stemness indices to single-cell data revealed patterns of intra-tumor molecular heterogeneity. Finally, the indices allo...
Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of... more Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumor-infiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characteriza...
Uploads
Papers by Michael McLellan