Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

FusionFlow: An Integrated System Workflow for Gene Fusion Detection in Genomic Samples

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Abstract

Gene fusion is a genomic alteration where two genes after a break event are juxtaposed to form a new hybrid gene, leading to possible cancer development and progression. However, identifying gene fusions is not a trivial process as it requires the management and processing countless amounts of data. Genomic data (particularly DNA and RNA) can reach up to 300 GB per sample. Furthermore, specific software and hardware architectures are required to correctly process this type of data. Although many tools are available for detecting gene fusions, to date, systematic workflows that are free and easily usable even by non-specialists are hardly available.

This paper presents an integrated system for identifying gene fusions in RNA and DNA genomic samples, focusing on hardware and software architectural aspects. The proposed workflow is easy-to-use, scalable, and highly reproducible. It includes five gene fusion detection tools, three mainly intended for RNA samples (EricScript, Arriba, FusionCatcher) and two for DNA samples (INTEGRATE and GeneFuse). The workflow runs on servers exploiting Nextflow (a DSL for data-driven computational pipelines), Docker containers, and Conda virtual environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abate, F., et al.: Pegasus: a comprehensive annotation and prediction tool for detection of driver gene fusions in cancer. BMC Syst. Biol. 8, 97 (2014). https://doi.org/10.1186/s12918-014-0097-z

    Article  Google Scholar 

  2. Ahmed, S., Ali, M.U., Ferzund, J., Sarwar, M.A., Rehman, A., Mehmood, A.: Modern data formats for big bioinformatics data analytics (2017). https://www.ijacsa.thesai.org

  3. Allegretti, S., Bolelli, F., Cancilla, M., Pollastri, F., Canalini, L., Grana, C.: How does connected components labeling with decision trees perform on GPUs? In: International Conference on Computer Analysis of Images and Patterns, pp. 39–51. Springer (2019). https://doi.org/10.1007/978-3-030-29888-3_

  4. Allegretti, S., Bolelli, F., Pollastri, F., Longhitano, S., Pellacani, G., Grana, C.: Supporting skin lesion diagnosis with content-based image retrieval. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 8053–8060. IEEE (2021)

    Google Scholar 

  5. Benelli, M., Pescucci, C., Marseglia, G., Severgnini, M., Torricelli, F., Magi, A.: Discovering chimeric transcripts in paired-end rna-seq data by using ericscript. Bioinformatics 28, 3232–3239 (2012). https://doi.org/10.1093/bioinformatics/bts617

    Article  Google Scholar 

  6. Bolelli, F., Baraldi, L., Pollastri, F., Grana, C.: A hierarchical quasi-recurrent approach to video captioning. In: 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), pp. 162–167. IEEE (2018)

    Google Scholar 

  7. Chen, S., Liu, M., Huang, T., Liao, W., Xu, M., Gu, J.: Genefuse: detection and visualization of target gene fusions from dna sequencing data. Int. J. Biol. Sci. 14, 843–848 (2018). https://doi.org/10.7150/ijbs.24626

    Article  Google Scholar 

  8. Cirrincione, G., Randazzo, V., Kumar, R.R., Cirrincione, M., Pasero, E.: Growing curvilinear component analysis (GCCA) for stator fault detection in induction machines. In: Esposito, A., Faundez-Zanuy, M., Morabito, F.C., Pasero, E. (eds.) Neural Approaches to Dynamics of Signal Exchanges. SIST, vol. 151, pp. 235–244. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8950-4_22

    Chapter  MATH  Google Scholar 

  9. Cirrincione, G., Randazzo, V., Pasero, E.: Growing curvilinear component analysis (GCCA) for dimensionality reduction of nonstationary data. In: Esposito, A., Faudez-Zanuy, M., Morabito, F.C., Pasero, E. (eds.) Multidisciplinary Approaches to Neural Computing. SIST, vol. 69, pp. 151–160. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-56904-8_15

    Chapter  MATH  Google Scholar 

  10. Cirrincione, G., Randazzo, V., Pasero, E.: A neural based comparative analysis for feature extraction from ECG signals. In: Esposito, A., Faundez-Zanuy, M., Morabito, F.C., Pasero, E. (eds.) Neural Approaches to Dynamics of Signal Exchanges. SIST, vol. 151, pp. 247–256. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8950-4_23

    Chapter  Google Scholar 

  11. Goecks, J., Nekrutenko, A., Taylor, J.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), 1–13 (2010)

    Article  Google Scholar 

  12. Killian, J.A., Topiwala, T.M., Pelletier, A.R., Frankhouser, D.E., Yan, P.S., Bundschuh, R.: Fuspot: a web-based tool for visual evaluation of fusion candidates. BMC Genom. 19, 139 (2018). https://doi.org/10.1186/s12864-018-4486-3

  13. Kim, P., Yiya, K., Zhou, X.: Fgviewer: an online visualization tool for functional features of human fusion genes. Nucleic Acids Res. 48, W313–W320 (2021). https://doi.org/10.1093/NAR/GKAA364

    Article  Google Scholar 

  14. Köster, J., Rahmann, S.: Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 28(19), 2520–2522 (2012). https://doi.org/10.1093/bioinformatics/bts480, https://doi.org/10.1093/bioinformatics/bts480

  15. Latysheva, N.S., Babu, M.M.: Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res. 44, 4487–4503 (2016). https://doi.org/10.1093/nar/gkw282

    Article  Google Scholar 

  16. Lovino, M., Bontempo, G., Cirrincione, G., Ficarra, E.: Multi-omics classification on kidney samples exploiting uncertainty-aware models. In: Huang, D.-S., Jo, K.-H. (eds.) ICIC 2020. LNCS, vol. 12464, pp. 32–42. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60802-6_4

    Chapter  Google Scholar 

  17. Lovino, M., Ciaburri, M.S., Urgese, G., Di Cataldo, S., Ficarra, E.: Deeprior: a deep learning tool for the prioritization of gene fusions. Bioinformatics 36(10), 3248–3250 (2020)

    Article  Google Scholar 

  18. Lovino, M., Montemurro, M., Barrese, V.S., Ficarra, E.: Identifying the oncogenic potential of gene fusions exploiting mirnas. J. Biomed. Inform. 129, 104057 (2022)

    Google Scholar 

  19. Lovino, M., Urgese, G., Macii, E., Di Cataldo, S., Ficarra, E.: A deep learning approach to the screening of oncogenic gene fusions in humans. Int. J. Mol. Sci. 20(7), 1645 (2019)

    Article  Google Scholar 

  20. Nicorici, D., et al.: Fusioncatcher - a tool for finding somatic fusion genes in paired-end rna-sequencing data. bioRxiv, p. 011650 (2014). https://doi.org/10.1101/011650

  21. Paviglianiti, A., Randazzo, V., Pasero, E., Vallan, A.: Noninvasive arterial blood pressure estimation using abpnet and vital-ecg. In: 2020 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), pp. 1–5. IEEE (2020)

    Google Scholar 

  22. Ponzio, F., Deodato, G., Macii, E., Di Cataldo, S., Ficarra, E.: Exploiting "uncertain" deep networks for data cleaning in digital pathology. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1139–1143. IEEE (2020)

    Google Scholar 

  23. Ponzio, F., Villalobos, A.E.L., Mesin, L., de’Sperati, C., Roatta, S.: A human-computer interface based on the "voluntary" pupil accommodative response. Int. J. Hum. Comput. Stud. 126, 53–63 (2019)

    Google Scholar 

  24. Roy, S., et al.: Standards and guidelines for validating next-generation sequencing bioinformatics pipelines: a joint recommendation of the association for molecular pathology and the college of american pathologists, Jan 2018. https://doi.org/10.1016/j.jmoldx.2017.11.003

  25. Shugay, M., Mendíbil, I.O.D., Vizmanos, J.L., Novo, F.J.: Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. Bioinformatics 29, 2539–2546 (2013). https://doi.org/10.1093/bioinformatics/btt445

    Article  Google Scholar 

  26. Uhrig, S., et al.: Accurate and efficient detection of gene fusions from rna sequencing data

    Google Scholar 

  27. Vivian, J., et al.: Toil enables reproducible, open source, big biomedical data analyses. Nat. Biotechnol. 35(4), 314–316 (2017)

    Article  Google Scholar 

  28. Wang, Q., Xia, J., Jia, P., Pao, W., Zhao, Z.: Application of next generation sequencing to human gene fusion detection: Computational tools, features and perspectives. Briefings Bioinf. 14, 506–519 (2013). https://doi.org/10.1093/bib/bbs044

    Article  Google Scholar 

  29. Wang, Y., Shi, T., Song, X., Liu, B., Wei, J.: Gene fusion neoantigens: Emerging targets for cancer immunotherapy May 2021. https://doi.org/10.1016/j.canlet.2021.02.023

  30. Williford, A., Betrán, E.: Gene fusion, May 2013. https://doi.org/10.1002/9780470015902.a0005099.pub3, https://onlinelibrary.wiley.com/doi/10.1002/9780470015902.a0005099.pub3

  31. Zhang, J., Gao, T., Maher, C.A.: Integrate-vis: A tool for comprehensive gene fusion visualization. Scientific Reports 7, 17808 ( 2017). https://doi.org/10.1038/s41598-017-18257-2

  32. Zhang, J., et al.: Integrate: gene fusion discovery using whole genome and transcriptome data. Genome Res. 26, 108–118 (2016). https://doi.org/10.1101/gr.186114.114

    Article  Google Scholar 

Download references

Funding

This study was funded by the European Union’s Horizon 2020 research and innovation programme DECIDER under Grant Agreement 965193.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta Lovino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Citarrella, F., Bontempo, G., Lovino, M., Ficarra, E. (2022). FusionFlow: An Integrated System Workflow for Gene Fusion Detection in Genomic Samples. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics