Effective Identification and Annotation of Fungal Genomes

Liu, Jian; Sun, Jia-Liang; Liu, Yong-Zhuang

doi:10.1007/s11390-021-0856-4

Effective Identification and Annotation of Fungal Genomes

Regular Paper
Published: 31 March 2021

Volume 36, pages 248–260, (2021)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Jian Liu¹,
Jia-Liang Sun¹ &
Yong-Zhuang Liu²

490 Accesses
Explore all metrics

Abstract

In the past few decades, the dangers of mycosis have caused widespread concern. With the development of the sequencing technology, the effective analysis of fungal sequencing data has become a hotspot. With the gradual increase of fungal sequencing data, there is now a lack of sufficient approaches for the identification and functional annotation of fungal chromosomal genomes. To overcome this challenge, this paper firstly deals with the approaches of the identification and annotation of fungal genomes based on short and long reads sequenced by using multiple platforms such as Illumina and Pacbio. Then this paper develops an automated bioinformatics pipeline called PFGI for the identification and annotation task. The experimental evaluation on a real-world dataset ENA (European Nucleotide Archive) shows that PFGI provides a user-friendly way to perform fungal identification and annotation based on the sequencing data analysis, and could provide accurate analyzing results, accurate to the species level (97% sequence identity).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Desprez-Loustau M L, Robin C, Buée M, Courtecuisse R, Garbaye J, Suffert F, Sache I, Rizzo D M. The fungal dimension of biological invasions. Trends in Ecology & Evolution, 2007, 22(9): 472-480. https://doi.org/10.1016/j.tree.2007.04.005.
Schuster S C. Next-generation sequencing transforms today’s biology. Nature Methods, 2008, 5(1): 16-18. https://doi.org/10.1038/nmeth1156.
Article MathSciNet Google Scholar
van Dijk E L, Auger H, Jaszczyszyn Y, Thermes C. Ten years of next-generation sequencing technology. Trends in Genetics, 2014, 30(9): 418-426. https://doi.org/10.1016/j.tig.2014.07.001.
Article Google Scholar
van Dijk E L, Jaszczyszyn Y, Naquin D, Thermes C. The third revolution in sequencing technology. Trends in Genetics, 2018, 34(9): 666-681. https://doi.org/10.1016/j.tig.2018.05.008.
Article Google Scholar
Dannemiller K C, Reeves D, Bibby K, Yamamoto N, Peccia J. Fungal high-throughput taxonomic identification tool for use with next-generation sequencing (FHiTINGS). Journal of Basic Microbiology, 2014, 54(4): 315-321. https://doi.org/10.1002/jobm.201200507.
Article Google Scholar
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden T L. BLAST+: Architecture and applications. BMC Bioinformatics, 2009, 10(1): Article No. 421. https://doi.org/10.1186/1471-2105-10-421.
Gweon H S, Oliver A, Taylor J, Booth T, Gibbs M, Read D S, Griffiths R I, Schonrogge K. PIPITS: An automated pipeline for analyses of fungal internal transcribed spacer sequences from the I llumina sequencing platform. Methods in Ecology and Evolution, 2015, 6(8): 973-980. https://doi.org/10.1111/2041-210X.12399.
Article Google Scholar
Eng A, Verster A J, Borenstein E. Meta-LAFFA: A flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline. BMC Bioinformatics, 2020, 21(1): Article No. 471. https://doi.org/10.1186/s12859-020-03815-9.
Clarke E L, Taylor L J, Zhao C, Connell A, Lee J J, Fett B, Bushman F D, Bittinger K. Sunbeam: An extensible pipeline for analyzing metagenomic sequencing experiments. Microbiome, 2019, 7(1): Article No. 46. https://doi.org/10.1186/s40168-019-0658-x.
Rhoads A, Au K F. PacBio sequencing and its applications. Genomics, Proteomics & Bioinformatics, 2015, 13(5): 278-289. https://doi.org/10.1016/j.gpb.2015.08.002.
Seemann T. Prokka: Rapid prokaryotic genome annotation. Bioinformatics, 2014, 30(14): 2068-2069. https://doi.org/10.1093/bioinformatics/btu153.
Article Google Scholar
Jolley K A, Maiden M C. BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics, 2010, 11(1): Article No. 595. https://doi.org/10.1186/1471-2105-11-595.
Chen S, Zhou Y, Chen Y, Gu J. FASTQ: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 2018, 34(17): i884-i890. https://doi.org/10.1093/bioinformatics/bty560.
Article Google Scholar
Bolger A M, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 2014, 30(15): 2114-2120. https://doi.org/10.1093/bioinformatics/btu170.
Article Google Scholar
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet Journal, 2011, 17(1): 10-12. https://doi.org/10.14806/ej.17.1.200.
Benson D A, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman D J, Ostell J, Sayers E W. GenBank. Nucleic Acids Research, 2012, 41(D1): D36-D42. https://doi.org/10.1093/nar/gks1195.
Li D, Liu C M, Luo R, Sadakane K, Lam T W. MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics, 2015, 31(10): 1674-1676. https://doi.org/10.1093/bioinformatics/btv033.
Article Google Scholar
Zerbino D R, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 2008, 18(5): 821-829. https://doi.org/10.1101/gr.074492.107.
Article Google Scholar
Bankevich A, Nurk S, Antipov D et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology, 2012, 19(5): 455-477. https://doi.org/10.1089/cmb.2012.0021.
Article MathSciNet Google Scholar
Koren S, Walenz B P, Berlin K, Miller J R, Bergman N H, Phillippy A M. Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Research, 2017, 27(5): 722-736. https://doi.org/10.1101/gr.215087.116.
Article Google Scholar
Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics, 2013, 29(8): 1072-1075. https://doi.org/10.1093/bioinformatics/btt086.
Article Google Scholar
Cock P J, Antao T, Chang J T et al. Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 2009, 25(11): 1422-1423. https://doi.org/10.1093/bioinformatics/btp163.
Article Google Scholar
Rowe W P. When the levee breaks: A practical guide to sketching algorithms for processing the flood of genomic data. Genome Biology, 2019, 20(1): Article No. 199. https://doi.org/10.1186/s13059-019-1809-x.
Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 2018, 34(18): 3094-3100. https://doi.org/10.1093/bioinformatics/bty191.
Article Google Scholar
Kanz C, Aldebert P, Althorpe N et al. The EMBL nucleotide sequence database. Nucleic Acids Research, 2005, 33(suppl_1): D29-D33. https://doi.org/10.1093/nar/gki098.
Cornish-Bowden A. Nomenclature for incompletely specified bases in nucleic acid sequences: Recommendations 1984. Nucleic Acids Research, 1985, 13(9): 3021-3030. https://doi.org/10.1093/nar/13.9.3021.
Article Google Scholar
Caboche S, Even G, Loywick A, Audebert C, Hot D. MICRA: An automatic pipeline for fast characterization of microbial genomes from high-throughput sequencing data. Genome Biology, 2017, 18(1): Article No. 233. https://doi.org/10.1186/s13059-017-1367-z.

Download references

Author information

Authors and Affiliations

College of Computer Science, Nankai University, Tianjin, 300350, China
Jian Liu & Jia-Liang Sun
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
Yong-Zhuang Liu

Authors

Jian Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Liang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Zhuang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Liu.

Supplementary Information

ESM 1

(PDF 378 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, J., Sun, JL. & Liu, YZ. Effective Identification and Annotation of Fungal Genomes. J. Comput. Sci. Technol. 36, 248–260 (2021). https://doi.org/10.1007/s11390-021-0856-4

Download citation

Received: 01 August 2020
Accepted: 23 February 2021
Published: 31 March 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11390-021-0856-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective Identification and Annotation of Fungal Genomes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fungal Genome Annotation

FGMP: assessing fungal genome completeness

Practical Guide for Fungal Gene Prediction from Genome Assembly and RNA-Seq Reads by FunGAP

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Effective Identification and Annotation of Fungal Genomes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fungal Genome Annotation

FGMP: assessing fungal genome completeness

Practical Guide for Fungal Gene Prediction from Genome Assembly and RNA-Seq Reads by FunGAP

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation