Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Preprint Technical Note Version 1 Preserved in Portico This version is not peer-reviewed

AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality

Version 1 : Received: 7 June 2024 / Approved: 7 June 2024 / Online: 10 June 2024 (06:39:41 CEST)

How to cite: Rashid, U.; Wu, C.; Shiller, J.; Smith, K.; Crowhurst, R.; Davy, M.; Chen, T.-H.; Carvajal, I.; Bailey, S.; Thomson, S.; Deng, C. AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality. Preprints 2024, 2024060518. https://doi.org/10.20944/preprints202406.0518.v1 Rashid, U.; Wu, C.; Shiller, J.; Smith, K.; Crowhurst, R.; Davy, M.; Chen, T.-H.; Carvajal, I.; Bailey, S.; Thomson, S.; Deng, C. AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality. Preprints 2024, 2024060518. https://doi.org/10.20944/preprints202406.0518.v1

Abstract

Summary Genome assembly projects have grown exponentially due to breakthroughs in sequencing technologies and assembly algorithms. Evaluating the quality of genome assemblies is critical to ensure the reliability of downstream analysis and interpretation. To fulfil this task, we have developed the AssemblyQC pipeline that performs file-format validation, contaminant checking, contiguity measurement, gene- and repeat-space completeness quantification, telomere inspection, taxonomic assignment, synteny alignment, scaffold examination through Hi-C contact-map visualisation, and assessments of completeness, consensus quality and phasing through K-mer analysis. It produces a comprehensive HTML report with method descriptions, tables, and visualisations. Availability and Implementation The pipeline uses NextFlow for workflow orchestration and adheres to the best-practice established by the nf-core community. This pipeline offers a reproducible, scalable, and portable method to assess the quality of genome assemblies – the code is available online.GitHub: https://github.com/Plant-Food-Research-Open/assemblyqc Supplementary information Pipeline usage documentation, parameter descriptions and example outputs are available on GitHub: https://github.com/Plant-Food-Research-Open/assemblyqc/tree/main/docs. A preview report is also hosted online: https://plant-food-research-open.github.io/assemblyqc

Keywords

Genome; Quality assessment; Nextflow

Subject

Biology and Life Sciences, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.