Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2755753.2757131acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

FPGA accelerated DNA error correction

Published: 09 March 2015 Publication History

Abstract

Correcting errors in DNA sequencing data is an important process that can improve the quality of downstream analysis using the data. Even though many error-correction methods have been proposed for Illumina reads, their throughput is not high enough to process data from large genomes. The current paper describes the first FPGA-based error-correction tool, called FPGA Accelerated DNA Error Correction (FADE), which targets to improve the throughput of DNA error correction for Illumina reads. The base algorithm of FADE is BLESS that is highly accurate but slow. A Bloom filter that is the main data structure of BLESS and BLESS' error correction subroutines for different types of errors have been implemented on a FPGA. We compared our design with the software version of BLESS using DNA sequencing data generated from four genomes and we could achieve up to 43 times speedup for the best case, and 36 times speedup on the average.

References

[1]
X. Yang, S. Chockalingam, and S. Aluru, "A survey of error-correction methods for next-generation sequencing," Briefings in Bioinformatics, vol. 14, no. 1, pp. 56--66, 01/01/, 2013.
[2]
M. Molnar, and L. Ilie, "Correcting Illumina data," Brief Bioinform, Sep 1, 2014.
[3]
P. A. Pevzner, H. Tang, and M. S. Waterman, "An Eulerian path approach to DNA fragment assembly," Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 17, pp. 9748--9753, 08/14/, 2001.
[4]
Y. Heo, X.-L. Wu, D. Chen, J. Ma, and W.-M. Hwu, "BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads," Bioinformatics, vol. 30, no. 10, pp. 1354--1362, 05/15/, 2014.
[5]
B. Bloom, "Space/time trade-offs in hash coding with allowable errors," Commun. ACM, vol. 13, no. 7, pp. 422--426, July, 1970.
[6]
L. Fan, P. Cao, J. Almeida, and A. Broder, "Summary cache: a scalable wide-area web cache sharing protocol," IEEE/ACM Trans. Netw., vol. 8, no. 3, pp. 281--293, June, 2000.
[7]
G. Vacek, B. Hornung, J. Bolding, and S. Koren, "Accelerating Error Correction and Assembly of Single-Molecule Sequencing Reads," unpublished.
[8]
Y. Chen, B. Schmidt, and D. L. Maskell, "Reconfigurable accelerator for the word-matching stage of BLASTN," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 21, no. 4, pp. 659--669, 2013.
[9]
P. Krishnamurthy et al., "Biosequence similarity search on the Mercury system," The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 49, no. 1, pp. 101--121, 2007.
[10]
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic local alignment search tool," Journal of molecular biology, vol. 215, no. 3, pp. 403--410, 1990.
[11]
T. L. Madden, R. L. Tatusov, and J. Zhang, "Applications of network BLAST server," Methods in enzymology, vol. 266, pp. 131--141, 1996.
[12]
F. Putze, P. Sanders, and J. Singler, "Cache-, Hash- and Space-Efficient Bloom Filters," Experimental Algorithms, C. Demetrescu, ed., pp. 108--121: Springer Berlin Heidelberg, 2007.
[13]
P. P. Chu, and R. E. Jones, "Design techniques of FPGA based random number generator," Military and Aerospace Applications of Programmable Devices and Technologies Conference. pp. 1--6, 1999.
[14]
Y. Liu, J. Schröder, and B. Schmidt, "Musket: a multistage k- mer spectrum-based error corrector for Illumina sequence data," Bioinformatics, vol. 29, no. 3, pp. 308--315, 02/01/, 2013.
[15]
L. Song, L. Florea, and B. Langmead, Lighter: fast and memory-efficient error correction without counting, bioRxiv, 2014.
[16]
S. Salzberg et al., "GAGE: A critical evaluation of genome assemblies and assembly algorithms," Genome Research, vol. 22, no. 3, pp. 557--567, 03/01/, 2012.
[17]
Sang-Woo Jun, Ming Liu, Kermin Elliott Fleming, and Arvind. 2014. Scalable multi-access flash store for big data analytics. In Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays (FPGA '14). ACM, New York, NY, USA, 55--64. DOI=10.1145/2554688.2554789

Cited By

View all
  • (2018)SPECTRProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225060(1-10)Online publication date: 13-Aug-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DATE '15: Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition
March 2015
1827 pages
ISBN:9783981537048

Sponsors

Publisher

EDA Consortium

San Jose, CA, United States

Publication History

Published: 09 March 2015

Check for updates

Author Tags

  1. DNA
  2. FPGA
  3. bloom filter
  4. error correction

Qualifiers

  • Research-article

Conference

DATE '15
Sponsor:
  • EDAA
  • EDAC
  • SIGDA
  • Russian Acadamy of Sciences
DATE '15: Design, Automation and Test in Europe
March 9 - 13, 2015
Grenoble, France

Acceptance Rates

DATE '15 Paper Acceptance Rate 206 of 915 submissions, 23%;
Overall Acceptance Rate 518 of 1,794 submissions, 29%

Upcoming Conference

DATE '25
Design, Automation and Test in Europe
March 31 - April 2, 2025
Lyon , France

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)SPECTRProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225060(1-10)Online publication date: 13-Aug-2018

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media