Abstract
Pattern discovery is one of the fundamental tasks in bioinformatics and pattern recognition is a powerful technique for searching sequence patterns in the biological sequence databases. The significant increase in the number of DNA and protein sequences expands the need for raising the performance of pattern matching algorithms. For this purpose, heterogeneous architectures can be a good choice due to their potential for high performance and energy efficiency. In this paper we present an efficient implementation of Aho- Corasick (AC) and PFAC (Parallel Failureless Aho-Corasick) algorithm on a heterogeneous CPU/GPU architecture. We progressively redesigned the algorithms and data structures to fit on the GPU architecture. Our results on different protein sequence data sets show 15% speedup comparing to the original implementation of the PFAC algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Villa, O., Tumeo, A.: Accelerating DNA analysis applications on GPU clusters. In: 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76 (2010)
Dios, F., Daneshtalab, M., Ebrahimi, M., Carabaño, J.: An Exploration of Heterogeneous Systems. In: 8th IEEE 8th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), pp. 1–7 (2013)
Aho, A.V., Corasick, M.J.: Efficient String Matching: An Aid to Bibliographic Search. ACM 18, 333–340 (1975)
Tsai, S.-Y., Liu, C.-H., Chang, S.-C., Shyu, J.-M., Lin, C.-H.: Accelerating String Matching Using Multi-threaded Algorithm on GPU. IEEE (2010)
Villa, O., Sciuto, D., Tumeo, A.: Efficient Pattern Matching on GPUs for Intrusion Detection System, pp. 87–88. ACM (May 2010)
Zha, X., Sahni, S.: Multipattern String Matching on A GPU. In: IEEE, pp. 277–282 (2011)
Zha, X., Sahni, S.: GPU-to-GPU and Host-to-Host Multipattern String Matching on A GPU. Computer and Information Science and Engineering. University of Florida, Florida (2011)
Moore, J.S., Boyer, R.S.: A Fast String Searching Algorithm. Communications of the ACM 20, 762–772 (1997)
Lee, M., Hong, S., Shin, M., Tran, N.-P.: Memory Efficient Parallelization for Aho-Corasick Algorithm on a GPU. In: IEEE 14th International Conference on High Performance Computing and Communications, pp. 432–438 (2012)
Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: High Performance Network Intrusion Detection Using Graphics Processors. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 116–134. Springer, Heidelberg (2008)
Chen, H., Shi, S., Peng, J.: The GPU-based string matching system in adavanced AC algorithm. In: 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 1158–1163 (2010)
Villa, O., ChavarrÃa-Miranda, D.G., Tumeo, A.: Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures. IEEE Transactions on Parallel and Distributed Systems 23, 436–443 (2012)
Liu, C.-H., Chien, L.-S., Chang, S.-C., Lin, C.-H.: Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPU. IEEE Transactions on Computers 62(10), 1906–1916 (2013)
Motwani, M., Saxena, A., Haseeb, S.: Serial and Parallel Bayesian Spam Filtering using Aho-Corasick and PFAC. International Journal of Computer Applications 74, 9–14 (2013)
Rasool, A., Khare, N., Agarwal, C.: PFAC Implementation Issues and their Solutions on GPGPU’s using OpenCL. International Journal of Computer Applications 72, 52–58 (2013)
Villa, O., Tumeo, A.: Accelerating DNA analysis applications on GPU Clusters. In: IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76 (2010)
Tsai, S.-Y., Liu, C.-H., Chang, S.-C., Shyu, J.-M., Li, C.-H.: "Accelerating String Matching Using Multi-threaded Algorithm on GPU. In: IEEE Globecom (2010)
Moore, J.S., Boyer, R.S.: A fast string searching algorithm. Communications of the ACM 20, 762–772 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Soroushnia, S., Daneshtalab, M., Plosila, J., Liljeberg, P. (2014). Heterogeneous Parallelization of Aho-Corasick Algorithm. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-07581-5_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07580-8
Online ISBN: 978-3-319-07581-5
eBook Packages: EngineeringEngineering (R0)