Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics Applications

Published: 29 November 2024 Publication History

Abstract

GPUs are used in many settings to accelerate large-scale scientific computation, including simulation, computational biology, and molecular dynamics. However, optimizing codes to run efficiently on GPUs requires developers to have both detailed understanding of the application logic and significant knowledge of parallel programming and GPU architectures. This paper shows that an automated GPU program optimization tool, GEVO, can leverage evolutionary computation to find code edits that reduce the runtime of three important applications, multiple sequence alignment, agent-based simulation and molecular dynamics codes, by 28.9%, 29%, and 17.8% respectively. The paper presents an in-depth analysis of the discovered optimizations, revealing that (1) several of the most important optimizations involve significant epistasis, (2) the primary sources of improvement are application-specific, and (3) many of the optimizations generalize across GPU architectures. In general, the discovered optimizations are not straightforward even for a GPU human expert, showcasing the potential of automated program optimization tools to both reduce the optimization burden for human domain experts and provide new insights for GPU experts.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conf. on Operating Systems Design and Implementation.
[2]
Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified pre-training for program understanding and generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 2655–2668. Retrieved from https://www.aclweb.org/anthology/2021.naacl-main.211
[3]
Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis.
[4]
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. 2021. Program synthesis with large language models. arXiv:2108.07732. Retrieved from https://arxiv.org/abs/2108.07732
[5]
Muaaz G Awan, Jack Deslippe, Aydin Buluc, Oguz Selvitopi, Steven Hofmeyr, Leonid Oliker, and Katherine Yelick. 2020. ADEPT: A domain independent sequence alignment strategy for gpu architectures. BMC Bioinformatics 21, 1 (2020), 1–29.
[6]
Sorav Bansal and Alex Aiken. 2006. Automatic generation of peephole superoptimizers. SIGARCH Computer Architecture News 34, 5 (2006), 394–403.
[7]
Gilles Barthe, Juan Manuel Crespo, Sumit Gulwani, Cesar Kunz, and Mark Marron. 2013. From relational verification to SIMD loop synthesis. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13). ACM, New York, NY, 123–134. DOI:
[8]
William Bateson. 1909. Mendel’s Principles of Heredity. Cambridge University Press, Cambridge.
[9]
Sylvie Boldo, Marc Daumas, and Ren-Cang Li. 2008. Formally verified argument reduction with a fused multiply-add. IEEE Transactions on Computers 58, 8 (2008), 1139–1145.
[10]
Nicolas Brisebarre, David Defour, Peter Kornerup, J-M Muller, and Nathalie Revol. 2005. A new range-reduction algorithm. IEEE Transactions on Computers 54, 3 (2005), 331–339.
[11]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, 1877–1901.
[12]
Alexander Brownlee, Jason Adair, Saemundur Haraldsson, and John Jabbo. 2021. Exploring the accuracy-energy trade-off in machine learning. In Proceedings of the Genetic Improvement Workshop at 43rd International Conference on Software Engineering. ACM, New York, NY.
[13]
Bobby R. Bruce, Justyna Petke, and Mark Harman. 2015. Reducing energy consumption using genetic improvement. In Proceedings of the 17th Annual Conference on Genetic and Evolutionary Computation.
[14]
Bobby Ralph Bruce, Justyna Petke, Mark Harman, and Earl T. Barr. 2019. Approximate oracles and synergy in software energy search spaces. IEEE Transactions on Software Engineering 45, 11 (2019), 1150–1169.
[15]
Sebastian Buchwald, Andreas Fried, and Sebastian Hack. 2018. Synthesizing an instruction selection rule library from semantic specifications. In Proceedings of the International Symposium on Code Generation and Optimization (CGO ’18). ACM, New York, NY, 300–313. DOI:
[16]
Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H. S. Torr, and Pushmeet Kohli. 2017. Learning to superoptimize programs. In Proceedings of the International Conference on Learning Representations.
[17]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. Retrieved from https://arxiv.org/abs/2107.03374
[18]
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274. Retrieved from https://arxiv.org/abs/1512.01274
[19]
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. \(\{\)TVM\(\}\): An automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 18), 578–594.
[20]
Berkeley Churchill, Rahul Sharma, J. F. Bastien, and Alex Aiken. 2017. Sound loop superoptimization for google native client. SIGARCH Computer Architecture News 45, 1 (2017), 313–326.
[21]
Colin B Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, and Neel Sundaresan. 2020. PyMT5: Multi-mode translation of natural language and Python code with transformers. arXiv:2010.03150. Retrieved from https://arxiv.org/abs/2010.03150
[22]
Chris Cummins, Volker Seeker, Dejan Grubisic, Mostafa Elhoushi, Youwei Liang, Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Kim Hazelwood, Gabriel Synnaeve, et al. 2023. Large language models for compiler optimization. arXiv:2309.07062. Retrieved from https://arxiv.org/abs/2309.07062
[23]
Florent De Dinechin, Luc Forget, Jean-Michel Muller, and Yohann Uguen. 2019. Posits: The good, the bad and the ugly. In Proceedings of the Conference for Next Generation Arithmetic, 1–10.
[24]
Vidroha Debroy and W. Eric Wong. 2010. Using mutation to automatically suggest fixes for faulty programs. In Proceedings of 3rd International Conference on Software Testing, Verification and Validation.
[25]
Jonathan P. K. Doye, Thomas E. Ouldridge, Ard A. Louis, Flavio Romano, Petr Šulc, Christian Matek, Benedict E. K. Snodin, Lorenzo Rovigatti, John S. Schreck, Ryan M. Harrison, et al. 2013. Coarse-graining DNA for simulations of DNA nanotechnology. Physical Chemistry Chemical Physics 15, 47 (2013), 20395–20414.
[26]
Peter Eastman, Mark S. Friedrichs, John D. Chodera, Randall J. Radmer, Christopher M. Bruns, Joy P. Ku, Kyle A. Beauchamp, Thomas J. Lane, Lee-Ping Wang, Diwakar Shukla, et al. 2013. OpenMM 4: A reusable, extensible, hardware independent library for high performance molecular simulation. Journal of Chemical Theory and Computation 9, 1 (2013), 461–469.
[27]
Richard Evans, David Saxton, David Amos, Pushmeet Kohli, and Edward Grefenstette. 2018. Can neural networks understand logical entailment?. In Proceedings of the International Conference on Learning Representations.
[28]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv:2002.08155. Retrieved from https://arxiv.org/abs/2002.08155
[29]
Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation.
[30]
Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. 2012. Auto-tuning a high-level language targeted to GPU codes. In Proceedings of the Innovative Parallel Computing (InPar), 1–10. DOI:
[31]
Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. 2011. Synthesis of loop-free programs. ACM SIGPLAN Notices 46, 6 (2011), 62–73.
[32]
Darrall Henderson. 2000. Elementary functions: Algorithms and implementation. Mathematics and Computer Education 34, 1 (2000), 94.
[33]
Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. 2019. TASO: Optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symp. on Operating Systems Principles (SOSP ’19).
[34]
Petr Klus, Simon Lam, Dag Lyberg, Ming Sin Cheung, Graham Pullan, Ian McFarlane, Giles S. H. Yeo, and Brian Y. H. Lam. 2012. BarraCUDA – A fast short read sequence aligner using graphics processing units. BMC Research Notes 5, 1 (2012), 1–7.
[35]
Matija Korpar and Mile Šikić. 2013. SW#–GPU-enabled exact alignments on genome scale. Bioinformatics 29, 19 (2013), 2494–2495.
[36]
John R. Koza. 1994. Genetic programming as a means for programming computers by natural selection. Statistics and Computing 4, 2 (1994), 87–112.
[37]
Sudhir B. Kylasa, Hasan Metin Aktulga, and Ananth Y. Grama. 2014. PuReMD-GPU: A reactive molecular dynamics simulation package for GPUs. The Journal of Computational Physics 272 (2014), 343–359.
[38]
William B. Langdon and Mark Harman. 2010. Evolving a CUDA kernel from an nVidia template. In Proceedings of IEEE Congress on Evolutionary Computation.
[39]
William B. Langdon and Mark Harman. 2015. Grow and graft a better CUDA pknotsRG for RNA pseudoknot free energy calculation. In Proceedings of the Companion Publication of the 17th Annual Conference on Genetic and Evolutionary Computation.
[40]
William B. Langdon, Brian Yee Hong Lam, Justyna Petke, and Mark Harman. 2015. Improving CUDA DNA analysis software with genetic programming. In Proceedings of the 17th Annual Conference on Genetic and Evolutionary Computation.
[41]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO ’04). IEEE, 75–86.
[42]
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for \(\$\)8 each. In Proceedings of the 34th International Conference on Software Engineering.
[43]
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2011. Genprog: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38, 1 (2011), 54–72.
[44]
Hugh Leather, Edwin Bonilla, and Michael O’Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In Proceedings of the International Symposium on Code Generation and Optimization, 81–91.
[45]
Shin-Ying Lee and Carole-Jean Wu. 2014. Characterizing the latency hiding ability of GPUs. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[46]
Jhe-Yu Liou, Muaaz Awan, Steven Hofmeyr, Stephanie Forrest, and Carole-Jean Wu. 2022. Understanding the power of evolutionary computation for GPU code optimization. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 185–198. DOI:
[47]
Jhe-Yu Liou, Stephanie Forrest, and Carole-Jean Wu. 2019a. Genetic Improvement of GPU Code. In Proceedings of the IEEE/ACM International Workshop on Genetic Improvement (GI), 20–27. DOI:
[48]
Jhe-Yu Liou, Stephanie Forrest, and Carole-Jean Wu. 2019b. Uncovering performance opportunities by relaxing program semantics of GPGPU kernels. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems: Workshop on Wild and Crazy Ideas (WACI).
[49]
Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, and Carole-Jean Wu. 2020a. GEVO: GPU code optimization using evolutionary computation. ACM Transactions on Architecture and Code Optimization 17, 4 (Nov. 2020), Article 33, 28 pages. DOI:
[50]
Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, and Carole-Jean Wu. 2020b. GEVO-ML: A proposal for optimizing ML code with evolutionary computation. In Proceedings of the Genetic and Evolutionary Computation Conference Companion.
[51]
Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell. 2012. CUSHAW: A CUDA compatible short read aligner to large genomes based on the Burrows–Wheeler transform. Bioinformatics 28, 14 (2012), 1830–1837.
[52]
Zohar Manna and Richard Waldinger. 1980. A deductive approach to program synthesis. ACM Transactions on Programming Languages and Systems 2, 1 (1980), 90–121.
[53]
Alexandru Marginean, Johannes Bader, Satish Chandra, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, and Andrew Scott. 2019. SapFix: Automated end-to-end repair at scale. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 269–278.
[54]
Charith Mendis, Cambridge Yang, Yewen Pu, Saman Amarasinghe, and Michael Carbin. 2019. Compiler auto-vectorization with imitation learning. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, 14598–14609.
[55]
Microsoft. 2023. Github Copilot. Retrieved from https://github.com/features/copilot
[56]
Melanie E. Moses, Steven Hofmeyr, Judy L. Cannon, Akil Andrews, Rebekah Gridley, Monica Hinga, Kirtus Leyba, Abigail Pribisova, Vanessa Surjadidjaja, Humayra Tasnim, et al. 2021. Spatially distributed infection increases viral load in a computational model of SARS-CoV-2 lung infection. PLoS Computational Biology 17, 12 (2021), e1009735.
[57]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems.
[58]
Dariusz Mrozek, Miłosz Brożek, and Bożena Małysiak-Mrozek. 2014. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA. Journal of molecular modeling 20 (2014), 1–17.
[59]
NERSC. 2024. Cori GPU Nodes. Retrieved from https://docs-dev.nersc.gov/cgpu/hardware/
[60]
Kwok C. Ng. 1992. Argument Reduction for Huge Arguments: Good to the Last Bit. Unpublished draft, available from the author ([email protected]).
[61]
NVIDIA. 2024. CUDA LLVM Compiler. Retrieved from https://developer.nvidia.com/cuda-llvm-compiler/
[64]
NVIDIA. 2024. NVIDIA A100 Tensor Core GPU. Retrieved from https://www.nvidia.com/en-us/data-center/a100/
[65]
NVIDIA. 2024. NVIDIA Tesla P100 GPU. Retrieved from https://www.nvidia.com/en-us/data-center/tesla-p100/
[66]
NVIDIA. 2024. NVIDIA V100 Tensor Core GPU. Retrieved from https://www.nvidia.com/en-us/data-center/v100/
[67]
NVIDIA. 2017. Register Cache: Caching for Warp-Centric CUDA Programs. Retrieved from https://developer.nvidia.com/blog/register-cache-warp-cuda/
[68]
NVIDIA. 2018. Using CUDA Warp-Level Primitives. Retrieved from https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/
[69]
Aditya Paliwal, Sarah Loos, Markus Rabe, Kshitij Bansal, and Christian Szegedy. 2020. Graph representations for higher-order logic and theorem proving. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 2967–2974.
[70]
Bin Pang, Nan Zhao, Michela Becchi, Dmitry Korkin, and Chi-Ren Shyu. 2012. Accelerating large-scale protein structure alignments with graphics processing units. BMC Research Notes 5, 1 (2012), 1–11.
[71]
Chandra Shekhar Pareek, Rafal Smoczynski, and Andrzej Tretyn. 2011. Sequencing technologies and genome sequencing. Journal of Applied Genetics 52, 4 (2011), 413–435.
[72]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the NeurIPS Autodiff Workshop.
[73]
Karl Pettis and Robert C. Hansen. 1990. Profile guided code positioning. In Proceedings of the ACM SIGPLAN Conference on Programming language design and implementation.
[74]
Erik Poppleton, Michael Matthies, Debesh Mandal, Flavio Romano, Petr Šulc, and Lorenzo Rovigatti. 2023. oxDNA: Coarse-grained simulations of nucleic acids made simple. Journal of Open Source Software 8, 81 (2023), 4693.
[75]
Erik Poppleton, Roger Romero, Aatmik Mallya, Lorenzo Rovigatti, and Petr Šulc. 2021. OxDNA.org: A public webserver for coarse-grained simulations of DNA and RNA nanostructures. Nucleic Acids Research 49, W1 (2021), W491–W498.
[76]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48, 6 (Jun 2013), 519–530. DOI:
[77]
Paul Richmond, Dawn Walker, Simon Coakley, and Daniela Romano. 2010. High performance cellular level agent-based simulation with FLAME for the GPU. Briefings in Bioinformatics 11, 3 (2010), 334–347.
[78]
Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Satish Nadathur, Jakob Olesen, et al. 2018. Glow: Graph lowering compiler techniques for neural networks. arXiv:1805.00907. Retrieved from https://arxiv.org/abs/1805.00907
[79]
Lorenzo Rovigatti, Petr Šulc, István Z. Reguly, and Flavio Romano. 2015. A comparison between parallelization approaches in molecular dynamics simulations on GPUs. Journal of Computational Chemistry 36, 1 (2015), 1–8.
[80]
Romelia Salomon-Ferrer, Andreas W Gotz, Duncan Poole, Scott Le Grand, and Ross C. Walker. 2013. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Ewald. Journal of Chemical Theory and Computation 9, 9 (2013), 3878–3888.
[81]
Eric Schkufza, Rahul Sharma, and Alex Aiken. 2013. Stochastic superoptimization. In Proceedings of ACM SIGARCH Computer Architecture News.
[82]
Eric Schkufza, Rahul Sharma, and Alex Aiken. 2014. Stochastic optimization of floating-point programs with tunable precision. ACM SIGPLAN Notices 49, 6 (2014), 53–64.
[83]
Eric Schulte, Jonathan Dorn, Stephen Harding, Stephanie Forrest, and Westley Weimer. 2014a. Post-compiler software optimization for reducing energy. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems.
[84]
Eric Schulte, Zachary P. Fry, Ethan Fast, Westley Weimer, and Stephanie Forrest. 2014b. Software mutational robustness. Genetic Programming and Evolvable Machines 15, 3 (2014), 281–312.
[85]
Daniel Selsam, Matthew Lamm, Benedikt Bünz, Percy Liang, Leonardo de Moura, and David L. Dill. 2019. Learning a SAT solver from single-bit supervision. In Proceedings of the International Conference on Learning Representations.
[86]
Rahul Sharma, Eric Schkufza, Berkeley Churchill, and Alex Aiken. 2015. Conditionally Correct Superoptimization. In Proceedings of ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications.
[87]
Xujie Si, Yuan Yang, Hanjun Dai, Mayur Naik, and Le Song. 2019. Learning a meta-solver for syntax-guided program synthesis. In Proceedings of the International Conference on Learning Representations.
[88]
Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conf. on Foundations of Software Engineering.
[89]
Pitchaya Sitthi-Amorn, Nicholas Modly, Westley Weimer, and Jason Lawrence. 2011. Genetic programming for shader simplification. In Proceedings of the SIGGRAPH Asia Conference.
[90]
Temple F. Smith, and Michael S. Waterman. 1981. Identification of common molecular subsequences. Journal of Molecular Biology 147, 1 (1981), 195–197.
[91]
Benedict E. K. Snodin, Ferdinando Randisi, Majid Mosayebi, Petr Šulc, John S. Schreck, Flavio Romano, Thomas E. Ouldridge, Roman Tsukanov, Eyal Nir, Ard A. Louis, et al. 2015. Introducing improved structural properties and salt dependence into a coarse-grained model of DNA. The Journal of Chemical Physics 142, 23 (2015), 06B613_1.
[92]
Alex D. Stivala, Peter J. Stuckey, and Anthony I. Wirth. 2010. Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinformatics 11 (2010), 1–17.
[93]
Petr Šulc, Flavio Romano, Thomas E. Ouldridge, Jonathan P. K. Doye, and Ard A. Louis. 2014. A nucleotide-level coarse-grained model of RNA. The Journal of Chemical Physics 140, 23 (2014), 06B614_1.
[94]
TensorFlow. 2018. XLA Is a Compiler That Optimizes TensorFlow Computations. Retrieved from https://www.tensorflow.org/xla/
[95]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288. Retrieved from https://arxiv.org/abs/2307.09288
[96]
Ben van Werkhoven. 2019. Kernel Tuner: A search-optimizing GPU code auto-tuner. Future Generation Computer Systems 90 (2019), 347–358. DOI:
[97]
Paul Walsh and Conor Ryan. 1996. Paragen: a novel technique for the autoparallelisation of sequential programs using gp. In Proceedings of the 1st Annual Conference on Genetic Programming, 406–409.
[98]
Yue Wang, Weishi Wang, Shafiq Joty, and Steven C. H. Hoi. 2021. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv:2109.00859. Retrieved from https://arxiv.org/abs/2109.00859
[99]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering.
[100]
David R. White, Andrea Arcuri, and John A. Clark. 2011. Evolutionary improvement of programs. IEEE Transactions on Evolutionary Computation 15, 4 (2011), 515–538.
[101]
Yuan Yuan and Wolfgang Banzhaf. 2020. ARJA: Automated repair of java programs via multi-objective genetic programming. Transactions on Software Engineering 46, 10 (2020), 1040–1067.

Index Terms

  1. Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics Applications

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Evolutionary Learning and Optimization
      ACM Transactions on Evolutionary Learning and Optimization  Volume 4, Issue 4
      December 2024
      231 pages
      EISSN:2688-3007
      DOI:10.1145/3613700
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 November 2024
      Online AM: 15 November 2024
      Accepted: 30 October 2024
      Revised: 20 September 2024
      Received: 18 June 2023
      Published in TELO Volume 4, Issue 4

      Check for updates

      Author Tags

      1. Genetic improvement
      2. Evolutionary programming
      3. Bioinformatics
      4. Genetic programming

      Qualifiers

      • Research-article

      Funding Sources

      • ONR

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 159
        Total Downloads
      • Downloads (Last 12 months)159
      • Downloads (Last 6 weeks)25
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media