Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377555.3377890acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
research-article

Vectorization-aware loop unrolling with seed forwarding

Published: 24 February 2020 Publication History

Abstract

Loop unrolling is a widely adopted loop transformation, commonly used for enabling subsequent optimizations. Straight-line-code vectorization (SLP) is an optimization that benefits from unrolling. SLP converts isomorphic instruction sequences into vector code. Since unrolling generates repeatead isomorphic instruction sequences, it enables SLP to vectorize more code. However, most production compilers apply these optimizations independently and uncoordinated. Unrolling is commonly tuned to avoid code bloat, not maximizing the potential for vectorization, leading to missed vectorization opportunities.
We are proposing VALU, a novel loop unrolling heuristic that takes vectorization into account when making unrolling decisions. Our heuristic is powered by an analysis that estimates the potential benefit of SLP vectorization for the unrolled version of the loop. Our heuristic then selects the unrolling factor that maximizes the utilization of the vector units. VALU also forwards the vectorizable code to SLP, allowing it to bypass its greedy search for vectorizable seed instructions, exposing more vectorization opportunities.
Our evaluation on a production compiler shows that VALU uncovers many vectorization opportunities that were missed by the default loop unroller and vectorizers. This results in more vectorized code and significant performance speedups for 17 of the kernels of the TSVC benchmarks suite, reaching up to 2× speedup over the already highly optimized -O3. Our evaluation on full benchmarks from FreeBench and MiBench shows that VALU results in a geo-mean speedup of 1.06×.

Cited By

View all
  • (2024)If-Convert as Early as You MustProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641562(26-38)Online publication date: 17-Feb-2024
  • (2024)Boost Linear Algebra Computation Performance via Efficient VNNI UtilizationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651333(149-163)Online publication date: 27-Apr-2024
  • (2024)Enhancing Performance through Control-Flow Unmerging and Loop Unrolling on GPUsProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444819(106-118)Online publication date: 2-Mar-2024
  • Show More Cited By

Index Terms

  1. Vectorization-aware loop unrolling with seed forwarding

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CC 2020: Proceedings of the 29th International Conference on Compiler Construction
    February 2020
    222 pages
    ISBN:9781450371209
    DOI:10.1145/3377555
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 February 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Auto-Vectorization
    2. Loop Unrolling
    3. SIMD
    4. SLP

    Qualifiers

    • Research-article

    Funding Sources

    • EPSRC

    Conference

    CC '20
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)85
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)If-Convert as Early as You MustProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641562(26-38)Online publication date: 17-Feb-2024
    • (2024)Boost Linear Algebra Computation Performance via Efficient VNNI UtilizationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651333(149-163)Online publication date: 27-Apr-2024
    • (2024)Enhancing Performance through Control-Flow Unmerging and Loop Unrolling on GPUsProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444819(106-118)Online publication date: 2-Mar-2024
    • (2023)Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph TransformationsACM Transactions on Architecture and Code Optimization10.1145/363170921:1(1-25)Online publication date: 9-Nov-2023
    • (2022)Improving Vectorization Heuristics in a Dynamic Compiler with Machine Learning ModelsProceedings of the 14th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages10.1145/3563838.3567679(36-47)Online publication date: 29-Nov-2022
    • (2022) GCD 2 : A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00044(512-529)Online publication date: Oct-2022
    • (2022)Efficient Loop Unrolling Factor Prediction Algorithm using Machine Learning Models2022 3rd International Conference for Emerging Technology (INCET)10.1109/INCET54531.2022.9825092(1-8)Online publication date: 27-May-2022
    • (2022)Loop Rolling for Code Size Reduction2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO53902.2022.9741256(217-229)Online publication date: 2-Apr-2022
    • (2021)Inlining for Code Size ReductionProceedings of the 25th Brazilian Symposium on Programming Languages10.1145/3475061.3475081(17-24)Online publication date: 27-Sep-2021
    • (2021)PostSLP: Cross-Region Vectorization of Fully or Partially Vectorized CodeLanguages and Compilers for Parallel Computing10.1007/978-3-030-72789-5_2(15-31)Online publication date: 26-Mar-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media