research-article

Open access

HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond

Authors:

Cong HaoAuthors Info & Claims

MLCAD '24: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD

Article No.: 23, Pages 1 - 9

https://doi.org/10.1145/3670474.3685961

Published: 09 September 2024 Publication History

PDF eReader

Abstract

Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present great challenges to FPGA and ML researchers. Existing datasets either cover only a subset of previously published benchmarks, provide no way to enumerate optimization design spaces, are limited to a specific vendor, or have no reproducible and extensible software for dataset construction. Many works also lack user-friendly ways to add more designs to existing datasets, limiting wider adoption and sustainability of such datasets.

In response to these challenges, we introduce HLSFactory, a comprehensive framework designed to facilitate the curation and generation of high-quality HLS design datasets. HLSFactory has three main stages: 1) a design space expansion stage to elaborate single HLS designs into large design spaces using various optimization directives across multiple vendor tools, 2) a design synthesis stage to execute HLS and FPGA tool flows concurrently across designs, and 3) a data aggregation stage for extracting standardized data into packaged datasets for ML usage. This tripartite architecture not only ensures broad coverage of data points via design space expansion but also supports interoperability with tools from multiple vendors. Users can contribute to each stage easily by submitting their own HLS designs or synthesis results via provided user APIs. The framework is also flexible, allowing extensions at every step via user APIs with custom frontends, synthesis tools, and scripts.

To demonstrate the framework functionality, we include an initial set of built-in base designs from PolyBench, MachSuite, Rosetta, CHStone, Kastner et al.'s Parallel Programming for FPGAs, and curated kernels from existing open-source HLS designs. We report the statistical analyses and design space visualizations to demonstrate the completed end-to-end compilation flow, and to highlight the effectiveness of our design space expansion beyond the initial base dataset, which greatly contributes to dataset diversity and coverage.

In addition to its evident application in ML, we showcase the versatility and multi-functionality of our framework through seven case studies:

I) Building an ML model for post-implementation QoR prediction

II) Using design space sampling in stage 1 to expand the design space covered from a small base set of HLS designs; III) Demonstrating the speedup from the fine-grained design parallelism backend; IV) Extending HLSFactory to target Intel's HLS flow across all stages; V) Adding and running new auxiliary designs using HLSFactory; VI) Integration of previously published HLS data in stage 3; VII) Using HLSFactory to perform HLS tool version regression benchmarking.

Code available at https://github.com/sharc-lab/HLSFactory.

References

[1]

[n.d.]. PolyBench. https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/

Abstract

References

Index Terms

Recommendations

High-Level Synthesis: Past, Present, and Future

High-Level Test Synthesis: A Survey from Synthesis Process Flow Perspective

Layout-driven RTL binding techniques for high-level synthesis

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Badges

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations