research-article

PyHDL-Eval: An LLM Evaluation Framework for Hardware Design Using Python-Embedded DSLs

Authors:

Christopher Batten,

Nathaniel Pinckney,

Brucek KhailanyAuthors Info & Claims

MLCAD '24: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD

Article No.: 10, Pages 1 - 17

https://doi.org/10.1145/3670474.3685948

Published: 09 September 2024 Publication History

Abstract

Embedding hardware design frameworks within Python is a promising technique to improve the productivity of hardware engineers. At the same time, there is significant interest in using large-language models (LLMs) to improve key chip design tasks. This paper describes PyHDL-Eval, a new framework for evaluating LLMs on specification-to-RTL tasks in the context of Python-embedded domain-specific languages (DSLs). The framework includes 168 problems, Verilog reference solutions, Verilog test benches, Python test scripts, and workflow orchestration scripts. We use the framework to conduct a detailed case study comparing five LLMs (CodeGemma 7B, Llama3 8B/70B, GPT4, and GPT4 Turbo) targeting Verilog and five Python-embedded DSLs (PyMTL3, PyRTL, MyHDL, Migen, and Amaranth). Our results demonstrate the promise of in-context learning when applied to smaller models (e.g., pass rate for CodeGemma 7B improves from 14.9% to 32.7% on Verilog) and Python-embedded DSLs (e.g., pass rate for LLama3 70B improves from 0.6% to 33.0% on PyMTL3). We find LLMs perform better when targeting Verilog as compared to Python-embedded DSLs (e.g., pass rate for GPT4 Turbo is 72.2% on Verilog and 29.8-62.0% on the Python-embedded DSLs) despite using a popular general-purpose host language. PyHDL-Eval will serve as a useful framework for future research at the intersection of Python-embedded DSLs and LLMs.

References

[1]

A. Allam and M. Shalan. RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects. Computing Research Repository (CoRR), arXiv:2405.17378, May 2024.

[2]

Amaranth HDL. Online Webpage, 2024 (accessed May 2024). https://github.com/amaranth-lang/amaranth.

[3]

C. Baaij, M. Kooijman, J. Kuper, A. Boeijink, and M. Gerards. Cλlash: Structural Descriptions of Synchronous Hardware Using Haskell. Euromicro Conf. on Digital System Design (DSD), Sep 2010.

[4]

J. Bachrach, H. Vo, B. Richards, Y. Lee, A. Waterman, R. Avizienis, J. Wawrzynek, and K. Asanović. Chisel: Constructing Hardware in a Scala Embedded Language. Design Automation Conf. (DAC), Jun 2012.

Digital Library

[5]

P. Bellows and B. Hutchings. JHDL: An HDL for Reconfigurable Systems. Symp. on FPGAs for Custom Computing Machines (FCCM), Apr 1998.

[6]

P. Bjesse, K. Claessen, M. Sheeran, and S. Singh. Lava: Hardware Design in Haskell. Int'l Conf. on Functional Programming (ICFP), Sep 1998.

[7]

J. Blocklove, S. Garg, R. Karri, and H. Pearce. Chip-Chat: Challenges and Opportunities in Conversational Hardware Design. Int'l Symp. on Machine Learning for CAD (MLCAD), Sep 2023.

[8]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language Modles are Few-Shot Learners. Conf. on Neural Information Processing Systems (NeurIPS), Dec 2020.

[9]

K. Chang, K. Wang, N. Yang, Y. Wang, D. Jin, W. Zhu, Z. Chen, C. Li, H. Yan, Y. Zhou, Z. Zhao, Y. Cheng, Y. Pan, Y. Liu, M. Wang, S. Liang, Y. Han, H. Li, and X. Li. Data is All You Need: Finetuning LLMs for Chip Design via an Automated Design-Data Augmentation Framework. Design Automation Conf. (DAC), Jun 2024.

[10]

K. Chang, Y. Wang, H. Ren, M. Wang, S. Liang, Y. Han, H. Li, and X. Li. ChipGPT: How Far Are We From Natural Language Hardware Design? Computing Research Repository (CoRR), arXiv:2305.14019, May 2023.

[11]

J. Clow, G. Tzimpragos, D. Dangwal, S. Guo, J. McMahan, and T. Sherwood. A Pythonic Approach for Rapid Hardware Prototyping and Instrumentation. Int'l Conf. on Field Programmable Logic (FPL), Sep 2017.

[12]

cocotb: A Coroutine-Based Cosimulation Library for Writing VHDL and Verilog Testbenches in Python. Online Webpage, 2024 (accessed May 2024). https: //github.com/cocotb/cocotb.

[13]

J.Decaluwe. MyHDL: A Python-Based Hardware Description Language. Linux Journal, Nov 2004.

Digital Library

[14]

D. Durst, M. Feldman, D. Huff, D. Akeley, R. Daly, G. L. Bernstein, M. Patrignani, K. Fatahalian, and P. Hanrahan. Type-Directed Scheduling of Streaming Accelerators. Conf. on Programming Language Design and Implementation (PLDI), Jun 2020.

[15]

Y. Fu, Y. Zhang, Z. Yu, S. Li, Z. Ye, C. Li, C. Wan, and Y. C. Lin. GPT4AIChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models. Int'l Conf. on Computer-Aided Design (ICCAD), Nov 2023.

[16]

Google. CodeGemma: Open Code Models Based on Gemma. Google White Paper, May 2024. https://goo.gle/codegemma.

[17]

S. Jiang, Y. Ou, P. Pan, K. Cheng, Y. Zhang, and C. Batten. PyH2: Using PyMTL3 to Create Productive and Open-Source Hardware Testing Methodologies. IEEE Design and Test of Computers, 40(4):53--61, Apr 2021.

[18]

S.Jiang, P. Pan, Y. Ou, and C. Batten. PyMTL3: A Python Framework for Open-Source Hardware Modeling, Generation, Simulation, and Verification. IEEE Micro, 40(4):58--66, Jul/Aug 2020.

Digital Library

[19]

F. Kermarrec, S. Bourdeauducq, J.-C. L. Lann, and H. Badier. LiteX: An Open-Source SoC Builder and Library Based on Migen Python DSL. Workshop on Open-Source Design Automation (OSDA), Mar 2019.

[20]

M. Liu, T.-D. Ene, R. Kirby, C. Cheng, N. Pinckney, R. Liang, J. Alben, H. Anand, S. Banerjee, I. Bayraktaroglu, B. Bhaskaran, B. Catanzaro, A. Chaudhuri, S. Clay, B. Dally, L. Dang, P. Deshpande, S. Dhodhi, S. Halepete, E. Hill, J. Hu, S. Jain, A. Jindal, B. Khailany, G. Kokai, K. Kunal, X. Li, C. Lind, H. Liu, S. Oberman, S. Omar, G. Pasandi, S. Pratty, J. Raiman, A. Sarkar, Z. Shao, H. Sun, P. P. Suthar, V. Tej, W. Turner, K. Xu, and H. Ren. ChipNeMo: Domain-Adapted LLMs for Chip Design. Computing Research Repository (CoRR), arXiv:2311.00176, Oct 2023.

[21]

M. Liu, N. Pinckney, B. Khailany, and H. Ren. VerilogEval: Evaluating Large Language Models for Verilog Code Generation. Int'l Conf. on Computer-Aided Design (ICCAD), Nov 2023.

[22]

S. Liu, W. Fang, Y. Lu, Q. Zhang, H. Zhang, and Z. Xie. RTLCoder: Outperforming GPT-3.5 in Design RTL Generation with Our Open-Source Dataset and Lightweight Solution. Computing Research Repository (CoRR), arXiv:2312.08617, Dec 2023.

[23]

D. Lockhart, G. Zibrat, and C. Batten. PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research. Int'l Symp. on Microarchitecture (MICRO), Dec 2014.

[24]

Y. Lu, S. Liu, Q. Zhang, and Z. Xie. RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Models. Asia and South Pacific Design Automation Conf. (ASP-DAC), Jan 2024.

[25]

D. R. MacIver, Z. Hatfield-Dodds, and many other contributors. Hypothesis: A New Approach to Property-Based Testing. Journal of Open-Source Software (JOSS), 4(43), Nov 2019.

[26]

Meta. Introducing Meta Llama 3: The Most Capable Openly Available LLM to Date. Online Webpage, Apr 2024 (accessed May 2024). https://ai.meta.com/blog/meta-llama-3.

[27]

R. Nigam, P. H. A. de Amorim, and A. Sampson. Modular Hardawre Design with Timeline Types. Conf. on Programming Language Design and Implementation (PLDI), Jun 2023.

[28]

N. Nikhil. Bluespec System Verilog: Efficient, Correct RTL from High-Level Specifications. Int'l Conf. on Formal Methods and Models for Co-Design (MEMOCODE), Jun 2004.

[29]

OpenAI. New Models and Developer Products Announced at DevDay. Online Webpage, Nov 2024 (accessed May 2024). https://openai.com/index/new-models-and-developer-products-announced-at-devday.

[30]

OpenAI et al. GPT-4 Technical Report. Computing Research Repository (CoRR), arxiv:2303.08774, Mar 2023.

[31]

M. Orenes-Vera, M. Martonosi, and D. Wentzlaff. Using LLMs to Facilitate Formal Verification of RTL. Computing Research Repository (CoRR), arXiv:2309.09437, Sep 2023.

[32]

O. Port and Y. Etsion. DFiant: A Dataflow Hardware Description Language. Int'l Conf. on Field Programmable Logic (FPL), Sep 2017.

[33]

A. Ray, B. Devlin, F. Y. Quah, and R. Yesantharao. HardCaml: An OCaml Hardware Domain-Specific Languaeg for Efficient and Robust Design. Computing Research Repository (CoRR), arXiv:1509.02058, Dec 2023.

[34]

SpinalHDL: A Scala-based HDL. Online Webpage, 2024 (accessed May 2024). https://github.com/SpinalHDL/SpinalHDL.

[35]

S. Thakur, B. Ahmad, Z. Fan, H. Pearce, B. Tan, R. Karri, B. Dolan-Gavitt, and S. Garg. Benchmarking Large Language Models for Automated Verilog RTL Code Generation. Design, Automation, and Test in Europe (DATE), Apr 2023.

[36]

S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg. VeriGen: A Large Language Model for Verilog Code Generation. ACM Trans. on Design Automation of Electronic Systems (TODAES), 29(3):1--31, Apr 2024.

[37]

S. Thakur, J. Blocklove, H. Pearce, B. Tan, S. Garg, and R. Karri. AutoChip: Automating HDL Generation Using LLM Feedback. Computing Research Repository (CoRR), arXiv:2311.04887, Nov 2023.

[38]

L. Truong and P. Hanrahan. A Golden Age of Hardware Description Languages: Applying Programming Language Techniques to Improve Design Productivity. Summit on Advances in Programming Languages (SNAPL), May 2019.

[39]

Y.-D. Tsai, M. Liu, and H. Ren. RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models. Computing Research Repository (CoRR), arXivv:2311.16543, Nov 2023.

[40]

M. Zakharov, F. R. Kashanaki, and J. Renau. HDLEval Benchmarking LLMs for Multiple HDLs. Int'l Workshop on LLM-Aided Design (LAD), Jun 2024.

[41]

Z. Zhang, G. Chadwick, H. McNally, Y. Zhao, and R. Mullins. LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation. Computing Research Repository (CoRR), arXiv:2310.04535, Oct 2023.

[42]

R. Zhong, X. Du, S. Kai, Z. Tang, S. Xu, H.-L. Zhen, J. Hao, Q. Xu, M. Yuan, and J. Yan. LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation. Computing Research Repository (CoRR), arXiv:2401.12224, Dec 2023.

Cited By

Abdollahi MYeganli SBaharloo MBaniasadi A(2024)Hardware Design and Verification with Large Language Models: A Scoping Review, Challenges, and Open IssuesElectronics10.3390/electronics1401012014:1(120)Online publication date: 30-Dec-2024
https://doi.org/10.3390/electronics14010120

Index Terms

PyHDL-Eval: An LLM Evaluation Framework for Hardware Design Using Python-Embedded DSLs
1. Computing methodologies
  1. Machine learning
2. Hardware
  1. Electronic design automation
    1. Hardware description languages and compilation

Recommendations

The Transmogrifier C hardware description language and compiler for FPGAs
FCCM '95: Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines

Abstract: The Transmogrifier C hardware description language is almost identical to the C programming language, making it attractive to the large community of C-language programmers. This paper describes the semantics of the language and presents a ...
Design of a Classification System for Rectangular Shapes Using a Co-Design Environment
SBCCI '00: Proceedings of the 13th symposium on Integrated circuits and systems design

Pattern localization and classification are CPU time intensive, being normally implemented in software. Custom implementations in hardware allow real-time processing. In practice, in ASIC or FPGA implementations, the digitization process introduces ...
Python to accelerate embedded SoC design: A case study for systems biology
Regular Papers

We present SysPy (System Python) a tool which exploits the strengths of the popular Python scripting language to boost design productivity of embedded System on Chips for FPGAs. SysPy acts as a “glue” software between mature HDLs, ready-to-use VHDL ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MLCAD '24: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD

September 2024

321 pages

ISBN:9798400706998

DOI:10.1145/3670474

General Chairs:
Hussam Amrouch
Technical University of Munich
,
Jiang Hu
Texas A&M University
,
Program Chairs:
Siddharth Garg
New York University
,
Yibo Lin
Peking University

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MLCAD '24

Sponsor:

SIGDA

MLCAD '24: 2024 ACM/IEEE International Symposium on Machine Learning for CAD

September 9 - 11, 2024

UT, Salt Lake City, USA

Acceptance Rates

MLCAD '24 Paper Acceptance Rate 35 of 83 submissions, 42%;

Overall Acceptance Rate 35 of 83 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
232
Total Downloads

Downloads (Last 12 months)232
Downloads (Last 6 weeks)37

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Abdollahi MYeganli SBaharloo MBaniasadi A(2024)Hardware Design and Verification with Large Language Models: A Scoping Review, Challenges, and Open IssuesElectronics10.3390/electronics1401012014:1(120)Online publication date: 30-Dec-2024
https://doi.org/10.3390/electronics14010120

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents