Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Unboxed Data Constructors: Or, How cpp Decides a Halting Problem

Published: 05 January 2024 Publication History

Abstract

We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation.
We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes.
Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.

References

[1]
Ömer Sínan Ağacan. 2016. GHC unboxed sums. https://github.com/ghc/ghc/commit/714bebff44076061d0a719c4eda2cfd213b7ac3d
[2]
Noah Lev Bartell-Mangel. 2022. Filling a Niche: Using Spare Bits to Optimize Data Representations. https://www.noahlev.org/papers/popl22src-filling-a-niche.pdf POPL’22 student research presentation
[3]
Thaïs Baudon, Gabriel Radanne, and Laure Gonnord. 2023. Bit-Stealing Made Legal. In ICFP. https://doi.org/10.1145/3607858
[4]
Aria Beingessner. 2015. Rust RFC 1230: More Exotic Enum Layout Optimizations. https://github.com/rust-lang/rfcs/issues/1230
[5]
Michael Benfield. 2022. rustc PR 94075: Use niche-filling optimization even when multiple variants have data. https://github.com/rust-lang/rust/pull/94075
[6]
Mathieu Boespflug, Maxime Dénès, and Benjamin Grégoire. 2011. Full Reduction at Full Throttle. In CPP. https://inria.hal.science/hal-00650940
[7]
Eduard-Mihai Burtescu. 2017. rustc PR 45225: Refactor type memory layouts and ABIs, to be more general and easier to optimize. https://github.com/rust-lang/rust/pull/45225
[8]
Lloyd Chan. 2017. Scala Pre-SIP: Unboxed wrapper types. https://contributors.scala-lang.org/t/pre-sip-unboxed-wrapper-types/987
[9]
Zilin Chen, Ambroise Lafont, Liam O’Connor, Gabriele Keller, Craig McLaughlin, Vincent Jackson, and Christine Rizkallah. 2023. Dargent: A Silver Bullet for Verified Data Layout Refinement. PACMPL, 7, POPL (2023), Article 47, Jan, 27 pages. https://doi.org/10.1145/3571240
[10]
Simon Colin, Rodolphe Lepigre, and Gabriel Scherer. 2019. Unboxing Mutually Recursive Type Definitions in OCaml. In JFLA 2019. https://hal.inria.fr/hal-01929508
[11]
Stephen Compall. 2017. Blog post: the high cost of AnyVal classes. https://failex.blogspot.com/2017/04/the-high-cost-of-anyval-subclasses.html
[12]
Iavor S. Diatchki, Mark P. Jones, and Rebekah Leslie. 2005. High-Level Views on Low-Level Representations. In ICFP’05. http://web.cecs.pdx.edu/~mpj/pubs/bitdata-icfp05.pdf
[13]
Torbjörn Granlund and contributors. 1991. GMP. https://gmplib.org/
[14]
John Hughes. 1982. Super-Combinators a New Implementation Method for Applicative Languages. In Proceedings of the 1982 ACM Symposium on LISP and Functional Programming (LFP). https://doi.org/10.1145/800068.802129
[15]
Zurab Khasidashvil. 2020. A short proof of the decidability of normalization in recursive program schemes. In Shalva Pkhakadze’s Festschrift, AMIM Vol. 25 No. 2. http://www.viam.science.tsu.ge/Ami/2020_2/5_zura.pdf
[16]
Simon Marlow. 2003. GHC’s UNPACK pragma. https://github.com/ghc/ghc/commit/abbc5a0be1df84a33015470319062ed7a3aa3153
[17]
Antoine Miné and Xavier Leroy. 2012. Zarith. https://github.com/ocaml/Zarith/
[18]
Martin Odersky and Adriaan Moors. 2018. dotty PR 5300: Opaque types. https://github.com/lampepfl/dotty/pull/5300
[19]
Erik Osheim, Jorge Vicente Cantero, and Sébastien Doeraene. 2017. Scala SIP 35: Opaque types. https://contributors.scala-lang.org/t/pre-sip-unboxed-wrapper-types/987
[20]
Simon Peyton-Jones. 2007. GHC view patterns. https://gitlab.haskell.org/ghc/ghc/-/wikis/view-patterns
[21]
Gordon Plotkin. 2022. Recursion does not always help. arxiv:2206.08413
[22]
Dave Prosser. 1986. X3J11/86-196: Complete macro expansion algorithm. https://www.spinellis.gr/blog/20060626/x3J11-86-196.pdf
[23]
Sylvain Salvati and Igor Walukiewicz. 2015. Using models to model-check recursive schemes. Logical Methods in Computer Science, Volume 11, Issue 2 (2015), June, https://doi.org/10.2168/LMCS-11(2:7)2015
[24]
Diomidis Spinellis. 2008. A corrected and annotated version of the X4J11/86-196 document. https://www.spinellis.gr/blog/20060626/
[25]
Don Syme. 2016. Fsharp PR 1395: struct discriminated unions. https://github.com/dotnet/fsharp/pull/1395
[26]
Don Syme, Gregory Neverov, and James Margetson. 2007. Extensible Pattern Matching via a Lightweight Language Extension. In ICFP’07 (ICFP ’07). https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/p29-syme.pdf
[27]
The C++ standard committee, working group SG12. 2014. n3882; An update to the preprocessor specification. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3882.pdf
[28]
The C standard committee, working group WG14. 1992. Defect report 017. https://www.open-std.org/Jtc1/sc22/wg14/www/docs/dr_017.html
[29]
David A. Turner. 1979. A new implementation technique for applicative languages. In Software - Practice and Experience.
[30]
Stephen Weeks. 2006. Whole-Program Compilation in MLton. In ML Workshop 2006. http://www.mlton.org/References.attachments/060916-mlton.pdf
[31]
Jeremy Yallop. 2020. OCaml RFC: constructor unboxing. https://github.com/ocaml/RFCs/pull/14

Cited By

View all
  • (2024)Unboxing Virgil ADTs for Fun and ProfitProceedings of the Workshop Dedicated to Jens Palsberg on the Occasion of His 60th Birthday10.1145/3694848.3694857(43-52)Online publication date: 22-Oct-2024
  • (2024)Double-Ended Bit-Stealing for Algebraic Data TypesProceedings of the ACM on Programming Languages10.1145/36746288:ICFP(88-120)Online publication date: 15-Aug-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 8, Issue POPL
January 2024
2820 pages
EISSN:2475-1421
DOI:10.1145/3554315
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2024
Published in PACMPL Volume 8, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. boxing
  2. data representation
  3. recursive definitions
  4. sum types
  5. tagging
  6. termination

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,655
  • Downloads (Last 6 weeks)39
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Unboxing Virgil ADTs for Fun and ProfitProceedings of the Workshop Dedicated to Jens Palsberg on the Occasion of His 60th Birthday10.1145/3694848.3694857(43-52)Online publication date: 22-Oct-2024
  • (2024)Double-Ended Bit-Stealing for Algebraic Data TypesProceedings of the ACM on Programming Languages10.1145/36746288:ICFP(88-120)Online publication date: 15-Aug-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media