Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3643795.3648376acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

MoonBit: Explore the Design of an AI-Friendly Programming Language

Published: 10 September 2024 Publication History

Abstract

MoonBit, a new general-purpose programming language designed for cloud and edge computing, was initiated in late 2022, coinciding with the announcement of ChatGPT. Language models like GPT, capable of producing practical programs, are revolutionizing the way we write programs and interact with computers. However, significant challenges persist, such as the models' inability to understand the global context of a whole project with its dependencies, the need for human verification and correction of generated code, and the lack of assurance in meeting basic requirements like syntactic correctness.
In this paper, we explore the design of the MoonBit language highlighting its AI integration, emphasizing the synergy between traditional code intelligence and large language model capabilities. We also introduce a real-time, semantics-based sampler to guide the inference process of language models. This approach ensures the generated programs are both syntactically correct and free from obvious semantic flaws, such as type errors. Crucially, this has been achieved with minimal impact on overall performance. Our evaluation demonstrates a notable improvement in code quality, achieved without sacrificing the models' responsiveness.

References

[1]
Lakshya Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K. Lahiri, and Sriram Rajamani. 2023. Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context. In Thirty-Seventh Conference on Neural Information Processing Systems.
[2]
Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D. C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, and Shashank Shet. 2023. CodePlan: Repository-level Coding Using LLMs and Planning. In NeurIPS 2023 Foundation Models for Decision Making Workshop.
[3]
Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Carolyn Jane Anderson, Michael Greenberg, Abhinav Jangda, and Arjun Guha. 2023. Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs. arXiv:2308.09895 [cs]
[4]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs]
[5]
Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context. arXiv:2212.10007 [cs]
[6]
Georgi Gerganov. 2023. Llama.Cpp. ggml.ai.
[7]
Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2021. Synchromesh: Reliable Code Generation from Pre-trained Language Models. In International Conference on Learning Representations.
[8]
Reiner Pope, Sholto Douglas, Aakanksha Chowdhery, Jacob Devlin, James Bradbury, Jonathan Heek, Kefan Xiao, Shivani Agrawal, and Jeff Dean. 2023. Efficiently Scaling Transformer Inference. Proceedings of Machine Learning and Systems 5 (March 2023).
[9]
Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, and Gabriel Synnaeve. 2023. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs]
[10]
Disha Shrivastava, Hugo Larochelle, and Daniel Tarlow. 2023. Repository-Level Prompt Generation for Large Language Models of Code. arXiv:2206.12839 [cs]
[11]
Yuxiang Wei, Chunqiu Steven Xia, and Lingming Zhang. 2023. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair. In ESEC/FSE 2023. ACM, San Francisco, CA, USA. arXiv:2309.00608 [cs]
[12]
Eric Zelikman, Qian Huang, Gabriel Poesia, Noah Goodman, and Nick Haber. 2023. Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions. In Thirty-Seventh Conference on Neural Information Processing Systems.
[13]
Hongbo Zhang. 2023. MoonBit: The Fast, Compact & User Friendly Language for WebAssembly. International Digital Economy Academy.

Index Terms

  1. MoonBit: Explore the Design of an AI-Friendly Programming Language

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    LLM4Code '24: Proceedings of the 1st International Workshop on Large Language Models for Code
    April 2024
    144 pages
    ISBN:9798400705793
    DOI:10.1145/3643795
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    • Faculty of Engineering of University of Porto

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 September 2024

    Check for updates

    Author Tags

    1. large language model
    2. program synthesize
    3. static analysis

    Qualifiers

    • Short-paper

    Conference

    LLM4Code '24
    Sponsor:

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 58
      Total Downloads
    • Downloads (Last 12 months)58
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 02 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media