short-paper

MoonBit: Explore the Design of an AI-Friendly Programming Language

Authors:

Qing LiuAuthors Info & Claims

LLM4Code '24: Proceedings of the 1st International Workshop on Large Language Models for Code

Pages 79 - 83

https://doi.org/10.1145/3643795.3648376

Published: 10 September 2024 Publication History

Get Access

Abstract

MoonBit, a new general-purpose programming language designed for cloud and edge computing, was initiated in late 2022, coinciding with the announcement of ChatGPT. Language models like GPT, capable of producing practical programs, are revolutionizing the way we write programs and interact with computers. However, significant challenges persist, such as the models' inability to understand the global context of a whole project with its dependencies, the need for human verification and correction of generated code, and the lack of assurance in meeting basic requirements like syntactic correctness.

In this paper, we explore the design of the MoonBit language highlighting its AI integration, emphasizing the synergy between traditional code intelligence and large language model capabilities. We also introduce a real-time, semantics-based sampler to guide the inference process of language models. This approach ensures the generated programs are both syntactically correct and free from obvious semantic flaws, such as type errors. Crucially, this has been achieved with minimal impact on overall performance. Our evaluation demonstrates a notable improvement in code quality, achieved without sacrificing the models' responsiveness.

References

[1]

Lakshya Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K. Lahiri, and Sriram Rajamani. 2023. Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context. In Thirty-Seventh Conference on Neural Information Processing Systems.

Google Scholar

[2]

Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D. C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, and Shashank Shet. 2023. CodePlan: Repository-level Coding Using LLMs and Planning. In NeurIPS 2023 Foundation Models for Decision Making Workshop.

Google Scholar

[3]

Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Carolyn Jane Anderson, Michael Greenberg, Abhinav Jangda, and Arjun Guha. 2023. Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs. arXiv:2308.09895 [cs]

Crossref

Google Scholar

[4]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs]

Crossref

Google Scholar

[5]

Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context. arXiv:2212.10007 [cs]

Crossref

Google Scholar

[6]

Georgi Gerganov. 2023. Llama.Cpp. ggml.ai.

Google Scholar

[7]

Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2021. Synchromesh: Reliable Code Generation from Pre-trained Language Models. In International Conference on Learning Representations.

Google Scholar

[8]

Reiner Pope, Sholto Douglas, Aakanksha Chowdhery, Jacob Devlin, James Bradbury, Jonathan Heek, Kefan Xiao, Shivani Agrawal, and Jeff Dean. 2023. Efficiently Scaling Transformer Inference. Proceedings of Machine Learning and Systems 5 (March 2023).

Google Scholar

[9]

Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, and Gabriel Synnaeve. 2023. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs]

Crossref

Google Scholar

[10]

Disha Shrivastava, Hugo Larochelle, and Daniel Tarlow. 2023. Repository-Level Prompt Generation for Large Language Models of Code. arXiv:2206.12839 [cs]

Crossref

Google Scholar

[11]

Yuxiang Wei, Chunqiu Steven Xia, and Lingming Zhang. 2023. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair. In ESEC/FSE 2023. ACM, San Francisco, CA, USA. arXiv:2309.00608 [cs]

Digital Library

Google Scholar

[12]

Eric Zelikman, Qian Huang, Gabriel Poesia, Noah Goodman, and Nick Haber. 2023. Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions. In Thirty-Seventh Conference on Neural Information Processing Systems.

Google Scholar

[13]

Hongbo Zhang. 2023. MoonBit: The Fast, Compact & User Friendly Language for WebAssembly. International Digital Economy Academy.

Google Scholar

Index Terms

MoonBit: Explore the Design of an AI-Friendly Programming Language
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Automatic programming

Recommendations

From Datalog to flix: a declarative language for fixed points on lattices
PLDI '16

We present Flix, a declarative programming language for specifying and solving least fixed point problems, particularly static program analyses. Flix is inspired by Datalog and extends it with lattices and monotone functions. Using Flix, implementors ...
Experiences with a simple structured programming language
SIGCSE '74: Proceedings of the fourth SIGCSE technical symposium on Computer science education

A great deal of interest has developed in structured programming [Dahl, Dijkstra, and Hoare, 1972] during the past few years. This paper is concerned with some experiences obtained in the use of a structured programming language in the computer science ...
Towards cross-platform cross-language analysis with soot
SOAP 2016: Proceedings of the 5th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis

To assess the security and quality of the growing number of programs on desktop computers, mobile devices, and servers, companies often rely on static analysis techniques. While static analysis has been applied successfully to various problems, the ...

Comments

Information & Contributors

Information

Published In

LLM4Code '24: Proceedings of the 1st International Workshop on Large Language Models for Code

April 2024

144 pages

ISBN:9798400705793

DOI:10.1145/3643795

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 September 2024

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

LLM4Code '24

Sponsor:

SIGSOFT

LLM4Code '24: 1st International Workshop on Large Language Models for Code

April 20, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
58
Total Downloads

Downloads (Last 12 months)58
Downloads (Last 6 weeks)6

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Index Terms

Recommendations

From Datalog to flix: a declarative language for fixed points on lattices

Experiences with a simple structured programming language

Towards cross-platform cross-language analysis with soot

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations