Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models

Published: 12 July 2024 Publication History

Abstract

Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptability of LLMs, we propose a hybrid approach consisting of three modules. We not only write prompts to instruct LLMs to complete tasks, but we also invoke Analytic Rule Interfaces (ARIs) to accomplish tasks. The ARIs are Python code generated by prompting LLMs to generate code. We first construct a knowledge module with three elements including ASTscenario, ASTcomponent and Condition, and prompt LLMs to generate Python code for incorporation into an ARI library for subsequent use. After that, for any syntax-error-free Python code, we invoke ARIs from the ARI library to extract ASTcomponent from the ASTscenario, and then filter out ASTcomponent that does not meet the condition. Finally, we design prompts to instruct LLMs to abstract and idiomatize code, and then invoke ARIs from the ARI library to rewrite non-idiomatic code into the idiomatic code. Next, we conduct a comprehensive evaluation of our approach, RIdiom, and Prompt-LLM on nine established Pythonic idioms in RIdiom. Our approach exhibits superior accuracy, F1 score, and recall, while maintaining precision levels comparable to RIdiom, all of which consistently exceed or come close to 90% for each metric of each idiom. Lastly, we extend our evaluation to encompass four new Pythonic idioms. Our approach consistently outperforms Prompt-LLM, achieving metrics with values consistently exceeding 90% for accuracy, F1-score, precision, and recall.

References

[1]
2022. Programming Idioms. https://programming-idioms.org/
[2]
2023. GPT. https://platform.openai.com/docs/guides/gpt
[3]
2023. Introducing ChatGPT. https://chat.openai.com/
[4]
2023. OpenAI Codex. https://openai.com/blog/openai-codex
[5]
2023. Replication Package. https://github.com/idiomaticrefactoring/IdiomatizationLLM
[6]
Carol V Alexandru, José J Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C Gall, and Gregorio Robles. 2018. On the usage of pythonic idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 1–11.
[7]
D. Bader. 2017. Python Tricks: A Buffet of Awesome Python Features. BookBaby. isbn:9781775093312 https://books.google.co.in/books?id=C0VKDwAAQBAJ
[8]
D. Beazley and B.K. Jones. 2013. Python Cookbook: 3rd Edition. O’Reilly Media, Incorporated. isbn:9781449340377 lccn:bl2013028808 https://books.google.com.au/books?id=oBKwkgEACAAJ
[9]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165.
[10]
Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2022. CodeT: Code Generation with Generated Tests. arxiv:2207.10397.
[11]
Quantified Code. 2014. The Little Book of Python Anti-Patterns. https://github.com/quantifiedcode/python-anti-patterns
[12]
Antonia Creswell, Murray Shanahan, and Irina Higgins. 2022. Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning. https://doi.org/10.48550/arXiv.2205.09712
[13]
Python developers. 2000. Python Enhancement Proposals. https://peps.python.org/pep-0000/
[14]
Malinda Dilhara, Danny Dig, and Ameya Ketkar. 2023. PYEVOLVE: Automating Frequent Code Changes in Python ML Systems. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 995–1007.
[15]
Yihong Dong, Xue Jiang, Zhi Jin, and Ge Li. 2023. Self-collaboration Code Generation via ChatGPT. arxiv:2304.07590.
[16]
Aamir Farooq and Vadim Zaytsev. 2021. There is More than One Way to Zen Your Python. In Proceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering. 68–82.
[17]
Sidong Feng and Chunyang Chen. 2023. Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. arxiv:2306.01987.
[18]
Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen tau Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. arxiv:2204.05999.
[19]
Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. 2022. VulRepair: A T5-Based Automated Software Vulnerability Repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA. 935–947. isbn:9781450394130 https://doi.org/10.1145/3540250.3549098
[20]
Raymond Hettinger. 2013. Transforming code into beautiful, idiomatic Python. https://www.youtube.com/watch?v=OSGv2VnC0go
[21]
Qing Huang, Zhiqiang Yuan, Zhenchang Xing, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Prompt-Tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Association for Computing Machinery, New York, NY, USA. Article 79, 13 pages. isbn:9781450394758 https://doi.org/10.1145/3551349.3556912
[22]
Qing Huang, Jiahui Zhu, Zhenchang Xing, Huan Jin, Changjing Wang, and Xiwei Xu. 2023. A Chain of AI-based Solutions for Resolving FQNs and Fixing Syntax Errors in Partial Code. arxiv:2306.11981.
[23]
Qing Huang, Zhou Zou, Zhenchang Xing, Zhenkang Zuo, Xiwei Xu, and Qinghua Lu. 2023. AI Chain on Large Language Model for Unsupervised Control Flow Graph Generation for Statically-Typed Partial Code. arxiv:2306.00757.
[24]
Naman Jain, Skanda Vaidyanath, Arun Iyer, Nagarajan Natarajan, Suresh Parthasarathy, Sriram Rajamani, and Rahul Sharma. 2022. Jigsaw: Large Language Models Meet Program Synthesis. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 1219–1231. isbn:9781450392211 https://doi.org/10.1145/3510003.3510203
[25]
Jeff Knupp. 2013. Writing Idiomatic Python 3.3. Jeff Knupp.
[26]
Pattara Leelaprute, Bodin Chinthanet, Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Pongchai Jaisri, and Takashi Ishio. 2022. Does coding in pythonic zen peak performance? preliminary experiments of nine pythonic idioms at scale. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. 575–579.
[27]
Caroline Lemieux, Jeevana Priya Inala, Shuvendu K. Lahiri, and Siddhartha Sen. 2023. CodaMosa: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 919–931. https://doi.org/10.1109/ICSE48619.2023.00085
[28]
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Ré mi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-level code generation with AlphaCode. Science, 378, 6624 (2022), dec, 1092–1097. https://doi.org/10.1126/science.abq1158
[29]
Zhe Liu, Chunyang Chen, Junjie Wang, Xing Che, Yuekai Huang, Jun Hu, and Qing Wang. 2023. Fill in the Blank: Context-Aware Automated Text Input Generation for Mobile GUI Testing. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, 1355–1367. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00119
[30]
José J. Merchante. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. https://api.semanticscholar.org/CorpusID:211530803
[31]
José Javier Merchante and Gregorio Robles. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. In Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution. 1–3.
[32]
Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Haiquan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2022. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. In International Conference on Learning Representations. https://api.semanticscholar.org/CorpusID:252668917
[33]
OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774.
[34]
Yun Peng, Chaozheng Wang, Wenxuan Wang, Cuiyun Gao, and Michael R. Lyu. 2023. Generative Type Inference for Python. arxiv:2307.09163.
[35]
Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Morakot Choetkiertikul, and Raula Gaikovina Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). 806–809.
[36]
Xiaoxue Ren, Xinyuan Ye, Dehai Zhao, Zhenchang Xing, and Xiaohu Yang. 2023. From Misuse to Mastery: Enhancing Code Generation with Knowledge-Driven AI Chaining. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA, USA. 976–987. https://doi.org/10.1109/ASE56229.2023.00143
[37]
Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the usage of pythonic idioms over time: A case study of the with open idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). 43–435.
[38]
Danilo Silva and Marco Tulio Valente. 2017. RefDiff: Detecting refactorings in version histories. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 269–279.
[39]
Brett Slatkin. 2020. Effective Python : 90 specific ways to write better Python / Brett Slatkin. (second edition. ed.). Addison-Wesley, Place of publication not identified. isbn:0-13-485471-3
[40]
Mark Summerfield. 2009. Programming in Python 3: A Complete Introduction to the Python Language (2nd ed.). Addison-Wesley Professional. isbn:0321680561
[41]
Balázs Szalontai, Ákos Kukucska, András Vadász, Balázs Pintér, and Tibor Gregorics. 2023. Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning. In Intelligent Computing, Kohei Arai (Ed.). Springer Nature Switzerland, Cham. 683–702. isbn:978-3-031-37963-5
[42]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arxiv:2307.09288.
[43]
Nikolaos Tsantalis, Matin Mansouri, Laleh M Eshkevari, Davood Mazinanian, and Danny Dig. 2018. Accurate and efficient refactoring detection in commit history. In Proceedings of the 40th international conference on software engineering. 483–494.
[44]
Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI ’22: CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April 2022 - 5 May 2022, Extended Abstracts, Simone D. J. Barbosa, Cliff Lampe, Caroline Appert, and David A. Shamma (Eds.). ACM, 332:1–332:7. https://doi.org/10.1145/3491101.3519665
[45]
Anthony J Viera and Joanne M Garrett. 2005. Understanding interobserver agreement: the kappa statistic. Fam med, 37, 5 (2005), 360–363.
[46]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arxiv:2201.11903.
[47]
Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 385, 22 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3517582
[48]
Tongshuang Wu, Michael Terry, and Carrie J. Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. arxiv:2110.01691.
[49]
Kevin Yang, Nanyun Peng, Yuandong Tian, and Dan Klein. 2022. Re3: Generating Longer Stories With Recursive Reprompting and Revision. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:252873593
[50]
Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, and Liming Zhu. 2022. Making Python code idiomatic by automatic refactoring non-idiomatic Python code with pythonic idioms. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA. 696–708. isbn:9781450394130 https://doi.org/10.1145/3540250.3549143
[51]
Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, 1495–1507. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00130
[52]
Zejun Zhang, Zhenchang Xing, Xiwei Xu, and Liming Zhu. 2023. RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 102–106.
[53]
Zejun Zhang, Zhenchang Xing, Dehai Zhao, Qinghua Lu, Xiwei Xu, and Liming Zhu. 2024. Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24) (ICSE 2024). Association for Computing Machinery, New York, NY, USA. isbn:979-8-4007-0217-4/24/04 https://doi.org/10.1145/3597503.3639101

Index Terms

  1. Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Software Engineering
    Proceedings of the ACM on Software Engineering  Volume 1, Issue FSE
    July 2024
    2770 pages
    EISSN:2994-970X
    DOI:10.1145/3554322
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 July 2024
    Published in PACMSE Volume 1, Issue FSE

    Author Tags

    1. Code Change
    2. Large Language Model
    3. Pythonic Idioms

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 132
      Total Downloads
    • Downloads (Last 12 months)132
    • Downloads (Last 6 weeks)58
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media