research-article

Open access

Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models

Authors:

Zhenchang Xing,

Xiwei XuAuthors Info & Claims

Proceedings of the ACM on Software Engineering, Volume 1, Issue FSE

Article No.: 50, Pages 1107 - 1128

https://doi.org/10.1145/3643776

Published: 12 July 2024 Publication History

Abstract

Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptability of LLMs, we propose a hybrid approach consisting of three modules. We not only write prompts to instruct LLMs to complete tasks, but we also invoke Analytic Rule Interfaces (ARIs) to accomplish tasks. The ARIs are Python code generated by prompting LLMs to generate code. We first construct a knowledge module with three elements including ASTscenario, ASTcomponent and Condition, and prompt LLMs to generate Python code for incorporation into an ARI library for subsequent use. After that, for any syntax-error-free Python code, we invoke ARIs from the ARI library to extract ASTcomponent from the ASTscenario, and then filter out ASTcomponent that does not meet the condition. Finally, we design prompts to instruct LLMs to abstract and idiomatize code, and then invoke ARIs from the ARI library to rewrite non-idiomatic code into the idiomatic code. Next, we conduct a comprehensive evaluation of our approach, RIdiom, and Prompt-LLM on nine established Pythonic idioms in RIdiom. Our approach exhibits superior accuracy, F1 score, and recall, while maintaining precision levels comparable to RIdiom, all of which consistently exceed or come close to 90% for each metric of each idiom. Lastly, we extend our evaluation to encompass four new Pythonic idioms. Our approach consistently outperforms Prompt-LLM, achieving metrics with values consistently exceeding 90% for accuracy, F1-score, precision, and recall.

References

[1]

2022. Programming Idioms. https://programming-idioms.org/

[2]

2023. GPT. https://platform.openai.com/docs/guides/gpt

[3]

2023. Introducing ChatGPT. https://chat.openai.com/

[4]

2023. OpenAI Codex. https://openai.com/blog/openai-codex

[5]

2023. Replication Package. https://github.com/idiomaticrefactoring/IdiomatizationLLM

[6]

Carol V Alexandru, José J Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C Gall, and Gregorio Robles. 2018. On the usage of pythonic idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 1–11.

Digital Library

[7]

D. Bader. 2017. Python Tricks: A Buffet of Awesome Python Features. BookBaby. isbn:9781775093312 https://books.google.co.in/books?id=C0VKDwAAQBAJ

[8]

D. Beazley and B.K. Jones. 2013. Python Cookbook: 3rd Edition. O’Reilly Media, Incorporated. isbn:9781449340377 lccn:bl2013028808 https://books.google.com.au/books?id=oBKwkgEACAAJ

[9]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165.

[10]

Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2022. CodeT: Code Generation with Generated Tests. arxiv:2207.10397.

[11]

Quantified Code. 2014. The Little Book of Python Anti-Patterns. https://github.com/quantifiedcode/python-anti-patterns

[12]

Antonia Creswell, Murray Shanahan, and Irina Higgins. 2022. Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning. https://doi.org/10.48550/arXiv.2205.09712

[13]

Python developers. 2000. Python Enhancement Proposals. https://peps.python.org/pep-0000/

[14]

Malinda Dilhara, Danny Dig, and Ameya Ketkar. 2023. PYEVOLVE: Automating Frequent Code Changes in Python ML Systems. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 995–1007.

[15]

Yihong Dong, Xue Jiang, Zhi Jin, and Ge Li. 2023. Self-collaboration Code Generation via ChatGPT. arxiv:2304.07590.

[16]

Aamir Farooq and Vadim Zaytsev. 2021. There is More than One Way to Zen Your Python. In Proceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering. 68–82.

Digital Library

[17]

Sidong Feng and Chunyang Chen. 2023. Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. arxiv:2306.01987.

[18]

Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen tau Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. arxiv:2204.05999.

[19]

Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. 2022. VulRepair: A T5-Based Automated Software Vulnerability Repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA. 935–947. isbn:9781450394130 https://doi.org/10.1145/3540250.3549098

Digital Library

[20]

Raymond Hettinger. 2013. Transforming code into beautiful, idiomatic Python. https://www.youtube.com/watch?v=OSGv2VnC0go

[21]

Qing Huang, Zhiqiang Yuan, Zhenchang Xing, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Prompt-Tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Association for Computing Machinery, New York, NY, USA. Article 79, 13 pages. isbn:9781450394758 https://doi.org/10.1145/3551349.3556912

Digital Library

[22]

Qing Huang, Jiahui Zhu, Zhenchang Xing, Huan Jin, Changjing Wang, and Xiwei Xu. 2023. A Chain of AI-based Solutions for Resolving FQNs and Fixing Syntax Errors in Partial Code. arxiv:2306.11981.

[23]

Qing Huang, Zhou Zou, Zhenchang Xing, Zhenkang Zuo, Xiwei Xu, and Qinghua Lu. 2023. AI Chain on Large Language Model for Unsupervised Control Flow Graph Generation for Statically-Typed Partial Code. arxiv:2306.00757.

[24]

Naman Jain, Skanda Vaidyanath, Arun Iyer, Nagarajan Natarajan, Suresh Parthasarathy, Sriram Rajamani, and Rahul Sharma. 2022. Jigsaw: Large Language Models Meet Program Synthesis. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 1219–1231. isbn:9781450392211 https://doi.org/10.1145/3510003.3510203

Digital Library

[25]

Jeff Knupp. 2013. Writing Idiomatic Python 3.3. Jeff Knupp.

[26]

Pattara Leelaprute, Bodin Chinthanet, Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Pongchai Jaisri, and Takashi Ishio. 2022. Does coding in pythonic zen peak performance? preliminary experiments of nine pythonic idioms at scale. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. 575–579.

[27]

Caroline Lemieux, Jeevana Priya Inala, Shuvendu K. Lahiri, and Siddhartha Sen. 2023. CodaMosa: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 919–931. https://doi.org/10.1109/ICSE48619.2023.00085

Digital Library

[28]

Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Ré mi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-level code generation with AlphaCode. Science, 378, 6624 (2022), dec, 1092–1097. https://doi.org/10.1126/science.abq1158

[29]

Zhe Liu, Chunyang Chen, Junjie Wang, Xing Che, Yuekai Huang, Jun Hu, and Qing Wang. 2023. Fill in the Blank: Context-Aware Automated Text Input Generation for Mobile GUI Testing. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, 1355–1367. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00119

Digital Library

[30]

José J. Merchante. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. https://api.semanticscholar.org/CorpusID:211530803

[31]

José Javier Merchante and Gregorio Robles. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. In Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution. 1–3.

[32]

Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Haiquan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2022. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. In International Conference on Learning Representations. https://api.semanticscholar.org/CorpusID:252668917

[33]

OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774.

[34]

Yun Peng, Chaozheng Wang, Wenxuan Wang, Cuiyun Gao, and Michael R. Lyu. 2023. Generative Type Inference for Python. arxiv:2307.09163.

[35]

Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Morakot Choetkiertikul, and Raula Gaikovina Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). 806–809.

[36]

Xiaoxue Ren, Xinyuan Ye, Dehai Zhao, Zhenchang Xing, and Xiaohu Yang. 2023. From Misuse to Mastery: Enhancing Code Generation with Knowledge-Driven AI Chaining. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA, USA. 976–987. https://doi.org/10.1109/ASE56229.2023.00143

Digital Library

[37]

Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the usage of pythonic idioms over time: A case study of the with open idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). 43–435.

[38]

Danilo Silva and Marco Tulio Valente. 2017. RefDiff: Detecting refactorings in version histories. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 269–279.

Digital Library

[39]

Brett Slatkin. 2020. Effective Python : 90 specific ways to write better Python / Brett Slatkin. (second edition. ed.). Addison-Wesley, Place of publication not identified. isbn:0-13-485471-3

[40]

Mark Summerfield. 2009. Programming in Python 3: A Complete Introduction to the Python Language (2nd ed.). Addison-Wesley Professional. isbn:0321680561

[41]

Balázs Szalontai, Ákos Kukucska, András Vadász, Balázs Pintér, and Tibor Gregorics. 2023. Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning. In Intelligent Computing, Kohei Arai (Ed.). Springer Nature Switzerland, Cham. 683–702. isbn:978-3-031-37963-5

[42]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arxiv:2307.09288.

[43]

Nikolaos Tsantalis, Matin Mansouri, Laleh M Eshkevari, Davood Mazinanian, and Danny Dig. 2018. Accurate and efficient refactoring detection in commit history. In Proceedings of the 40th international conference on software engineering. 483–494.

Digital Library

[44]

Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI ’22: CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April 2022 - 5 May 2022, Extended Abstracts, Simone D. J. Barbosa, Cliff Lampe, Caroline Appert, and David A. Shamma (Eds.). ACM, 332:1–332:7. https://doi.org/10.1145/3491101.3519665

Digital Library

[45]

Anthony J Viera and Joanne M Garrett. 2005. Understanding interobserver agreement: the kappa statistic. Fam med, 37, 5 (2005), 360–363.

[46]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arxiv:2201.11903.

[47]

Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA. Article 385, 22 pages. isbn:9781450391573 https://doi.org/10.1145/3491102.3517582

Digital Library

[48]

Tongshuang Wu, Michael Terry, and Carrie J. Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. arxiv:2110.01691.

[49]

Kevin Yang, Nanyun Peng, Yuandong Tian, and Dan Klein. 2022. Re3: Generating Longer Stories With Recursive Reprompting and Revision. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:252873593

[50]

Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, and Liming Zhu. 2022. Making Python code idiomatic by automatic refactoring non-idiomatic Python code with pythonic idioms. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA. 696–708. isbn:9781450394130 https://doi.org/10.1145/3540250.3549143

Digital Library

[51]

Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, 1495–1507. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00130

Digital Library

[52]

Zejun Zhang, Zhenchang Xing, Xiwei Xu, and Liming Zhu. 2023. RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 102–106.

[53]

Zejun Zhang, Zhenchang Xing, Dehai Zhao, Qinghua Lu, Xiwei Xu, and Liming Zhu. 2024. Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24) (ICSE 2024). Association for Computing Machinery, New York, NY, USA. isbn:979-8-4007-0217-4/24/04 https://doi.org/10.1145/3597503.3639101

Digital Library

Index Terms

Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software evolution

Recommendations

Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

The Python community strives to design pythonic idioms so that Python users can achieve their intent in a more concise and efficient way. According to our analysis of 154 questions about challenges of understanding pythonic idioms on Stack Overflow, we ...
Making Python code idiomatic by automatic refactoring non-idiomatic Python code with pythonic idioms
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Compared to other programming languages (e.g., Java), Python has more idioms to make Python code concise and efficient. Although pythonic idioms are well accepted in the Python community, Python programmers are often faced with many challenges in ...
RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms
ICSE '23: Proceedings of the 45th International Conference on Software Engineering: Companion Proceedings

Pythonic idioms are widely adopted in the Python community because of their advantages such as conciseness and performance. However, when Python programmers use pythonic idioms, they face many challenges such as being unaware of certain pythonic ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Software Engineering

Proceedings of the ACM on Software Engineering Volume 1, Issue FSE

July 2024

2770 pages

EISSN:2994-970X

DOI:10.1145/3554322

Editor:
Luciano Baresi
Politecnico di Milano, Italy

Issue’s Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2024

Published in PACMSE Volume 1, Issue FSE

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
132
Total Downloads

Downloads (Last 12 months)132
Downloads (Last 6 weeks)58

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents