research-article

Open access

Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code

Authors:

Zhenchang Xing,

Liming ZhuAuthors Info & Claims

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Article No.: 227, Pages 1 - 12

https://doi.org/10.1145/3597503.3639101

Published: 12 April 2024 Publication History

Abstract

The Python community strives to design pythonic idioms so that Python users can achieve their intent in a more concise and efficient way. According to our analysis of 154 questions about challenges of understanding pythonic idioms on Stack Overflow, we find that Python users face various challenges in comprehending pythonic idioms. And the usage of pythonic idioms in 7,577 GitHub projects reveals the prevalence of pythonic idioms. By using a statistical sampling method, we find pythonic idioms result in not only lexical conciseness but also the creation of variables and functions, which indicates it is not straightforward to map back to non-idiomatic code. And usage of pythonic idioms may even cause potential negative effects such as code redundancy, bugs and performance degradation. To alleviate such readability issues and negative effects, we develop a transforming tool, DeIdiom, to automatically transform idiomatic code into equivalent non-idiomatic code. We test and review over 7,572 idiomatic code instances of nine pythonic idioms (list/set/dict-comprehension, chain-comparison, truth-value-test, loop-else, assign-multi-targets, for-multi-targets, star), the result shows the high accuracy of DeIdiom. Our user study with 20 participants demonstrates that explanatory non-idiomatic code generated by DeIdiom is useful for Python users to understand pythonic idioms correctly and efficiently, and leads to a more positive appreciation of pythonic idioms.

References

[1]

2008. Hidden features of Python (The Explanation of Chain Comparison). https://stackoverflow.com/questions/101268/hidden-features-of-python

[2]

2010. Expanding tuples into arguments. https://stackoverflow.com/questions/1993727/expanding-tuples-into-arguments

[3]

2011. How do chained assignments work. https://stackoverflow.com/questions/7601823/how-do-chained-assignments-work

[4]

2012. Why does (1 in [1,0] == True) evaluate to False? https://stackoverflow.com/questions/9284350/why-does-1-in-1-0-true-evaluate-to-false?

[5]

2013. Explanation of how nested list comprehension works. https://stackoverflow.com/questions/20639180/explanation-of-how-nested-list-comprehension-works

[6]

2017. Set comprehension gives unhashable type set of list in Python. https://stackoverflow.com/questions/42363826/set-comprehension-gives-unhashable-type-set-of-list-in-python

[7]

2019. Strange chained comparison. https://stackoverflow.com/questions/58084423/strange-chained-comparison

[8]

2022. Pylint. https://pylint.readthedocs.io/en/latest/

[9]

2022. RIdiom. https://plugins.jetbrains.com/plugin/20107-ridiom

[10]

2022. Stack Overflow. https://stackoverflow.com/help/searching

[11]

Marwen Abbes, Foutse Khomh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2011. An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Comprehension. In 2011 15th European Conference on Software Maintenance and Reengineering. 181--190.

Digital Library

[12]

Carol V. Alexandru, José J. Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C. Gall, and Gregorio Robles. 2018. On the Usage of Pythonic Idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Boston, MA, USA) (Onward! 2018). Association for Computing Machinery, New York, NY, USA, 1--11.

Digital Library

[13]

Arooj Arif and Zeeshan Ali Rana. 2020. Refactoring of Code to Remove Technical Debt and Reduce Maintenance Effort. In 2020 14th International Conference on Open Source Systems and Technologies (ICOSST). 1--7.

[14]

Dan Bader. 2017. Python Tricks: A Buffet of Awesome Python Features. BookBaby.

[15]

Yuriy Brun, Tian Lin, Jessie Elise Somerville, Elisha M. Myers, and Natalie Ebner. 2023. Blindspots in Python and Java APIs Result in Vulnerable Code. ACM Trans. Softw. Eng. Methodol. 32, 3, Article 76 (apr 2023), 31 pages.

Digital Library

[16]

T. A. Corbi. 1989. Program understanding: Challenge for the 1990s. IBM Systems Journal 28, 2 (1989), 294--306.

Digital Library

[17]

Marco D'Ambros, Alberto Bacchelli, and Michele Lanza. 2010. On the Impact of Design Flaws on Software Defects. In Quality Software, International Conference on. IEEE Computer Society, Los Alamitos, CA, USA, 23--31.

Digital Library

[18]

Python developers. 2000. Python Enhancement Proposals. https://peps.python.org/pep-0000/

[19]

Python developers. 2001. Python Enhancement Proposal 8 (PEP8). https://peps.python.org/pep-0008/

[20]

Python Developers. 2014. The performance of the list-comprehension. https://stackoverflow.com/questions/22108488/are-list-comprehensions-and-functional-functions-faster-than-for-loops

[21]

Python Developers. 2022. The definition of logical lines of Python program. ttps: //docs.python.org/3/reference/lexical_analysis.html#logical-lines

[22]

Bart Du Bois, Serge Demeyer, Jan Verelst, Tom Mens, and Marijn Temmerman. 2006. Does god class decomposition affect comprehensibility?. In IASTED Conf. on software engineering. 346--355.

[23]

Aamir Farooq and Vadim Zaytsev. 2021. There is More than One Way to Zen Your Python (SLE 2021). Association for Computing Machinery, New York, NY, USA, 68--82.

Digital Library

[24]

Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos. 2017. Understanding Misunderstandings in Source Code. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA, 129--139.

Digital Library

[25]

Geoffrey Hecht, Naouel Moha, and Romain Rouvoy. 2016. An Empirical Study of the Performance Impacts of Android Code Smells. In Proceedings of the International Conference on Mobile Software Engineering and Systems (Austin, Texas) (MOBILESoft '16). Association for Computing Machinery, New York, NY, USA, 59--69.

Digital Library

[26]

Jeff Knupp. 2013. Writing Idiomatic Python 3.3. Jeff Knupp.

[27]

M. Lanza, S. Ducasse, H. Gall, and M. Pinzger. 2005. CodeCrawler - an information visualization tool for program comprehension. In Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005. 672--673.

[28]

Pattara Leelaprute, Bodin Chinthanet, Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Pongchai Jaisri, and Takashi Ishio. 2022. Does Coding in Pythonic Zen Peak Performance? Preliminary Experiments of Nine Pythonic Idioms at Scale. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension (Virtual Event) (ICPC '22). Association for Computing Machinery, New York, NY, USA, 575--579.

Digital Library

[29]

Li Li, Jiawei Wang, and Haowei Quan. 2022. Scalpel: The python static analysis framework. arXiv preprint arXiv:2202.11840 (2022).

[30]

Constantine Lignos. 2019. Anti-Patterns in Python Programming. https://lignos.org/py_antipatterns/

[31]

Isela Macia Bertran, Alessandro Garcia, and Arndt von Staa. 2011. An Exploratory Study of Code Smells in Evolving Aspect-Oriented Systems. In Proceedings of the Tenth International Conference on Aspect-Oriented Software Development (Porto de Galinhas, Brazil) (AOSD '11). Association for Computing Machinery, New York, NY, USA, 203--214.

Digital Library

[32]

Alex Martelli, Anna Ravenscroft, and David Ascher. 2005. Python cookbook. O'Reilly Media, Inc.

[33]

José Javier Merchante and Gregorio Robles. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. In Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution. 1--3.

[34]

Mónika Mészáros, Máté Cserép, and Anett Fekete. 2019. Delivering comprehension features into source code editors through LSP. In 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). 1581--1586.

[35]

P. Phan-udom, N. Wattanakul, T. Sakulniwat, C. Ragkhitwetsagul, T. Sunetnanta, M. Choetkiertikul, and R. Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE Computer Society, Los Alamitos, CA, USA, 806--809.

[36]

Kenneth Reitz and Tanya Schlusser. 2016. The Hitchhiker's guide to Python: best practices for development. O'Reilly Media, Inc.

[37]

Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the Usage of Pythonic Idioms Over Time: A Case Study of the with open Idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 43--435.

[38]

Richard L Scheaffer, William Mendenhall III, R Lyman Ott, and Kenneth G Gerow. 2011. Elementary survey sampling. Cengage Learning.

[39]

Jingqiu Shao and Yingxu Wang. 2003. A new measure of software complexity based on cognitive weights. In CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436), Vol. 2. 1333--1338 vol.2.

[40]

Janet Siegmund. 2016. Program Comprehension: Past, Present, and Future. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 5. 13--20.

[41]

Brett Slatkin. 2019. Effective python: 90 specific ways to write better python. Addison-Wesley Professional.

[42]

Elliot Soloway and Kate Ehrlich. 1984. Empirical Studies of Programming Knowledge. IEEE Transactions on Software Engineering SE-10, 5 (1984), 595--609.

Digital Library

[43]

Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.

[44]

Anthony J Viera, Joanne M Garrett, et al. 2005. Understanding interobserver agreement: the kappa statistic. Fam med 37, 5 (2005), 360--363.

[45]

Jiawei Wang, Li Li, Kui Liu, and Haipeng Cai. 2020. Exploring How Deprecated Python Library APIs Are (Not) Handled. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 233--244.

Digital Library

[46]

Frank. Wilcoxon. 1945. Individual Comparisons by Ranking Methods. Biometrics 1 (1945), 196--202.

[47]

Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2018. Measuring Program Comprehension: A Large-Scale Field Study with Professionals. IEEE Transactions on Software Engineering 44, 10 (2018), 951--976.

Digital Library

[48]

Sheng Yu and Shijie Zhou. 2010. A survey on metric of software complexity. In 2010 2nd IEEE International Conference on Information Management and Engineering. 352--356.

[49]

Marvin V. Zelkowitz, Alan C. Shaw, and John D. Gannon. 1979. Principles of Software Engineering and Design. Prentice Hall Professional Technical Reference.

[50]

Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, and Liming Zhu. 2022. Making Python Code Idiomatic by Automatic Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Singapore, Singapore) (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA, 696--708.

Digital Library

[51]

Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence. In Proceedings of the 45th International Conference on Software Engineering (Melbourne, Victoria, Australia) (ICSE '23). IEEE Press, 1495--1507.

Digital Library

[52]

Zejun Zhang, Zhenchang Xing, Xiwei Xu, and Liming Zhu. 2023. RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In Proceedings of the 45th International Conference on Software Engineering: Companion Proceedings (Melbourne, Victoria, Australia) (ICSE '23). IEEE Press, 102--106.

Digital Library

Cited By

Zhang ZXing ZRen XLu QXu X(2024)Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language ModelsProceedings of the ACM on Software Engineering10.1145/36437761:FSE(1107-1128)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643776

Index Terms

Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code
1. Software and its engineering
  1. Software notations and tools
    1. Software maintenance tools

Recommendations

Making Python code idiomatic by automatic refactoring non-idiomatic Python code with pythonic idioms
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Compared to other programming languages (e.g., Java), Python has more idioms to make Python code concise and efficient. Although pythonic idioms are well accepted in the Python community, Python programmers are often faced with many challenges in ...
RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms
ICSE '23: Proceedings of the 45th International Conference on Software Engineering: Companion Proceedings

Pythonic idioms are widely adopted in the Python community because of their advantages such as conciseness and performance. However, when Python programmers use pythonic idioms, they face many challenges such as being unaware of certain pythonic ...
Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models

Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting rule-based approach or LLM-only approach is not sufficient to overcome three persistent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

May 2024

2942 pages

ISBN:9798400702174

DOI:10.1145/3597503

Co-chairs:
Ana Paiva,
Rui Abreu,
Program Co-chairs:
Abhik Roychoudhury,
Margaret Storey

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE '24

Sponsor:

SIGSOFT

ICSE '24: IEEE/ACM 46th International Conference on Software Engineering

April 14 - 20, 2024

Lisbon, Portugal

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
276
Total Downloads

Downloads (Last 12 months)276
Downloads (Last 6 weeks)53

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZXing ZRen XLu QXu X(2024)Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language ModelsProceedings of the ACM on Software Engineering10.1145/36437761:FSE(1107-1128)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643776

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents