Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3597503.3639101acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code

Published: 12 April 2024 Publication History

Abstract

The Python community strives to design pythonic idioms so that Python users can achieve their intent in a more concise and efficient way. According to our analysis of 154 questions about challenges of understanding pythonic idioms on Stack Overflow, we find that Python users face various challenges in comprehending pythonic idioms. And the usage of pythonic idioms in 7,577 GitHub projects reveals the prevalence of pythonic idioms. By using a statistical sampling method, we find pythonic idioms result in not only lexical conciseness but also the creation of variables and functions, which indicates it is not straightforward to map back to non-idiomatic code. And usage of pythonic idioms may even cause potential negative effects such as code redundancy, bugs and performance degradation. To alleviate such readability issues and negative effects, we develop a transforming tool, DeIdiom, to automatically transform idiomatic code into equivalent non-idiomatic code. We test and review over 7,572 idiomatic code instances of nine pythonic idioms (list/set/dict-comprehension, chain-comparison, truth-value-test, loop-else, assign-multi-targets, for-multi-targets, star), the result shows the high accuracy of DeIdiom. Our user study with 20 participants demonstrates that explanatory non-idiomatic code generated by DeIdiom is useful for Python users to understand pythonic idioms correctly and efficiently, and leads to a more positive appreciation of pythonic idioms.

References

[1]
2008. Hidden features of Python (The Explanation of Chain Comparison). https://stackoverflow.com/questions/101268/hidden-features-of-python
[2]
2010. Expanding tuples into arguments. https://stackoverflow.com/questions/1993727/expanding-tuples-into-arguments
[3]
2011. How do chained assignments work. https://stackoverflow.com/questions/7601823/how-do-chained-assignments-work
[4]
2012. Why does (1 in [1,0] == True) evaluate to False? https://stackoverflow.com/questions/9284350/why-does-1-in-1-0-true-evaluate-to-false?
[5]
2013. Explanation of how nested list comprehension works. https://stackoverflow.com/questions/20639180/explanation-of-how-nested-list-comprehension-works
[6]
2017. Set comprehension gives unhashable type set of list in Python. https://stackoverflow.com/questions/42363826/set-comprehension-gives-unhashable-type-set-of-list-in-python
[7]
2019. Strange chained comparison. https://stackoverflow.com/questions/58084423/strange-chained-comparison
[8]
2022. Pylint. https://pylint.readthedocs.io/en/latest/
[9]
2022. RIdiom. https://plugins.jetbrains.com/plugin/20107-ridiom
[10]
2022. Stack Overflow. https://stackoverflow.com/help/searching
[11]
Marwen Abbes, Foutse Khomh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2011. An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Comprehension. In 2011 15th European Conference on Software Maintenance and Reengineering. 181--190.
[12]
Carol V. Alexandru, José J. Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C. Gall, and Gregorio Robles. 2018. On the Usage of Pythonic Idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Boston, MA, USA) (Onward! 2018). Association for Computing Machinery, New York, NY, USA, 1--11.
[13]
Arooj Arif and Zeeshan Ali Rana. 2020. Refactoring of Code to Remove Technical Debt and Reduce Maintenance Effort. In 2020 14th International Conference on Open Source Systems and Technologies (ICOSST). 1--7.
[14]
Dan Bader. 2017. Python Tricks: A Buffet of Awesome Python Features. BookBaby.
[15]
Yuriy Brun, Tian Lin, Jessie Elise Somerville, Elisha M. Myers, and Natalie Ebner. 2023. Blindspots in Python and Java APIs Result in Vulnerable Code. ACM Trans. Softw. Eng. Methodol. 32, 3, Article 76 (apr 2023), 31 pages.
[16]
T. A. Corbi. 1989. Program understanding: Challenge for the 1990s. IBM Systems Journal 28, 2 (1989), 294--306.
[17]
Marco D'Ambros, Alberto Bacchelli, and Michele Lanza. 2010. On the Impact of Design Flaws on Software Defects. In Quality Software, International Conference on. IEEE Computer Society, Los Alamitos, CA, USA, 23--31.
[18]
Python developers. 2000. Python Enhancement Proposals. https://peps.python.org/pep-0000/
[19]
Python developers. 2001. Python Enhancement Proposal 8 (PEP8). https://peps.python.org/pep-0008/
[20]
Python Developers. 2014. The performance of the list-comprehension. https://stackoverflow.com/questions/22108488/are-list-comprehensions-and-functional-functions-faster-than-for-loops
[21]
Python Developers. 2022. The definition of logical lines of Python program. ttps: //docs.python.org/3/reference/lexical_analysis.html#logical-lines
[22]
Bart Du Bois, Serge Demeyer, Jan Verelst, Tom Mens, and Marijn Temmerman. 2006. Does god class decomposition affect comprehensibility?. In IASTED Conf. on software engineering. 346--355.
[23]
Aamir Farooq and Vadim Zaytsev. 2021. There is More than One Way to Zen Your Python (SLE 2021). Association for Computing Machinery, New York, NY, USA, 68--82.
[24]
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos. 2017. Understanding Misunderstandings in Source Code. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA, 129--139.
[25]
Geoffrey Hecht, Naouel Moha, and Romain Rouvoy. 2016. An Empirical Study of the Performance Impacts of Android Code Smells. In Proceedings of the International Conference on Mobile Software Engineering and Systems (Austin, Texas) (MOBILESoft '16). Association for Computing Machinery, New York, NY, USA, 59--69.
[26]
Jeff Knupp. 2013. Writing Idiomatic Python 3.3. Jeff Knupp.
[27]
M. Lanza, S. Ducasse, H. Gall, and M. Pinzger. 2005. CodeCrawler - an information visualization tool for program comprehension. In Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005. 672--673.
[28]
Pattara Leelaprute, Bodin Chinthanet, Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Pongchai Jaisri, and Takashi Ishio. 2022. Does Coding in Pythonic Zen Peak Performance? Preliminary Experiments of Nine Pythonic Idioms at Scale. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension (Virtual Event) (ICPC '22). Association for Computing Machinery, New York, NY, USA, 575--579.
[29]
Li Li, Jiawei Wang, and Haowei Quan. 2022. Scalpel: The python static analysis framework. arXiv preprint arXiv:2202.11840 (2022).
[30]
Constantine Lignos. 2019. Anti-Patterns in Python Programming. https://lignos.org/py_antipatterns/
[31]
Isela Macia Bertran, Alessandro Garcia, and Arndt von Staa. 2011. An Exploratory Study of Code Smells in Evolving Aspect-Oriented Systems. In Proceedings of the Tenth International Conference on Aspect-Oriented Software Development (Porto de Galinhas, Brazil) (AOSD '11). Association for Computing Machinery, New York, NY, USA, 203--214.
[32]
Alex Martelli, Anna Ravenscroft, and David Ascher. 2005. Python cookbook. O'Reilly Media, Inc.
[33]
José Javier Merchante and Gregorio Robles. 2017. From Python to Pythonic: Searching for Python idioms in GitHub. In Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution. 1--3.
[34]
Mónika Mészáros, Máté Cserép, and Anett Fekete. 2019. Delivering comprehension features into source code editors through LSP. In 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). 1581--1586.
[35]
P. Phan-udom, N. Wattanakul, T. Sakulniwat, C. Ragkhitwetsagul, T. Sunetnanta, M. Choetkiertikul, and R. Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE Computer Society, Los Alamitos, CA, USA, 806--809.
[36]
Kenneth Reitz and Tanya Schlusser. 2016. The Hitchhiker's guide to Python: best practices for development. O'Reilly Media, Inc.
[37]
Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the Usage of Pythonic Idioms Over Time: A Case Study of the with open Idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 43--435.
[38]
Richard L Scheaffer, William Mendenhall III, R Lyman Ott, and Kenneth G Gerow. 2011. Elementary survey sampling. Cengage Learning.
[39]
Jingqiu Shao and Yingxu Wang. 2003. A new measure of software complexity based on cognitive weights. In CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436), Vol. 2. 1333--1338 vol.2.
[40]
Janet Siegmund. 2016. Program Comprehension: Past, Present, and Future. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 5. 13--20.
[41]
Brett Slatkin. 2019. Effective python: 90 specific ways to write better python. Addison-Wesley Professional.
[42]
Elliot Soloway and Kate Ehrlich. 1984. Empirical Studies of Programming Knowledge. IEEE Transactions on Software Engineering SE-10, 5 (1984), 595--609.
[43]
Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.
[44]
Anthony J Viera, Joanne M Garrett, et al. 2005. Understanding interobserver agreement: the kappa statistic. Fam med 37, 5 (2005), 360--363.
[45]
Jiawei Wang, Li Li, Kui Liu, and Haipeng Cai. 2020. Exploring How Deprecated Python Library APIs Are (Not) Handled. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 233--244.
[46]
Frank. Wilcoxon. 1945. Individual Comparisons by Ranking Methods. Biometrics 1 (1945), 196--202.
[47]
Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2018. Measuring Program Comprehension: A Large-Scale Field Study with Professionals. IEEE Transactions on Software Engineering 44, 10 (2018), 951--976.
[48]
Sheng Yu and Shijie Zhou. 2010. A survey on metric of software complexity. In 2010 2nd IEEE International Conference on Information Management and Engineering. 352--356.
[49]
Marvin V. Zelkowitz, Alan C. Shaw, and John D. Gannon. 1979. Principles of Software Engineering and Design. Prentice Hall Professional Technical Reference.
[50]
Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, and Liming Zhu. 2022. Making Python Code Idiomatic by Automatic Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Singapore, Singapore) (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA, 696--708.
[51]
Zejun Zhang, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Qinghua Lu. 2023. Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence. In Proceedings of the 45th International Conference on Software Engineering (Melbourne, Victoria, Australia) (ICSE '23). IEEE Press, 1495--1507.
[52]
Zejun Zhang, Zhenchang Xing, Xiwei Xu, and Liming Zhu. 2023. RIdiom: Automatically Refactoring Non-Idiomatic Python Code with Pythonic Idioms. In Proceedings of the 45th International Conference on Software Engineering: Companion Proceedings (Melbourne, Victoria, Australia) (ICSE '23). IEEE Press, 102--106.

Cited By

View all
  • (2024)Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language ModelsProceedings of the ACM on Software Engineering10.1145/36437761:FSE(1107-1128)Online publication date: 12-Jul-2024

Index Terms

  1. Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
    May 2024
    2942 pages
    ISBN:9798400702174
    DOI:10.1145/3597503
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    In-Cooperation

    • Faculty of Engineering of University of Porto

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 April 2024

    Check for updates

    Author Tags

    1. pythonic idioms
    2. code transformation
    3. program comprehension

    Qualifiers

    • Research-article

    Conference

    ICSE '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)276
    • Downloads (Last 6 weeks)53
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language ModelsProceedings of the ACM on Software Engineering10.1145/36437761:FSE(1107-1128)Online publication date: 12-Jul-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media