Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3361242.3361251acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

An approach to helping developers learn open source projects based on machine learning

Published: 28 October 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Developers usually learn excellent coding methods and design patterns by reading the code from well-known open-source projects, and participate in the development of open-source projects to enhance their programming capabilities. When developers have just joined an existing open-source project development, the first thing to do is to read and understand the project code. However, almost no project will maintain design documentations. Developers can only understand code according to user guide (mainly focus on how to use code but not on how to develop code) or brief code comments, which is relatively difficult for new developers. To help developers learn open-source projects more quickly, we propose an approach to helping developers learn open-source projects based on machine learning. First, we build a code structure graph for the project code by static analysis. Second, we implement a project entries recommendation approach based on clustering and machine learning to recommend project entries suitable for developers to read. Third, we implement a learning path recommendation algorithm. The algorithm recommends learning paths based on function nodes in the code structure graph selected by the developers, helps developers understand open-source projects better. In experiments, we select two famous c++ open-source projects, Lua and Memcache, as examples to perform project learning path recommendation. The experimental results show that our approach save a lot of time for developers to learn open-source projects while maintaining the accuracy of the recommendations.

    References

    [1]
    A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, & O. Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, 2787--2795.
    [2]
    Z. Wang, J. Zhang, J. Feng, & Z. Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Twenty-Eighth AAAI conference on artificial intelligence.
    [3]
    Y. Lin, Z. Liu, M. Sun, Y. Liu, & X. Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Twenty-ninth AAAI conference on artificial intelligence.
    [4]
    G. Ji, K. Liu, S. He, & J. Zhao. 2016. Knowledge graph completion with adaptive sparse transfer matrix. In Thirtieth AAAI Conference on Artificial Intelligence.
    [5]
    B. Perozzi, R. Al-Rfou, & S. Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 701--710.
    [6]
    G. Gharibi, R. Tripathi, & Y. Lee. 2018. Code2graph: automatic generation of static call graphs for python source code. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ACM, 880--883.
    [7]
    A. Habib, & M. Pradel. 2018. Is this class thread-safe? inferring documentation using graph-based learning. In ASE, 41--52.
    [8]
    J. Tu, X. Xie, Y. Zhou, B. Xu, & L. Chen. 2016. A Search Based Context-Aware Approach for Understanding and Localizing the Fault via Weighted Call Graph. In 2016 Third International Conference on Trustworthy Systems and their Applications (TSA), IEEE, 64--72.
    [9]
    H. Gascon, F. Yamaguchi, D. Arp, & K. Rieck. 2013. Structural detection of android malware using embedded call graphs. In Proceedings of the 2013 ACM workshop on Artificial intelligence and security, ACM, 45--54.
    [10]
    M. Trapp, M. Rossberg, & G. Schaefer. 2015. Program partitioning based on static call graph analysis for privilege separation. In 2015 IEEE Symposium on Computers and Communication (ISCC), IEEE, 613--618.
    [11]
    Y. Zou, C. Ling, Z. Lin, & B. Xie. 2018. Graph Embedding based Code Search in Software Project. In Proceedings of the Tenth Asia-Pacific Symposium on Internetware, ACM, 1.
    [12]
    F. Lv, H. Zhang, J. G. Lou, S. Wang, D. Zhang, & J. Zhao. 2015. Codehow: Effective code search based on api understanding and extended boolean model (e). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, 260--270.
    [13]
    F. Asadi, M. Di Penta, G. Antoniol, & Y. G. Guéhéneuc. 2010. A heuristic-based approach to identify concepts in execution traces. In 2010 14th European Conference on Software Maintenance and Reengineering, IEEE, 31--40.
    [14]
    M. Revelle, B. Dit, & D. Poshyvanyk. 2010. Using data fusion and web mining to support feature location in software. In 2010 IEEE 18th International Conference on Program Comprehension, IEEE, 14--23.
    [15]
    https://www.jetbrains.com/idea/
    [16]
    K. Zimmerman, & C. R. Rupakheti. 2015. An automated framework for recommending program elements to novices (n). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, 283--288.
    [17]
    Y. Lin, G. Meng, Y. Xue, Z. Xing, J. Sun, X. Peng, ... & J. Dong. 2017. Mining implicit design templates for actionable code reuse. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, IEEE Press, 394--404.
    [18]
    S. Zhou, H. Zhong, & B. Shen. 2018. SLAMPA: Recommending Code Snippets with Statistical Language Model. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC), IEEE, 79--88.
    [19]
    S. Prabhakar, G. Spanakis, & O. Zaiane. 2017. Reciprocal recommender system for learners in massive open online courses (moocs). In International Conference on Web-Based Learning, Springer, Cham, 157--167.
    [20]
    Y. Dai, Y. Asano, & M. Yoshikawa. 2016. Course Content Analysis: An Initiative Step toward Learning Object Recommendation Systems for MOOC Learners. International Educational Data Mining Society.
    [21]
    H. M. Chang, T. M. L. Kuo, S. C. Chen, C. A. Li, Y. W. Huang, Y. C. Cheng, ... & J. W. Tzeng. 2016. Developing a data-driven learning interest recommendation system to promoting self-paced learning on MOOCs. In 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT), IEEE, 23--25.
    [22]
    Y. Pang, C. Liao, W. Tan, Y. Wu, & C. Zhou. 2018. Recommendation for MOOC with Learner Neighbors and Learning Series. In International Conference on Web Information Systems Engineering, Springer, Cham, 379--394.
    [23]
    http://www.doxygen.nl/
    [24]
    A. K. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern recognition letters, 31(8), 651--666.
    [25]
    G. H. Ball, & D. J. Hall. 1965. ISODATA, a novel method of data analysis and pattern classification. Stanford research inst Menlo Park CA.
    [26]
    X. Han, S. Cao, X. Lv, Y. Lin, Z. Liu, M. Sun, & J. Li. 2018. Openke: An open toolkit for knowledge embedding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 139--144.
    [27]
    http://www.lua.org/
    [28]
    https://memcached.org/

    Cited By

    View all
    • (2024)Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problemsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644926(394-406)Online publication date: 15-Apr-2024
    • (2021)Open model of education using Open Source principlesTrendovi u poslovanju10.5937/trendpos2101041V9:1(40-48)Online publication date: 2021
    • (2021)Automatic Learning Path Recommendation for Open Source Projects Using Deep Learning on Knowledge Graphs2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00115(824-833)Online publication date: Jul-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    Internetware '19: Proceedings of the 11th Asia-Pacific Symposium on Internetware
    October 2019
    179 pages
    ISBN:9781450377010
    DOI:10.1145/3361242
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Code structure graph
    2. Learning path recommendation
    3. Machine Learning
    4. Software reverse engineering

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • National Key Research and Development Program of China
    • National Natural Science Foundation of China
    • National Basic Research Program of China

    Conference

    Internetware '19

    Acceptance Rates

    Internetware '19 Paper Acceptance Rate 20 of 35 submissions, 57%;
    Overall Acceptance Rate 55 of 111 submissions, 50%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problemsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644926(394-406)Online publication date: 15-Apr-2024
    • (2021)Open model of education using Open Source principlesTrendovi u poslovanju10.5937/trendpos2101041V9:1(40-48)Online publication date: 2021
    • (2021)Automatic Learning Path Recommendation for Open Source Projects Using Deep Learning on Knowledge Graphs2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00115(824-833)Online publication date: Jul-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media