Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2875913.2875924acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

GEMiner: Mining Social and Programming Behaviors to Identify Experts in Github

Published: 06 November 2015 Publication History

Abstract

Hosting over 10 million repositories, GitHub becomes the largest open source community in the world. Besides sharing code, Github is also a social network, in which developers can follow others or keep track of their interested projects. Considering the multi-roles of Github, integrating heterogenous data of each developer to identify experts is a challenging task. In this paper, we propose GEMiner, a novel approach to identify experts for some specific programming languages in Github. Different from previous approaches, GEMiner analyzes the social behaviors and programming behaviors of a developer to determine the expertise of the developer. When modeling social behaviors of developers, to integrate heterogenous social networks in Github, GEMiner implements a Multi-Sources PageRank algorithm. Also, GEMiner analyzes the behaviors of developers when they are programming (e.g., their commit activities and their preferred programming languages) to model programming behaviors of them. Based on our expertise models and our extracted programming languages data, GEMiner can then identify experts for some specific programming languages in Github. We conducted experiments on a real data set, and our results show that GEMiner identifies experts with 60% accuracy higher than the state-of-the-art algorithms.

References

[1]
Shaowei Wang, Daniel Lo, Bogdan Vasilescu, and Alexander Serebrenik. Entagrec: an enhanced tag recommendation system for software information sites. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on, pages 291--300. IEEE, 2014.
[2]
Oskar Jarczyk, Błażej Gruszka, Szymon Jaroszewicz, Leszek Bukowski, and Adam Wierzbicki. Github projects. quality analysis of open-source software. In Social Informatics, pages 80--94. Springer, 2014.
[3]
Jun Zhang, Mark S Ackerman, and Lada Adamic. Expertise networks in online communities: structure and algorithms. In Proceedings of the 16th international conference on World Wide Web, pages 221--230. ACM, 2007.
[4]
David Easley and Jon Kleinberg. Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press, 2010.
[5]
Mark S Granovetter. The strength of weak ties. American journal of sociology, pages 1360--1380, 1973.
[6]
Linton C Freeman. A set of measures of centrality based on betweenness. Sociometry, pages 35--41, 1977.
[7]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.
[8]
G Alan Wang, Jian Jiao, Alan S Abrahams, Weiguo Fan, and Zhongju Zhang. Expertrank: A topic-aware expert finding algorithm for online knowledge communities. Decision Support Systems, 54(3):1442--1451, 2013.
[9]
Albert Hupa, Krzysztof Rzadca, Adam Wierzbicki, and Anwitaman Datta. Interdisciplinary matchmaking: Choosing collaborators by skill, acquaintance and trust. Springer, 2010.
[10]
Jon M Kleinberg. Hubs, authorities, and communities. ACM Computing Surveys (CSUR), 31(4es):5, 1999.
[11]
K Kalaiselvi and PS Balamurugan. An ontological approach to identify expert knowledge in academic institution. In Current Trends in Engineering and Technology (ICCTET), 2013 International Conference on, pages 120--122. IEEE, 2013.
[12]
Jie Li, Harold Boley, Virenda Bhavsar, and Jing Mei. Expert finding for ecollaboration using foaf with ruleml rules. 2006.
[13]
Gustavo Freitas, Cesar da Costa, Jorge Barbosa, Rodrigo Righi, and Abid Yamin. Expert user discovery in a spontaneous social network an approach using knowledge retrieval. In Computational Aspects of Social Networks (CASoN), 2013 Fifth International Conference on, pages 15--20. IEEE, 2013.
[14]
Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 807--816. ACM, 2009.
[15]
Petter Holme and Mark EJ Newman. Nonequilibrium phase transition in the coevolution of networks and opinions. Physical Review E, 74(5):056108, 2006.
[16]
David Crandall, Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, and Siddharth Suri. Feedback effects between similarity and social influence in online communities. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 160--168. ACM, 2008.
[17]
Jerry Scripps, Pang-Ning Tan, and Abdol-Hossein Esfahanian. Measuring the effects of preprocessing decisions and network forces in dynamic network analysis. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 747--756. ACM, 2009.
[18]
Norman Fenton and James Bieman. Software metrics: a rigorous and practical approach. CRC Press, 2014.
[19]
Kevin Crowston, Kangning Wei, James Howison, and Andrea Wiggins. Free/libre open-source software development: What we know and what we do not know. ACM Computing Surveys (CSUR), 44(2):7, 2012.
[20]
Audris Mockus and James D Herbsleb. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the 24th international conference on software engineering, pages 503--512. ACM, 2002.
[21]
John Anvik and Gail C Murphy. Determining implementation expertise from bug reports. In Mining Software Repositories, 2007. ICSE Workshops MSR'07. Fourth International Workshop on, pages 2--2. IEEE, 2007.
[22]
Elben Shira and Matthew Lease. Expert search on code repositories. 2010.
[23]
Benjamin V Hanrahan, Gregorio Convertino, and Les Nelson. Modeling problem difficulty and expertise in stackoverflow. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work Companion, pages 91--94. ACM, 2012.
[24]
Nidhi Raj, Lipika Dey, and Bhakti Gaonkar. Expertise prediction for social network platforms to encourage knowledge sharing. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on, volume 1, pages 380--383. IEEE, 2011.
[25]
Jiwoon Jeon, W Bruce Croft, Joon Ho Lee, and Soyeon Park. A framework to predict the quality of answers with non-textual features. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 228--235. ACM, 2006.
[26]
Fatemeh Riahi, Zainab Zolaktaf, Mahdi Shafiei, and Evangelos Milios. Finding expert users in community question answering. In Proceedings of the 21st international conference companion on World Wide Web, pages 791--798. ACM, 2012.
[27]
Matthew Richardson and Pedro Domingos. The intelligent surfer: Probabilistic combination of link and content information in pagerank. In NIPS, pages 1441--1448, 2001.
[28]
Georgios Gousios and Diomidis Spinellis. Ghtorrent: Github's data from a firehose. In Mining Software Repositories (MSR), 2012 9th IEEE Working Conference on, pages 12--21. IEEE, 2012.

Cited By

View all
  • (2023)Help! I need somebody. A Mapping Study about Expert Identification in Software DevelopmentProceedings of the XXXVII Brazilian Symposium on Software Engineering10.1145/3613372.3613389(154-163)Online publication date: 25-Sep-2023
  • (2021)Antecedents of Different Social Network Structures on Open Source Projects PopularitySmart Business: Technology and Data Enabled Innovative Business Models and Practices10.1007/978-3-030-67781-7_14(143-157)Online publication date: 31-Jan-2021
  • (2019)Towards Community and Expert Detection in Open Source Global Development2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD.2019.8791872(350-355)Online publication date: May-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
Internetware '15: Proceedings of the 7th Asia-Pacific Symposium on Internetware
November 2015
247 pages
ISBN:9781450336413
DOI:10.1145/2875913
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Key Laboratory of High Confidence Software Technologies: Key Laboratory of High Confidence Software Technologies, Ministry of Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Experts Identification
  2. Github
  3. Social Network

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Internetware '15

Acceptance Rates

Overall Acceptance Rate 55 of 111 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Help! I need somebody. A Mapping Study about Expert Identification in Software DevelopmentProceedings of the XXXVII Brazilian Symposium on Software Engineering10.1145/3613372.3613389(154-163)Online publication date: 25-Sep-2023
  • (2021)Antecedents of Different Social Network Structures on Open Source Projects PopularitySmart Business: Technology and Data Enabled Innovative Business Models and Practices10.1007/978-3-030-67781-7_14(143-157)Online publication date: 31-Jan-2021
  • (2019)Towards Community and Expert Detection in Open Source Global Development2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD.2019.8791872(350-355)Online publication date: May-2019
  • (2018)How swift developers handle errorsProceedings of the 15th International Conference on Mining Software Repositories10.1145/3196398.3196428(292-302)Online publication date: 28-May-2018
  • (2018)A Free-Choice Social Learning Network for Computational Thinking2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT)10.1109/ICALT.2018.00023(69-71)Online publication date: Jul-2018

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media