Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1137983.1138016acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
Article

Mining email social networks

Published: 22 May 2006 Publication History

Abstract

Communication & Co-ordination activities are central to large software projects, but are difficult to observe and study in traditional (closed-source, commercial) settings because of the prevalence of informal, direct communication modes. OSS projects, on the other hand, use the internet as the communication medium,and typically conduct discussions in an open, public manner. As a result, the email archives of OSS projects provide a useful trace of the communication and co-ordination activities of the participants. However, there are various challenges that must be addressed before this data can be effectively mined. Once this is done, we can construct social networks of email correspondents, and begin to address some interesting questions. These include questions relating to participation in the email; the social status of different types of OSS participants; the relationship of email activity and commit activity (in the CVS repositories) and the relationship of social status with commit activity. In this paper, we begin with a discussion of our infrastructure (including a novel use of Scientific Workflow software) and then discuss our approach to mining the email archives; and finally we present some preliminary results from our data analysis.

References

[1]
R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu. Mining newsgroups using networks arising from social behavior. In WWW '03: Proceedings of the 12th international conference on World Wide Web, 2003.
[2]
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999.
[3]
C. Bird, A. Gourley, P. Devanbu, A. Swaminathan, and M. Gertz. Mining email social networks in postgres. In MSR '06: Proceedings of the International Workshop on Mining Software Repositories, 2006.
[4]
F. Brooks. The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley, 1995.
[5]
S. Chapman. Sam's string metrics page. www.dcs.shef.ac.uk/ sam/stringmetrics.html.
[6]
J. F. P. D. Cleidson de Souza. Seeking the source: Software source code as a social and technical artifact, 2005. http://opensource.mit.edu/papers/desouza.pdf.
[7]
K. Crowston and J. Howison. The social structure of free and open source software development. opensource.mit.edu/papers/crowstonhowison.pdf, November 2004.
[8]
B. J. Dempsey, D. Weiss, P. Jones, and J. Greenberg. Who is an open source software developer? Communications of the ACM, 45(2):67--72, February 2002.
[9]
L. C. Freeman. Centrality in social networks I. Conceptual clarification. Social Networks, 1:215--239, 1979.
[10]
M. Granovetter. The strength of weak ties. American Journal of Sociology, 78:1360--1380, 1973.
[11]
K. Kuwabara. Linux: A bazaar at the edge of chaos. First Monday, 5(3), March 2000.
[12]
L. Lopez, J. M. Gonzalez-Barahona, and G. Robles. Applying social network analysis to the information in cvs repositories. In Proceedings of the International Workshop on Mining Software Repositories, 2004.
[13]
G. Navarro. A guided tour to approximate string matching. ACM Comput. Surveys, 33(1):31--88, 2001.
[14]
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.
[15]
J. Nieminen. On centrality in a graph. Scandinavian Journal of Psychology, 15:322--336, 1974.
[16]
E. S. Raymond. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O'Reilly and Associates, Sebastopol, California, 1999.
[17]
E. Ukkonen. Algorithms for approximate string matching. Information & Control, 64(1-3), 1985.
[18]
P. A. Wagstrom, J. D. Herbsleb, and K. Carley. A social network approach to free/open source software simulation. In Proceedings First International Conference on Open Source Systems, pages 16--23, 2005.
[19]
J. Xu, Y. Gao, S. Christley, and G. Madey. A topological analysis of the open source software development community. In HICSS '05: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 7, 2005.

Cited By

View all
  • (2024)Curated Email-Based Code Reviews DatasetsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644872(294-298)Online publication date: 15-Apr-2024
  • (2024)How Are Paid and Volunteer Open Source Developers Different? A Study of the Rust ProjectProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639197(1-13)Online publication date: 20-May-2024
  • (2024)Factoring Expertise, Workload, and Turnover Into Code Review RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.336675350:4(884-899)Online publication date: Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '06: Proceedings of the 2006 international workshop on Mining software repositories
May 2006
191 pages
ISBN:1595933972
DOI:10.1145/1137983
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. open source
  2. social networks

Qualifiers

  • Article

Conference

ICSE06
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)70
  • Downloads (Last 6 weeks)7
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Curated Email-Based Code Reviews DatasetsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644872(294-298)Online publication date: 15-Apr-2024
  • (2024)How Are Paid and Volunteer Open Source Developers Different? A Study of the Rust ProjectProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639197(1-13)Online publication date: 20-May-2024
  • (2024)Factoring Expertise, Workload, and Turnover Into Code Review RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.336675350:4(884-899)Online publication date: Apr-2024
  • (2024)The AI community building the future? A quantitative analysis of development activity on Hugging Face HubJournal of Computational Social Science10.1007/s42001-024-00300-8Online publication date: 24-Jun-2024
  • (2024)What can we learn from quality assurance badges in open-source software?Science China Information Sciences10.1007/s11432-022-3611-367:4Online publication date: 26-Mar-2024
  • (2024)Towards privacy-aware exploration of archived personal emailsInternational Journal on Digital Libraries10.1007/s00799-024-00394-5Online publication date: 21-Feb-2024
  • (2023)Mobile Architecture for Version Control SystemsDesigning and Developing Innovative Mobile Applications10.4018/978-1-6684-8582-8.ch003(38-55)Online publication date: 30-Jun-2023
  • (2023)Storytelling With Networks: Realizing the Explanatory Potential of Network Diagrams Through the Integration of Qualitative DataInternational Journal of Qualitative Methods10.1177/1609406923118936922Online publication date: 16-Aug-2023
  • (2023)Contribution-Based Firing of Developers?Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613085(2062-2066)Online publication date: 30-Nov-2023
  • (2023)Automatic Core-Developer Identification on GitHub: A Validation StudyACM Transactions on Software Engineering and Methodology10.1145/359380332:6(1-29)Online publication date: 30-Sep-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media