research-article

Measuring the heterogeneity of cross-company dataset

Authors:

Jia Chen,

Ye Yang,

Wen Zhang,

Gregory GayAuthors Info & Claims

PROFES '10: Proceedings of the 11th International Conference on Product Focused Software

Pages 55 - 58

https://doi.org/10.1145/1961258.1961272

Published: 21 June 2010 Publication History

Get Access

Abstract

As a standard practice, general effort estimate models are calibrated from large cross-company datasets. However, many of the records within such datasets are taken from companies that have calibrated the model to match their own local practices. Locally calibrated models are a double-edged sword; they often improve estimate accuracy for that particular organization, but they also encourage the growth of local biases. Such biases remain present when projects from that firm are used in a new cross-company dataset. Over time, such biases compound, and the reliability and accuracy of a general model derived from the data will be affected by the increased level of heterogeneity. In this paper, we propose a statistical measure of the exact level of heterogeneity of a cross-company dataset. In experimental tests, we measure the heterogeneity of two COCOMO-based datasets and demonstrate that one is more homogeneous than the other. Such a measure has potentially important implications for both model maintainers and model users. Furthermore, a heterogeneity measure can be used to inform users of the appropriate data handling techniques.

References

[1]

B. A. Kitchenham, E. Mendes, and G. H. Travassos, "Cross versus within-company cost estimation studies: A systematic review," IEEE Transactions on Software Engineering, vol. 33, no. 5, pp. 316--329, May, 2007.

Digital Library

Google Scholar

[2]

R. Jeffery, M. Ruhe, and I. Wieczorek, "A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data," Information and Software Technology, vol. 42, no. 14, pp. 1009--1016, Nov, 2000.

Crossref

Google Scholar

[3]

B. W. Boehm, Software Engineering Economics: Prentice Hall PTR, 1981.

Digital Library

Google Scholar

[4]

B. W. Boehm, Clark, Horowitz et al., Software Cost Estimation with Cocomo II with Cdrom: Prentice Hall PTR, 2000.

Digital Library

Google Scholar

[5]

B. Clark, S. Devnani-Chulani, B. Boehm et al., "Calibrating the COCOMO II Post-Architecture model," Proceedings of the 1998 International Conference on Software Engineering, International Conference on Software Engineering, pp. 477--480, Los Alamitos: IEEE Computer Soc, 1998.

Digital Library

Google Scholar

[6]

V. Nguyen, B. Steece, B. Boehm et al., A Constrained Regression Technique for COCOMO Calibration, New York: Assoc Computing Machinery, 2008.

Google Scholar

[7]

S. Chulani, B. Boehm, and B. Steece, "Bayesian analysis of empirical software engineering cost models," IEEE Transactions on Software Engineering, vol. 25, no. 4, pp. 573--583, Jul-Aug, 1999.

Digital Library

Google Scholar

[8]

E. Mendes, B. Kitchenham, and s. IEEE computer, Further comparison of cross-company and within-company effort estimation models for web applications, Los Alamitos: IEEE Computer Soc, 2004.

Google Scholar

[9]

K. Maxwell, L. Van Wassenhove, and S. Dutta, "Performance evaluation of general and company specific models in software development effort estimation," Management Science, vol. 45, no. 6, pp. 787--803, Jun, 1999.

Digital Library

Google Scholar

[10]

Q. Liu, and R. Mintram, "Preliminary data analysis methods in software estimation," Software Quality Journal, vol. 13, no. 1, pp. 91--115, Mar, 2005.

Digital Library

Google Scholar

[11]

B. Kitchenham, "A procedure for analyzing unbalanced datasets," IEEE Transactions on Software Engineering, vol. 24, no. 4, pp. 278--301, Apr, 1998.

Digital Library

Google Scholar

[12]

J. J. Cuadrado-Gallego, and M. A. Sicilia, "An algorithm for the generation of segmented parametric software estimation models and its empirical evaluation," Computing and Informatics, vol. 26, no. 1, pp. 1--15, 2007.

Google Scholar

Index Terms

Measuring the heterogeneity of cross-company dataset
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Implementation management
        Pricing and resource allocation
      2. Project and people management

Recommendations

Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

In 2004 [Kitchenham, B.A., Mendes, E., 2004a. Software productivity measurement using multiple size measures. IEEE Transactions on Software Engineering 30 (12), 1023-1035, Kitchenham, B.A., Mendes, E., 2004b. A comparison of cross-company and single-...
Cross- vs. within-company cost estimation studies revisited: an extended systematic review
EASE '14: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering

[Objective] The objective of this paper is to extend a previously conducted systematic literature review (SLR) that investigated under what circumstances individual organizations would be able to rely on cross-company based estimation models. [Method] ...
Further Comparison of Cross-Company and Within-Company Effort Estimation Models for Web Applications
METRICS '04: Proceedings of the Software Metrics, 10th International Symposium

This paper extends a previous study, using data on 67 Web projects from the Tukutuku database, investigating to what extent a cross-company cost model can be successfully employed to estimate effort for projects that belong to a single company, where no ...

Comments

Information & Contributors

Information

Published In

PROFES '10: Proceedings of the 11th International Conference on Product Focused Software

June 2010

158 pages

ISBN:9781450302814

DOI:10.1145/1961258

General Chair:
Markku Oivo
University of Oulu, Finland
,
Program Chairs:
M. Ali Babar
IT University of Copenhagen, Denmark
,
Matias Vierimaa
VTT, Oulu, Finland

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

Profes '10

Sponsor:

LERO
SIGSOFT

Profes '10: International Conference on Product Focused Software

June 21 - 23, 2010

Limerick, Ireland

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
57
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

Cross- vs. within-company cost estimation studies revisited: an extended systematic review

Further Comparison of Cross-Company and Within-Company Effort Estimation Models for Web Applications

Comments

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Other Metrics

Article Metrics

Other Metrics

Login options

Full Access

PDF

eReader

Abstract

References

Index Terms

Recommendations

Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

Cross- vs. within-company cost estimation studies revisited: an extended systematic review

Further Comparison of Cross-Company and Within-Company Effort Estimation Models for Web Applications

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations