Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Survey of Controlled Experiments in Software Engineering

Published: 01 September 2005 Publication History

Abstract

The classical method for identifying cause-effect relationships is to conduct controlled experiments. This paper reports upon the present state of how controlled experiments in software engineering are conducted and the extent to which relevant information is reported. Among the 5,453 scientific articles published in 12 leading software engineering journals and conferences in the decade from 1993 to 2002, 103 articles (1.9 percent) reported controlled experiments in which individuals or teams performed one or more software engineering tasks. This survey quantitatively characterizes the topics of the experiments and their subjects (number of subjects, students versus professionals, recruitment, and rewards for participation), tasks (type of task, duration, and type and size of application) and environments (location, development tools). Furthermore, the survey reports on how internal and external validity is addressed and the extent to which experiments are replicated. The gathered data reflects the relevance of software engineering experiments to industrial practice and the scientific maturity of software engineering research.

References

[1]
A. Abran and J.W. Moore, “SWEBOK—Guide to the Software Engineering Body of Knowledge,” 2004 Version, IEEE CS Professional Practices Committee, 2004.
[2]
ACM Computing Classification, http://theory.lcs.mit.edu/jacm/CR/1991, 1995.
[3]
E. Arisholm and D.I.K. Sjøberg, “Evaluating the Effect of a Delegated versus Centralized Control Style on the Maintainability of Object-Oriented Software,” IEEE Trans. Software Eng., vol. 30, no. 8, pp. 521-534, Aug. 2004.
[4]
V.R. Basili, “The Experimental Paradigm in Software Engineering,” Experimental Eng. Issues: Critical Assessment and Future Directions, Proc. Int'l Workshop, vol. 706, pp. 3-12, 1993.
[5]
V.R. Basili, “The Role of Experimentation in Software Engineering: Past, Current, and Future,” Proc. 18th Int'l Conf. Software Eng., pp. 442-449, 1996.
[6]
V.R. Basili R.W. Selby and D.H. Hutchens, “Experimentation in Software Engineering,” IEEE Trans. Software Eng., pp. 733-743, July 1986.
[7]
V.R. Basili F. Shull and F. Lanubile, “Building Knowledge through Families of Experiments,” IEEE Trans. Software Eng., vol. 25, pp. 456-473, July/Aug. 1999.
[8]
D.M. Berry and W.F. Tichy, “Response to `Comments on Formal Methods Application: An Empirical Tale of Software Development,'” IEEE Trans. Software Eng., vol. 29, no. 6, pp. 572-575, June 2003.
[9]
D.T. Campbell and J.C. Stanley, “Experimental and Quasi-Experimental Designs for Research on Teaching,” Handbook of Research on Teaching, N.L. Cage, ed., Chicago: Rand McNally, 1963.
[10]
L.B. Christensen, Experimental Methodology, eighth ed. Boston: Pearson/Allyn & Bacon, 2001.
[11]
T.D. Cook and D.T. Campbell, Quasi-Experimentation. Design & Analysis Issues for Field Settings. Houghton Mifflin, 1979.
[12]
B. Curtis, “Measurement and Experimentation in Software Engineering,” Proc. IEEE, vol. 68, no. 9, pp. 1144-1157, Sept. 1980.
[13]
B. Curtis, “By the Way, Did Anyone Study Real Programmers?” Empirical Studies of Programmers, Proc. First Workshop, pp. 256-262, 1986.
[14]
I.S. Deligiannis M. Shepperd S. Webster and M. Roumeliotis, “A Review of Experimental Investigations into Object-Oriented Technology,” Empirical Software Eng., vol. 7, no. 3, pp. 193-231, Sept. 2002.
[15]
A. Endres and D. Rombach, A Handbook of Software and Systems Engineering: Empirical Observations, Laws, and Theories, Fraunhofer IESE series on software engineering. Pearson Education Limited, 2003.
[16]
N. Fenton, “How Effective Are Software Engineering Methods?” J. Systems and Software, vol. 22, no. 2, pp. 141-146, 1993.
[17]
R. Ferber, “Editorial: Research by Convenience,” J. Consumer Research, vol. 4, pp. 57-58, June 1977.
[18]
R.L. Glass and T.Y. Chen, “An Assessment of Systems and Software Engineering Scholars and Institutions (1998-2002),” J. Systems and Software, vol. 68, no. 1, pp. 77-84, 2003.
[19]
R.L. Glass V. Ramesh and I. Vessey, “An Analysis of Research in Computing Disciplines,” Comm. ACM, vol. 47, no. 6, pp. 89-94, June 2004.
[20]
R.L. Glass I. Vessey and V. Ramesh, “Research in Software Engineering: An Analysis of the Literature,” J. Information and Software Technology, vol. 44, no. 8, pp. 491-506, June 2002.
[21]
S. Greenland J.M. Robins and J. Pearl, “Confounding and Collapsibility in Causal Inference,” Statistical Science, vol. 14,no. 1, pp. 29-46, 1999.
[22]
W. Hayes, “Research Synthesis in Software Engineering: A Case for Meta- Analysis,” Proc. Sixth Int'l Symp. Software Metrics, pp. 143-151, 2003.
[23]
M. Höst C. Wohlin and T. Thelin, “Experimental Context Classification: Incentives and Experience of Subjects,” Proc. 27th Int'l Conf. Software Eng., pp. 470-478, 2005.
[24]
IEEE Keyword Taxonomy, http://www.computer.org/mc/ keywords/software.htm, 2002.
[25]
M. Jørgensen, “A Review of Studies on Expert Estimation of Software Development Effort,” J. Systems and Software, vol. 70, nos. 1,2, pp. 37-60, 2004.
[26]
M. Jørgensen and D.I.K. Sjøberg, “Generalization and Theory Building in Software Engineering Research,” Empirical Assessment in Software Eng. Proc., pp. 29-36, 2004.
[27]
M. Jørgensen K.H. Teigen and K. Moløkken, “Better Sure than Safe? Over-Confidence in Judgement Based Software Development Effort Prediction Intervals,” J. Systems and Software, vol. 70, nos. 1, 2, pp. 79-93, 2004.
[28]
N. Juristo A.M. Moreno and S. Vegas, “Reviewing 25 Years of Testing Technique Experiments,” Empirical Software Eng., vol. 9, pp. 7-44, Mar. 2004.
[29]
B.A. Kitchenham, “Procedures for Performing Systematic Reviews,” Technical Report TR/SE-0401, Keele University, and Technical Report 0400011T.1, NICTA, 2004.
[30]
B.A. Kitchenham S.L. Pfleeger L.M. Pickard P.W. Jones D.C. Hoaglin K. El-Emam and J. Rosenberg, “Preliminary Guidelines for Empirical Research in Software Engineering,” IEEE Trans. Software Eng., vol. 28, no. 8, pp. 721-734, Aug. 2002.
[31]
R.M. Lindsay and A.S.C. Ehrenberg, “The Design of Replicated Studies,” The Am. Statistician, vol. 47, pp. 217-228, Aug. 1993.
[32]
C. Lott and D. Rombach, “Repeatable Software Engineering Experiments for Comparing Defect-Detection Techniques,” Empirical Software Eng., vol. 1, pp. 241-277, 1996.
[33]
J.W. Lucas, “Theory-Testing, Generalization, and the Problem of External Validity,” Sociological Theory, vol. 21, pp. 236-253, Sept. 2003.
[34]
T.R. Lunsford and B.R. Lunsford, “The Research Sample, Part I: Sampling,” J. Prosthetics and Orthotics, vol. 7, pp. 105-112, 1995.
[35]
Experimental Software Engineering Issues: Critical Assessment and Future Directions, Int'l Workshop Dagstuhl Castle (Germany), Sept. 14-18, 1992, Proc., H.D. Rombach, V.R. Basili, and R.W. Selby, eds. Springer Verlag, 1993.
[36]
P. Runeson, “Using Students as Experimental Subjects—An Analysis of Graduate and Freshmen PSP Student Data,” Proc. Empirical Assessment in Software Eng., pp. 95-102, 2003.
[37]
W.R. Shadish T.D. Cook and D.T. Campbell, Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin, 2002.
[38]
M. Shaw, “Writing Good Software Engineering Research Papers: Minitutorial,” Proc. 25th Int'l Conf. Software Eng., pp. 726-736, 2003.
[39]
D.I.K. Sjøberg B. Anda E. Arisholm T. Dybå M. Jørgensen A. Karahasanović E. Koren and M. Vokáč, “Conducting Realistic Experiments in Software Engineering,” Proc. 18th Int'l Symp. Empirical Software Eng., pp. 17-26, 2002.
[40]
D.I.K. Sjøberg B. Anda E. Arisholm T. Dybå M. Jørgensen A. Karahasanović and M. Vokáč, “Challenges and Recommendations when Increasing the Realism of Controlled Software Engineering Experiments,” Empirical Methods and Studies in Software Eng., pp. 24-38, Springer Verlag, 2003.
[41]
W.F. Tichy, “Should Computer Scientists Experiment More? 16 Excuses to Avoid Experimentation,” Computer, vol. 31, no. 5, pp. 32-40, May 1998.
[42]
W.F. Tichy, “Hints for Reviewing Empirical Work in Software Engineering,” Empirical Software Eng., vol. 5, no. 4, pp. 309-312, 2000.
[43]
W.F. Tichy P. Lukowicz L. Prechelt and E.A. Heinz, “Experimental Evaluation in Computer Science: A Quantitative Study,” J. Systems and Software, vol. 28, no. 1, pp. 9-18, Jan. 1995.
[44]
W.M.K Trochim, The Research Methods Knowledge Base, second ed., Cincinnati: Atomic Dog Publishing, 2001.
[45]
R. Weber, “Editor's Comments,” MIS Quarterly, vol. 27, no. 3, pp. iii-xii, Sept. 2003.
[46]
C. Wohlin P. Runeson M. Höst M.C. Ohlsson B. Regnell and A. Wesslen, Experimentation in Software Eng.: An Introduction. John Wiley & Sons Inc., 1999.
[47]
R.K. Yin, Case Study Research: Design and Methods. Thousand Oaks, Calif.: Sage, 2003.
[48]
E.A. Youngs, “Human Errors in Programming,” Int'l J. Man-Machine Studies, vol. 6, no. 3, pp. 361-376, 1974.
[49]
M.V. Zelkowitz and D. Wallace, “Experimental Validation in Software Engineering,” J. Information and Software Technology, vol. 39, pp. 735-743, 1997.
[50]
M.V. Zelkowitz and D. Wallace, “Experimental Models for Validating Technology,” Theory and Practice of Object Systems, vol. 31, no. 5, pp. 23-31, May 1998.
[51]
A. Zendler, “A Preliminary Software Engineering Theory as Investigated by Published Experiments,” Empirical Software Eng., vol. 6, no. 2, pp. 161-180, 2001.
[52]
G.H. Zimney, Method in Experimental Psychology. New York: Ronald Press, 1961.

Cited By

View all
  • (2024)Establishing Metrics to Encourage Broader Use of Atomic Requirements - A Call for Exchange and ExperimentationACM SIGSOFT Software Engineering Notes10.1145/3672089.367209649:3(23-26)Online publication date: 18-Jul-2024
  • (2024)Rocks Coding, Not Development: A Human-Centric, Experimental Evaluation of LLM-Supported SE TasksProceedings of the ACM on Software Engineering10.1145/36437581:FSE(699-721)Online publication date: 12-Jul-2024
  • (2024)An empirical evaluation of RAIDEScience of Computer Programming10.1016/j.scico.2023.103013231:COnline publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Reviews

Larry Bernstein

This is a very important and insightful paper. By digging through the limited literature on controlled software engineering experiments, the authors have done a yeoman's job. They show clearly that planned methods work and that agile methods also work. As with all software engineering methods, it depends on the nature of the problem. The authors show what has been done and reach the conclusion that more experiments are needed. In their summary, the authors point out that, "although as many as 108 institutions from 19 countries were involved in conducting the [software engineering] experiments, a relatively low proportion of software engineering articles (1.9 percent) report controlled experiments." The authors go on to excuse this miscarriage of professional ethics, but they should not. Their data is an indictment of computer science and software engineering. Enough money is spent-much of it wasted on bankrupt software development-that the field demands controlled experiments. Better hypotheses and repeatable experiments can elevate hypotheses to the level of theory. In section 9, the authors state that "only 18 percent of the surveyed experiments were replications." We need to mount a large effort and obtain the necessary resources that will enable us to have a truly scientific theory for software design and implementation. The National Science Foundation in the US is funding a very limited amount of this kind of work. I was pleased to see that so many countries are participating in this global search, as the authors show in Table 4, with Visaggio and Lanubile at the University of Bari in Italy leading the way. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 31, Issue 9
September 2005
88 pages

Publisher

IEEE Press

Publication History

Published: 01 September 2005

Author Tags

  1. Index Terms- Controlled experiments
  2. empirical software engineering.
  3. research methodology
  4. survey

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Establishing Metrics to Encourage Broader Use of Atomic Requirements - A Call for Exchange and ExperimentationACM SIGSOFT Software Engineering Notes10.1145/3672089.367209649:3(23-26)Online publication date: 18-Jul-2024
  • (2024)Rocks Coding, Not Development: A Human-Centric, Experimental Evaluation of LLM-Supported SE TasksProceedings of the ACM on Software Engineering10.1145/36437581:FSE(699-721)Online publication date: 12-Jul-2024
  • (2024)An empirical evaluation of RAIDEScience of Computer Programming10.1016/j.scico.2023.103013231:COnline publication date: 1-Jan-2024
  • (2024)Guidelines for using financial incentives in software-engineering experimentationEmpirical Software Engineering10.1007/s10664-024-10517-w29:5Online publication date: 10-Aug-2024
  • (2023)Rethinking People Analytics With Inverse Transparency by DesignProceedings of the ACM on Human-Computer Interaction10.1145/36100837:CSCW2(1-29)Online publication date: 4-Oct-2023
  • (2023)What’s (Not) Working in Programmer User Studies?ACM Transactions on Software Engineering and Methodology10.1145/358715732:5(1-32)Online publication date: 24-Jul-2023
  • (2023)How Do Computing Education Researchers Talk About Threats and Limitations?Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 110.1145/3568813.3600114(381-396)Online publication date: 7-Aug-2023
  • (2023)From Code Complexity Metrics to Program ComprehensionCommunications of the ACM10.1145/354657666:5(52-61)Online publication date: 21-Apr-2023
  • (2023)Studying the Influence and Distribution of the Human Effort in a Hybrid Fitness Function for Search-Based Model-Driven EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2023.332973049:12(5189-5202)Online publication date: 1-Dec-2023
  • (2023)Construct Validity in Software EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2022.317672549:3(1374-1396)Online publication date: 1-Mar-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media