research-article

Open access

Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

Authors:

Daniel Graziotin,

Stefan WagnerAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 31, Issue 1

Article No.: 7, Pages 1 - 36

https://doi.org/10.1145/3469888

Published: 28 September 2021 Publication History

All formats PDF

Abstract

A meaningful and deep understanding of the human aspects of software engineering (SE) requires psychological constructs to be considered. Psychology theory can facilitate the systematic and sound development as well as the adoption of instruments (e.g., psychological tests, questionnaires) to assess these constructs. In particular, to ensure high quality, the psychometric properties of instruments need evaluation. In this article, we provide an introduction to psychometric theory for the evaluation of measurement instruments for SE researchers. We present guidelines that enable using existing instruments and developing new ones adequately. We conducted a comprehensive review of the psychology literature framed by the Standards for Educational and Psychological Testing. We detail activities used when operationalizing new psychological constructs, such as item pooling, item review, pilot testing, item analysis, factor analysis, statistical property of items, reliability, validity, and fairness in testing and test bias. We provide an openly available example of a psychometric evaluation based on our guideline. We hope to encourage a culture change in SE research towards the adoption of established methods from psychology. To improve the quality of behavioral research in SE, studies focusing on introducing, validating, and then using psychometric instruments need to be more common.

Supplementary Material

graziotin (graziotin.zip)

Supplemental movie, appendix, image and software files for, Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

Download
367.38 KB

References

[1]

Hervé Abdi. 2003. Factor rotations in factor analyses. In Encyclopedia of Social Sciences Research Methods, A. Lewis-Beck M., Bryman and Futing T. (Eds.). SAGE, Thousand Oaks, CA, 792–795.

[2]

Arnold Alphen, Ruud Halfens, Arie Hasman, and Tjaart Imbos. 1994. Likert or Rasch? Nothing is more applicable than good theory. J. Adv. Nurs. 20, 1 (1994), 196–201.

[3]

American Educational Research Association, American Psychological Association, National Council on Measurement in Education, and Joint Committee on Standards for Educational and Psychological Testing (U.S.). 2014. Standards for Educational and Psychological Testing. American Educational Research Association, Washington, DC.

[4]

Anne Anastasi and Susana Urbina. 1997. Psychological Testing. Prentice Hall, Upper Saddle River, NJ.

[5]

Asim Ansari, Kamel Jedidi, and Laurette Dube. 2002. Heterogeneous factor analysis models: A Bayesian approach. Psychometrika 67, 1 (2002), 49–77.

[6]

Michael Berger. 2013. Criterion-referenced testing. In Encyclopedia of Autism Spectrum Disorders. Springer New York, 823–823.

[7]

Michael Berger. 2013. Norm-referenced testing. In Encyclopedia of Autism Spectrum Disorders. Springer New York, 2063–2064.

[8]

R. Boulkedid, H. Abdoul, M. Loustau, O. Sibony, and C. Alberti. 2011. Using and reporting the delphi method for selecting healthcare quality indicators: A systematic review.PLoS One 6 (2011), e20476.

[9]

Norman M. Bradburn, Seymour Sudman, and Brian Wansink. 2004. Asking Questions. John Wiley & Sons, Hoboken, NJ.

[10]

Michael W. Browne. 2001. An overview of analytic rotation in exploratory factor analysis. Multivar. Behav. Res. 36, 1 (2001), 111–150.

[11]

Victor R. Basili, Gianluigi Caldiera, and H. Dieter Rombach. 1994. The goal question metric approach. Encyc. Softw. Eng. 1, 1 (1994), 528–532.

[12]

English Dictionary Cambridge. 2018. Fairness. Cambridge Eng. Dict. 1, 1 (2018), 1. Retrieved from https://dictionary.cambridge.org/dictionary/english/fairness.

[13]

Donald T. Campbell and Donald W. Fiske. 1959. Convergent and discriminant validation by the multitrait-multimethod matrix.Psychol. Bull. 56, 2 (1959), 81–105.

[14]

Luiz Fernando Capretz. 2003. Personality types in software engineering. Int. J. Hum.-comput. Stud. 58, 2 (2003), 207–214.

Digital Library

[15]

Rudolf Carnap. 1962. Logical Foundations of Probability. University of Chicago Press, Chicago.

[16]

Jeffrey C. Carver, Henry Muccini, Birgit Penzenstadler, Rafael Prikladnicki, Alexander Serebrenik, and Thomas Zimmermann. 2021. Behavioral science and diversity in software engineering. IEEE Softw. 38, 2 (2021), 107–112.

[17]

Raymond B. Cattell. 1966. The scree test for the number of factors. Multivar. Behav. Res. 1, 2 (1966), 245–276.

[18]

Souti Chattopadhyay, Nicholas Nelson, Audrey Au, Natalia Morales, Christopher Sanchez, Rahul Pandita, and Anita Sarma. 2020. A tale from the trenches. In ACM/IEEE 42nd International Conference on Software Engineering. ACM, New York, NY, 654–665.

Digital Library

[19]

Marcus Ciolkowski, Oliver Laitenberger, Sira Vegas, and Stefan Biffl. 2003. Practical experiences in the design and conduct of surveys in empirical software engineering. In Empirical Methods and Studies in Software Engineering, Gerhard Goos, Juris Hartmanis, Jan van Leeuwen, Reidar Conradi, and Alf Inge Wang (Eds.). Springer Berlin, 104–128.

[20]

Keith Coaley. 2014. An Introduction to Psychological Assessment & Psychometrics. SAGE, Los Angeles, CA.

[21]

Ronald Jay Cohen, Mark E. Swerdlik, and Suzanne M. Phillips. 1995. Psychological Testing and Assessment: An Introduction to Tests and Measurement. Mayfield Pub Co., California City, CA.

[22]

D. Collins. 2003. Pretesting survey instruments: An overview of cognitive methods.Qual. Life Res. 12, 3 (2003), 229–238.

[23]

Gabriella Conti, Sylvia Frühwirth-Schnatter, James J. Heckman, and Rémi Piatek. 2014. Bayesian exploratory factor analysis. J. Economet. 183, 1 (2014), 31–57.

[24]

G. Conti, S. Frühwirth-Schnatter, J. J. Heckman, and R. Piatek. 2014. Bayesian exploratory factor analysis.J. Econ. 183, 1 (2014), 31–57.

[25]

Paul T. Costa Jr and Robert R. McCrae. 1992. Comes of age. In Nebraska Symposium on Motivation: Psychology and Aging. University of Nebraska Press, Lincoln, NE.

[26]

Matthew Gordon Rau Courtney. 2013. Determining the number of factors to retain in EFA: Using the SPSS r-menu v2.0 to make more judicious estimations. Practic. Assess. Res. Eval. 18, 8 (2013), 1–14.

[27]

John W. Creswell and J. David Creswell. 2018. Research Design. SAGE Publications, Incorporated, Thousand Oaks, CA.

[28]

Linda Crocker. 2008. Introduction to Classical and Modern Test Theory. Cengage Learning, Mason, OH.

[29]

Lee J. Cronbach. 1951. Coefficient alpha and the internal structure of tests. Psychometrika 16, 3 (1951), 297–334.

[30]

Shirley Cruz, Fabio Q. B. da Silva, and Luiz Fernando Capretz. 2015. Forty years of research on personality in software engineering: A mapping study. Comput. Hum. Behav. 46 (2015), 94–113.

Digital Library

[31]

Norman Dalkey and Olaf Helmer. 1963. An experimental application of the DELPHI method to the use of experts. Manag. Sci. 9, 3 (1963), 458–467.

Digital Library

[32]

David P. Darcy and Meng Ma. 2005. Exploring individual characteristics and programming performance: Implications for programmer selection. In 38th Hawaii International Conference on System Sciences. IEEE, Piscataway, NJ.

Digital Library

[33]

R. A. Darton. 1980. Rotation in factor analysis. Statistician 29, 3 (1980), 167.

[34]

Knut De Swert. 2012. Calculating Inter-coder Reliability in Media Content Analysis Using Krippendorff's Alpha. Technical Report. Center for Politics and Communication, University of Amsterdam.

[35]

J. Drennan. 2003. Cognitive interviewing: Verbal data in the design and pretesting of questionnaires.J. Adv. Nurs. 42, 1 (2003), 57–63.

[36]

Susan E. Embretson and Steven P. Reise. 2013. Item Response Theory. Psychology Press, Hove, East Sussex, UK.

[37]

Leandre R. Fabrigar, Duane T. Wegener, Robert C. MacCallum, and Erin J. Strahan. 1999. Evaluating the use of exploratory factor analysis in psychological research.Psychol. Meth. 4, 3 (1999), 272.

[38]

Fabian Fagerholm. 2015. Software Developer Experience: Case Studies in Lean-Agile and Open Source Environments. Ph.D. Dissertation. Department of Computer Science, University of Helsinki, Helsinki, Finland.

[39]

Fabian Fagerholm and Thomas Fritz. 2020. Biometric measurement in software engineering. In Contemporary Empirical Methods in Software Engineering, Michael Felderer and Guilherme Horta Travassos (Eds.). Springer International Publishing, Cham, Switzerland, 151–172.

[40]

Fabian Fagerholm and Max Pagels. 2014. Examining the structure of lean and agile values among software developers. In Lecture Notes in Business Information Processing: Agile Processes in Software Engineering and Extreme Programming. Springer International Publishing, Cham, 218–233.

Digital Library

[41]

Robert Feldt and Ana Magazinius. 2010. Validity threats in empirical software engineering research—An initial survey. In 22nd International Conference on Software Engineering & Knowledge Engineering (SEKE'2010)… Retrieved from http://www.cse.chalmers.se/ feldt/publications/feldt_2010_validity_threats_in_ese_initial_survey.pdf.

[42]

Robert Feldt, Richard Torkar, Lefteris Angelis, and Maria Samuelsson. 2008. Towards individualized software engineering: Empirical studies should collect psychometrics. In International Workshop on Cooperative and Human Aspects of Software Engineering. ACM Press, New York, NY, 49–52.

Digital Library

[43]

Daniel Méndez Fernández, Daniel Graziotin, Stefan Wagner, and Heidi Seibold. 2020. Open science in software engineering. In Contemporary Empirical Methods in Software Engineering, Michael Felderer and Guilherme Horta Travassos (Eds.). Springer International Publishing, Cham, Switzerland, 479–504.

[44]

David B. Flora and Patrick J. Curran. 2004. An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data.Psychol. Meth. 9, 4 (2004), 466.

[45]

John Fox, Zhenghua Nie, and Jarrett Byrnes. 2017. sem: Structural Equation Models. Technical Report. The Comprehensive R Archive Network.

[46]

Cesar Franca, Fabio Q. B. da Silva, and Helen Sharp. 2020. Motivation and satisfaction of software engineers. IEEE Trans. Softw. Eng. 46, 2 (2020), 118–140.

[47]

Carlo Alberto Furia, Robert Feldt, and Richard Torkar. 2019. Bayesian data analysis in empirical software engineering research. IEEE Trans. Softw. Eng. 1, 1 (2019), 1–26.

[48]

Annie T. Ginty. 2013. Psychometric properties. In Encyclopedia of Behavioral Medicine, Marc D. Gellman and J. Rick Turner (Eds.). Springer New York, NY, 1563–1564.

[49]

Robert Glaser. 1963. Instructional technology and the measurement of Learing outcomes: Some questions.Amer. Psychol. 18, 8 (1963), 519–521.

[50]

Daniel Graziotin, Fabian Fagerholm, Xiaofeng Wang, and Pekka Abrahamsson. 2017. On the unhappiness of software developers. In 21st International Conference on Evaluation and Assessment in Software Engineering. ACM Press, New York, NY, 324–333.

Digital Library

[51]

Daniel Graziotin, Per Lenberg, Robert Feldt, and Stefan Wagner. 2021. Behavioral software engineering—Example of psychometric evaluation with R.

[52]

Daniel Graziotin, Xiaofeng Wang, and Pekka Abrahamsson. 2015. The affect of software developers: Common misconceptions and measurements. In IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE'15). IEEE, Piscataway, NJ, 123–124.

Digital Library

[53]

Daniel Graziotin, Xiaofeng Wang, and Pekka Abrahamsson. 2015. Do feelings matter? On the correlation of affects and the self-assessed productivity in software engineering. J. Softw.: Evol. Proc. 27, 7 (2015), 467–487.

Digital Library

[54]

Daniel Graziotin, Xiaofeng Wang, and Pekka Abrahamsson. 2015. Understanding the affect of developers: Theoretical background and guidelines for psychoempirical software engineering. In 7th International Workshop on Social Software Engineering (SSE'15). ACM, New York, NY, 25–32.

Digital Library

[55]

Christopher D. Green. 2009. Darwinian theory, functionalism, and the first American psychological revolution.Amer. Psychol. 64, 2 (2009), 75–83.

[56]

Lucas Gren. 2018. Standards of validity and the validity of standards in behavioral software engineering research. ACM Press, New York, NY.

Digital Library

[57]

Lucas Gren and Alfredo Goldman. 2016. Useful statistical methods for human factors research in software engineering: A discussion on validation with quantitative data. In 9th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE'16). Association for Computing Machinery, New York, NY, 121–124.

Digital Library

[58]

Joy Paul Guilford. 1954. Psychometric Methods (1st ed.). McGraw-Hill Book Co., New York, London.

[59]

Joy Paul Guilford. 1954. Psychometric Methods (2nd ed.). McGraw-Hill Book Co., New York, London.

[60]

Ronald K. Hambleton and Jane Rodgers. 1995. Item Bias Review. Technical Report. ERIC Clearinghouse on Assessment and Evaluation, the Catholic University of America, Department of Education.

[61]

Ronald K. Hambleton, Hariharan Swaminathan, and H. Jane Rogers. 1991. Fundamentals of Item Response Theory. Sage Publications, Newbury Park, CA.

[62]

Ernest R. Hilgard. 1980. The trilogy of mind: Cognition, affection, and conation. J. Hist. Behav. Sci. 16, 2 (1980), 107–117.

[63]

Charlotte Emma Hilton. 2017. The importance of pretesting questionnaires: A field research example of cognitive pretesting the exercise referral quality of life scale (ER-QLS). Int. J. Soc. Res. Methodol. 20, 1 (2017), 21–34.

[64]

Robert Hogan. 2007. Personality and the Fate of Organizations. Erlbaum, Mahwah, NJ.

[65]

John L. Horn. 1965. A rationale and test for the number of factors in factor analysis.Psychometrika 30 (1965), 179–185.

[66]

Junzhong Ji, Jingyue Li, Reidar Conradi, Chunnian Liu, Jianqiang Ma, and Weibing Chen. 2008. Some lessons learned in conducting software engineering surveys in china. In 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, Piscataway, NJ, 168–177.

Digital Library

[67]

George A. Johanson and Gordon P. Brooks. 2010. Initial scale development: Sample size for pilot studies. Educ. Psychol. Measur. 70, 3 (2010), 394–400.

[68]

Lyle V. Jones and David Thissen. 2006. A history and overview of psychometrics. In Handbook of Statistics: Psychometrics. Elsevier, Amsterdam Boston, 1–27.

[69]

Henry F. Kaiser. 1958. The varimax criterion for analytic rotation in factor analysis. Psychometrika 23, 3 (1958), 187–200.

[70]

Henry F. Kaiser. 1960. The application of electronic computers to factor analysis. Educ. Psychol. Measur. 20, 1 (1960), 141–151.

[71]

Graham Kalton and Howard Schuman. 1982. The effect of the question on survey responses: A review. J. Royal Statist. Soc.: Series A (Gen.) 145, 1 (1982), 42–57.

[72]

David Kaplan. 2008. Structural Equation Modeling: Foundations and Extensions. Vol. 10. SAGE, Los Angeles, CA.

[73]

B. A. Kitchenham. 2007. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Technical Report. Keele University and University of Durham Keele and Durham, UK.

[74]

Barbara A. Kitchenham and Shari L. Pfleeger. 2008. Personal opinion surveys. In Guide to Advanced Empirical Software Engineering, Forrest Shull, Janice Singer, and Dag I. K. Sjøberg (Eds.). Springer London, 63–92.

[75]

Paul Kline. 2015. A Handbook of Test Construction (Psychology Revivals): Introduction to Psychometric Design. Routledge, London New York.

[76]

Mark E. Koltko-Rivera. 2004. The psychology of worldviews. Rev. Gen. Psychol. 8, 1 (2004), 3–58.

[77]

G. J. Kootstra. 2006. Exploratory Factor Analysis: Theory and Application. Technical Report. University of Groningen.

[78]

Sik-Yum Lee. 1981. A Bayesian approach to confirmatory factor analysis. Psychometrika 46, 2 (1981), 153–160.

[79]

Per Lenberg, Robert Feldt, Lars Göran Wallgren Tengberg, Inga Tidefors, and Daniel Graziotin. 2017. Behavioral software engineering—Guidelines for qualitative studies. Retrieved from https://arxiv.org/abs/1712.08341.

Digital Library

[80]

Per Lenberg, Robert Feldt, and Lars Göran Wallgren. 2015. Behavioral software engineering: A definition and systematic literature review. J. Syst. Softw. 107 (2015), 15–37.

Digital Library

[81]

Per Lenberg, Lars Göran Wallgren Tengberg, and Robert Feldt. 2017. An initial analysis of software engineers' attitudes towards organizational change. Empir. Softw. Eng. 22, 4 (2017), 2179–2205.

Digital Library

[82]

Roy Levy and Robert J. Mislevy. 2016. Bayesian Psychometric Modeling. CRC Press, Boca Raton, FL.

[83]

Clayton Lewis. 2020. On personality testing and software engineering. In 31st Workshop of the Psychology of Programming Interest Group (PPIG'20). PPIG, 39–41.

[84]

Rensis Likert. 1932. A technique for the measurement of attitudes.Arch. Psychol. 22, 40 (1932), 1–55.

[85]

Jane Loevinger. 1957. Objective tests as instruments of psychological theory. Psychol. Rep. 3 (1957), 635–694.

[86]

Z. H. Lu, S. M. Chow, and E. Loken. 2016. Bayesian factor analysis as a variable-selection problem: Alternative priors and consequences.Multivar. Behav. Res. 51, 4 (2016), 519–539.

[87]

R. C. MacCallum and J. T. Austin. 2000. Applications of structural equation modeling in psychological research.Annu. Rev. Psychol. 51 (2000), 201–226.

[88]

Robert C. MacCallum, Keith F. Widaman, Shaobo Zhang, and Sehee Hong. 1999. Sample size in factor analysis.Psychol. Meth. 4, 1 (1999), 84–99.

[89]

Geofferey N. Masters. 1988. Item discrimination: When more is worse. J. Educ. Measur. 25, 1 (1988), 15–29.

[90]

Sharon McDonald and Helen M. Edwards. 2007. Who should test whom. Commun. ACM 50, 1 (2007), 66–71.

Digital Library

[91]

Joel Michell. 2000. Normal science, pathological science and psychometrics. Theor. Psychol. 10, 5 (2000), 639–667.

[92]

Kristen Miller, Valerie Chepp, Stephanie Willson, and Jose-Luis Padilla. 2014. Cognitive Interviewing Methodology. John Wiley & Sons, Hoboken, NJ.

[93]

Jefferson Seide Molléri, Kai Petersen, and Emilia Mendes. 2016. Survey guidelines in software engineering: An annotated review. In 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, Piscataway, NJ.

Digital Library

[94]

W. P. Morgan. 1980. The trait psychology controversy.Res. Q. Exerc. Sport 51, 1 (1980), 50–76.

[95]

Bengt Muthén and Tihomir Asparouhov. 2012. Bayesian structural equation modeling: A more flexible representation of substantive theory.Psychol. Meth. 17, 3 (2012), 313–335.

[96]

Geoff Norman. 2002. Research in medical education: Three decades of progress. BMJ 324, 7353 (2002), 1560–1562.

[97]

Megan Norris and Luc Lecavalier. 2010. Evaluating the use of exploratory factor analysis in developmental disability psychological research. J. Autis. Devel. Disord. 40, 1 (01 Jan. 2010), 8–20.

[98]

Jum Nunnally. 1994. Psychometric Theory. McGraw-Hill, New York.

[99]

Jum C. Nunnally. 1978. An overview of psychological measurement. In Clinical Diagnosis of Mental Disorders. Springer US, 97–146.

[100]

A. N. Oppenheim. 1992. Questionnaire Design, Interviewing and Attitude Measurement. Pinter Pub Ltd, London, UK.

[101]

Jan-Peter Ostberg, Daniel Graziotin, Stefan Wagner, and Birgit Derntl. 2017. Towards the assessment of stress and emotional responses of a salutogenesis-enhanced software tool using psychophysiological measurements. In Towards the Assessment of Stress and Emotional Responses of a Salutogenesis-Enhanced Software Tool Using Psychophysiological Measurements. IEEE, Piscataway, NJ, 22–25.

Digital Library

[102]

Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin Philos. Mag. J. Sci. 2, 11 (Nov. 1901), 559–572.

[103]

Michael Perrone. 2006. Differential item functioning and item bias: Critical considerations in test fairness. Teach. Coll., Columbia Univ. Work. Pap. TESOL Appl. Ling. 6 (2006), 1–3.

[104]

Kai Petersen and Cigdem Gencel. 2013. Worldviews, research methods, and their relationship to validity in empirical software engineering research. In Joint Conference of the 23nd International Workshop on Software Measurement and the 8th International Conference on Software Process and Product Measurement (IWSM-MENSURA'13). IEEE, Piscataway, NJ, 81–89.

Digital Library

[105]

Jennifer Petrillo, Stefan J. Cano, Lori D. McLeod, and Cheryl D. Coon. 2015. Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: A comparison of worked examples. Val. Health 18, 1 (2015), 25–34.

[106]

David J. Pittenger. 1993. Measuring the MBTI… and coming up short. J. Career Plan. Employ. 54, 1 (1993), 48–52.

[107]

Friedrich Pukelsheim. 1994. The three sigma rule. Amer. Statist. 48, 2 (1994), 88–91.

[108]

Paul Ralph, Sebastian Baltes, Gianisa Adisaputri, Richard Torkar, Vladimir Kovalenko, Marcos Kalinowski, Nicole Novielli, Shin Yoo, Xavier Devroey, Xin Tan, Minghui Zhou, Burak Turhan, Rashina Hoda, Hideaki Hata, Gregorio Robles, Amin Milani Fard, and Rana Alkadhi. 2020. Pandemic programming. Empir. Softw. Eng. 25, 6 (Sept. 2020), 4927–4961.

Digital Library

[109]

Paul Ralph and Ewan Tempero. 2018. Construct validity in software engineering research and software metrics. In 22nd International Conference on Evaluation and Assessment in Software Engineering (EASE'18). Association for Computing Machinery, New York, NY, 13–23.

Digital Library

[110]

J. Rattray and M. C. Jones. 2007. Essential elements of questionnaire design and development.J. Clin. Nurs. 16, 2 (2007), 234–243.

[111]

Glenn Regehr, Marion Bogo, Cheryl Regehr, and Roxanne Power. 2007. Can we build a better mousetrap? Improving the measures of practice performance in the field practicum. J. Soc. Work Educ.ation 43, 2 (2007), 327–344.

[112]

William Revelle. 2009. An Introduction to Psychometric Theory with Applications in R. Retrieved from personality-project.org.

[113]

William Revelle. 2018. An Introduction to the Psych Package: Part II Scale Construction and Psychometrics. Technical Report. The Comprehensive R Archive Network.

[114]

William Revelle. 2018. Using the Psych Package to Generate and Test Structural Models. Technical Report. The Comprehensive R Archive Network.

[115]

William Revelle. 2019. psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, IL. Retrieved from https://cran.r-project.org/package=psych. R package version 1.9.12.

[116]

William Revelle. 2020. How To: Use the Psych Package for Factor Analysis and Data Reduction. Technical Report. The Comprehensive R Archive Network.

[117]

William Revelle and Thomas Rocklin. 1979. Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivar. Behav. Res. 14, 4 (Oct. 1979), 403–414.

[118]

Judith Runnels. 2013. Measuring differential item and test functioning across academic disciplines. Lang. Test. Asia 3, 1 (2013), 9.

[119]

Ioana Rus, Mikael Lindvall, and S. Sinha. 2002. Knowledge management in software engineering. IEEE Softw. 19, 3 (2002), 26–38.

Digital Library

[120]

Daniel W. Russell. 2016. In search of underlying dimensions: The use (and abuse) of factor analysis in personality and social psychology bulletin. Personal. Soc. Psychol. Bull. 28, 12 (2016), 1629–1646.

[121]

John Rust. 2009. Modern Psychometrics : the Science of Psychological Assessment. Routledge, Hove, East Sussex New York.

[122]

Daniel J. Schad, Michael Betancourt, and Shravan Vasishth. 2021. Toward a principled Bayesian workflow in cognitive science.Psychol. Meth. 26, 1 (Feb. 2021), 103–126.

[123]

Frank L. Schmidt. 1992. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology.Amer. Psychol. 47, 10 (1992), 1173–1181.

[124]

J. R. Schoenherr and S. J. Hamstra. 2016. Psychometrics and its discontents: An historical perspective on the discourse of the measurement tradition.Adv. Health Sci. Educ. Theor. Pract. 21, 3 (2016), 719–729.

[125]

Norbert Schwarz and Daphna Oyserman. 2016. Asking questions about behavior: Cognition, communication, and questionnaire construction. Amer. J. Eval. 22, 2 (2016), 127–160.

[126]

C. M. Shea, S. R. Jacobs, D. A. Esserman, K. Bruce, and B. J. Weiner. 2014. Organizational readiness for implementing change: A psychometric assessment of a new measure.Implement. Sci. 9 (2014), 7.

[127]

Janet Siegmund, Norbert Siegmund, and Sven Apel. 2015. Views on internal and external validity in empirical software engineering. In IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE).IEEE, Piscataway, NJ, 9–19.

Digital Library

[128]

Lee Sigelman. 1981. Question-order effects on presidential popularity. Pub. Opin. Quart. 45, 2 (1981), 199.

[129]

Kamlesh Singh, Mohita Junnarkar, Jasleen Kaur, Kamlesh Singh, Mohita Junnarkar, and Jasleen Kaur. 2016. Norms for test construction. In Measures of Positive Psychology. Springer India, New Delhi, 17–34.

[130]

Juani Swart and Nicholas Kinnie. 2003. Sharing knowledge in knowledge-intensive firms. Hum. Resour. Manag. J. 13, 2 (2003), 60–75.

[131]

Barbara G. Tabachnick, Linda S. Fidell, and Jodie B. Ullman. 2007. Using Multivariate Statistics. Vol. 5. Pearson Boston, MA.

[132]

Lehana Thabane, Jinhui Ma, Rong Chu, Ji Cheng, Afisi Ismaila, Lorena P. Rios, Reid Robson, Marroon Thabane, Lora Giangregorio, and Charles H. Goldsmith. 2010. A tutorial on pilot studies: The what, why. and how. BMC Med. Res. Methodol. 10, 1 (2010).

[133]

Louis Leon Thurstone. 1937. Psychology as a quantitative rational science. Science 85, 2201 (1937), 227–232.

[134]

Howard E. Tinsley and Diane J. Tinsley. 1987. Uses of factor analysis in counseling psychology research.J. Counsel. Psychol. 34, 4 (1987), 414–424.

[135]

Ross E. Traub. 2005. Classical test theory in historical perspective. Educ. Measur.: Iss. Pract. 16, 4 (Oct. 2005), 8–14.

[136]

John F. Tripp, Cindy Riemenschneider, and Jason B. Thatcher. 2016. Job satisfaction in agile development teams: Agile development as work redesign. J. Assoc. Inform. Syst. 17, 4 (2016), 267.

[137]

J. Uher. 2018. Quantitative data from rating scales: An epistemological and methodological enquiry.Front. Psychol. 9 (2018), 2599.

[138]

Jana Uher. 2021. Quantitative psychology under scrutiny: Measurement requires not result-dependent but traceable data generation. Personal. Indiv. Differ. 170 (2021), 110205.

[139]

Eric-Jan Wagenmakers, Maarten Marsman, Tahira Jamil, Alexander Ly, Josine Verhagen, Jonathon Love, Ravi Selker, Quentin F. Gronau, Martin Šmíra, Sacha Epskamp, et al. 2018. Bayesian inference for psychology. Part I: Theoretical advantages and practical ramifications. Psychon. Bull. Rev. 25, 1 (2018), 35–57.

[140]

Stefan Wagner, Daniel Mendez, Michael Felderer, Daniel Graziotin, and Marcos Kalinowski. 2020. Challenges in survey research. In Contemporary Empirical Methods in Software Engineering, Michael Felderer and Guilherme Horta Travassos (Eds.). Springer International Publishing, Cham, 93–125. Available https://arxiv.org/abs/1908.05899

[141]

Yi Wang and Min Zhang. 2020. Reducing implicit gender biases in software development: Does intergroup contact theory work. In 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, New York, NY.

Digital Library

[142]

Ronald L. Wasserstein, Allen L. Schirm, and Nicole A. Lazar. 2019. Moving to a world beyond “p < 0.05.”The American Statistician 73, sup 1 (2019), 1–19.

[143]

Gerald M. Weinberg. 1971. The Psychology of Computer Programming. Dorset House Publishing, New York.

Digital Library

[144]

Howard M. Weiss. 2002. Deconstructing job satisfaction: Separating evaluations, beliefs and affective experiences. Hum. Resour. Manag. Rev. 12, 2 (2002), 173–194.

[145]

Keith F. Widaman. 1993. Common factor analysis versus principal component analysis: Differential bias in representing model parameters. Multivar. Behav. Res. 28, 3 (1993), 263–311.

[146]

Gordon B. Willis. 2004. Cognitive interviewing revisited: A useful technique, in theory. In Methods for Testing and Evaluating Survey Questionnaires: Wiley Series in Survey Methodology. John Wiley & Sons, Inc., Hoboken, NJ, 23–43.

[147]

Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in Software Engineering. Springer Berlin.

[148]

D. Wood, M. H. Gardner, and P. D. Harms. 2015. How functionalist and process approaches to behavior can explain trait covariation.Psychol. Rev. 122, 1 (2015), 84–111.

[149]

C. A. Wynd, B. Schmidt, and M. A. Schaefer. 2003. Two quantitative approaches for estimating content validity.West. J. Nurs. Res. 25, 5 (2003), 508–518.

[150]

Marvin Wyrich, Daniel Graziotin, and Stefan Wagner. 2019. A theory on individual characteristics of successful coding challenge solvers. PeerJ Comput. Sci. 5 (2019), e173.

[151]

Marvin Wyrich, Andreas Preikschat, Daniel Graziotin, and Stefan Wagner. 2021. The mind is a powerful place: How showing code comprehensibility metrics influences code understanding. In IEEE/ACM 43rd International Conference on Software Engineering (ICSE'21). IEEE, Piscataway, NJ, 512–523.

Digital Library

[152]

An Gie Yong and Sean Pearce. 2013. A beginners guide to factor analysis: Focusing on exploratory factor analysis. Tutor. Quantit. Meth. Psychol. 9, 2 (2013), 79–94.

[153]

Bruno D. Zumbo. 2007. Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Lang. Assess. Quart. 4, 2 (2007), 223–233.

Cited By

Hicks CLee CRamsey M(2024)Developer Thriving: Four Sociocognitive Factors That Create Resilient Productivity on Software TeamsIEEE Software10.1109/MS.2024.338295741:4(68-77)Online publication date: Jul-2024
https://doi.org/10.1109/MS.2024.3382957
Höppner STichy M(2024)Traceability and reuse mechanisms, the most important properties of model transformation languagesEmpirical Software Engineering10.1007/s10664-023-10428-229:2Online publication date: 24-Feb-2024
https://dl.acm.org/doi/10.1007/s10664-023-10428-2
Valový M(2023)Psychological Aspects of Pair ProgrammingProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering10.1145/3593434.3593458(210-216)Online publication date: 14-Jun-2023
https://doi.org/10.1145/3593434.3593458
Show More Cited By

Index Terms

Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

Recommendations

Consequences of unhappiness while developing software
SEmotion '17: Proceedings of the 2nd International Workshop on Emotion Awareness in Software Engineering

The growing literature on affect among software developers mostly reports on the linkage between happiness, software quality, and developer productivity. Understanding the positive side of happiness - positive emotions and moods - is an attractive and ...
Evidence-Based Software Engineering for Practitioners

Software engineers might make incorrect decisions about adopting new techniques if they donýt consider scientific evidence about the techniquesý efficacy. Procedures used for evidence-based medicine can also apply to software engineering. Such evidence-...
Behavioral software engineering

Throughout the history of software engineering, the human aspects have repeatedly been recognized as important. Even though research that investigates them has been growing in the past decade, these aspects should be more generally considered.The main ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 31, Issue 1

January 2022

665 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/3481711

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology

Issue’s Table of Contents

Copyright © 2021 Copyright held by the owner/author(s).

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2021

Accepted: 01 June 2021

Revised: 01 April 2021

Received: 01 July 2020

Published in TOSEM Volume 31, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Swedish Armed Forces
Swedish Defense Materiel Administration
Swedish Governmental Agency for Innovation Systems (VINNOVA)
Marianne and Marcus Wallenberg Foundation
Alexander von Humboldt (AvH) Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
1,800
Total Downloads

Downloads (Last 12 months)661
Downloads (Last 6 weeks)74

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hicks CLee CRamsey M(2024)Developer Thriving: Four Sociocognitive Factors That Create Resilient Productivity on Software TeamsIEEE Software10.1109/MS.2024.338295741:4(68-77)Online publication date: Jul-2024
https://doi.org/10.1109/MS.2024.3382957
Höppner STichy M(2024)Traceability and reuse mechanisms, the most important properties of model transformation languagesEmpirical Software Engineering10.1007/s10664-023-10428-229:2Online publication date: 24-Feb-2024
https://dl.acm.org/doi/10.1007/s10664-023-10428-2
Valový M(2023)Psychological Aspects of Pair ProgrammingProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering10.1145/3593434.3593458(210-216)Online publication date: 14-Jun-2023
https://doi.org/10.1145/3593434.3593458
Feitelson D(2023)“We do not appreciate being experimented on”Journal of Systems and Software10.1016/j.jss.2023.111774204:COnline publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1016/j.jss.2023.111774
Felipe DKalinowski MGraziotin DNatividade J(2023)Psychometric instruments in software engineering research on personalityJournal of Systems and Software10.1016/j.jss.2023.111740203:COnline publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.jss.2023.111740
Börstler JAli NSvensson MPetersen K(2023)Investigating acceptance behavior in software engineering—Theoretical perspectivesJournal of Systems and Software10.1016/j.jss.2022.111592198:COnline publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.jss.2022.111592
Lenberg PFeldt RGren LWallgren Tengberg LTidefors IGraziotin D(2023)Qualitative software engineering researchJournal of Software: Evolution and Process10.1002/smr.260736:6Online publication date: 12-Sep-2023
https://dl.acm.org/doi/10.1002/smr.2607
Zolduoarrati ELicorish SStanger N(2022)Impact of individualism and collectivism cultural profiles on the behaviour of software developersJournal of Systems and Software10.1016/j.jss.2022.111427192:COnline publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1016/j.jss.2022.111427
Stol KSchaarschmidt MGoldblit S(2022)Gamification in software engineering: the mediating role of developer engagement and job satisfactionEmpirical Software Engineering10.1007/s10664-021-10062-w27:2Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1007/s10664-021-10062-w
Valový M(2022)Effects of Pilot, Navigator, and Solo Programming Roles on Motivation: An Experimental StudyNew Perspectives in Software Engineering10.1007/978-3-031-20322-0_6(84-98)Online publication date: 30-Oct-2022
https://doi.org/10.1007/978-3-031-20322-0_6
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents