Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Missing the missing values: : The ugly duckling of fairness in machine learning

Published: 28 May 2021 Publication History

Abstract

Nowadays, there is an increasing concern in machine learning about the causes underlying unfair decision making, that is, algorithmic decisions discriminating some groups over others, especially with groups that are defined over protected attributes, such as gender, race and nationality. Missing values are one frequent manifestation of all these latent causes: protected groups are more reluctant to give information that could be used against them, sensitive information for some groups can be erased by human operators, or data acquisition may simply be less complete and systematic for minority groups. However, most recent techniques, libraries and experimental results dealing with fairness in machine learning have simply ignored missing data. In this paper, we present the first comprehensive analysis of the relation between missing values and algorithmic fairness for machine learning: (1) we analyse the sources of missing data and bias, mapping the common causes, (2) we find that rows containing missing values are usually fairer than the rest, which should discourage the consideration of missing values as the uncomfortable ugly data that different techniques and libraries for handling algorithmic bias get rid of at the first occasion, (3) we study the trade‐off between performance and fairness when the rows with missing values are used (either because the technique deals with them directly or by imputation methods), and (4) we show that the sensitivity of six different machine‐learning techniques to missing values is usually low, which reinforces the view that the rows with missing data contribute more to fairness through the other, nonmissing, attributes. We end the paper with a series of recommended procedures about what to do with missing data when aiming for fair decision making.

References

[1]
Angwin J, Larson J, Mattu S, Kirchner L. Machine bias: there's software used across the country to predict future criminals. And it's biased against blacks. ProPublica. 2016. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
[2]
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R. Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, Cambridge, MA, USA, 8‐10 January 2012. ACM; 2012:214‐226.
[3]
Grgic‐Hlaca N, Redmiles EM, Gummadi KP, Weller A. Human perceptions of fairness in algorithmic decision making: a case study of criminal risk prediction. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, Lion, France, 23‐27 April 2018. International World Wide Web Conferences Steering Committee; 2018:903‐912.
[4]
Munoz C, Smith M, Patil D. Big Data: a Report on Algorithmic Systems, Opportunity, and Civil Rights. United States: White House Office; 2016.
[5]
Noble SU. Algorithms of Oppression: How Search Engines Reinforce Racism. United States: NYU Press; 2018.
[6]
Speicher T, Heidari H, Grgic‐Hlaca N, et al. A unified approach to quantifying algorithmic unfairness: measuring individual & group unfairness via inequality indices. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, August 2018. ACM; 2018:2239‐2248.
[7]
Tolan S, Miron M, Gómez E, Castillo C. Why machine learning may lead to unfairness: evidence from risk assessment for juvenile justice in Catalonia. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, ICAIL 2019, Montreal, QC, Canada, June 17‐21, 2019. ACM; 2019:83‐92.
[8]
Dignum V, Baldoni M, Baroglio C, et al. Ethics by design: necessity or curse? In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. ACM; 2018:60‐66.
[9]
Whittlestone J, Nyrup R, Alexandrova A, Dihal K, Cave S. Ethical and Societal Implications of Algorithms, Data, and Artificial Intelligence: a Roadmap for Research. London, UK: Nuffield Foundation; 2019.
[10]
Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453, 2018.
[11]
Alvero A, Arthurs N, Antonio AL, et al. AI and holistic review: informing human reading in college admissions. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, USA, 7‐8 February 2020. 2020:200‐206.
[12]
Berk R, Heidari H, Jabbari S, et al. A convex framework for fair regression. arXiv preprint arXiv:1706.02409, 2017.
[13]
Bolukbasi T, Chang K‐W, Zou JY, Saligrama V, Kalai AT. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, Barcelona, Spain, 5‐10 December 2016. 2016:4349‐4357.
[14]
Hardt M, Price E, Srebro N. Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, Barcelona, Spain, 5‐10 December 2016. 2016:3315‐3323.
[15]
Kusner MJ, Loftus J, Russell C, Silva R. Counterfactual fairness. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems. Vol 30. United States: Curran Associates Inc.; 2017:4066‐4076.
[16]
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635, 2019.
[17]
Zafar MB, Valera I, Rodriguez M, Gummadi K, Weller A. From parity to preference‐based notions of fairness in classification. In: Advances in Neural Information Processing Systems, Long Beach, CA, USA. 2017:229‐239.
[18]
Barocas S, Selbst AD. Big data's disparate impact. Calif Law Rev. 2016;104:671.
[19]
Lin B‐R, Kifer D. Information measures in statistical privacy and data processing applications. ACM Trans Knowl Discovery Data (TKDD). 2015;9(4):28.
[20]
Romero‐Tris C, Megías D. Protecting privacy in trajectories with a user‐centric approach. ACM Trans Knowl Discovery Data (TKDD). 2018;12(6):67.
[21]
Kohavi R. Data mining and visualization. In: Sixth Annual Symposium on Frontiers of Engineering. DC: National Academy Press; 2001:30‐40.
[22]
Friedler SA, Scheidegger C, Venkatasubramanian S, Choudhary S, Hamilton EP, Roth D. A comparative study of fairness‐enhancing interventions in machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, 29‐31 January 2019. 2019:329‐338.
[23]
Bellamy RK, Dey K, Hind M, et al. AI Fairness 360: an extensible toolkit for detecting, and mitigating algorithmic bias. IBM J Res Dev. 2019;63(4/5):1‐15.
[24]
Saleiro P, Kuester B, Stevens A, et al. Aequitas: a bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577, 2018.
[25]
Bantilan N. Themis‐ML: a fairness‐aware machine learning interface for end‐to‐end discrimination discovery and mitigation. J Technol Hum Serv. 2018;36(1):15‐30.
[26]
Northpoint I. Compas Risk and Need Assessment System. Tech. rep. 2012.
[27]
Borum R, Lodewijks H, Bartel PA, Forth AE. Structured assessment of violence risk in youth (SAVRY). Handbook of Violence Risk Assessment. England, UK: Routledge; 2011:73‐90.
[28]
Hilterman EL, Nicholls TL, van Nieuwenhuizen C. Predictive validity of risk assessments in juvenile offenders: comparing the SAVRY, PCL: YV, and YLS/CMI with unstructured clinical assessments. Assessment. 2014;21(3):324‐339.
[29]
Peugh JL, Enders CK. Missing data in educational research: a review of reporting practices and suggestions for improvement. Rev Educ Res. 2004;74(4):525‐556.
[30]
Rombach I, Rivero‐Arias O, Gray AM, Jenkinson C, Burke O. The current practice of handling and reporting missing outcome data in eight widely used proms in RCT publications: a review of the current literature. Qual Life Res. 2016;25(7):1613‐1623.
[31]
Dua D, Graff C. UCI Machine Learning Repository. Irvine, CA, USA: University of California, School of Information and Computer Sciences; 2017.
[32]
Lavrakas PJ. Encyclopedia of Survey Research Methods. United States: Sage Publications; 2008.
[33]
García S, Ramírez‐Gallego S, Luengo J, Benítez JM, Herrera F. Big data preprocessing: methods and prospects. Big Data Anal. 2016;1(1):9.
[34]
Kossinets G. Effects of missing data in social networks. Soc Netw. 2006;28(3):247‐268.
[35]
Little RJ, Rubin DB. Statistical Analysis with Missing Data. Vol 333. United States: John Wiley & Sons; 2014.
[36]
Rubin DB. Inference and missing data. Biometrika. 1976;63(3):581‐592.
[37]
Enders CK. Applied Missing Data Analysis. United States: Guilford Press; 2010.
[38]
Enders CK. Missing not at random models for latent growth curve analyses. Psychol Methods. 2011;16(1):1.
[39]
Molenberghs G, Fitzmaurice G. Incomplete data: introduction and overview. In: Longitudinal Data Analisis. England, UK: Taylor and Francis; 2009:395‐408.
[40]
Millsap RE, Maydeu‐Olivares A. The SAGE Handbook of Quantitative Methods in Psychology. United States: Sage Publications; 2009.
[41]
Johnson DR, Young R. Toward best practices in analyzing datasets with missing data: comparisons and recommendations. J Marriage Fam. 2011;73(5):926‐945.
[42]
Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychol Methods. 2002;7(2):147.
[43]
Little TD, Schnabel KU. Modeling Longitudinal and Multilevel Data: Practical Issues, Applied Approaches, and Specific Examples. England, UK: Psychology Press; 2015.
[44]
Wilkinson L. Statistical methods in psychology journals: guidelines and explanations. Am Psychol. 1999;54(8):594.
[45]
Breiman L. Random forests. Mach Learn. 2001;45(1):5‐32.
[46]
Breiman L. Classification and Regression Trees. England, UK: Routledge; 1984.
[47]
Quinlan JR. C4. 5: Programs for Machine Learning. Netherlands: Elsevier; 2014.
[48]
Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical Machine Learning Tools and Techniques. Netherlands: Morgan Kaufmann; 2016.
[49]
Little RJ. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc. 1988;83(404):1198‐1202.
[50]
Chouldechova A, Roth A. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810, 2018.
[51]
Devine PG, Plant EA, Amodio DM, Harmon‐Jones E, Vance SL. The regulation of explicit and implicit race bias: the role of motivations to respond without prejudice. J Pers Soc Psychol. 2002;82(5):835.
[52]
Donaldson SI, Grant‐Vallone EJ. Understanding self‐report bias in organizational behavior research. J Bus Psychol. 2002;17(2):245‐260.
[53]
Heckman JJ. Sample selection bias as a specification error. Econ: J Econ Soc. 1979:153‐161.
[54]
Kunz TP, Crone SF, Meissner J. The effect of data preprocessing on a retail price optimization system. Decision Support Syst. 2016;84:16‐27.
[55]
Li H, Gupta A, Zhang J, Sarathy R. Examining the decision to use standalone personal health record systems as a trust‐enabled fair social contract. Decision Support Syst. 2014;57:376‐386.
[56]
Millsap RE, Everson HT. Methodology review: statistical approaches for assessing measurement bias. Appl Psychol Meas. 1993;17(4):297‐334.
[57]
Nickerson RS. Confirmation bias: a ubiquitous phenomenon in many guises. Rev Gen Psychol. 1998;2(2):175‐220.
[58]
Wijnhoven F, Bloemen O. External validity of sentiment mining reports: Can current methods identify demographic biases, event biases, and manipulation of reviews? Decision Support Syst. 2014;59:262‐273.
[59]
Squire P. Why the 1936 literary digest poll failed. Public Opinion Q. 1988;52(1):125‐133.
[60]
Gray PO. Psychology. United States: Worth Publishers; 2002.
[62]
Zafar MB, Valera I, Gomez Rodriguez M, Gummadi KP. Airness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In: Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee; 2017:1171‐1180.
[63]
Kusner MJ, Loftus J, Russell C, Silva R. Counterfactual fairness. In: Advances in Neural Information Processing Systems, Long Beach, CA, USA. 2017:4066‐4076.
[64]
Verma S, Rubin J. Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), Gothenburg, Sweden, 29 May 2018. IEEE; 2018:1‐7.
[65]
Ferri C, Hernández‐Orallo J, Modroiu R. An experimental comparison of performance measures for classification. Pattern Recognition Lett. 2009;30(1):27‐38.
[66]
Hernández‐Orallo J, Flach P, Ferri C. A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res. 2012;13:2813‐2869.
[67]
Dutta S, Wei D, Yueksel H, Chen P‐Y, Liu S, Varshney K. Is there a trade‐off between fairness and accuracy? A perspective using mismatched hypothesis testing. In: International Conference on Machine Learning, 13‐18 July 2020. PMLR; 2020:2803‐2813.
[68]
Milfont TL, Fischer R. Testing measurement invariance across groups: applications in cross‐cultural research. Int J Psychol Res. 2010;3(1):111‐130.
[69]
Schwarz N, Groves RM, Schuman H. Survey methods. In: Gilbert DT, Fiske ST, Gardner L, eds. Handbook of Social Psychology. 4th ed. New York: McGraw‐Hill; 1998:143‐179. Chapter 4.
[70]
Buolamwini J, Gebru T. Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conference on Fairness, Accountability and Transparency, New York University, New York, NY, 23, 24 February 2018. 2018:77‐91.
[71]
Raji ID, Buolamwini J. Actionable auditing: investigating the impact of publicly naming biased performance results of commercial AI products. In: AAAI/ACM Conference on AI Ethics and Society, Honolulu, HI, USA, 27‐28 January 2019. Vol 1. 2019.
[72]
Ahmad MA, Eckert C, Teredesai A. The challenge of imputation in explainable artificial intelligence models. arXiv preprint arXiv:1907.12669, 2019.
[73]
Therneau T, Atkinson B, Ripley B, Ripley MB. Package ‘rpart’. cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf (accessed on 20 April 2016), 2015.
[74]
Therneau TM, Atkinson EJ. An introduction to recursive partitioning using the RPART routines (Technical Report). Vol. 61. United States: Mayo Foundation; 1997.
[75]
Kuhn M. Building predictive models in R using the caret package. J Stat Software. 2008;28(5):1‐26.
[76]
Alpaydin E. Introduction to Machine Learning. United States: MIT Press; 2009.
[77]
Flach P. Machine Learning: the Art and Science of Algorithms that Make Sense of Data. England, UK: Cambridge University Press; 2012.
[78]
Hernández‐Orallo J, Ferri C, Ramírez‐Quintana MJ. Introduction to Data Mining. United States: Pearson Prentice Hall; 2004.
[79]
Setiono R, Liu H. Neural‐network feature selector. IEEE Trans Neural Netw. 1997;8(3):654‐662.
[80]
Ribeiro MT, Singh S, Guestrin C. Why should I trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, 13‐17 August 2016. ACM; 2016:1135‐1144.
[81]
Renooij S, Van Der Gaag LC. Evidence and scenario sensitivities in naive Bayesian classifiers. Int J Approx Reason. 2008;49(2):398‐416.
[82]
Xu H, Caramanis C, Mannor S. Robustness and regularization of support vector machines. J Mach Learn Res. 2009;10:1485‐1510.
[83]
Ferri C, Hernández‐Orallo J, Martınez‐Usó A, Ramırez‐Quintana M. Identifying dominant models when the noise context is known. In: First Workshop on Generalization and Reuse of Machine Learning Models Over Multiple Contexts, Nancy, France, 19 September 2014. 2014.
[84]
Khamis A, Ismail Z, Haron K, Mohammed AT. The effects of outliers data on neural network performance. J Appl Sci. 2005;5(8):1394‐1398.
[85]
Yoon J, Jordon J, van der Schaar M. GAIN: missing data imputation using generative adversarial nets. In: Dy J, Krause A, eds. Proceedings of the 35th International Conference on Machine Learning (Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018), Proceedings of Machine Learning Research. Vol 80. PMLR; 2018:5689‐5698.
[86]
Amiri M, Jensen R. Missing data imputation using fuzzy‐rough methods. Neurocomputing. 2016;205:152‐164.
[87]
Camino RD, Hammerschmidt CA, State R. Improving missing data imputation with deep generative models. arXiv preprint arXiv:1902.10666, 2019.
[88]
Beaulieu‐Jones BK, Moore JH. Missing data imputation in the electronic health record using deeply learned autoencoders. In: Pacific Symposium on Biocomputing 2017, Hawaii, United States, 3‐7 January 2017. World Scientific; 2017:207‐218.
[89]
Bella A, Ferri C, Hernández‐Orallo J, Ramı́rez‐Quintana MJ. Calibration of machine learning models. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. United States: IGI Global; 2010:128‐146.

Cited By

View all
  • (2024)A Survey on Trustworthy Recommender SystemsACM Transactions on Recommender Systems10.1145/3652891Online publication date: 13-Apr-2024
  • (2024)The Impact of Differential Feature Under-reporting on Algorithmic FairnessProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658977(1355-1382)Online publication date: 3-Jun-2024
  • (2024)Fairness in Machine Learning: A SurveyACM Computing Surveys10.1145/361686556:7(1-38)Online publication date: 9-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Intelligent Systems
International Journal of Intelligent Systems  Volume 36, Issue 7
July 2021
606 pages
ISSN:0884-8173
DOI:10.1002/int.v36.7
Issue’s Table of Contents
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 28 May 2021

Author Tags

  1. algorithmic bias
  2. confirmation bias
  3. data imputation
  4. fairness
  5. missing values
  6. sample bias
  7. survey bias

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Survey on Trustworthy Recommender SystemsACM Transactions on Recommender Systems10.1145/3652891Online publication date: 13-Apr-2024
  • (2024)The Impact of Differential Feature Under-reporting on Algorithmic FairnessProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658977(1355-1382)Online publication date: 3-Jun-2024
  • (2024)Fairness in Machine Learning: A SurveyACM Computing Surveys10.1145/361686556:7(1-38)Online publication date: 9-Apr-2024
  • (2024)Towards a Non-Ideal Methodological Framework for Responsible MLProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642501(1-17)Online publication date: 11-May-2024
  • (2023)Adapting fairness interventions to missing valuesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668717(59388-59409)Online publication date: 10-Dec-2023
  • (2023)Aleatoric and epistemic discriminationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667298(27040-27062)Online publication date: 10-Dec-2023
  • (2023)Towards Risk-Free Trustworthy Artificial IntelligenceInternational Journal of Intelligent Systems10.1155/2023/44591982023Online publication date: 1-Jan-2023
  • (2023)Fairness Without Demographic Data: A Survey of ApproachesProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623234(1-12)Online publication date: 30-Oct-2023
  • (2023)Can language models automate data wrangling?Machine Language10.1007/s10994-022-06259-9112:6(2053-2082)Online publication date: 1-Jun-2023
  • (2023)A systematic review of generative adversarial imputation network in missing data imputationNeural Computing and Applications10.1007/s00521-023-08840-235:27(19685-19705)Online publication date: 1-Sep-2023
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media