Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Quantifying Fairness and Discrimination in Predictive Models

  • Chapter
  • First Online:
Machine Learning for Econometrics and Related Topics

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 508))

Abstract

The analysis of discrimination has long interested economists and lawyers. In recent years, the literature in computer science and machine learning has become interested in the subject, offering an interesting re-reading of the topic. These questions are the consequences of numerous criticisms of algorithms used to translate texts or to identify people in images. With the arrival of massive data, and the use of increasingly opaque algorithms, it is not surprising to have discriminatory algorithms, because it has become easy to have a proxy of a sensitive variable, by enriching the data indefinitely. According to [69], “technology is neither good nor bad, nor is it neutral”, and therefore, “machine learning won’t give you anything like gender neutrality ‘for free’ that you didn’t explicitely ask for”, as claimed by [61]. In this article, we will come back to the general context, for predictive models in classification. We will present the main concepts of fairness, called group fairness, based on independence between the sensitive variable and the prediction, possibly conditioned on this or that information. We will finish by going further, by presenting the concepts of individual fairness. Finally, we will see how to correct a potential discrimination, in order to guarantee that a model is more ethical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Project https://immersion.media.mit.edu/.

  2. 2.

    Some articles define the score as \(\boldsymbol{x}^\top \boldsymbol{\beta }\), which has values in \(\mathbb {R}\). The score we define is an increasing function of this linear combination. Note that here, \(\Phi \) denotes the cumulative distribution function of some \(\mathcal {N}(0,1)\) variable.

References

  1. Agarwal, S.: Trade-offs between fairness and interpretability in machine learning. In: IJCAI 2021 Workshop on AI for Social Good (2021)

    Google Scholar 

  2. Aigner, D.J., Cain, G.G.: Statistical theories of discrimination in labor markets. Indus. Labor Relat. Rev. 30(2), 175–187 (1977)

    Article  Google Scholar 

  3. Avraham, R., Logue, K.D., Schwarcz, D.: Towards a universal framework for insurance anti-discrimination laws. Connecticut Insur. Law J. 21, 1 (2014)

    Google Scholar 

  4. Awasthi, P., Kleindessner, M., Morgenstern, J.: Equalized odds postprocessing under imperfect group information. In: International Conference on Artificial Intelligence and Statistics, pp. 1770–1780. PMLR (2020)

    Google Scholar 

  5. Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning (2019). fairmlbook.org

    Google Scholar 

  6. Barry, L., Charpentier, A.: The Fairness of Machine Learning in Insurance: New Rags for an Old Man? (2022)

    Google Scholar 

  7. Bechavod, Y., Ligett, K.: Penalizing Unfairness in Binary Classification (2017). arXiv preprint arXiv:1707.00044

  8. Becker, G.S.: The Economics of Discrimination. University of Chicago press (1957)

    Google Scholar 

  9. Becker, G.S.: Is Ethnic and Other Profiling Discrimination? The Becker-Posner Blog (2005)

    Google Scholar 

  10. Bergstrom, C.T., West, J.D.: Calling Bullshit: The Art of Skepticism in a Data-Driven World. Random House Trade Paperbacks (2021)

    Google Scholar 

  11. Berk, R., Heidari, H., Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J., Neel, S., Roth, A.: A convex framework for fair regression (2017). arXiv:1706.02409

  12. Berk, R., Heidari, H., Jabbari, S., Kearns, M., Roth, A.: Fairness in criminal justice risk assessments: the state of the art. Sociol. Methods Res. 50(1), 3–44 (2021)

    Article  MathSciNet  Google Scholar 

  13. Besse, P., del Barrio, E., Gordaliza, P., Loubes, J.-M.: Confidence Intervals for Testing Disparate Impact in Fair Learning (2018). arXiv:1807.06362

  14. Beutel, A., Chen, J., Doshi, T., Qian, H., Wei, L., Wu, Y., Heldt, L., Zhao, Z., Hong, L., Chi, E.H., et al.: Fairness in recommendation ranking through pairwise comparisons. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2212–2220 (2019)

    Google Scholar 

  15. Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical implications when adversarially learning fair representations (2017). arXiv:1707.00075

  16. Biddle, D.: Adverse Impact and Test Validation: A Practitioner’s Guide to Valid and Defensible Employment Testing. Routledge (2017)

    Google Scholar 

  17. Bohren, J.A., Haggag, K., Imas, A., Pope, D.G.: Inaccurate statistical discrimination: An identification problem. Technical report, National Bureau of Economic Research (2019)

    Google Scholar 

  18. Borkan, D., Dixon, L., Sorensen, J., Thain, N., Vasserman, L.: Nuanced metrics for measuring unintended bias with real data for text classification. In: Companion Proceedings of the 2019 World Wide Web Conference, pp. 491–500 (2019)

    Google Scholar 

  19. Calmon, F.P., Wei, D., Ramamurthy, K.N., Varshney, K.R.: Optimized data pre-processing for discrimination prevention (2017). arXiv:1704.03354

  20. Caton, S., Haas, C.: Fairness in machine learning: A survey (2020). arXiv:2010.04053

  21. Chakraborty, S., Raghavan, K.R., Johnson, M.P., Srivastava, M.B.: A framework for context-aware privacy of sensor data on mobile systems. In: Proceedings of the 14th Workshop on Mobile Computing Systems and Applications, HotMobile ’13. Association for Computing Machinery (2013)

    Google Scholar 

  22. Charles, K.K., Guryan, J.: Studying discrimination: fundamental challenges and recent progress. Ann. Rev. Econ. 3(1), 479–511 (2011)

    Article  Google Scholar 

  23. Charpentier, A.: Insurance: Biases, Discrimination and Fairness. Institut Louis Bachelier (2022)

    Google Scholar 

  24. Charpentier, A., Flachaire, E., Gallic, E.: Causal inference with optimal transport. In: Thach, N.N., Kreinovich, V., Ha, D.T., Trung, N.D. (eds.) Optimal Transport Statistics for Economics and Related Topics. Springer Verlag (2023)

    Google Scholar 

  25. Charpentier, A., Flachaire, E., Ly, A.: Econometrics and machine learning. Economie et Statistique 505(1), 147–169 (2018)

    Google Scholar 

  26. Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genom. 21(1), 1–13 (2020)

    Article  Google Scholar 

  27. Cho, J., Hwang, G., Suh, C.: A fair classifier using mutual information. In: 2020 IEEE International Symposium on Information Theory (ISIT), pp. 2521–2526. IEEE (2020)

    Google Scholar 

  28. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)

    Article  Google Scholar 

  29. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness (2017). arXiv:1701.08230

  30. Cornu, G.: Vocabulaire juridique, vol. 6. Presses Universitaires de France (2016)

    Google Scholar 

  31. Cunningham, S.: Causal Inference. Yale University Press (2021)

    Google Scholar 

  32. Dalenius, T.: Towards a methodology for statistical disclosure control. Statistik Tidskrift 15(429-444), 2–1 (1977)

    Google Scholar 

  33. Daniels, N.: The Functions of Insurance and the Fairness of Genetic Underwriting, pp. 119–145. Medical Underwriting and Social Policy, Genetics and Life Insurance (2004)

    Google Scholar 

  34. David, H.: Why are there still so many jobs? The history and future of workplace automation. J. Econ. Perspect. 29(3), 3–30 (2015)

    Article  Google Scholar 

  35. Dawid, A.P.: The well-calibrated Bayesian. J. Am. Stat. Assoc. 77(379), 605–610 (1982)

    Article  MathSciNet  Google Scholar 

  36. Denuit, M., Charpentier, A., Trufin, J.: Autocalibration and tweedie-dominance for insurance pricing with machine learning. Math. Econ. Insurance (2021)

    Google Scholar 

  37. Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity constraints. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 301–316. Springer (2008)

    Google Scholar 

  38. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)

    Google Scholar 

  39. Dwork, C., Immorlica, N., Kalai, A.T., Leiserson, M.: Decoupled classifiers for group-fair and efficient machine learning. In: Conference on Fairness, Accountability and Transparency, pp. 119–133. PMLR (2018)

    Google Scholar 

  40. Edgeworth, F.Y.: Equal pay to men and women for equal work. Econ. J. 32(128), 431–457 (1922)

    Article  Google Scholar 

  41. Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2015)

    Google Scholar 

  42. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 329–338 (2019)

    Google Scholar 

  43. Gajane, P., Pechenizkiy, M.: On formalizing fairness in prediction with machine learning (2017). arXiv preprint arXiv:1710.03184

  44. Gambs, S., Killijian, M.-O., del Prado Cortez, M. N.N.: Show me how you move and i will tell you who you are. In: Proceedings of the 3rd ACM International Workshop on Security and Privacy in GIS and LBS (2010)

    Google Scholar 

  45. Gebelein, H.: Das statistische problem der korrelation als variations- und eigenwertproblem und sein zusammenhang mit der ausgleichsrechnung. ZAMM: J. Appl. Math. Mech./Zeitschrift für Angewandte Mathematik und Mechanik 21(6), 364–379 (1941)

    Article  MathSciNet  Google Scholar 

  46. Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018)

    Article  Google Scholar 

  47. Grari, V., Charpentier, A., Lamprier, S., Detyniecki, M.: A fair pricing model via adversarial learning (2022). arXiv:2202.12008

  48. Hale, K.: A.i. bias caused 80% of black mortgage applicants to be denied. Forbes 09 (2021)

    Google Scholar 

  49. Harcourt, B.E.: Surveiller et punir à l’âge actuariel. Déviance et Société 35, 163 (2011)

    Google Scholar 

  50. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. Adv. Neural Inform. Process. Syst. 29, 3315–3323 (2016)

    Google Scholar 

  51. Hastie, T., Tibshirani, R., Wainwright, M.: Statistical learning with sparsity. Monogr. Stat. Appl. Probab. 143, 143 (2015)

    Google Scholar 

  52. Hernán, M.A., Robins, J.M.: Causal inference (2010)

    Google Scholar 

  53. Hirschfeld, H.O.: A connection between correlation and contingency. Math. Proc. Cambridge Philos. Soc. 31(4), 520–524 (1935)

    Article  Google Scholar 

  54. Imai, K.: Quantitative Social Science: An Introduction. Princeton University Press (2018)

    Google Scholar 

  55. Imbens, G.W., Rubin, D.B.: Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press (2015)

    Google Scholar 

  56. Ito, J.: Supposedly ‘fair’ algorithms can perpetuate discrimination. Wired 02(05), 2019 (2021)

    Google Scholar 

  57. Jung, C., Kannan, S., Lee, C., Pai, M. M., Roth, A., Vohra, R.: Fair prediction with endogenous behavior (2020). arXiv:2002.07147

  58. Kamiran, F., Calders, T.: Classifying without discriminating. In: 2009 2nd International Conference on Computer, Control and Communication, pp. 1–6. IEEE (2009)

    Google Scholar 

  59. Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inform. Syst. 33(1), 1–33 (2012)

    Article  Google Scholar 

  60. Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 643–650 (2011)

    Google Scholar 

  61. Kearns, M., Roth, A.: The Ethical Algorithm: The Science of Socially Aware Algorithm Design. Oxford University Press (2019)

    Google Scholar 

  62. Kelly, H.: A Priest’s Phone Location Data Outed His Private Life. It Could Happen to Anyone. The Washington Post, 22 July 2021 (2021)

    Google Scholar 

  63. Khaitan, T.: Indirect discrimination. In: Lippert-Rasmussen, K. (ed.) Handbook of the Ethics of Discrimination, pp. 30–41. Routledge (2017)

    Google Scholar 

  64. Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning (2017). arXiv:1706.02744

  65. Kim, M.P., Reingold, O., Rothblum, G.N.: Fairness through computationally-bounded awareness (2018). arXiv preprint arXiv:1803.03239

  66. Kim, P.T.: Auditing Algorithms for Discrimination. Univ. Pennsylvania Law Rev. 166, 189 (2017)

    Google Scholar 

  67. Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., Mullainathan, S.: Human decisions and machine predictions. Q. J. Econ. 133(1), 237–293 (2017)

    Google Scholar 

  68. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores (2016). arXiv:1609.05807

  69. Kranzberg, M.: Technology and history: “kranzberg’s laws’’. Technol. Cult. 27(3), 544–560 (1986)

    Google Scholar 

  70. Krüger, F., Ziegel, J.F.: Generic conditions for forecast dominance. J. Bus. Econ. Stat. 39(4), 972–983 (2021)

    Article  MathSciNet  Google Scholar 

  71. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4066–4076. NIPS (2017)

    Google Scholar 

  72. Lippert-Rasmussen, K.: Nothing personal: On statistical discrimination. J. Polit. Philos. 15(4), 385–403 (2007)

    Google Scholar 

  73. Lohia, P.K., Ramamurthy, K.N., Bhide, M., Saha, D., Varshney, K.R., Puri, R.: Bias mitigation post-processing for individual and group fairness. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2847–2851. IEEE (2019)

    Google Scholar 

  74. Luong, B.T., Ruggieri, S., Turini, F.: \(k\)-nn as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 502–510 (2011)

    Google Scholar 

  75. Matthews, B.W.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Struct. 405(2), 442–451 (1975)

    Google Scholar 

  76. Mayer, J., Mutchler, P., Mitchell, J.C.: Evaluating the privacy properties of telephone metadata. Proc. Natl. Acad. Sci. 113(20), 5536–5541 (2016)

    Article  Google Scholar 

  77. McKinsey: Technology, Jobs and the Future of Work. McKinsey Global Institute (2017)

    Google Scholar 

  78. Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Conference on Fairness, Accountability and Transparency, pp. 107–118. PMLR (2018)

    Google Scholar 

  79. Mercat-Bruns, M.: Discrimination at Work. University of California Press (2016)

    Google Scholar 

  80. Miracle, J.M.: De-Anonymization Attack Anatomy and Analysis of Ohio Nursing Workforce Data Anonymization. PhD thesis, Wright State University (2016)

    Google Scholar 

  81. Morgan, S.L., Winship, C.: Counterfactuals and Causal Inference. Cambridge University Press (2015)

    Google Scholar 

  82. Palmer, D.E.: Insurance, risk assessment and fairness: An ethical analysis. In: Insurance Ethics for a More Ethical World. Emerald Group Publishing Limited (2007)

    Google Scholar 

  83. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann (1988)

    Google Scholar 

  84. Pearl, J., Mackenzie, D.: The Book of Why: the New Science of Cause and Effect. Basic books (2018)

    Google Scholar 

  85. Pedreshi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 560–568 (2008)

    Google Scholar 

  86. Phelps, E.S.: The statistical theory of racism and sexism. Am. Econ. Rev. 62(4), 659–661 (1972)

    Google Scholar 

  87. Rényi, A.: On measures of dependence. Acta mathematica hungarica 10(3–4), 441–451 (1959)

    MathSciNet  Google Scholar 

  88. Rothschild-Elyassi, G., Koehler, J., Simon, J.: Actuarial Justice, chapter 14, pp. 194–206. John Wiley & Sons, Ltd (2018)

    Google Scholar 

  89. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)

    Google Scholar 

  90. Scott, J., Marshall, G.: A Dictionary of Sociology. Oxford University Press, USA (2009)

    Google Scholar 

  91. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation. In: Australasian Joint Conference on Artificial Intelligence, pp. 1015–1021. Springer (2006)

    Google Scholar 

  92. Thomas, R.G.: Some novel perspectives on risk classification. The Geneva Papers on Risk and Insurance-Issues and Practice 32(1), 105–132 (2007)

    Article  Google Scholar 

  93. Thomsen, F.K.: Direct discrimination. In: Lippert-Rasmussen, K. (ed.) Handbook of the Ethics of Discrimination, pp. 19–29. Routledge (2017)

    Google Scholar 

  94. Van Calster, B., McLernon, D.J., Van Smeden, M., Wynants, L., Steyerberg, E.W.: Calibration: the Achilles heel of predictive analytics. BMC Med. 17(1), 1–7 (2019)

    Google Scholar 

  95. Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (Fairware), pp. 1–7. IEEE (2018)

    Google Scholar 

  96. Vogel, R., Bellet, A., Clémen, S., et al.: Learning fair scoring functions: Bipartite ranking under roc-based fairness constraints. In: International Conference on Artificial Intelligence and Statistics, pp. 784–792. PMLR (2021)

    Google Scholar 

  97. Žliobaite, I.: On the relation between accuracy and fairness in binary classification (2015). arXiv:1505.05723

  98. Wirth, L.: Morale and minority groups. Am. J. Sociol. 47(3), 415–433 (1941)

    Article  Google Scholar 

  99. Zafar, M.B., Valera, I., Rodriguez, M.G., Gummadi, K.P.: Fairness constraints: Mechanisms for fair classification (2017). arXiv:1507.05259

  100. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning (2018). arXiv:1801.07593

  101. Žliobaitė, I.: Measuring discrimination in algorithmic decision making. Data Mining Knowl. Discov. 31(4), 1060–1089 (2017)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Arthur Charpentier acknowledges the financial support of the AXA Research Fund through the joint research initiative use and value of unusual data in actuarial science, as well as NSERC grant 2019-07077.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arthur Charpentier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Charpentier, A. (2024). Quantifying Fairness and Discrimination in Predictive Models. In: Kreinovich, V., Sriboonchitta, S., Yamaka, W. (eds) Machine Learning for Econometrics and Related Topics. Studies in Systems, Decision and Control, vol 508. Springer, Cham. https://doi.org/10.1007/978-3-031-43601-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43601-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43600-0

  • Online ISBN: 978-3-031-43601-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics