Abstract
The analysis of discrimination has long interested economists and lawyers. In recent years, the literature in computer science and machine learning has become interested in the subject, offering an interesting re-reading of the topic. These questions are the consequences of numerous criticisms of algorithms used to translate texts or to identify people in images. With the arrival of massive data, and the use of increasingly opaque algorithms, it is not surprising to have discriminatory algorithms, because it has become easy to have a proxy of a sensitive variable, by enriching the data indefinitely. According to [69], “technology is neither good nor bad, nor is it neutral”, and therefore, “machine learning won’t give you anything like gender neutrality ‘for free’ that you didn’t explicitely ask for”, as claimed by [61]. In this article, we will come back to the general context, for predictive models in classification. We will present the main concepts of fairness, called group fairness, based on independence between the sensitive variable and the prediction, possibly conditioned on this or that information. We will finish by going further, by presenting the concepts of individual fairness. Finally, we will see how to correct a potential discrimination, in order to guarantee that a model is more ethical.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Project https://immersion.media.mit.edu/.
- 2.
Some articles define the score as \(\boldsymbol{x}^\top \boldsymbol{\beta }\), which has values in \(\mathbb {R}\). The score we define is an increasing function of this linear combination. Note that here, \(\Phi \) denotes the cumulative distribution function of some \(\mathcal {N}(0,1)\) variable.
References
Agarwal, S.: Trade-offs between fairness and interpretability in machine learning. In: IJCAI 2021 Workshop on AI for Social Good (2021)
Aigner, D.J., Cain, G.G.: Statistical theories of discrimination in labor markets. Indus. Labor Relat. Rev. 30(2), 175–187 (1977)
Avraham, R., Logue, K.D., Schwarcz, D.: Towards a universal framework for insurance anti-discrimination laws. Connecticut Insur. Law J. 21, 1 (2014)
Awasthi, P., Kleindessner, M., Morgenstern, J.: Equalized odds postprocessing under imperfect group information. In: International Conference on Artificial Intelligence and Statistics, pp. 1770–1780. PMLR (2020)
Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning (2019). fairmlbook.org
Barry, L., Charpentier, A.: The Fairness of Machine Learning in Insurance: New Rags for an Old Man? (2022)
Bechavod, Y., Ligett, K.: Penalizing Unfairness in Binary Classification (2017). arXiv preprint arXiv:1707.00044
Becker, G.S.: The Economics of Discrimination. University of Chicago press (1957)
Becker, G.S.: Is Ethnic and Other Profiling Discrimination? The Becker-Posner Blog (2005)
Bergstrom, C.T., West, J.D.: Calling Bullshit: The Art of Skepticism in a Data-Driven World. Random House Trade Paperbacks (2021)
Berk, R., Heidari, H., Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J., Neel, S., Roth, A.: A convex framework for fair regression (2017). arXiv:1706.02409
Berk, R., Heidari, H., Jabbari, S., Kearns, M., Roth, A.: Fairness in criminal justice risk assessments: the state of the art. Sociol. Methods Res. 50(1), 3–44 (2021)
Besse, P., del Barrio, E., Gordaliza, P., Loubes, J.-M.: Confidence Intervals for Testing Disparate Impact in Fair Learning (2018). arXiv:1807.06362
Beutel, A., Chen, J., Doshi, T., Qian, H., Wei, L., Wu, Y., Heldt, L., Zhao, Z., Hong, L., Chi, E.H., et al.: Fairness in recommendation ranking through pairwise comparisons. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2212–2220 (2019)
Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical implications when adversarially learning fair representations (2017). arXiv:1707.00075
Biddle, D.: Adverse Impact and Test Validation: A Practitioner’s Guide to Valid and Defensible Employment Testing. Routledge (2017)
Bohren, J.A., Haggag, K., Imas, A., Pope, D.G.: Inaccurate statistical discrimination: An identification problem. Technical report, National Bureau of Economic Research (2019)
Borkan, D., Dixon, L., Sorensen, J., Thain, N., Vasserman, L.: Nuanced metrics for measuring unintended bias with real data for text classification. In: Companion Proceedings of the 2019 World Wide Web Conference, pp. 491–500 (2019)
Calmon, F.P., Wei, D., Ramamurthy, K.N., Varshney, K.R.: Optimized data pre-processing for discrimination prevention (2017). arXiv:1704.03354
Caton, S., Haas, C.: Fairness in machine learning: A survey (2020). arXiv:2010.04053
Chakraborty, S., Raghavan, K.R., Johnson, M.P., Srivastava, M.B.: A framework for context-aware privacy of sensor data on mobile systems. In: Proceedings of the 14th Workshop on Mobile Computing Systems and Applications, HotMobile ’13. Association for Computing Machinery (2013)
Charles, K.K., Guryan, J.: Studying discrimination: fundamental challenges and recent progress. Ann. Rev. Econ. 3(1), 479–511 (2011)
Charpentier, A.: Insurance: Biases, Discrimination and Fairness. Institut Louis Bachelier (2022)
Charpentier, A., Flachaire, E., Gallic, E.: Causal inference with optimal transport. In: Thach, N.N., Kreinovich, V., Ha, D.T., Trung, N.D. (eds.) Optimal Transport Statistics for Economics and Related Topics. Springer Verlag (2023)
Charpentier, A., Flachaire, E., Ly, A.: Econometrics and machine learning. Economie et Statistique 505(1), 147–169 (2018)
Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genom. 21(1), 1–13 (2020)
Cho, J., Hwang, G., Suh, C.: A fair classifier using mutual information. In: 2020 IEEE International Symposium on Information Theory (ISIT), pp. 2521–2526. IEEE (2020)
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness (2017). arXiv:1701.08230
Cornu, G.: Vocabulaire juridique, vol. 6. Presses Universitaires de France (2016)
Cunningham, S.: Causal Inference. Yale University Press (2021)
Dalenius, T.: Towards a methodology for statistical disclosure control. Statistik Tidskrift 15(429-444), 2–1 (1977)
Daniels, N.: The Functions of Insurance and the Fairness of Genetic Underwriting, pp. 119–145. Medical Underwriting and Social Policy, Genetics and Life Insurance (2004)
David, H.: Why are there still so many jobs? The history and future of workplace automation. J. Econ. Perspect. 29(3), 3–30 (2015)
Dawid, A.P.: The well-calibrated Bayesian. J. Am. Stat. Assoc. 77(379), 605–610 (1982)
Denuit, M., Charpentier, A., Trufin, J.: Autocalibration and tweedie-dominance for insurance pricing with machine learning. Math. Econ. Insurance (2021)
Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity constraints. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 301–316. Springer (2008)
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)
Dwork, C., Immorlica, N., Kalai, A.T., Leiserson, M.: Decoupled classifiers for group-fair and efficient machine learning. In: Conference on Fairness, Accountability and Transparency, pp. 119–133. PMLR (2018)
Edgeworth, F.Y.: Equal pay to men and women for equal work. Econ. J. 32(128), 431–457 (1922)
Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2015)
Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 329–338 (2019)
Gajane, P., Pechenizkiy, M.: On formalizing fairness in prediction with machine learning (2017). arXiv preprint arXiv:1710.03184
Gambs, S., Killijian, M.-O., del Prado Cortez, M. N.N.: Show me how you move and i will tell you who you are. In: Proceedings of the 3rd ACM International Workshop on Security and Privacy in GIS and LBS (2010)
Gebelein, H.: Das statistische problem der korrelation als variations- und eigenwertproblem und sein zusammenhang mit der ausgleichsrechnung. ZAMM: J. Appl. Math. Mech./Zeitschrift für Angewandte Mathematik und Mechanik 21(6), 364–379 (1941)
Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018)
Grari, V., Charpentier, A., Lamprier, S., Detyniecki, M.: A fair pricing model via adversarial learning (2022). arXiv:2202.12008
Hale, K.: A.i. bias caused 80% of black mortgage applicants to be denied. Forbes 09 (2021)
Harcourt, B.E.: Surveiller et punir à l’âge actuariel. Déviance et Société 35, 163 (2011)
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. Adv. Neural Inform. Process. Syst. 29, 3315–3323 (2016)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical learning with sparsity. Monogr. Stat. Appl. Probab. 143, 143 (2015)
Hernán, M.A., Robins, J.M.: Causal inference (2010)
Hirschfeld, H.O.: A connection between correlation and contingency. Math. Proc. Cambridge Philos. Soc. 31(4), 520–524 (1935)
Imai, K.: Quantitative Social Science: An Introduction. Princeton University Press (2018)
Imbens, G.W., Rubin, D.B.: Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press (2015)
Ito, J.: Supposedly ‘fair’ algorithms can perpetuate discrimination. Wired 02(05), 2019 (2021)
Jung, C., Kannan, S., Lee, C., Pai, M. M., Roth, A., Vohra, R.: Fair prediction with endogenous behavior (2020). arXiv:2002.07147
Kamiran, F., Calders, T.: Classifying without discriminating. In: 2009 2nd International Conference on Computer, Control and Communication, pp. 1–6. IEEE (2009)
Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inform. Syst. 33(1), 1–33 (2012)
Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 643–650 (2011)
Kearns, M., Roth, A.: The Ethical Algorithm: The Science of Socially Aware Algorithm Design. Oxford University Press (2019)
Kelly, H.: A Priest’s Phone Location Data Outed His Private Life. It Could Happen to Anyone. The Washington Post, 22 July 2021 (2021)
Khaitan, T.: Indirect discrimination. In: Lippert-Rasmussen, K. (ed.) Handbook of the Ethics of Discrimination, pp. 30–41. Routledge (2017)
Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning (2017). arXiv:1706.02744
Kim, M.P., Reingold, O., Rothblum, G.N.: Fairness through computationally-bounded awareness (2018). arXiv preprint arXiv:1803.03239
Kim, P.T.: Auditing Algorithms for Discrimination. Univ. Pennsylvania Law Rev. 166, 189 (2017)
Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., Mullainathan, S.: Human decisions and machine predictions. Q. J. Econ. 133(1), 237–293 (2017)
Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores (2016). arXiv:1609.05807
Kranzberg, M.: Technology and history: “kranzberg’s laws’’. Technol. Cult. 27(3), 544–560 (1986)
Krüger, F., Ziegel, J.F.: Generic conditions for forecast dominance. J. Bus. Econ. Stat. 39(4), 972–983 (2021)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4066–4076. NIPS (2017)
Lippert-Rasmussen, K.: Nothing personal: On statistical discrimination. J. Polit. Philos. 15(4), 385–403 (2007)
Lohia, P.K., Ramamurthy, K.N., Bhide, M., Saha, D., Varshney, K.R., Puri, R.: Bias mitigation post-processing for individual and group fairness. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2847–2851. IEEE (2019)
Luong, B.T., Ruggieri, S., Turini, F.: \(k\)-nn as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 502–510 (2011)
Matthews, B.W.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Struct. 405(2), 442–451 (1975)
Mayer, J., Mutchler, P., Mitchell, J.C.: Evaluating the privacy properties of telephone metadata. Proc. Natl. Acad. Sci. 113(20), 5536–5541 (2016)
McKinsey: Technology, Jobs and the Future of Work. McKinsey Global Institute (2017)
Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Conference on Fairness, Accountability and Transparency, pp. 107–118. PMLR (2018)
Mercat-Bruns, M.: Discrimination at Work. University of California Press (2016)
Miracle, J.M.: De-Anonymization Attack Anatomy and Analysis of Ohio Nursing Workforce Data Anonymization. PhD thesis, Wright State University (2016)
Morgan, S.L., Winship, C.: Counterfactuals and Causal Inference. Cambridge University Press (2015)
Palmer, D.E.: Insurance, risk assessment and fairness: An ethical analysis. In: Insurance Ethics for a More Ethical World. Emerald Group Publishing Limited (2007)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann (1988)
Pearl, J., Mackenzie, D.: The Book of Why: the New Science of Cause and Effect. Basic books (2018)
Pedreshi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 560–568 (2008)
Phelps, E.S.: The statistical theory of racism and sexism. Am. Econ. Rev. 62(4), 659–661 (1972)
Rényi, A.: On measures of dependence. Acta mathematica hungarica 10(3–4), 441–451 (1959)
Rothschild-Elyassi, G., Koehler, J., Simon, J.: Actuarial Justice, chapter 14, pp. 194–206. John Wiley & Sons, Ltd (2018)
Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)
Scott, J., Marshall, G.: A Dictionary of Sociology. Oxford University Press, USA (2009)
Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation. In: Australasian Joint Conference on Artificial Intelligence, pp. 1015–1021. Springer (2006)
Thomas, R.G.: Some novel perspectives on risk classification. The Geneva Papers on Risk and Insurance-Issues and Practice 32(1), 105–132 (2007)
Thomsen, F.K.: Direct discrimination. In: Lippert-Rasmussen, K. (ed.) Handbook of the Ethics of Discrimination, pp. 19–29. Routledge (2017)
Van Calster, B., McLernon, D.J., Van Smeden, M., Wynants, L., Steyerberg, E.W.: Calibration: the Achilles heel of predictive analytics. BMC Med. 17(1), 1–7 (2019)
Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (Fairware), pp. 1–7. IEEE (2018)
Vogel, R., Bellet, A., Clémen, S., et al.: Learning fair scoring functions: Bipartite ranking under roc-based fairness constraints. In: International Conference on Artificial Intelligence and Statistics, pp. 784–792. PMLR (2021)
Žliobaite, I.: On the relation between accuracy and fairness in binary classification (2015). arXiv:1505.05723
Wirth, L.: Morale and minority groups. Am. J. Sociol. 47(3), 415–433 (1941)
Zafar, M.B., Valera, I., Rodriguez, M.G., Gummadi, K.P.: Fairness constraints: Mechanisms for fair classification (2017). arXiv:1507.05259
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning (2018). arXiv:1801.07593
Žliobaitė, I.: Measuring discrimination in algorithmic decision making. Data Mining Knowl. Discov. 31(4), 1060–1089 (2017)
Acknowledgements
Arthur Charpentier acknowledges the financial support of the AXA Research Fund through the joint research initiative use and value of unusual data in actuarial science, as well as NSERC grant 2019-07077.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Charpentier, A. (2024). Quantifying Fairness and Discrimination in Predictive Models. In: Kreinovich, V., Sriboonchitta, S., Yamaka, W. (eds) Machine Learning for Econometrics and Related Topics. Studies in Systems, Decision and Control, vol 508. Springer, Cham. https://doi.org/10.1007/978-3-031-43601-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-43601-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43600-0
Online ISBN: 978-3-031-43601-7
eBook Packages: EngineeringEngineering (R0)