Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fair Feature Selection: A Causal Perspective

Published: 19 June 2024 Publication History

Abstract

Fair feature selection for classification decision tasks has recently garnered significant attention from researchers. However, existing fair feature selection algorithms fall short of providing a full explanation of the causal relationship between features and sensitive attributes, potentially impacting the accuracy of fair feature identification. To address this issue, we propose a fair causal feature selection algorithm, called FairCFS. Specifically, FairCFS constructs a localized causal graph that identifies the Markov blankets of class and sensitive variables, to block the transmission of sensitive information for selecting fair causal features. Extensive experiments on seven public real-world datasets validate that FairCFS has accuracy comparable to eight state-of-the-art feature selection algorithms while presenting more superior fairness.

References

[1]
Constantin F. Aliferis, Alexander Statnikov, Ioannis Tsamardinos, Subramani Mani, and Xenofon D. Koutsoukos. 2010. Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. Journal of Machine Learning Research 11, 7 (2010), 171–234.
[2]
Constantin F. Aliferis, Ioannis Tsamardinos, and Alexander Statnikov. 2003. HITON: A novel Markov blanket algorithm for optimal variable selection. In AMIA Annual Symposium Proceedings. Vol. 2003. American Medical Informatics Association, 21.
[3]
Yahav Bechavod and Katrina Ligett. 2017. Learning fair classifiers: A regularization-inspired approach. arXiv preprint arXiv:1707.00044 (2017).
[4]
Clara Belitz, Lan Jiang, and Nigel Bosch. 2021. Automating procedurally fair feature selection in machine learning. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 379–389.
[5]
Ben S. Bernanke and Alan S. Blinder. 1988. Credit, money, and aggregate demand. American Economic Review 78, 2 (1988), 435–439.
[6]
Giorgos Borboudakis and Ioannis Tsamardinos. 2019. Forward-backward selection with early dropping. Journal of Machine Learning Research 20, 1 (2019), 276–314.
[7]
Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R. Varshney. 2017. Optimized pre-processing for discrimination prevention. Advances in Neural Information Processing Systems 30 (2017), 1–10.
[8]
Sam Corbett-Davies and Sharad Goel. 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023 (2018).
[9]
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 797–806.
[10]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226.
[11]
Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2017. Decoupled classifiers for fair and efficient machine learning. arXiv preprint arXiv:1707.06613 (2017).
[12]
Shunkai Fu and Michel C. Desmarais. 2008. Fast Markov blanket discovery algorithm via local learning within single pass. In Advances in Artificial Intelligence. Lecture Notes in Computer Science, Vol. 5032. Springer, 96–107.
[13]
Sainyam Galhotra, Karthikeyan Shanmugam, Prasanna Sattigeri, and Kush R. Varshney. 2022. Causal feature selection for algorithmic fairness. In Proceedings of the 2022 International Conference on Management of Data. 276–285.
[14]
Tian Gao and Qiang Ji. 2017. Efficient Markov blanket discovery and its application. IEEE Transactions on Cybernetics 47, 5 (2017), 1169–1179.
[15]
Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P. Gummadi, and Adrian Weller. 2016. The case for process fairness in learning: Feature selection for fair decision making. In Proceedings of the Symposium on Machine Learning and the Law at the 29th Conference on Neural Information Processing Systems (NIPS ’16), Vol. 1. 1–11.
[16]
Isabelle Guyon, Constantin Aliferis, and André Elisseeff. 2007. Causal feature selection. Computational Methods of Feature Selection 2007 (2007), 63–82.
[17]
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems 29 (2016), 1–9.
[18]
Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.
[19]
Matt J. Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. Advances in Neural Information Processing Systems 30 (2017), 1–11.
[20]
Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, and Eirini Ntoutsi. 2022. A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12, 3 (2022), e1452.
[21]
Zhaolong Ling, Bo Li, Yiwen Zhang, Qingren Wang, Kui Yu, and Xindong Wu. 2023. Causal feature selection with efficient spouses discovery. IEEE Transactions on Big Data 9, 2 (2023), 555–568.
[22]
Zhaolong Ling, Kui Yu, Hao Wang, Lin Liu, Wei Ding, and Xindong Wu. 2019. BAMB: A balanced Markov blanket discovery approach to feature selection. ACM Transactions on Intelligent Systems and Technology 10, 5 (2019), 1–25.
[23]
Zhaolong Ling, Kui Yu, Yiwen Zhang, Lin Liu, and Jiuyong Li. 2022. Causal learner: A toolbox for causal structure and Markov blanket learning. Pattern Recognition Letters 163 (2022), 92–95.
[24]
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2015. The variational fair autoencoder. arXiv preprint arXiv:1511.00830 (2015).
[25]
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN as an implementation of situation testing for discrimination discovery and prevention. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 502–510.
[26]
Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. 2020. Survey on causal-based machine learning fairness notions. arXiv preprint arXiv:2010.09553 (2020).
[27]
Dimitris Margaritis and Sebastian Thrun. 1999. Bayesian network induction via local neighborhoods. Advances in Neural Information Processing Systems 12 (1999), 1–7.
[28]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys 54, 6 (2021), 1–35.
[29]
Aditya Krishna Menon and Robert C. Williamson. 2018. The cost of fairness in binary classification. In Proceedings of the 1st Conference on Fairness, Accountability, and Transparency. 107–118.
[30]
Richard E. Neapolitan. 2004. Learning Bayesian Networks. Pearson Prentice Hall, Upper Saddle River, NJ.
[31]
J. Pearl. 1988. The Morgan Kaufmann Series in Representation and Reasoning. In Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
[32]
Judea Pearl. 2009. Causality. Cambridge University Press.
[33]
Edmund D. Pellegrino. 1966. Medicine, philosophy and man’s infirmity. In Conditio Humana, W. von Baeyer and R. M. Griffith (Eds.). Springer, 272–284.
[34]
Jose M. Pena, Roland Nilsson, Johan Björkegren, and Jesper Tegnér. 2007. Towards scalable and data efficient learning of Markov boundaries. International Journal of Approximate Reasoning 45, 2 (2007), 211–232.
[35]
Dana Pessach and Erez Shmueli. 2022. A review on fairness in machine learning. ACM Computing Surveys 55, 3 (2022), 1–44.
[36]
Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. 2019. Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 International Conference on Management of Data. 793–810.
[37]
Peter Spirtes, Clark N. Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search. MIT Press.
[38]
Salvatore P. Sutera and Richard Skalak. 1993. The history of Poiseuille’s law. Annual Review of Fluid Mechanics 25, 1 (1993), 1–20.
[39]
Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Time and sample efficient discovery of Markov blankets and direct causal relations. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 673–678.
[40]
Ioannis Tsamardinos, Constantin F. Aliferis, Alexander R. Statnikov, and Er Statnikov. 2003. Algorithms for large scale Markov blanket discovery. In Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference (FLAIRS ’03), Vol. 2. 376–380.
[41]
Clifford H. Wagner. 1982. Simpson’s paradox in real life. American Statistician 36, 1 (1982), 46–48.
[42]
Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. 2017. Learning non-discriminatory predictors. In Proceedings of the 2017 Conference on Learning Theory. 1920–1953.
[43]
Sandeep Yaramakala and Dimitris Margaritis. 2005. Speculative Markov blanket discovery for optimal feature selection. In Proceedings of the 5th IEEE International Conference on Data Mining. IEEE, 1–4.
[44]
Kui Yu, Xianjie Guo, Lin Liu, Jiuyong Li, Hao Wang, Zhaolong Ling, and Xindong Wu. 2020. Causality-based feature selection: Methods and evaluations. ACM Computing Surveys 53, 5 (2020), 1–36.
[45]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171–1180.
[46]
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In Proceedings of the 30th International Conference on Machine Learning. 325–333.
[47]
Guixian Zhang, Debo Cheng, Guan Yuan, and Shichao Zhang. 2024. Learning fair representations via rebalancing graph structure. Information Processing & Management 61, 1 (2024), 103570.
[48]
Guixian Zhang, Debo Cheng, and Shichao Zhang. 2023. FPGNN: Fair path graph neural network for mitigating discrimination. World Wide Web 26 (2023), 3119–3136.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 7
August 2024
505 pages
EISSN:1556-472X
DOI:10.1145/3613689
  • Editor:
  • Jian Pei
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2024
Online AM: 03 February 2024
Accepted: 30 January 2024
Revised: 09 January 2024
Received: 15 September 2023
Published in TKDD Volume 18, Issue 7

Check for updates

Author Tags

  1. Causal fairness
  2. fair feature selection
  3. markov blanket

Qualifiers

  • Research-article

Funding Sources

  • National Key Research and Development Program of China
  • National Natural Science Foundation of China
  • Natural Science Project of Anhui Provincial Education Department

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 376
    Total Downloads
  • Downloads (Last 12 months)376
  • Downloads (Last 6 weeks)54
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media