Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3531146.3533197acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

Towards Fair Unsupervised Learning

Published: 20 June 2022 Publication History

Abstract

Bias-mitigating techniques are now well established in the supervised learning literature and have shown their ability to tackle fairness-accuracy, as well as fairness-fairness trade-offs. These are usually predicated on different conceptions of fairness, such as demographic parity or equal odds that depend on the available labels in the dataset. However, it is often the case in practice that unsupervised learning is used as part of a machine learning pipeline (for instance, to perform dimensionality reduction or representation learning via SVD) or as a standalone model (for example, to derive a customer segmentation via k-means). It is thus crucial to develop approaches towards fair unsupervised learning. This work investigates fair unsupervised learning within the broad framework of generalised low-rank models (GLRM). Importantly, we introduce the concept of fairness functional that encompasses both traditional unsupervised learning techniques and min-max algorithms (whereby one minimises the maximum group loss). To do so, we design straightforward alternate convex search or biconvex gradient descent algorithms that also provide partial debiasing techniques. Finally, we show on benchmark datasets that our fair generalised low-rank models (“fGLRM”) perform well and help reduce disparity amongst groups while only incurring small runtime overheads.

References

[1]
Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. 2021. Fair Clustering via Equitable Group Representations. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 504–514. https://doi.org/10.1145/3442188.3445913
[2]
Ashrya Agrawal, Florian Pfisterer, Bernd Bischl, Jiahao Chen, Srijan Sood, Sameena Shah, Francois Buet-Golfouse, Bilal A Mateen, and Sebastian Vollmer. 2020. Debiasing classifiers: is reality at variance with expectation?
[3]
Sara Ahmadian, Alessandro Epasto, Ravi Kumar, and Mohammad Mahdian. 2019. Clustering without Over-Representation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 267–275. https://doi.org/10.1145/3292500.3330987
[4]
Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. 2018. Fairness in Criminal Justice Risk Assessments: The State of the Art. Sociological Methods & Research (Aug. 2018), 42 pages. https://doi.org/10.1177/0049124118782533 arxiv:1703.09207
[5]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge University Press.
[6]
Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Long Beach, CA, 3992–4001. http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf
[7]
Elisa Celis, Vijay Keswani, Damian Straszak, Amit Deshpande, Tarun Kathuria, and Nisheeth Vishnoi. 2018. Fair and Diverse DPP-Based Data Summarization. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, 716–725. https://proceedings.mlr.press/v80/celis18a.html
[8]
Anshuman Chhabra, Karina Masalkovaitė, and Prasant Mohapatra. 2021. An Overview of Fairness in Clustering. IEEE Access 9(2021), 130698–130720. https://doi.org/10.1109/ACCESS.2021.3114099
[9]
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. 2017. Fair Clustering through Fairlets. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 5036–5044.
[10]
Alexandra Chouldechova. 2017. Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data 5, 2 (June 2017), 153–163. https://doi.org/10.1089/big.2016.0047 arxiv:1703.00056
[11]
Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5, 2 (2017), 153–163. https://doi.org/10.1089/big.2016.0047
[12]
Ching-Yao Chuang and Youssef Mroueh. 2021. Fair Mixup: Fairness via Interpolation. In International Conference on Learning Representations. https://openreview.net/forum?id=DNl5s5BXeBn
[13]
Michele Donini, Luca Oneto, Shai Ben-David, John S Shawe-Taylor, and Massimiliano Pontil. 2018. Empirical Risk Minimization Under Fairness Constraints. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2018/file/83cdcec08fbf90370fcf53bdd56604ff-Paper.pdf
[14]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[15]
William Fithian and Rahul Mazumder. 2018. Flexible Low-Rank Statistical Modeling with Missing Data and Side Information. Statist. Sci. 33, 2 (2018), 238 – 260. https://doi.org/10.1214/18-STS642
[16]
Tamara Berg Gary B. Huang, Manu Rameshand Erik Learned-Miller. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments.49, 07 (2007).
[17]
Mehrdad Ghadiri, Samira Samadi, and Santosh Vempala. 2021. Socially Fair K-Means Clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 438–448. https://doi.org/10.1145/3442188.3445906
[18]
Jochen Gorski, Frank Pfeuffer, and Kathrin Klamroth. 2007. Biconvex sets and optimization with biconvex functions: a survey and extensions. Mathematical Methods of Operations Research 66 (2007). https://doi.org/10.1007/s00186-007-0161-1
[19]
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems 29 (Dec. 2016), 3323–3331. https://doi.org/10.5555/3157382.3157469 arxiv:1610.02413
[20]
Trevor Hastie, Robert Tibshirani, and Martin Wainwright. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman & Hall/CRC.
[21]
& Che-hui Lien I-Cheng Yeh. 2009. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients.Expert Systems with Applications 2, 36 (2009), 2473–2480.https://doi.org/10.1016/J.ESWA.2007.12.020
[22]
Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.
[23]
Joon Sik Kim, Jiahao Chen, and Ameet Talwalkar. 2020. Model-Agnostic Characterization of Fairness Trade-offs. In Proceedings of the International Conference on Machine Learning. Vienna, Austria / Online, 9339–9349. https://proceedings.icml.cc/paper/2020/hash/cf5530d9e441e0d78574353214373569
[24]
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In Proceedings of the 8th Innovations in Theoretical Computer Science Conference(Leibniz International Proceedings in Informatics (LIPIcs), Vol. 67), Christos H. Papadimitriou (Ed.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, Article 43, 23 pages. https://doi.org/10.4230/LIPIcs.ITCS.2017.43
[25]
Matthäus Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. 2019. Fair k-Center Clustering for Data Summarization. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 3448–3457. https://proceedings.mlr.press/v97/kleindessner19a.html
[26]
Matthäus Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. 2020. A Notion of Individual Fairness for Clustering. arxiv:2006.04960 [stat.ML]
[27]
Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, and Jamie Morgenstern. 2019. Guarantees for Spectral Clustering with Fairness Constraints. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 3458–3467. https://proceedings.mlr.press/v97/kleindessner19b.html
[28]
Tian Li, Maziar Sanjabi, Ahmad Beirami, and Virginia Smith. 2020. Fair Resource Allocation in Federated Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=ByexElSYDr
[29]
Yunqi Li, Hanxiong Chen, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2021. User-Oriented Fairness in Recommendation. In Proceedings of the Web Conference 2021(Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 624–632. https://doi.org/10.1145/3442381.3449866
[30]
Yury Makarychev and Ali Vakilian. 2021. Approximation Algorithms for Socially Fair Clustering. arxiv:2103.02512 [cs.DS]
[31]
Geoffrey J McLachlan and Kaye E Basford. 1988. Mixture models: Inference and applications to clustering. Vol. 38. M. Dekker New York.
[32]
Arvind Narayanan. 2018. Translation tutorial: 21 fairness definitions and their politics. In Proceedings of the Conference on Fairness, Accountability and Transparency(FAT* 18). New York, USA.
[33]
Frank Nielsen and Ke Sun. 2016. Guaranteed Bounds on Information-Theoretic Measures of Univariate Mixtures Using Piecewise Log-Sum-Exp Inequalities. Entropy 18, 12 (2016). https://doi.org/10.3390/e18120442
[34]
Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q Weinberger. 2017. On fairness and calibration. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., 5680–5689. http://papers.nips.cc/paper/7151-on-fairness-and-calibration.pdf
[35]
John Rawls. 1971. A Theory of Justice (1ed.). Belknap Press of Harvard University Press.
[36]
Samira Samadi, Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala. 2018. The Price of Fair PCA: One Extra Dimension. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 10999–11010.
[37]
Uthaipon Tantipongpipat, Samira Samadi, Mohit Singh, Jamie H Morgenstern, and Santosh Vempala. 2019. Multi-Criteria Dimensionality Reduction with Applications to Fairness. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2019/file/2201611d7a08ffda97e3e8c6b667a1bc-Paper.pdf
[38]
Madeleine Udell, Corinne Horn, Reza Zadeh, and Stephen Boyd. 2016. Generalized Low Rank Models. Foundations and Trends in Machine Learning 9, 1 (2016). https://doi.org/10.1561/2200000055 arxiv:1410.0342 [stat-ml]
[39]
US Congress. 2003. P. L. 108-159: Fair and Accurate Credit Transactions Act. https://www.gpo.gov/fdsys/pkg/PLAW-108publ159/pdf/PLAW-108publ159.pdf
[40]
Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In Proceedings of the International Conference on Software Engineering. ACM, New York, NY, USA, 1–7. https://doi.org/10.1145/3194770.3194776
[41]
Richard Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In Proceedings of Machine Learning Research, Vol. 28. 1362–1370. http://proceedings.mlr.press/v28/zemel13.html
[42]
Ziwei Zhu, Jianling Wang, and James Caverlee. 2020. Measuring and Mitigating Item Under-Recommendation Bias in Personalized Ranking Systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 449–458. https://doi.org/10.1145/3397271.3401177

Cited By

View all
  • (2024)“It’s Everybody’s Role to Speak Up... But Not Everyone Will”: Understanding AI Professionals’ Perceptions of Accountability for AI Bias MitigationACM Journal on Responsible Computing10.1145/36321211:1(1-30)Online publication date: 20-Mar-2024
  • (2024)Crafting Disability Fairness Learning in Data Science: A Student-Centric Pedagogical ApproachProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 110.1145/3626252.3630815(944-950)Online publication date: 7-Mar-2024
  • (2024)An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW60836.2024.00109(995-1004)Online publication date: 1-Jan-2024
  • Show More Cited By

Index Terms

  1. Towards Fair Unsupervised Learning
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
          June 2022
          2351 pages
          ISBN:9781450393522
          DOI:10.1145/3531146
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 20 June 2022

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Clustering
          2. Fairness
          3. PCA
          4. Unsupervised Learning

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          FAccT '22
          Sponsor:

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)153
          • Downloads (Last 6 weeks)14
          Reflects downloads up to 03 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)“It’s Everybody’s Role to Speak Up... But Not Everyone Will”: Understanding AI Professionals’ Perceptions of Accountability for AI Bias MitigationACM Journal on Responsible Computing10.1145/36321211:1(1-30)Online publication date: 20-Mar-2024
          • (2024)Crafting Disability Fairness Learning in Data Science: A Student-Centric Pedagogical ApproachProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 110.1145/3626252.3630815(944-950)Online publication date: 7-Mar-2024
          • (2024)An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW60836.2024.00109(995-1004)Online publication date: 1-Jan-2024
          • (2024)Insights into Artificial Intelligence Bias: Implications for AgricultureDigital Society10.1007/s44206-024-00142-x3:3Online publication date: 2-Oct-2024
          • (2023)Turbo-Charging Deep Learning Methods for Partial Differential EquationsProceedings of the Fourth ACM International Conference on AI in Finance10.1145/3604237.3626900(150-158)Online publication date: 27-Nov-2023
          • (2023)WEIRD FAccTs: How Western, Educated, Industrialized, Rich, and Democratic is FAccT?Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3593985(160-171)Online publication date: 12-Jun-2023
          • (2023)Normalizing Flows for Human Pose Anomaly Detection2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01246(13499-13508)Online publication date: 1-Oct-2023
          • (2022)Towards Fair Multi-Stakeholder Recommender SystemsAdjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3511047.3538031(255-265)Online publication date: 4-Jul-2022

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media