Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3531146.3533149acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Open access

Model Multiplicity: Opportunities, Concerns, and Solutions

Published: 20 June 2022 Publication History

Abstract

Recent scholarship has brought attention to the fact that there often exist multiple models for a given prediction task with equal accuracy that differ in their individual-level predictions or aggregate properties. This phenomenon—which we call model multiplicity—can introduce a good deal of flexibility into the model selection process, creating a range of exciting opportunities. By demonstrating that there are many different ways of making equally accurate predictions, multiplicity gives model developers the freedom to prioritize other values in their model selection process without having to abandon their commitment to maximizing accuracy. However, multiplicity also brings to light a concerning truth: model selection on the basis of accuracy alone—the default procedure in many deployment scenarios—fails to consider what might be meaningful differences between equally accurate models with respect to other criteria such as fairness, robustness, and interpretability. Unless these criteria are taken into account explicitly, developers might end up making unnecessary trade-offs or could even mask intentional discrimination. Furthermore, the prospect that there might exist another model of equal accuracy that flips a prediction for a particular individual may lead to a crisis in justifiability: why should an individual be subject to an adverse model outcome if there exists an equally accurate model that treats them more favorably? In this work, we investigate how to take advantage of the flexibility afforded by model multiplicity while addressing the concerns with justifiability that it might raise?

References

[1]
1970. 116 Cong. Reg. 36572.
[2]
2011. Comment for 1002.6 - Rules Concerning Evaluation of Applications. https://www.consumerfinance.gov/rules-policy/regulations/1002/interp-6/.
[3]
Alekh Agarwal, Alina Beygelzimer, Miroslav Dudik, John Langford, and Hanna Wallach. 2018. A Reductions Approach to Fair Classification. In International Conference on Machine Learning. 60–69.
[4]
Christopher Anders, Plamen Pasliev, Ann-Kathrin Dombrowski, Klaus-Robert Müller, and Pan Kessel. 2020. Fairwashing explanations with off-manifold detergent. In International Conference on Machine Learning. PMLR, 314–323.
[5]
Solon Barocas and Andrew D Selbst. 2016. Big data’s disparate impact. California Law Review 104 (2016), 671–732.
[6]
Solon Barocas, Andrew D Selbst, and Manish Raghavan. 2020. The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 80–89.
[7]
Dimitris Bertsimas, Arthur Delarue, Patrick Jaillet, and Sebastien Martin. 2019. The price of interpretability. arXiv preprint arXiv:1907.03419(2019).
[8]
Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao. 2021. The values encoded in machine learning research. arXiv preprint arXiv:2106.15590(2021).
[9]
Emily Black, Solon Barocas, Alexandra Chouldechova, Logan Koepke, Kristian Lum, Michael Madaio, and Sarah Riley. 2022. Reducing Racial Disparity Through Procedural Interventions: Conceptions and Outcomes.
[10]
Emily Black and Matt Fredrikson. 2021. Leave-one-out Unfairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 285–295.
[11]
Emily Black, Klas Leino, and Matt Fredrikson. 2022. Selective Ensembles for Consistent Predictions. In International Conference on Learning Representations.
[12]
Emily Black, Zifan Wang, Matt Fredrikson, and Anupam Datta. 2022. Consistent Counterfactuals for Deep Models. In International Conference on Learning Representations.
[13]
Leo Breiman. 2001. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science 16, 3 (2001), 199–231.
[14]
Lisa Schultz Bressman. 2003. Beyond accountability: Arbitrariness and legitimacy in the administrative state. NYUL Rev. 78(2003), 461.
[15]
Chaofan Chen, Kangcheng Lin, Cynthia Rudin, Yaron Shaposhnik, Sijia Wang, and Tong Wang. 2018. An interpretable model with globally consistent explanations for credit risk. arXiv preprint arXiv:1811.12615(2018).
[16]
Irene Y. Chen, Fredrik D. Johansson, and David A. Sontag. 2018. Why Is My Classifier Discriminatory?. In Advances in Neural Information Processing Systems. 3543–3554.
[17]
Danielle Keats Citron. 2007. Technological due process. Wash. L Rev. 85(2007), 1249.
[18]
Danielle Keats Citron and Frank Pasquale. 2014. The scored society: Due process for automated predictions. Wash. L. Rev. 89(2014), 1.
[19]
A Feder Cooper and Ellen Abrams. 2021. Emergent Unfairness in Algorithmic Fairness-Accuracy Trade-Off Research. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 46–54.
[20]
Amanda Coston, Ashesh Rambachan, and Alexandra Chouldechova. 2021. Characterizing Fairness Over the Set of Good Models Under Selective Labels. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139). PMLR, 2144–2155.
[21]
Kate Crawford and Jason Schultz. 2014. Big data and due process: Toward a framework to redress predictive privacy harms. BCL Rev. 55(2014), 93.
[22]
Kathleen Creel and Deborah Hellman. 2021. The Algorithmic Leviathan: Arbitrariness, Fairness, and Opportunity in Algorithmic Decision Making Systems. Virginia Public Law and Legal Theory Research Paper2021-13 (2021).
[23]
Alexander D’Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D Hoffman, 2020. Underspecification presents challenges for credibility in modern machine learning. arXiv preprint arXiv:2011.03395(2020).
[24]
Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, and Shayak Sen. 2017. Proxy Discrimination in Data-Driven Systems. arXiv preprint arXiv:1707.08120(2017).
[25]
Pedro Domingos. 2000. A unified bias-variance decomposition. In Proceedings of 17th International Conference on Machine Learning. 231–238.
[26]
Jiayun Dong and Cynthia Rudin. 2019. Variable importance clouds: A way to explore variable importance for the set of good models. arXiv preprint arXiv:1901.03209(2019).
[27]
David Donoho. 2017. 50 years of data science. Journal of Computational and Graphical Statistics 26, 4(2017), 745–766.
[28]
Sanghamitra Dutta, Dennis Wei, Hazar Yueksel, Pin-Yu Chen, Sijia Liu, and Kush Varshney. 2020. Is there a trade-off between fairness and accuracy? a perspective using mismatched hypothesis testing. In International Conference on Machine Learning. PMLR, 2803–2813.
[29]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214–226.
[30]
Equal Credit Opportunities Act, Public Law 93-495 1974. Codified at 15 U.S.C. § 1691, et seq.
[31]
Fair Credit Reporting Act, Public Law 91-508 1970. Codified at 15 U.S.C. § 1681, et seq.
[32]
Aaron Fisher, Cynthia Rudin, and Francesca Dominici. 2019. All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.J. Mach. Learn. Res. 20, 177 (2019), 1–81.
[33]
Stuart Geman, Elie Bienenstock, and René Doursat. 1992. Neural networks and the bias/variance dilemma. Neural computation 4, 1 (1992), 1–58.
[34]
Amir-Hossein Karimi, Gilles Barthe, Bernhard Schölkopf, and Isabel Valera. 2020. A survey of algorithmic recourse: definitions, formulations, solutions, and prospects. arXiv preprint arXiv:2010.04050(2020).
[35]
Pauline T. Kim. 2022. Race-aware algorithms: Fairness, nondiscrimination and affirmative action.California Law Review 110 (2022).
[36]
Jon Kleinberg and Manish Raghavan. 2021. Algorithmic monoculture and social welfare. Proceedings of the National Academy of Sciences 118, 22(2021).
[37]
Ron Kohavi, David H Wolpert, 1996. Bias plus variance decomposition for zero-one loss functions. In ICML, Vol. 96. 275–83.
[38]
Loren Larsen. 2019. Resumes, Robots, and Racism: The Truth about AI in Hiring. HireVue.
[39]
David Lehr and Paul Ohm. 2017. Playing with the data: what legal scholars should learn about machine learning. UCDL Rev. 51(2017), 653.
[40]
Charles T. Marx, Flávio P. Calmon, and Berk Ustun. 2020. Predictive Multiplicity in Classification. In Proceedings of the 37th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 119). PMLR, 6765–6774.
[41]
Johannes Mehrer, Courtney J Spoerer, Nikolaus Kriegeskorte, and Tim C Kietzmann. 2020. Individual differences among deep neural network models. Nature communications 11, 1 (2020), 1–12.
[42]
Aditya Krishna Menon and Robert C Williamson. 2018. The cost of fairness in binary classification. In Conference on Fairness, Accountability and Transparency. PMLR, 107–118.
[43]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447–453.
[44]
Samir Passi and Solon Barocas. 2019. Problem formulation and fairness. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 39–48.
[45]
Martin Pawelczyk, Klaus Broelemann, and Gjergji. Kasneci. 2020. On Counterfactual Explanations under Predictive Multiplicity. In Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)(Proceedings of Machine Learning Research).
[46]
Ronen Perry and Tal Z Zarsky. 2014. May the Odds Be Ever in Your Favor: Lotteries in Law. Ala. L. Rev. 66(2014), 1035.
[47]
Manish Raghavan and Solon Barocas. 2019. Challenges for mitigating bias in algorithmic hiring. Brookings. Retrieved February 25 (2019), 2020.
[48]
Manish Raghavan, Solon Barocas, Jon Kleinberg, and Karen Levy. 2020. Mitigating bias in algorithmic hiring: Evaluating claims and practices. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 469–481.
[49]
Xavier Renard, Thibault Laugel, and Marcin Detyniecki. 2021. Understanding Prediction Discrepancies in Machine Learning Classifiers. arXiv preprint arXiv:2104.05467(2021).
[50]
Michael L Rich. 2016. Machine learning, automated suspicion algorithms, and the fourth amendment. University of Pennsylvania Law Review(2016), 871–929.
[51]
Kit T Rodolfa, Hemank Lamba, and Rayid Ghani. 2021. Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy. Nature Machine Intelligence 3, 10 (2021), 896–904.
[52]
Kit T Rodolfa, Erika Salomon, Lauren Haynes, Iván Higuera Mendieta, Jamie Larson, and Rayid Ghani. 2020. Case study: predictive fairness to reduce misdemeanor recidivism through social service interventions. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 142–153.
[53]
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.
[54]
Andrew D Selbst and Solon Barocas. 2018. The intuitive appeal of explainable machines. Fordham L. Rev. 87(2018), 1085.
[55]
Lesia Semenova, Cynthia Rudin, and Ronald Parr. 2019. A study in Rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning. arXiv preprint arXiv:1908.01755(2019).
[56]
Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding machine learning: From theory to algorithms. Cambridge university press.
[57]
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2019. Robustness may be at odds with accuracy. In International Conference on Learning Representations.
[58]
Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable Recourse in Linear Classification. In Conference on Fairness, Accountability, and Transparency. 10–19.
[59]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2018. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law & Technology 31, 2 (2018), 841–887.
[60]
Tong Wang. 2019. Gaining free or low-cost interpretability with interpretable partial substitute. In International Conference on Machine Learning. PMLR, 6505–6514.
[61]
Michael Wick, swetasudha panda, and Jean-Baptiste Tristan. 2019. Unlocking Fairness: a Trade-off Revisited. In Advances in Neural Information Processing Systems, Vol. 32.
[62]
Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning. PMLR, 7472–7482.

Cited By

View all
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693812(42783-42795)Online publication date: 21-Jul-2024
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692920(21148-21169)Online publication date: 21-Jul-2024
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692594(13072-13085)Online publication date: 21-Jul-2024
  • Show More Cited By

Index Terms

  1. Model Multiplicity: Opportunities, Concerns, and Solutions
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
      June 2022
      2351 pages
      ISBN:9781450393522
      DOI:10.1145/3531146
      This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 June 2022

      Check for updates

      Author Tags

      1. Model multiplicity
      2. arbitrariness
      3. discrimination
      4. fairness
      5. predictive multiplicity
      6. procedural multiplicity
      7. recourse

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      FAccT '22
      Sponsor:

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,141
      • Downloads (Last 6 weeks)171
      Reflects downloads up to 14 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693812(42783-42795)Online publication date: 21-Jul-2024
      • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692920(21148-21169)Online publication date: 21-Jul-2024
      • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692594(13072-13085)Online publication date: 21-Jul-2024
      • (2024)Recourse under Model Multiplicity via Argumentative EnsemblingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662950(954-963)Online publication date: 6-May-2024
      • (2024)Robust counterfactual explanations in machine learningProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/894(8086-8094)Online publication date: 3-Aug-2024
      • (2024)Arbitrariness and social predictionProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i20.30203(22004-22012)Online publication date: 20-Feb-2024
      • (2024)Ten simple rules for building and maintaining a responsible data science workflowPLOS Computational Biology10.1371/journal.pcbi.101223220:7(e1012232)Online publication date: 18-Jul-2024
      • (2024)From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI AuditingProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3689904.3694711(1-14)Online publication date: 29-Oct-2024
      • (2024)Ending Affirmative Action Harms Diversity Without Improving Academic MeritProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3689904.3694706(1-17)Online publication date: 29-Oct-2024
      • (2024)Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model MultiplicityACM Journal on Responsible Computing10.1145/36771731:3(1-27)Online publication date: 10-Jul-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media