Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3692070.3692107guideproceedingsArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Remembering to be fair: non-Markovian fairness in sequential decision making

Published: 03 January 2025 Publication History

Abstract

Fair decision making has largely been studied with respect to a single decision. Here we investigate the notion of fairness in the context of sequential decision making where multiple stakeholders can be affected by the outcomes of decisions. We observe that fairness often depends on the history of the sequential decision-making process, and in this sense that it is inherently non-Markovian. We further observe that fairness often needs to be assessed at time points within the process, not just at the end of the process. To advance our understanding of this class of fairness problems, we explore the notion of non-Markovian fairness in the context of sequential decision making. We identify properties of non-Markovian fairness, including notions of long-term, anytime, periodic, and bounded fairness. We explore the interplay between non-Markovian fairness and memory and how memory can support construction of fair policies. Finally, we introduce the FairQCM algorithm, which can automatically augment its training data to improve sample efficiency in the synthesis of fair policies via reinforcement learning.

References

[1]
Ala, A., Alsaadi, F. E., Ahmadi, M., and Mirjalili, S. Optimization of an appointment scheduling problem for healthcare systems based on the quality of fairness service using whale optimization algorithm and NSGA-II. Scientific Reports, 11:19816, 2021.
[2]
Bacchus, F., Boutilier, C., and Grove, A. J. Rewarding behaviors. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1160-1167, 1996.
[3]
Barocas, S. and Selbst, A. D. Big data's disparate impact. California Law Review, 104(3):671-732, 2016.
[4]
Binns, R. What can political philosophy teach us about algorithmic fairness? IEEE Security and Privacy, 16(3): 73-80, 2018.
[5]
Boehmer, N. and Niedermeier, R. Broadening the research agenda for computational social choice: Multiple preference profiles and multiple solutions. In AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, pp. 1-5, 2021.
[6]
Bouveret, S. and Lemaitre, M. Characterizing conflicts in fair division of indivisible goods using a scale of criteria. Autonomous Agents and Multi-Agent Systems, 30(2):259-290, 2016.
[7]
Caragiannis, I., Kurokawa, D., Moulin, H., Procaccia, A. D., Shah, N., and Wang, J. The unreasonable fairness of maximum Nash welfare. ACM Transactions on Economics and Computation (TEAC), 7(3):12:1-12:32, 2019.
[8]
Cho, K., Van Merrienboer, B., Bahdanau, D., and Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
[9]
D'Amour, A., Srinivasan, H., Atwood, J., Baljekar, P., Sculley, D., and Halpern, Y. Fairness is not static: deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 525-534, 2020.
[10]
Deng, Z., Sun, H., Wu, Z. S., Zhang, L., and Parkes, D. C. Reinforcement learning with stepwise fairness constraints. arXiv preprint arXiv:2211.03994, 2022.
[11]
Du, H., Saiyed, S., and Gardner, L. M. Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis. BMC Public Health, 24:220, 2024.
[12]
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS '12, pp. 214-226, New York, NY, USA, 2012. Association for Computing Machinery.
[13]
Erdoğan, G., Yucel, E., Kiavash, P., and Salman, F. S. Fair and effective vaccine allocation during a pandemic. Socio-Economic Planning Sciences, 93:101895, 2024.
[14]
Fan, Z., Peng, N., Tian, M., and Fain, B. Welfare and fairness in multi-objective reinforcement learning. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2023, pp. 1991-1999, 2023.
[15]
Hardt, M., Price, E., and Srebro, N. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems, 29, 2016.
[16]
Hashimoto, T., Srivastava, M., Namkoong, H., and Liang, P. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning, pp. 1929-1938. PMLR, 2018.
[17]
Hu, L. and Chen, Y. A short-term intervention for long-term fairness in the labor market. In Proceedings of the 2018 World Wide Web Conference, pp. 1389-1398, 2018.
[18]
Ibaraki, T. and Katoh, N. Resource Allocation Problems: Algorithmic Approaches. MIT press, 1988.
[19]
Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J., and Roth, A. Fairness in reinforcement learning. In International Conference on Machine Learning, pp. 1617-1626. PMLR, 2017.
[20]
Kash, I., Procaccia, A. D., and Shah, N. No agent left behind: Dynamic fair division of multiple resources. Journal of Artificial Intelligence Research, 51:579-603, 2014.
[21]
Liu, L. T., Dean, S., Rolf, E., Simchowitz, M., and Hardt, M. Delayed impact of fair machine learning. In International Conference on Machine Learning, pp. 3150-3158. PMLR, 2018.
[22]
Mandal, D. and Gan, J. Socially fair reinforcement learning. arXiv preprint arXiv:2208.12584, 2022.
[23]
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M. A., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. Human-level control through deep reinforcement learning. Nature, 518(7540): 529-533, 2015.
[24]
Narayanan, A. Translation tutorial: 21 fairness definitions and their politics. In Proc. Conference on Fairness, Accountability and Transparency, 2018.
[25]
Peshkin, L., Meuleau, N., and Kaelbling, L. P. Learning policies with external memory. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), pp. 307-314. Morgan Kaufmann, 1999.
[26]
Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 2014.
[27]
Qi, J. Mitigating delays and unfairness in appointment systems. Management Science, 63(2):566-583, 2017.
[28]
Roijers, D. M., Vamplew, P., Whiteson, S., and Dazeley, R. A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48:67-113, 2013.
[29]
Shams, P., Beynier, A., Bouveret, S., and Maudet, N. Minimizing and balancing envy among agents using ordered weighted average. In Algorithmic Decision Theory - 7th International Conference, ADT 2021, volume 13023 of Lecture Notes in Computer Science, pp. 289-303. Springer, 2021.
[30]
Siddique, U., Weng, P., and Zimmer, M. Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards. In Proceedings of the 37th International Conference on Machine Learning, volume 119, pp. 8905-8915. PMLR, 2020.
[31]
Sipser, M. Introduction to the Theory of Computation. PWS Publishing Company, 1997.
[32]
Sutton, R. S. and Barto, A. G. Reinforcement learning: An introduction. MIT press, 2018.
[33]
Thiébaux, S., Gretton, C., Slaney, J. K., Price, D., and Kabanza, F. Decision-theoretic planning with non-Markovian rewards. Journal of Artificial Intelligence Research, 25:17-74, 2006.
[34]
Toro Icarte, R., Valenzano, R., Klassen, T. Q., Christoffersen, P., Farahmand, A.-m., and McIlraith, S. A. The act of remembering: A study in partially observable reinforcement learning. arXiv preprint arXiv:2010.01753, 2020.
[35]
Toro Icarte, R., Klassen, T. Q., Valenzano, R., and McIlraith, S. A. Reward machines: Exploiting reward function structure in reinforcement learning. Journal of Artificial Intelligence Research, 73:173-208, 2022.
[36]
Usher, A. D. A beautiful idea: how COVAX has fallen short. The Lancet, 397(10292):2322-2325, 2021.
[37]
Watkins, C. J. C. H. and Dayan, P. Q-learning. Machine Learning, 8:279-292, 1992.
[38]
Wen, M., Bastani, O., and Topcu, U. Algorithms for fairness in sequential decision making. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS, pp. 1144-1152. PMLR, 2021.
[39]
World Health Organization. Allocation logic and algorithm to support allocation of vaccines secured through the COVAX facility, 2021. URL https://www.who.int/publications/m/item/allocation-logic-and-algorithm-to-support-allocation-of-vaccines-secured-through-the-covax-facility.
[40]
Xinying Chen, V. and Hooker, J. A guide to formulating fairness in an optimization model. Annals of Operations Research, pp. 1-39, 2023.
[41]
Zhang, C. and Shah, J. A. Fairness in multi-agent sequential decision-making. In Advances in Neural Information Processing Systems 27, 2014.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'24: Proceedings of the 41st International Conference on Machine Learning
July 2024
63010 pages

Publisher

JMLR.org

Publication History

Published: 03 January 2025

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media