research-article

Remembering to be fair: non-Markovian fairness in sequential decision making

AUTHORs: Parand A. Alamdari, Toryn Q. Klassen, Elliot Creager, Sheila A. McIlraithAuthors Info & Claims

ICML'24: Proceedings of the 41st International Conference on Machine Learning

Article No.: 37, Pages 906 - 920

Published: 03 January 2025 Publication History

Abstract

Fair decision making has largely been studied with respect to a single decision. Here we investigate the notion of fairness in the context of sequential decision making where multiple stakeholders can be affected by the outcomes of decisions. We observe that fairness often depends on the history of the sequential decision-making process, and in this sense that it is inherently non-Markovian. We further observe that fairness often needs to be assessed at time points within the process, not just at the end of the process. To advance our understanding of this class of fairness problems, we explore the notion of non-Markovian fairness in the context of sequential decision making. We identify properties of non-Markovian fairness, including notions of long-term, anytime, periodic, and bounded fairness. We explore the interplay between non-Markovian fairness and memory and how memory can support construction of fair policies. Finally, we introduce the FairQCM algorithm, which can automatically augment its training data to improve sample efficiency in the synthesis of fair policies via reinforcement learning.

References

[1]

Ala, A., Alsaadi, F. E., Ahmadi, M., and Mirjalili, S. Optimization of an appointment scheduling problem for healthcare systems based on the quality of fairness service using whale optimization algorithm and NSGA-II. Scientific Reports, 11:19816, 2021.

[2]

Bacchus, F., Boutilier, C., and Grove, A. J. Rewarding behaviors. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1160-1167, 1996.

Digital Library

[3]

Barocas, S. and Selbst, A. D. Big data's disparate impact. California Law Review, 104(3):671-732, 2016.

[4]

Binns, R. What can political philosophy teach us about algorithmic fairness? IEEE Security and Privacy, 16(3): 73-80, 2018.

[5]

Boehmer, N. and Niedermeier, R. Broadening the research agenda for computational social choice: Multiple preference profiles and multiple solutions. In AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, pp. 1-5, 2021.

Digital Library

[6]

Bouveret, S. and Lemaitre, M. Characterizing conflicts in fair division of indivisible goods using a scale of criteria. Autonomous Agents and Multi-Agent Systems, 30(2):259-290, 2016.

Digital Library

[7]

Caragiannis, I., Kurokawa, D., Moulin, H., Procaccia, A. D., Shah, N., and Wang, J. The unreasonable fairness of maximum Nash welfare. ACM Transactions on Economics and Computation (TEAC), 7(3):12:1-12:32, 2019.

[8]

Cho, K., Van Merrienboer, B., Bahdanau, D., and Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.

[9]

D'Amour, A., Srinivasan, H., Atwood, J., Baljekar, P., Sculley, D., and Halpern, Y. Fairness is not static: deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 525-534, 2020.

Digital Library

[10]

Deng, Z., Sun, H., Wu, Z. S., Zhang, L., and Parkes, D. C. Reinforcement learning with stepwise fairness constraints. arXiv preprint arXiv:2211.03994, 2022.

[11]

Du, H., Saiyed, S., and Gardner, L. M. Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis. BMC Public Health, 24:220, 2024.

[12]

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS '12, pp. 214-226, New York, NY, USA, 2012. Association for Computing Machinery.

Digital Library

[13]

Erdoğan, G., Yucel, E., Kiavash, P., and Salman, F. S. Fair and effective vaccine allocation during a pandemic. Socio-Economic Planning Sciences, 93:101895, 2024.

[14]

Fan, Z., Peng, N., Tian, M., and Fain, B. Welfare and fairness in multi-objective reinforcement learning. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2023, pp. 1991-1999, 2023.

[15]

Hardt, M., Price, E., and Srebro, N. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems, 29, 2016.

[16]

Hashimoto, T., Srivastava, M., Namkoong, H., and Liang, P. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning, pp. 1929-1938. PMLR, 2018.

[17]

Hu, L. and Chen, Y. A short-term intervention for long-term fairness in the labor market. In Proceedings of the 2018 World Wide Web Conference, pp. 1389-1398, 2018.

Digital Library

[18]

Ibaraki, T. and Katoh, N. Resource Allocation Problems: Algorithmic Approaches. MIT press, 1988.

Digital Library

[19]

Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J., and Roth, A. Fairness in reinforcement learning. In International Conference on Machine Learning, pp. 1617-1626. PMLR, 2017.

Digital Library

[20]

Kash, I., Procaccia, A. D., and Shah, N. No agent left behind: Dynamic fair division of multiple resources. Journal of Artificial Intelligence Research, 51:579-603, 2014.

[21]

Liu, L. T., Dean, S., Rolf, E., Simchowitz, M., and Hardt, M. Delayed impact of fair machine learning. In International Conference on Machine Learning, pp. 3150-3158. PMLR, 2018.

[22]

Mandal, D. and Gan, J. Socially fair reinforcement learning. arXiv preprint arXiv:2208.12584, 2022.

[23]

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M. A., Fidjeland, A., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. Human-level control through deep reinforcement learning. Nature, 518(7540): 529-533, 2015.

[24]

Narayanan, A. Translation tutorial: 21 fairness definitions and their politics. In Proc. Conference on Fairness, Accountability and Transparency, 2018.

[25]

Peshkin, L., Meuleau, N., and Kaelbling, L. P. Learning policies with external memory. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), pp. 307-314. Morgan Kaufmann, 1999.

[26]

Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 2014.

[27]

Qi, J. Mitigating delays and unfairness in appointment systems. Management Science, 63(2):566-583, 2017.

Digital Library

[28]

Roijers, D. M., Vamplew, P., Whiteson, S., and Dazeley, R. A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48:67-113, 2013.

[29]

Shams, P., Beynier, A., Bouveret, S., and Maudet, N. Minimizing and balancing envy among agents using ordered weighted average. In Algorithmic Decision Theory - 7th International Conference, ADT 2021, volume 13023 of Lecture Notes in Computer Science, pp. 289-303. Springer, 2021.

Digital Library

[30]

Siddique, U., Weng, P., and Zimmer, M. Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards. In Proceedings of the 37th International Conference on Machine Learning, volume 119, pp. 8905-8915. PMLR, 2020.

[31]

Sipser, M. Introduction to the Theory of Computation. PWS Publishing Company, 1997.

Digital Library

[32]

Sutton, R. S. and Barto, A. G. Reinforcement learning: An introduction. MIT press, 2018.

Digital Library

[33]

Thiébaux, S., Gretton, C., Slaney, J. K., Price, D., and Kabanza, F. Decision-theoretic planning with non-Markovian rewards. Journal of Artificial Intelligence Research, 25:17-74, 2006.

Digital Library

[34]

Toro Icarte, R., Valenzano, R., Klassen, T. Q., Christoffersen, P., Farahmand, A.-m., and McIlraith, S. A. The act of remembering: A study in partially observable reinforcement learning. arXiv preprint arXiv:2010.01753, 2020.

[35]

Toro Icarte, R., Klassen, T. Q., Valenzano, R., and McIlraith, S. A. Reward machines: Exploiting reward function structure in reinforcement learning. Journal of Artificial Intelligence Research, 73:173-208, 2022.

Digital Library

[36]

Usher, A. D. A beautiful idea: how COVAX has fallen short. The Lancet, 397(10292):2322-2325, 2021.

[37]

Watkins, C. J. C. H. and Dayan, P. Q-learning. Machine Learning, 8:279-292, 1992.

Digital Library

[38]

Wen, M., Bastani, O., and Topcu, U. Algorithms for fairness in sequential decision making. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS, pp. 1144-1152. PMLR, 2021.

[39]

World Health Organization. Allocation logic and algorithm to support allocation of vaccines secured through the COVAX facility, 2021. URL https://www.who.int/publications/m/item/allocation-logic-and-algorithm-to-support-allocation-of-vaccines-secured-through-the-covax-facility.

[40]

Xinying Chen, V. and Hooker, J. A guide to formulating fairness in an optimization model. Annals of Operations Research, pp. 1-39, 2023.

[41]

Zhang, C. and Shah, J. A. Fairness in multi-agent sequential decision-making. In Advances in Neural Information Processing Systems 27, 2014.

Index Terms

Remembering to be fair: non-Markovian fairness in sequential decision making

Index terms have been assigned to the content through auto-classification.

Recommendations

T²-fair: a two-tiered time and throughput fair scheduler for multi-rate WLANs
MSWiM '06: Proceedings of the 9th ACM international symposium on Modeling analysis and simulation of wireless and mobile systems

Low throughput due to unfairness is a key problem in multi-rate wireless local area networks. To promote fairness and hence throughput, T²-Fair groups flows according to their average data rate, provides each group fair time allocations and ensures ...
Can CSMA/CA networks be made fair?
MobiCom '08: Proceedings of the 14th ACM international conference on Mobile computing and networking

We demonstrate that CSMA/CA networks, including IEEE 802.11 networks, exhibit severe fairness problem in many scenarios, where some hosts obtain most of the channel's bandwidth while others starve. Most existing solutions require nodes to overhear ...
Proportional fair throughput allocation in multirate IEEE 802.11e wireless LANs

Under heterogeneous radio conditions, Wireless LAN stations may use different modulation schemes, leading to a heterogeneity of bit rates. In such a situation, 802.11 DCF allocates the same throughput to all stations independently of their transmitting ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'24: Proceedings of the 41st International Conference on Machine Learning

July 2024

63010 pages

Copyright © 2024.

Publisher

JMLR.org

Publication History

Published: 03 January 2025

Qualifiers

Research-article
Research
Refereed limited

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents