research-article

Optimizing debt collections using constrained reinforcement learning

Authors:

Chandan K. Reddy,

David L. Jensen,

Vince P. Thomas,

James J. Bennett,

Gary F. Anderson,

Brent R. Cooley,

Melissa Kowalczyk,

Timothy GardinierAuthors Info & Claims

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 75 - 84

https://doi.org/10.1145/1835804.1835817

Published: 25 July 2010 Publication History

Abstract

The problem of optimally managing the collections process by taxation authorities is one of prime importance, not only for the revenue it brings but also as a means to administer a fair taxing system. The analogous problem of debt collections management in the private sector, such as banks and credit card companies, is also increasingly gaining attention. With the recent successes in the applications of data analytics and optimization to various business areas, the question arises to what extent such collections processes can be improved by use of leading edge data modeling and optimization techniques. In this paper, we propose and develop a novel approach to this problem based on the framework of constrained Markov Decision Process (MDP), and report on our experience in an actual deployment of a tax collections optimization system at New York State Department of Taxation and Finance (NYS DTF).

Supplementary Material

JPG File (kdd2010_abe_odcucrl_01.jpg)

Download
11.99 KB

MOV File (kdd2010_abe_odcucrl_01.mov)

Download
176.19 MB

References

[1]

N. Abe, N. Verma, C. Apte, and R. Schroko. Cross channel optimized marketing by reinforcement learning. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 2004.

Digital Library

[2]

Eitan Altman. Asymptotic properties of constrained markov decision processes. Technical Report RR-1598, INRIA, 1993.

[3]

C. Apte, E. Bibelnieks, R. Natarajan, E. Pednault, F. Tipu, D. Campbell, and B. Nelson. Segmentation-based modeling for advanced targeted marketing. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 408--413. ACM, 2001.

Digital Library

[4]

L. C. Baird. Reinforcement learning in continuous time: Advantage updating. In Proceedings of the International Conference on Neural Networks, 1994.

[5]

S. Bradtke and M. Duff. Reinforcement learing methods for continuous-time Markov decision problems. In Advances in Neural Information Processing Systems, volume 7, pages 393--400. The MIT Press, 1995.

[6]

P. Domingos. MetaCost: A general method for making classifiers cost sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pages 155--164. ACM Press, 1999.

Digital Library

[7]

C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, August 2001.

Digital Library

[8]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 1996.

Digital Library

[9]

D. Margineantu. On class probability estimates and cost-sensitive evaluation of classifiers. In Workshop Notes, Workshop on Cost-Sensitive Learning, International Conference on Machine Learning, June 2000.

[10]

R. Natarajan and E.P.D. Pednault. Segmented regression estimators for massive data sets. In Second SIAM International Conference on Data Mining, Arlington, Virginia, 2002. to appear.

[11]

E. Pednault, N. Abe, B. Zadrozny, H. Wang, W. Fan, and C. Apte. Sequential cost-sensitive decision making with reinforcement learning. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 2002.

Digital Library

[12]

F. Provost, P. Melville, and M. Saar-Tsechansky. Data acquisition and cost-effective predictive modeling: Targeting offers for electronic commerce. In Proceedings of the Ninth International Conference on Electronic Commerce, 2007.

Digital Library

[13]

M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and sons, Inc., 1994.

Digital Library

[14]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.

Digital Library

[15]

J. N. Tsitsiklis and B. Van Roy. An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5):674--690, 1997.

[16]

P. Turney. Cost-sensitive learning bibliography. Institute for Information Technology, National Research Council, Ottawa, Canada, 2000. http://extractor.iit.nrc.ca/bibliographies/cost-sensitive.html.

[17]

F. Wang and C. Zniolo. Xbit: An xml-based bitemporal data model. In Proceedings of the 23rd International Conference on Conceptual Modeling, pages 810--824, 2004.

[18]

C. J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, 1989.

[19]

B. Zadrozny. Policy mining: Learning decision policies from fixed sets of data. PhD thesis, University of California, San Diego, 2003.

Digital Library

[20]

B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In Proceedings of the Eighteenth International Conference on Machine Learning, 2001.

Digital Library

Cited By

Zhang LPeng YYang WZhang Z(2024)Semi-Infinitely Constrained Markov Decision Processes and Provably Efficient Reinforcement LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.3348460(1-14)Online publication date: 2024
https://doi.org/10.1109/TPAMI.2023.3348460
Yuan ZXu SZhu M(2024)Federated reinforcement learning for robot motion planning with zero-shot generalizationAutomatica10.1016/j.automatica.2024.111709166(111709)Online publication date: Aug-2024
https://doi.org/10.1016/j.automatica.2024.111709
Riza LPutra ZZain MTrihutama FUtama JSamah KHerdiwijaya DNQZ RMumpuni EPriyatikanto R(2024)Real-time anomaly detection in sky quality meter data using probabilistic exponential weighted moving averageInternational Journal of Data Science and Analytics10.1007/s41060-024-00535-8Online publication date: 20-Apr-2024
https://doi.org/10.1007/s41060-024-00535-8
Show More Cited By

Index Terms

Optimizing debt collections using constrained reinforcement learning
1. Computing methodologies
  1. Machine learning

Recommendations

Dynamic portfolio rebalancing through reinforcement learning
Abstract
Portfolio managements in financial markets involve risk management strategies and opportunistic responses to individual trading behaviours. Optimal portfolios constructed aim to have a minimal risk with highest accompanying investment returns, ...
Stock Market Intraday Trading Using Reinforcement Learning
Multi-disciplinary Trends in Artificial Intelligence
Abstract
In this study, Reinforcement Learning (RL) techniques are used to develop trading strategies for the stock market. Conventional trading strategies rely on human intuition and the examination of historical data to make forecasts, whereas RL agents ...
Sell-Side Debt Analysts and Debt Market Efficiency

We explore sell-side debt analysts’ contributions to the efficiency of securities markets. We document that debt returns lag equity returns less when debt research coverage exists, which is consistent with debt analysts facilitating the process by which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

July 2010

1240 pages

ISBN:9781450300551

DOI:10.1145/1835804

General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '10

Sponsor:

KDD '10: The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

July 25 - 28, 2010

DC, Washington, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
1,087
Total Downloads

Downloads (Last 12 months)75
Downloads (Last 6 weeks)7

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang LPeng YYang WZhang Z(2024)Semi-Infinitely Constrained Markov Decision Processes and Provably Efficient Reinforcement LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.3348460(1-14)Online publication date: 2024
https://doi.org/10.1109/TPAMI.2023.3348460
Yuan ZXu SZhu M(2024)Federated reinforcement learning for robot motion planning with zero-shot generalizationAutomatica10.1016/j.automatica.2024.111709166(111709)Online publication date: Aug-2024
https://doi.org/10.1016/j.automatica.2024.111709
Riza LPutra ZZain MTrihutama FUtama JSamah KHerdiwijaya DNQZ RMumpuni EPriyatikanto R(2024)Real-time anomaly detection in sky quality meter data using probabilistic exponential weighted moving averageInternational Journal of Data Science and Analytics10.1007/s41060-024-00535-8Online publication date: 20-Apr-2024
https://doi.org/10.1007/s41060-024-00535-8
Ding DWei CZhang KRibeiro AOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Last-iterate convergent policy gradient primal-dual methods for constrained MDPsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669011(66138-66200)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669011
Li ZLiu BYang ZWang ZWang M(2023)Double dualityThe Journal of Machine Learning Research10.5555/3648699.364908424:1(18431-18473)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.5555/3648699.3649084
Yang CLu TLi BLu X(2023)When Less Is More? Deep Reinforcement Learning-Based Optimization of Debt CollectionSSRN Electronic Journal10.2139/ssrn.4488673Online publication date: 2023
https://doi.org/10.2139/ssrn.4488673
Ying DGuo MDing YLavaei JShen ZWilliams BChen YNeville J(2023)Policy-based primal-dual methods for convex constrained Markov decision processesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i9.26299(10963-10971)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i9.26299
Herman DGoogin CLiu XSun YGalda ASafro IPistoia MAlexeev Y(2023)Quantum computing for financeNature Reviews Physics10.1038/s42254-023-00603-15:8(450-465)Online publication date: 11-Jul-2023
https://doi.org/10.1038/s42254-023-00603-1
Gu SGrudzien Kuba JChen YDu YYang LKnoll AYang Y(2023)Safe multi-agent reinforcement learning for multi-robot controlArtificial Intelligence10.1016/j.artint.2023.103905319:COnline publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1016/j.artint.2023.103905
Zhang LPeng YYang WZhang ZKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Semi-infinitely constrained Markov decision processesProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601493(16808-16820)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601493
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents