short-paper

Unsupervised cost sensitive predictions with side information

Authors:

Arun Verma,

Manjesh K. HanawalAuthors Info & Claims

CODS-COMAD '18: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Pages 364 - 367

https://doi.org/10.1145/3152494.3167992

Published: 11 January 2018 Publication History

Get Access

Abstract

In many security and healthcare systems, a sequence of sensors/tests is used for detection and diagnosis. Each test outputs a prediction of the latent state, and carries with it inherent costs. Our objective is to learn strategies for selecting a test that gives the best trade-off between accuracy & costs. Unfortunately, it is often impossible to acquire ground truth annotations and we are left with the problem of unsupervised sensor selection (USS). Hanawal et al. [9] reduces USS to a special case of multi-armed bandit problem with side information and develop polynomial time algorithms that achieve sub-linear regret. In this paper, we extend earlier analysis with contextual information, propose an algorithm having sub-linear regret and verify our results on synthetic & real datasets.

References

[1]

Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. 2011. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems. 2312--2320.

Digital Library

Google Scholar

[2]

Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, Nitin Motgi, Seung-Taek Park, Raghu Ramakrishnan, Scott Roy, and Joe Zachariah. 2009. Online models for content optimization. In Advances in Neural Information Processing Systems. 17--24.

Digital Library

Google Scholar

[3]

Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, Nov (2002), 397--422.

Digital Library

Google Scholar

[4]

Wei Chu, Lihong Li, Lev Reyzin, and Robert E Schapire. 2011. Contextual bandits with linear payoff functions. In International Conference on Artificial Intelligence and Statistics. 208--214.

Google Scholar

[5]

Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. 2009. Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47, 4 (2009), 547--553.

Digital Library

Google Scholar

[6]

Varsha Dani, Thomas P Hayes, and Sham M Kakade. 2008. Stochastic Linear Optimization under Bandit Feedback. In COLT. 355--366.

Google Scholar

[7]

Kelwin Fernandes, Pedro Vinagre, and Paulo Cortez. 2015. A proactive intelligent decision support system for predicting the popularity of online news. In Portuguese Conference on Artificial Intelligence. Springer, 535--546.

Crossref

Google Scholar

[8]

Sarah Filippi, Olivier Cappe, Aurélien Garivier, and Csaba Szepesvári. 2010. Parametric bandits: The generalized linear case. In Advances in Neural Information Processing Systems. 586--594.

Digital Library

Google Scholar

[9]

Manjesh Hanawal, Csaba Szepesvari, and Venkatesh Saligrama. 2017. Unsupervised Sequential Sensor Acquisition. In Artificial Intelligence and Statistics. 803--811.

Google Scholar

[10]

Kwang-Sung Jun, Aniruddha Bhargava, Robert Nowak, and Rebecca Willett. 2017. Scalable Generalized Linear Bandits: Online Computation and Hashing. arXiv preprint arXiv:1706.00136 (2017).

Google Scholar

[11]

Lihong Li, Wei Chu, John Langford, and Robert E Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on Worldwide web. ACM, 661--670.

Digital Library

Google Scholar

[12]

Lihong Li, Yu Lu, and Dengyong Zhou. 2017. Provable Optimal Algorithms for Generalized Linear Contextual Bandits. arXiv preprint arXiv:1703.00048 (2017).

Google Scholar

[13]

M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml

Google Scholar

[14]

Paat Rusmevichientong and John N Tsitsiklis. 2010. Linearly parameterized bandits. Mathematics of Operations Research 35, 2 (2010), 395--411.

Digital Library

Google Scholar

Index Terms

Unsupervised cost sensitive predictions with side information
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
    2. Learning settings
      1. Online learning settings

Recommendations

Multi-armed bandits with episode context

A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0,1] associated with that arm. We assume contextual side information is available at the start of the episode. ...
Multi-armed bandits with dependent arms
Abstract
We study a variant of the multi-armed bandit problem (MABP) which we call as MABs with dependent arms. Multiple arms are grouped together to form a cluster, and the reward distributions of arms in the same cluster are known functions of an unknown ...
Dynamic path learning in decision trees using contextual bandits
Abstract
We present a novel online decision-making solution, where the optimal path of a given decision tree is dynamically found based on the contextual bandits analysis. At each round, the learner finds a path in the decision tree by making a sequence of ...

Comments

Information & Contributors

Information

Published In

CODS-COMAD '18: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

January 2018

379 pages

ISBN:9781450363419

DOI:10.1145/3152494

Conference Chair:
Sayan Ranu
IIT Delhi
,
General Chairs:
Niloy Ganguly
IIT Kharagpur
,
Raghu Ramakrishnan
Microsoft
,
Program Chairs:
Sunita Sarawagi
IIT Bombay
,
Shourya Roy
American Express Big Data Labs

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CoDS-COMAD '18

CoDS-COMAD '18: The ACM India Joint International Conference on Data Science & Management of Data

January 11 - 13, 2018

Goa, India

Acceptance Rates

CODS-COMAD '18 Paper Acceptance Rate 50 of 150 submissions, 33%;

Overall Acceptance Rate 197 of 680 submissions, 29%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
85
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Multi-armed bandits with episode context

Multi-armed bandits with dependent arms

Dynamic path learning in decision trees using contextual bandits

Comments

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Login options

Full Access

PDF

eReader

Abstract

References

Index Terms

Recommendations

Multi-armed bandits with episode context

Multi-armed bandits with dependent arms

Dynamic path learning in decision trees using contextual bandits

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations