Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Letard, Alexandre; Amghar, Tassadit; Camp, Olivier; Gutowski, Nicolas

Computer Science > Machine Learning

arXiv:2009.07518 (cs)

[Submitted on 16 Sep 2020]

Title:Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Authors:Alexandre Letard, Tassadit Amghar, Olivier Camp, Nicolas Gutowski

View PDF

Abstract:Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric. This can be achieved, in the case of recommender systems, with personalization. However, with a combinatorial online learning approach, personalization implies a large amount of user feedbacks. Such feedbacks can be hard to acquire when users need to be directly and frequently solicited. For a number of fields of activities undergoing the digitization of their business, online learning is unavoidable. Thus, a number of approaches allowing implicit user feedback retrieval have been implemented. Nevertheless, this implicit feedback can be misleading or inefficient for the agent's learning. Herein, we propose a novel approach reducing the number of explicit feedbacks required by Combinatorial Multi Armed bandit (COM-MAB) algorithms while providing similar levels of global accuracy and learning efficiency to classical competitive methods. In this paper we present a novel approach for considering user feedback and evaluate it using three distinct strategies. Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2009.07518 [cs.LG]
	(or arXiv:2009.07518v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2009.07518

Submission history

From: Alexandre Letard [view email]
[v1] Wed, 16 Sep 2020 07:32:51 UTC (37 KB)

Computer Science > Machine Learning

Title:Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators