Posterior Sampling via Autoregressive Generation

Zhang, Kelly W; Cai, Tiffany Tianhui; Namkoong, Hongseok; Russo, Daniel

Computer Science > Machine Learning

arXiv:2405.19466 (cs)

[Submitted on 29 May 2024 (v1), last revised 8 Oct 2024 (this version, v2)]

Title:Posterior Sampling via Autoregressive Generation

Authors:Kelly W Zhang, Tiffany Tianhui Cai, Hongseok Namkoong, Daniel Russo

View PDF HTML (experimental)

Abstract:Real-world decision-making requires grappling with a perpetual lack of data as environments change; intelligent agents must comprehend uncertainty and actively gather information to resolve it. We propose a new framework for learning bandit algorithms from massive historical data, which we demonstrate in a cold-start recommendation problem. First, we use historical data to pretrain an autoregressive model to predict a sequence of repeated feedback/rewards (e.g., responses to news articles shown to different users over time). In learning to make accurate predictions, the model implicitly learns an informed prior based on rich action features (e.g., article headlines) and how to sharpen beliefs as more rewards are gathered (e.g., clicks as each article is recommended). At decision-time, we autoregressively sample (impute) an imagined sequence of rewards for each action, and choose the action with the largest average imputed reward. Far from a heuristic, our approach is an implementation of Thompson sampling (with a learned prior), a prominent active exploration algorithm. We prove our pretraining loss directly controls online decision-making performance, and we demonstrate our framework on a news recommendation task where we integrate end-to-end fine-tuning of a pretrained language model to process news article headline text to improve performance.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2405.19466 [cs.LG]
	(or arXiv:2405.19466v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.19466

Submission history

From: Kelly Zhang [view email]
[v1] Wed, 29 May 2024 19:24:44 UTC (1,714 KB)
[v2] Tue, 8 Oct 2024 15:55:06 UTC (1,714 KB)

Computer Science > Machine Learning

Title:Posterior Sampling via Autoregressive Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Posterior Sampling via Autoregressive Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators