Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleMarch 2024
Double duality: variational primal-dual policy optimization for constrained reinforcement learning
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 385, Pages 18431–18473We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, ...
- research-articleMarch 2024
Topological hidden Markov models
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 340, Pages 16258–16306The Hidden Markov Model (HMM) is a classic modelling tool with a wide swath of applications. Its inception considered observations restricted to a finite alphabet, but it was quickly extended to multivariate continuous distributions. In this article, we ...
- research-articleMarch 2024
Sparse Markov models for high-dimensional inference
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 279, Pages 13221–13274Finite-order Markov models are well-studied models for dependent finite alphabet data. Despite their generality, application in empirical work is rare when the order d is large relative to the sample size n (e.g., d = O(n)). Practitioners rarely use ...
- research-articleMarch 2024
Nearest neighbor dirichlet mixtures
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 261, Pages 12196–12241There is a rich literature on Bayesian methods for density estimation, which characterize the unknown density as a mixture of kernels. Such methods have advantages in terms of providing uncertainty quantification in estimation, while being adaptive to a ...
- research-articleMarch 2024
Improving multiple-try metropolis with local balancing
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 248, Pages 11771–11829Multiple-try Metropolis (MTM) is a popular Markov chain Monte Carlo method with the appealing feature of being amenable to parallel computing. At each iteration, it samples several candidates for the next state of the Markov chain and randomly selects ...
-
- research-articleMarch 2024
Polynomial-time algorithms for counting and sampling Markov equivalent DAGs with applications
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 213, Pages 10140–10184Counting and sampling directed acyclic graphs from a Markov equivalence class are fundamental tasks in graphical causal analysis. In this paper we show that these tasks can be performed in polynomial time, solving a long-standing open problem in this ...
- research-articleMarch 2024
GFlowNet foundations
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 210, Pages 10006–10060Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context, with a training objective that makes them approximately sample in proportion to a given reward function. In this ...
- research-articleMarch 2024
Q-learning for MDPs with general spaces: convergence and near optimality via quantization under weak continuity
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 199, Pages 9531–9564Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such ...
- research-articleMarch 2024
Learning good state and action representations for Markov decision process via tensor decomposition
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 115, Pages 5157–5209The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations ...
- research-articleMarch 2024
Provably sample-efficient model-free algorithm for MDPs with peak constraints
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 60, Pages 2579–2603In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent ...
- research-articleMarch 2024
Reinforcement learning for joint optimization of multiple rewards
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 49, Pages 2039–2079Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require optimization of an ...
- research-articleMarch 2024
Can reinforcement learning find Stackelberg-Nash equilibria in general-sum Markov games with myopically rational followers?
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 35, Pages 1350–1401We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers. In particular, we focus on the class of games where the followers are myopically rational; i.e., they aim to ...
- research-articleJanuary 2022
Fully general online imitation learning
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 334, Pages 15066–15095In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been ...
- research-articleJanuary 2022
On the convergence rates of policy gradient methods
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 282, Pages 12887–12922We consider infinite-horizon discounted Markov decision problems with finite state and action spaces and study the convergence rates of the projected policy gradient method and a general class of policy mirror descent methods, all with direct ...
- research-articleJanuary 2022
Double spike Dirichlet priors for structured weighting
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 248, Pages 11293–11320Assigning weights to a large pool of objects is a fundamental task in a wide variety of applications. In this article, we introduce the concept of structured high-dimensional probability simplexes, in which most components are zero or near zero and the ...
- research-articleJanuary 2022
Mappings for marginal probabilities with applications to models in statistical physics
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 245, Pages 11168–11203We present local mappings that relate the marginal probabilities of a global probability mass function represented by its primal normal factor graph to the corresponding marginal probabilities in its dual normal factor graph. The mapping is based on the ...
- research-articleJanuary 2022
Online nonnegative CP-dictionary learning for Markovian data
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 148, Pages 6630–6679Online Tensor Factorization (OTF) is a fundamental tool in learning low-dimensional interpretable features from streaming multi-modal data. While various algorithmic and theoretical aspects of OTF have been investigated recently, a general convergence ...
- research-articleJanuary 2022
Multiple testing in nonparametric hidden Markov models: an empirical Bayes approach
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 94, Pages 4061–4117Given a nonparametric Hidden Markov Model (HMM) with two states, the question of constructing efficient multiple testing procedures is considered, treating the states as unknown null and alternative hypotheses. A procedure is introduced, based on ...
- research-articleJanuary 2022
Stacking for non-mixing Bayesian computations: the curse and blessing of multimodal posteriors
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 79, Pages 3426–3471When working with multimodal Bayesian posterior distributions, Markov chain Monte Carlo (MCMC) algorithms have difficulty moving between modes, and default variational or mode-based approximate inferences will understate posterior uncertainty. And, even ...
- research-articleJanuary 2022
Optimal transport for stationary Markov chains via policy iteration
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 45, Pages 2175–2226We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of ...