Markov processes

Applied Filters

People

Publications

Publication Date

Searched The ACM Guide to Computing Literature (3,737,288 records)|Limit your search to The ACM Full-Text Collection (751,091 records)

Showing 1 - 20of77 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Free
March 2024
Double duality: variational primal-dual policy optimization for constrained reinforcement learning
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 385, Pages 18431–18473

We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, ...
0
16
Metrics
Total Citations0
Total Downloads16
Last 12 Months16
Last 6 weeks4
View online with eReader
PDF
research-article
Free
March 2024
Topological hidden Markov models
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 340, Pages 16258–16306

The Hidden Markov Model (HMM) is a classic modelling tool with a wide swath of applications. Its inception considered observations restricted to a finite alphabet, but it was quickly extended to multivariate continuous distributions. In this article, we ...
0
12
Metrics
Total Citations0
Total Downloads12
Last 12 Months12
Last 6 weeks5
View online with eReader
PDF
research-article
Free
March 2024
Sparse Markov models for high-dimensional inference
- Guilherme Ost,
- Daniel Y. Takahashi
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 279, Pages 13221–13274

Finite-order Markov models are well-studied models for dependent finite alphabet data. Despite their generality, application in empirical work is rare when the order d is large relative to the sample size n (e.g., d = O(n)). Practitioners rarely use ...
0
11
Metrics
Total Citations0
Total Downloads11
Last 12 Months11
Last 6 weeks7
View online with eReader
PDF
research-article
Free
March 2024
Nearest neighbor dirichlet mixtures
- Shounak Chattopadhyay,
- Antik Chakraborty
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 261, Pages 12196–12241

There is a rich literature on Bayesian methods for density estimation, which characterize the unknown density as a mixture of kernels. Such methods have advantages in terms of providing uncertainty quantification in estimation, while being adaptive to a ...
0
7
Metrics
Total Citations0
Total Downloads7
Last 12 Months7
Last 6 weeks3
View online with eReader
PDF
research-article
Free
March 2024
Improving multiple-try metropolis with local balancing
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 248, Pages 11771–11829

Multiple-try Metropolis (MTM) is a popular Markov chain Monte Carlo method with the appealing feature of being amenable to parallel computing. At each iteration, it samples several candidates for the next state of the Markov chain and randomly selects ...
0
10
Metrics
Total Citations0
Total Downloads10
Last 12 Months10
Last 6 weeks3
View online with eReader
PDF
research-article
Free
March 2024
Polynomial-time algorithms for counting and sampling Markov equivalent DAGs with applications
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 213, Pages 10140–10184

Counting and sampling directed acyclic graphs from a Markov equivalence class are fundamental tasks in graphical causal analysis. In this paper we show that these tasks can be performed in polynomial time, solving a long-standing open problem in this ...
0
6
Metrics
Total Citations0
Total Downloads6
Last 12 Months6
Last 6 weeks3
View online with eReader
PDF
research-article
Free
March 2024
GFlowNet foundations
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 210, Pages 10006–10060

Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context, with a training objective that makes them approximately sample in proportion to a given reward function. In this ...
0
90
Metrics
Total Citations0
Total Downloads90
Last 12 Months90
Last 6 weeks52
View online with eReader
PDF
research-article
Free
March 2024
Q-learning for MDPs with general spaces: convergence and near optimality via quantization under weak continuity
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 199, Pages 9531–9564

Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such ...
0
6
Metrics
Total Citations0
Total Downloads6
Last 12 Months6
Last 6 weeks1
View online with eReader
PDF
research-article
Free
March 2024
Learning good state and action representations for Markov decision process via tensor decomposition
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 115, Pages 5157–5209

The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations ...
0
17
Metrics
Total Citations0
Total Downloads17
Last 12 Months17
Last 6 weeks6
View online with eReader
PDF
research-article
Free
March 2024
Provably sample-efficient model-free algorithm for MDPs with peak constraints
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 60, Pages 2579–2603

In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent ...
0
11
Metrics
Total Citations0
Total Downloads11
Last 12 Months11
Last 6 weeks5
View online with eReader
PDF
research-article
Free
March 2024
Reinforcement learning for joint optimization of multiple rewards
- Mridul Agarwal,
- Vaneet Aggarwal
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 49, Pages 2039–2079

Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require optimization of an ...
0
18
Metrics
Total Citations0
Total Downloads18
Last 12 Months18
Last 6 weeks10
View online with eReader
PDF
research-article
Free
March 2024
Can reinforcement learning find Stackelberg-Nash equilibria in general-sum Markov games with myopically rational followers?
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 35, Pages 1350–1401

We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers. In particular, we focus on the class of games where the followers are myopically rational; i.e., they aim to ...
0
36
Metrics
Total Citations0
Total Downloads36
Last 12 Months36
Last 6 weeks11
View online with eReader
PDF
research-article
Free
January 2022
Fully general online imitation learning
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 334, Pages 15066–15095

In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been ...
0
31
Metrics
Total Citations0
Total Downloads31
Last 12 Months25
Last 6 weeks2
View online with eReader
PDF
research-article
Free
January 2022
On the convergence rates of policy gradient methods
- Lin Xiao
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 282, Pages 12887–12922

We consider infinite-horizon discounted Markov decision problems with finite state and action spaces and study the convergence rates of the projected policy gradient method and a general class of policy mirror descent methods, all with direct ...
1
180
Metrics
Total Citations1
Total Downloads180
Last 12 Months160
Last 6 weeks9
View online with eReader
PDF
research-article
Free
January 2022
Double spike Dirichlet priors for structured weighting
- Huiming Lin,
- Meng Li
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 248, Pages 11293–11320

Assigning weights to a large pool of objects is a fundamental task in a wide variety of applications. In this article, we introduce the concept of structured high-dimensional probability simplexes, in which most components are zero or near zero and the ...
0
31
Metrics
Total Citations0
Total Downloads31
Last 12 Months28
Last 6 weeks4
View online with eReader
PDF
research-article
Free
January 2022
Mappings for marginal probabilities with applications to models in statistical physics
- Mehdi Molkaraie
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 245, Pages 11168–11203

We present local mappings that relate the marginal probabilities of a global probability mass function represented by its primal normal factor graph to the corresponding marginal probabilities in its dual normal factor graph. The mapping is based on the ...
0
18
Metrics
Total Citations0
Total Downloads18
Last 12 Months15
Last 6 weeks5
View online with eReader
PDF
research-article
Free
January 2022
Online nonnegative CP-dictionary learning for Markovian data
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 148, Pages 6630–6679

Online Tensor Factorization (OTF) is a fundamental tool in learning low-dimensional interpretable features from streaming multi-modal data. While various algorithmic and theoretical aspects of OTF have been investigated recently, a general convergence ...
0
47
Metrics
Total Citations0
Total Downloads47
Last 12 Months38
Last 6 weeks15
View online with eReader
PDF
research-article
Free
January 2022
Multiple testing in nonparametric hidden Markov models: an empirical Bayes approach
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 94, Pages 4061–4117

Given a nonparametric Hidden Markov Model (HMM) with two states, the question of constructing efficient multiple testing procedures is considered, treating the states as unknown null and alternative hypotheses. A procedure is introduced, based on ...
0
22
Metrics
Total Citations0
Total Downloads22
Last 12 Months19
Last 6 weeks6
View online with eReader
PDF
research-article
Free
January 2022
Stacking for non-mixing Bayesian computations: the curse and blessing of multimodal posteriors
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 79, Pages 3426–3471

When working with multimodal Bayesian posterior distributions, Markov chain Monte Carlo (MCMC) algorithms have difficulty moving between modes, and default variational or mode-based approximate inferences will understate posterior uncertainty. And, even ...
0
131
Metrics
Total Citations0
Total Downloads131
Last 12 Months105
Last 6 weeks9
View online with eReader
PDF
research-article
Free
January 2022
Optimal transport for stationary Markov chains via policy iteration
- Kevin McGoff,
- Andrew B. Nobel
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 45, Pages 2175–2226

We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of ...
0
43
Metrics
Total Citations0
Total Downloads43
Last 12 Months40
Last 6 weeks8
View online with eReader
PDF

Applied Filters

People

Names

Institutions

Authors

Publications

All Publications

Content Type

Media Formats

Publisher

Publication Date

Double duality: variational primal-dual policy optimization for constrained reinforcement learning

Topological hidden Markov models

Sparse Markov models for high-dimensional inference

Nearest neighbor dirichlet mixtures

Improving multiple-try metropolis with local balancing

Polynomial-time algorithms for counting and sampling Markov equivalent DAGs with applications

GFlowNet foundations

Q-learning for MDPs with general spaces: convergence and near optimality via quantization under weak continuity

Learning good state and action representations for Markov decision process via tensor decomposition

Provably sample-efficient model-free algorithm for MDPs with peak constraints

Reinforcement learning for joint optimization of multiple rewards

Can reinforcement learning find Stackelberg-Nash equilibria in general-sum Markov games with myopically rational followers?

Fully general online imitation learning

On the convergence rates of policy gradient methods

Double spike Dirichlet priors for structured weighting

Mappings for marginal probabilities with applications to models in statistical physics

Online nonnegative CP-dictionary learning for Markovian data

Multiple testing in nonparametric hidden Markov models: an empirical Bayes approach

Stacking for non-mixing Bayesian computations: the curse and blessing of multimodal posteriors

Optimal transport for stationary Markov chains via policy iteration