Markov Decision Process
3,190 Followers
Recent papers in Markov Decision Process
We study a decision problem faced by an energy limited wireless device that operates in discrete time. There is some external arrival to the device's transmit buffer. The possible decisions are: a) to serve some of the buffer content; b)... more
— This paper presents a method for short-term electricity price forecasting based on combination of the Monte Carlo simulation and Markov chains. The method provides an estimation of the probabilities of various electricity price ranges,... more
A major issue in supply chain inventory management is the coordination of inventory policies adopted by di!erent supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material #ow and minimize costs while... more
the physical phenomena. By this method, the students enhance their problem-solving abilities with minimal programming skills. By using examples, the paper presents an approach to computer-aided problem-solving methods for junior-level... more
Limited battery power at wireless video sensor nodes, along with the transmission quality requirements for video data, makes quality-of-service (QoS) provisioning in a wireless video sensor network a very challenging task. In this paper,... more
This paper gives a brief overview of version 2.0 of PRISM, a tool for the automatic formal verification of probabilistic systems, and some of the case studies to which it has already been applied.
Sepsis, the tenth-leading cause of death in the United States, accounts for more than $16.7 billion in annual health care costs. A significant factor in these costs is hospital length of stay. The lack of standardized hospital discharge... more
This book, written in Portuguese, deals with Operational Research topics, such as linear programming models and direct techniques for solving Ax = b systems, revised simplex method, Markov processes and queuing systems, in addition to... more
Example 1. Consider the two-state, continuous-time Markov process with transition rate diagram for some positive constants A and B. The generator matrix is given by Q = −A A B −B. Solve the forward Kolmogorov equation for a given initial... more
The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's. During the decades of the last century this theory has grown dramatically. It... more
Semiconductor industry is very capital intensive in which capacity utilization significantly affect the capital effectiveness and profitability of semiconductor manufacturing companies. Due to constant technology advance driven by Moore's... more
The majority of learning algorithms available today focus on approximating the state (V ) or state-action (Q) value function and efficient action selection comes as an afterthought. On the other hand, real-world problems tend to have... more
Recently, it has been recognized that revenue management of cruise ships is different from that of airlines or hotels. Among the main differences is the presence of multiple capacity constraints in cruise ships, i.e., the number of cabins... more
We formulate and analyze a Markov decision process (dynamic programming) model for airline seat allocation (yield management) on a single-leg flight with multiple fare classes. Unlike previous models, we allow cancellation, no-shows, and... more
The focus of this work is the computation of efficient strategies for commodity trading in a multi-market environment. In today's "global economy" commodities are often bought in one location and then sold (right away, or after some... more
We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive... more
Material transportation is one of the most important aspects of open-pit mine operations. The problem usually involves a truck dispatching system in which decisions on truck assignments and destinations are taken in real-time. Due to its... more
We study the 'Creation of Pooling in Inventory and Queueing Models'. This research consists of the study of sharing a scarce resource (such as inventory, server capacity, or production capacity) between multiple customer classes. This is... more
A major issue in supply chain inventory management is the coordination of inventory policies adopted by di!erent supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material #ow and minimize costs while... more
We provide a tutorial on the construction and evaluation of Markov decision processes (MDPs), which are powerful analytical tools used for sequential decision making under uncertainty that have been widely used in many industrial and... more
Q-learning is a simple, powerful algorithm for behavior learn- ing. It was derived in the context of single agent decision making in Markov decision process environments, but its applicability is much broader—in experiments in multia-... more
Markov Chains provide support for problems involving decisionon uncertainties through a continuous period of time. The greateravailability and access to processing power through computersallow that these models can be used more often to... more
Board games are often taken as examples to teach decision-making algorithms in artificial intelligence (AI). These algorithms are generally presented with a strong focus on winning the game. Unfortunately, a few important aspects, such as... more
Cet article présente les résultats expérimentaux obtenus avec une architecture originale permettant un apprentissage générique dans le cadre de processus décisionnels de Markov factorisés observables dans le désordre (PDMFOD). L'article... more
An approach to real-time control of a network of signalized intersections is proposed based on a discrete time, stationary, Markov control model (also known as Markov decision process or Markov dynamic programming). The approach... more
Many market participants now employ algorithmic trading, commonly defined as the use of computer algorithms to automatically make certain trading decisions, submit orders, and manage those orders after submission. Identifying and... more
We provide a method, based on the theory of Markov decision processes, for e cient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world state; the planner must nd a policy... more
2015): Gaussian process-based algorithmic trading strategy identification, Quantitative Finance, Many market participants now employ algorithmic trading, commonly defined as the use of computer algorithms, to automatically make certain... more
Decision-making in an environment of uncertainty and imprecision for real-world problems is a complex task. In this paper it is introduced general finite state fuzzy Markov chains that have a finite convergence to a stationary (may be... more
This paper¯rst explores the decision-making process in agile teams using scrum practices and second identi¯es factors that in°uence the decision-making process during the Sprint Planning and Daily Scrum Meetings. We conducted 34... more
Recent trends in the commercial aviation industry have resulted in rapidly increasing complexity and decentralisation in service parts logistics systems. As a consequence, MRO service providers tend to adopt more flexible strategies, such... more
Android devices provide opportunities for users to install third-party applications through various online markets. This brings security and privacy concerns to the users since thirdparty applications may pose serious threats. The... more
The development of service robots has recently received considerable attention. Their deployment, however, normally involves a substantial programming effort to develop a particular application. With the incorporation of service robots to... more
The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation problem and a policy optimization problem. We show how to... more
We study the problem of learning near-optimal behavior in finite Markov Decision Processes (MDPs) with a polynomial number of samples. These "PAC-MDP" algorithms include the wellknown E 3 and R-MAX algorithms as well as the more recent... more
We consider the problem of optimally parking empty cars in an elevator group so as to anticipate and intercept the arrival of new passengers and minimize their waiting times. Two solutions are proposed, for the down-peak and up-peak... more
With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, defies an exact solution because of the huge size of the state space. Only heuristic... more
Compared to the wave's intelligence we are equal to a goldfish. This I say not in jest but because it has taken us millions of years to finally arrive at this day where we find the simple truth it isn't even hidden, it never really was,... more
In the Probabilistic I/O Automata (PIOA) framework, nondeterministic choices are resolved using perfect-information schedulers, which are similar to history-dependent policies for Markov decision processes (MDPs). These schedulers are too... more
In supervised learning scenarios, feature selection has been studied widely in the literature. Here, feature selection is considered as an empirical strategy of restricting state space and lessen the complexity of hypothesis. In this work... more
In this paper we introduce a stochastic model for dialogue systems based on Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a... more