Markov Decision Process Research Papers

We study a decision problem faced by an energy limited wireless device that operates in discrete time. There is some external arrival to the device's transmit buffer. The possible decisions are: a) to serve some of the buffer content; b)... more

Bookmark
Download
- by Arzad Kherani
- •
- 20
  Computer Science, Optimal Control, Scheduling, Markov Processes

— This paper presents a method for short-term electricity price forecasting based on combination of the Monte Carlo simulation and Markov chains. The method provides an estimation of the probabilities of various electricity price ranges,... more

A major issue in supply chain inventory management is the coordination of inventory policies adopted by di!erent supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material #ow and minimize costs while... more

the physical phenomena. By this method, the students enhance their problem-solving abilities with minimal programming skills. By using examples, the paper presents an approach to computer-aided problem-solving methods for junior-level... more

Limited battery power at wireless video sensor nodes, along with the transmission quality requirements for video data, makes quality-of-service (QoS) provisioning in a wireless video sensor network a very challenging task. In this paper,... more

Bookmark
Download
- by Afshin Fallahi
- •
- 19
  Engineering, Technology, Remote Sensing, Service Quality

This paper gives a brief overview of version 2.0 of PRISM, a tool for the automatic formal verification of probabilistic systems, and some of the case studies to which it has already been applied.

Sepsis, the tenth-leading cause of death in the United States, accounts for more than $16.7 billion in annual health care costs. A significant factor in these costs is hospital length of stay. The lack of standardized hospital discharge... more

This book, written in Portuguese, deals with Operational Research topics, such as linear programming models and direct techniques for solving Ax = b systems, revised simplex method, Markov processes and queuing systems, in addition to... more

Example 1. Consider the two-state, continuous-time Markov process with transition rate diagram for some positive constants A and B. The generator matrix is given by Q = −A A B −B. Solve the forward Kolmogorov equation for a given initial... more

The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's. During the decades of the last century this theory has grown dramatically. It... more

Semiconductor industry is very capital intensive in which capacity utilization significantly affect the capital effectiveness and profitability of semiconductor manufacturing companies. Due to constant technology advance driven by Moore's... more

The majority of learning algorithms available today focus on approximating the state (V ) or state-action (Q) value function and efficient action selection comes as an afterthought. On the other hand, real-world problems tend to have... more

Recently, it has been recognized that revenue management of cruise ships is different from that of airlines or hotels. Among the main differences is the presence of multiple capacity constraints in cruise ships, i.e., the number of cabins... more

Bu çalışmada bir kamu kurumuna personel yetiştirmekle sorumlu olan ve dört yıllık eğitim hizmeti sağlayan bir okuldaki eğitim durumu incelenmiştir. Kamu kurumunun ihtiyaç duyduğu insan gücünün tamamı bu okuldan mezun olmaktadır. Bu... more

Bookmark
Download
- by Erkan köse and +1
  Tolga Tolga
- •
- 2
  Human Resource Management, Markov Decision Process

We formulate and analyze a Markov decision process (dynamic programming) model for airline seat allocation (yield management) on a single-leg flight with multiple fare classes. Unlike previous models, we allow cancellation, no-shows, and... more

The focus of this work is the computation of efficient strategies for commodity trading in a multi-market environment. In today's "global economy" commodities are often bought in one location and then sold (right away, or after some... more

We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive... more

Material transportation is one of the most important aspects of open-pit mine operations. The problem usually involves a truck dispatching system in which decisions on truck assignments and destinations are taken in real-time. Due to its... more

Material transportation is one of the most important aspects of open-pit mine operations. The problem usually involves a truck dispatching system in which decisions on truck assignments and destinations are taken in real-time. Due to its signicance, several
decision systems for this problem have been developed in the last few years, improving productivity and reducing operating costs. As in many other real-world applications, the assessment and correct modeling of uncertainty is a crucial requirement as the unpredictability originated from equipment faults, weather conditions, and human mistakes, can often result in truck queues or idle shovels. However, uncertainty is not considered in most commercial dispatching systems. In this thesis, we introduce novel truck dispatching systems as a starting point to modify the current practices with a statistically principled decision making methodology. First, we present a stochastic method using Time-
Dependent Markov Decision Process (TiMDP) applied to the truck dispatching problem. In the TiMDP model, travel times are represented as probabilistic density functions (pdfs), time-windows can be inserted for paths availability, and time-dependent utility can be used as a priority parameter. In order to minimize the well-known curse of dimensionality issue, to which multi-agent problems are subject when considering discrete state modelings, the system is modeled based on the introduced single-dependent-agents. Based also on the single-dependent-agents concept, we introduce the Genetic TiMDP (G-TiMDP) applied to the truck dispatching problem. This method is a hybridization of the TiMDP model and of a Genetic Algorithm (GA), which is also used to solve the truck dispatching problem. Finally, in order to evaluate and compare the results of the introduced methods, we execute Monte Carlo simulations in a example heterogeneous mine composed by 15 trucks, 3 shovels, and 1 crusher. The uncertain aspect of the problem is represented by the path selection through crusher and shovels, which is executed by the truck driver, being independent of the dispatching system. The results are compared to classical dispatching approaches (Greedy Heuristic and Minimization of Truck Cycle Times { MTCT) using Student's T-test, proving the eciency of the introduced truck dispatching methods.

In spite of tremendous progress made by human race, there is a marked discontent among the masses, as a large number of people are struggling for their meaningful existence. With all the resources, scientific discoveries, and powers... more

In spite of tremendous progress made by human race, there is a marked discontent among the masses, as a large number of people are struggling for their meaningful existence. With all the resources, scientific discoveries, and powers available, humans are unable to live in peace and safety. Over the years, we have grown in knowledge and money, but not in wisdom and virtue. In other words, we have not progressed towards higher levels of consciousness. Perhaps our trajectory of growth is not in the right direction, and the current scenario might well culminate into a situation of civilisation crisis.

The analysis of modern world crises can be done at the levels of both societies and individuals. The root cause of these crises is, however, at the micro level, that is, at the level of individual. As an individual, the person has many problems, such as greed, self-focus, ego, etc. These are the result of decreased level of consciousness. An attractive potential solution to this problem is to provide a mechanism which is scientifically and logically provable to the quantitative minds and by which the individual can progress towards higher levels of consciousness.

There are two potential challenges which need to be addressed when one attempts to develop a methodology for one’s journey towards higher levels of consciousness. The first is that people are not aware about their own level of consciousness. The second is that, even if they are aware, they might not know the way forward. Some work has been done in the past on these aspects by psychologists, philosophers and social scientists. The existing techniques for measurement of personality traits include psychometric tests, consciousness indices, and IQ, EQ and SQ tests. The major drawback of these techniques is that they are primarily focussed on personality traits, not on the true assessment of consciousness level.

In this paper, we offer a novel attempt to combine the teachings of religion and science. We have conceptualized a comprehensive process for assessment of one’s level of consciousness and have developed an absolute scale of consciousness by identifying the five major attributes of an individual based on the teachings of Saints. Further, based on the awareness of one’s own level of consciousness, we have also developed optimal policy framework for attainment of higher levels of consciousness using Markov Decision Process model (MDP). MDP is a technique used for optimal policy formation for time based sequential decision making problems. The current approach is one of the first attempts to apply this technique in the field of consciousness.

Key words: Consciousness, Markov Decision Process (MDP) Framework, Religion, Science

Bookmark
Download
- by Guru Sant and +1
  Anoop Srivastava
- •
- 2
  Consciousness, Markov Decision Process

We study the 'Creation of Pooling in Inventory and Queueing Models'. This research consists of the study of sharing a scarce resource (such as inventory, server capacity, or production capacity) between multiple customer classes. This is... more

We study the 'Creation of Pooling in Inventory and Queueing Models'. This research consists of the study of sharing a scarce resource (such as inventory, server capacity, or production capacity) between multiple customer classes. This is called pooling, where the goal is to achieve cost or waiting time reductions. For the inventory and queueing models studied, both theoretical, scientific insights are generated, as well as strategies which are applicable in practice.

This monograph consists of two parts: pooling and polling . In the first part, pooling is applied to multi-location inventory models. It is studied how cost reduction can be achieved by the use of stock transfers between local warehouses, so-called lateral transshipments. In this way, stock is pooled between the warehouses. The setting is motivated by a spare parts inventory network, where critical components of technically advanced machines are kept on stock, to reduce down time durations. We create insights into the question when lateral transshipments lead to cost reductions, by studying several models.

Firstly, a system with two stock points is studied, for which we completely characterize the structure of the optimal policy, using dynamic programming. For this, we formulate the model as a Markov decision process. We also derived conditions under which simple, easy to implement, policies are always optimal, such as a hold back policy and a complete pooling policy. Furthermore, we identified the parameter settings under which cost savings can be achieved. Secondly, we characterize the optimal policy structure for a multi-location model where only one stock point issues lateral transshipments, a so-called quick response warehouse. Thirdly, we apply the insights generated to the general multi-location model with lateral transshipments. We propose the use of a hold back policy, and construct a new approximation algorithm for deriving the performance characteristics. It is based on the use of interrupted Poisson processes. The algorithm is shown to be very accurate, and can be used for the optimization of the hold back levels, the parameters of this class of policies. Also, we study related inventory models, where a single stock point servers multiple customers classes.

Furthermore, in the first part, the pooling of server capacity is studied. For a two queue model where the head-of-line processor sharing discipline is applied, we derive the optimal control policy for dividing the servers attention, as well as for accepting customers. Also, a server farm with an infinite number of servers is studied, where servers can be turned off after a service completion in order to save costs. We characterize the optimal policy for this model.

In the second part of the thesis, polling models are studied, which are queueing systems where multiple queues are served by a single server. An application is the production of multiple types of products on a single machine. In this way, the production capacity is pooled between the product types. For the classical polling model, we derive a closed-form approximation for the mean waiting time at each of the queues. The approximation is based on the interpolation of light and heavy traffic results. Also, we study a system with so-called smart customers, where the arrival rate at a queue depends on the position of the server. Finally, we invent two new service disciplines (the gated/exhaustive and the k-gated discipline) for polling models, designed to yield 'fairness and efficiency' in the mean waiting times. That is, they result in almost equal mean waiting times at each of the queues, without increasing the weighted sum of the mean waiting times too much.

Slides of presentation during PhD defense: http://tue.academia.edu/SandraVanWijk/Talks/82570/Pooling_and_Polling_Creation_of_Pooling_in_Inventory_and_Queueing_Models

With the newly available deep learning techniques, new class of problems can be tackled. This is for example the case when dealing with classification or regression problem with an input space of infinite dimension. The mapping input... more

Bookmark
Download
- by Randall BALESTRIERO and +1
  Romain Cosentino
- •
- 17
  Artificial Intelligence, Computer Vision, Reinforcement Learning, Machine Learning

A major issue in supply chain inventory management is the coordination of inventory policies adopted by di!erent supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material #ow and minimize costs while... more

We provide a tutorial on the construction and evaluation of Markov decision processes (MDPs), which are powerful analytical tools used for sequential decision making under uncertainty that have been widely used in many industrial and... more

Q-learning is a simple, powerful algorithm for behavior learn- ing. It was derived in the context of single agent decision making in Markov decision process environments, but its applicability is much broader—in experiments in multia-... more

Markov Chains provide support for problems involving decisionon uncertainties through a continuous period of time. The greateravailability and access to processing power through computersallow that these models can be used more often to... more

Bookmark
Download
- by Renato Cesar Sato
- •
- 2
  Markov Decision Process, Markov chains

The increasing population has posed a threat to the existence of the forests, which provide many services to us. Of late, they seem to be degraded, deforested and converted into other land use classes. In such situation, it becomes... more

Bookmark
Download
- by laxmi goparaju and +1
  Firoz Ahmad
- •
- 4
  Remote Sensing, Markov Decision Process, FAO, Remote Sensing and GIS applications in Forestry

Board games are often taken as examples to teach decision-making algorithms in artificial intelligence (AI). These algorithms are generally presented with a strong focus on winning the game. Unfortunately, a few important aspects, such as... more

Bookmark
Download
- by Miguel Bagajewicz
- •
- 2
  Markov Decision Process, Business Risk

Cet article présente les résultats expérimentaux obtenus avec une architecture originale permettant un apprentissage générique dans le cadre de processus décisionnels de Markov factorisés observables dans le désordre (PDMFOD). L'article... more

Bookmark
Download
- by E. Piat
- •
- 3
  Reinforcement Learning, Markov Decision Process, State Space

An approach to real-time control of a network of signalized intersections is proposed based on a discrete time, stationary, Markov control model (also known as Markov decision process or Markov dynamic programming). The approach... more

Many market participants now employ algorithmic trading, commonly defined as the use of computer algorithms to automatically make certain trading decisions, submit orders, and manage those orders after submission. Identifying and... more

We provide a method, based on the theory of Markov decision processes, for e cient planning in stochastic domains. Goals are encoded as reward functions, expressing the desirability of each world state; the planner must nd a policy... more

2015): Gaussian process-based algorithmic trading strategy identification, Quantitative Finance, Many market participants now employ algorithmic trading, commonly defined as the use of computer algorithms, to automatically make certain... more

Decision-making in an environment of uncertainty and imprecision for real-world problems is a complex task. In this paper it is introduced general finite state fuzzy Markov chains that have a finite convergence to a stationary (may be... more

This paper¯rst explores the decision-making process in agile teams using scrum practices and second identi¯es factors that in°uence the decision-making process during the Sprint Planning and Daily Scrum Meetings. We conducted 34... more

Recent trends in the commercial aviation industry have resulted in rapidly increasing complexity and decentralisation in service parts logistics systems. As a consequence, MRO service providers tend to adopt more flexible strategies, such... more

Android devices provide opportunities for users to install third-party applications through various online markets. This brings security and privacy concerns to the users since thirdparty applications may pose serious threats. The... more

Bookmark
Download
- by Bahman Rashidi
- •
- 16
  Stochastic Process, Computer Science, Privacy, Security

The development of service robots has recently received considerable attention. Their deployment, however, normally involves a substantial programming effort to develop a particular application. With the incorporation of service robots to... more

The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation problem and a policy optimization problem. We show how to... more

We study the problem of learning near-optimal behavior in finite Markov Decision Processes (MDPs) with a polynomial number of samples. These "PAC-MDP" algorithms include the wellknown E 3 and R-MAX algorithms as well as the more recent... more

We consider the problem of optimally parking empty cars in an elevator group so as to anticipate and intercept the arrival of new passengers and minimize their waiting times. Two solutions are proposed, for the down-peak and up-peak... more

Bookmark
Download
- by Daniel Nikovski
- •
- Markov Decision Process

With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, defies an exact solution because of the huge size of the state space. Only heuristic... more

Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control... more

Bookmark
Download
- by Craig Boutilier and +1
  Cs Chair Tom Dean
- •
- 18
  Cognitive Science, Applied Mathematics, Artificial Intelligence, Control Theory

Compared to the wave's intelligence we are equal to a goldfish. This I say not in jest but because it has taken us millions of years to finally arrive at this day where we find the simple truth it isn't even hidden, it never really was,... more

Compared to the wave's intelligence we are equal to a goldfish. This I say not in jest but because it has taken us millions of years to finally arrive at this day where we find the simple truth it isn't even hidden, it never really was, just like children we were kept out of the pantry with good reason. In this paper, you will find out what the universe is exactly how it functions at every level and you will be able to communicate directly with the main system passing information into the creation process via a real code and very real system provided by the universe. Once I understood how the system worked I was able to understand a good part of the code. I don't know what the consequences will be of us moving out of our very limited reality. I think it is time we do something because the solution to every problem we face today and tomorrow is over there on the core systems processing side. I am the only one so far who has glimpsed the difference in our side and the main system side and well let's just say I now feel trapped and confined into a tiny domain with by comparison has extremely limited options. After glimpsing the other side it became clear that we are confined to a tiny part of reality and left in ignorance until we mature. The inability to capture or calculate the wave collapse, made us think that nothing important was happening in the wave collapse. It is unfathomable to think that so much information and energy could be made in the 1 −42 seconds it takes for the wave to collapse. It would be impossible unless there was a fundamental core process that was handling the wave collapse setting up the platform and delivering information during the collapse to the environment. The wave collapse is how the system creates events and information. The group wave intelligence of the universe acts as a central processing core that runs all the background processes that are needed for the software or user environment to work. Without these processes running in the background our reality would not function. Through these processes Infinite amounts of data are created in the fraction of a second it takes for the wave to collapse. It is not a collapse but a perfectly controlled system. A new breakthrough has allowed us into this hidden system.

Bookmark
Download
- by Mark L Moody
- •
- 17
  Mathematics, Number Theory, Artificial Intelligence, Theoretical Physics

In the Probabilistic I/O Automata (PIOA) framework, nondeterministic choices are resolved using perfect-information schedulers, which are similar to history-dependent policies for Markov decision processes (MDPs). These schedulers are too... more

In supervised learning scenarios, feature selection has been studied widely in the literature. Here, feature selection is considered as an empirical strategy of restricting state space and lessen the complexity of hypothesis. In this work... more

Bookmark
Download
- by Sattar Hashemi
- •
- 10
  Computer Science, Game Theory, Reinforcement Learning, Monte Carlo

In this paper we introduce a stochastic model for dialogue systems based on Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a... more

Electronic markets have emerged as popular venues for the trading of a wide variety of financial assets, and computer based algorithmic trading has also asserted itself as a dominant force in financial markets across the world.... more

Bookmark
Download
- by Mark Paddrik and +1
  Steve Yang
- •
- 14
  Linear Programming, Markov Processes, Markov Decision Process, Financial Markets

Markov Decision Process

Log In