Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Hoong Lau

    Hoong Lau

    Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the... more
    Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of interactions among agents, a feature commonly present in large scale multiagent applications; (b) a shaped reward model analytically derived from the learned reward model to address the key challenge of credit assignment; (c) a model-based multiagent RL approach that integrates shaped rewards into well known RL algorithms such as...
    Law enforcement agencies in dense urban environments, faced with a wide range of incidents to handle and limited manpower, are turning to data-driven AI to inform their policing strategy. In this paper we present a patrol scheduling... more
    Law enforcement agencies in dense urban environments, faced with a wide range of incidents to handle and limited manpower, are turning to data-driven AI to inform their policing strategy. In this paper we present a patrol scheduling system called GRAND-VISION: Ground Response Allocation and Deployment - Visualization, Simulation, and Optimization. The system employs deep learning to generate incident sets that are used to train a patrol schedule that can accommodate varying manpower, break times, manual pre-allocations, and a variety of spatio-temporal demand features. The complexity of the scenario results in a system with real world applicability, which we demonstrate through simulation on historical data obtained from a large urban law enforcement agency.
    Algorithm portfolios seek to determine an effective set of algorithms that can be used within an algorithm selection framework to solve problems. A limited number of these portfolio studies focus on generating different versions of a... more
    Algorithm portfolios seek to determine an effective set of algorithms that can be used within an algorithm selection framework to solve problems. A limited number of these portfolio studies focus on generating different versions of a target algorithm using different parameter configurations. In this paper, we employ a Design of Experiments (DOE) approach to determine a promising range of values for each parameter of an algorithm. These ranges are further processed to determine a portfolio of parameter configurations, which would be used within two online Algorithm Selection approaches for solving different instances of a given combinatorial optimization problem effectively. We apply our approach on a Simulated Annealing-Tabu Search (SA-TS) hybrid algorithm for solving the Quadratic Assignment Problem (QAP) as well as an Iterated Local Search (ILS) on the Travelling Salesman Problem (TSP). We also generate a portfolio of parameter configurations using best-of-breed parameter tuning a...
    In this article, we investigate effective ways of utilizing crowdworkers in providing various urban services. The task recommendation platform that we design can match tasks to crowdworkers based on workers’ historical trajectories and... more
    In this article, we investigate effective ways of utilizing crowdworkers in providing various urban services. The task recommendation platform that we design can match tasks to crowdworkers based on workers’ historical trajectories and time budget limits, thus making recommendations personal and efficient. One major challenge we manage to address is the handling of crowdworker’s trajectory uncertainties. In this article, we explicitly allow multiple routine routes to be probabilistically associated with each worker. We formulate this problem as an integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structures of the formulation and apply the Lagrangian relaxation technique to scale up computation. Numerical experiments have been performed over the instances generated using the realistic public transit dataset in Singapore. The results show that we can find significantly better solutions than the determ...
    With the advent of e-commerce, logistics providers are faced with the challenge of handling fluctuating and sparsely distributed demand, which raises their operational costs significantly. As a result, horizontal cooperation are gaining... more
    With the advent of e-commerce, logistics providers are faced with the challenge of handling fluctuating and sparsely distributed demand, which raises their operational costs significantly. As a result, horizontal cooperation are gaining momentum around the world. One of the major impediments, however, is the lack of stable and fair profit sharing mechanism. In this paper, we address this problem using the framework of computational cooperative games. We first present cooperative vehicle routing game as a model for collaborative logistics operations. Using the axioms of Shapley value as the conditions for fairness, we show that a stable, fair and budget balanced allocation does not exist in many instances of the game. By relaxing budget balance, we then propose an allocation scheme based on the normalized Shapley value. We show that this scheme maintains stability and fairness while requiring minimum subsidy. Finally, using numerical experiments we demonstrate the feasibility of the ...
    Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical)... more
    Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm ba...
    In real-world urban logistics operations, changes to the routes and tasks occur in response to dynamic events. To ensure customers' demands are met, planners need to make these changes quickly (sometimes instantaneously). This paper... more
    In real-world urban logistics operations, changes to the routes and tasks occur in response to dynamic events. To ensure customers' demands are met, planners need to make these changes quickly (sometimes instantaneously). This paper proposes the formulation of a dynamic vehicle routing problem with time windows and both known and stochastic customers as a route-based Markov Decision Process. We propose a solution approach that combines Deep Reinforcement Learning (specifically neural networks-based Temporal-Difference learning with experience replay) to approximate the value function and a routing heuristic based on Simulated Annealing, called DRLSA. Our approach enables optimized re-routing decision to be generated almost instantaneously. Furthermore, to exploit the structure of this problem, we propose a state representation based on the total cost of the remaining routes of the vehicles. We show that the cost of the remaining routes of vehicles can serve as proxy to the seque...
    Police patrol aims to fulfill two main objectives namely to project presence and to respond to incidents in a timely manner. Incidents happen dynamically and can disrupt the initially-planned patrol schedules. The key decisions to be made... more
    Police patrol aims to fulfill two main objectives namely to project presence and to respond to incidents in a timely manner. Incidents happen dynamically and can disrupt the initially-planned patrol schedules. The key decisions to be made will be which patrol agent to be dispatched to respond to an incident and subsequently how to adapt the patrol schedules in response to such dynamically-occurring incidents whilst still fulfilling both objectives; which sometimes can be conflicting. In this paper, we define this real-world problem as a Dynamic Bi-Objective Police Patrol Dispatching and Rescheduling Problem and propose a solution approach that combines Deep Reinforcement Learning (specifically neural networks-based Temporal-Difference learning with experience replay) to approximate the value function and a rescheduling heuristic based on ejection chains to learn both dispatching and rescheduling policies jointly. To address the dual objectives, we propose a reward function that impl...
    Effective placement of emergency response vehicles (such as ambulances, fire trucks, police cars) to deal with medical, fire or criminal activities can reduce the incident response time by few seconds, which in turn can potentially save a... more
    Effective placement of emergency response vehicles (such as ambulances, fire trucks, police cars) to deal with medical, fire or criminal activities can reduce the incident response time by few seconds, which in turn can potentially save a human life. Owing to its adoption in Emergency Medical Services (EMSs) worldwide, existing research on improving emergency response has focused on optimizing the objective of bounded time (i.e. number of incidents served in a fixed time). Due to the dependence of this objective on temporal uncertainty, optimizing the bounded time objective is challenging. In this paper, we propose a new objective referred to as the bounded rank (which is the number of incidents served by a base station whose rank is below a bounded rank value) that has nice theoretical properties and serves as an indirect substitute for the bounded time objective. To understand the theoretical properties of this new objective in the context of the spatio-temporal uncertainty associ...
    This paper introduces and addresses a new multi-agent variant of the orienteering problem (OP), namely the multi-agent orienteering problem with capacity constraints (MAOPCC). Different from the existing variants of OP, MAOPCC allows a... more
    This paper introduces and addresses a new multi-agent variant of the orienteering problem (OP), namely the multi-agent orienteering problem with capacity constraints (MAOPCC). Different from the existing variants of OP, MAOPCC allows a group of visitors to concurrently visit a node but limits the number of visitors simultaneously being served at each node. In this work, we solve MAOPCC in a centralized manner and optimize the total collected rewards of all agents. A branch and bound algorithm is first proposed to find an optimal MAOPCC solution. Since finding an optimal solution for MAOPCC can become intractable as the number of vertices and agents increases, a computationally efficient sequential algorithm that sacrifices the solution quality is then proposed. Finally, a probabilistic iterated local search algorithm is developed to find a sufficiently good solution in a reasonable time. Our experimental results show that the latter strikes a good tradeoff between the solution quali...
    Mobile crowd-sourcing can become as a strategy to perform time-sensitive urban tasks (such as municipal monitoring and last mile logistics) by effectively coordinating smartphone users. The success of the mobile crowd-sourcing platform... more
    Mobile crowd-sourcing can become as a strategy to perform time-sensitive urban tasks (such as municipal monitoring and last mile logistics) by effectively coordinating smartphone users. The success of the mobile crowd-sourcing platform depends mainly on its effectiveness in engaging crowd-workers, and recent studies have shown that compared to the pull-based approach, which relies on crowd-workers to browse and commit to tasks they would want to perform, the push-based approach can take into consideration of worker's daily routine, and generate highly effective recommendations. As a result, workers waste less time on detours, plan more in advance, and require much less planning effort. However, the push-based systems are not without drawbacks. The major concern is the potential privacy invasion that could result from the disclosure of individual's mobility traces to the crowd-sourcing platform. In this paper, we first demonstrate specific threats of continuous sharing of use...
    Mobile crowd-sourcing can become as a strategy to perform time-sensitive urban tasks (such as municipal monitoring and last mile logistics) by effectively coordinating smartphone users. The success of the mobile crowd-sourcing platform... more
    Mobile crowd-sourcing can become as a strategy to perform time-sensitive urban tasks (such as municipal monitoring and last mile logistics) by effectively coordinating smartphone users. The success of the mobile crowd-sourcing platform depends mainly on its effectiveness in engaging crowd-workers, and recent studies have shown that compared to the pull-based approach, which relies on crowd-workers to browse and commit to tasks they would want to perform, the push-based approach can take into consideration of worker's daily routine, and generate highly effective recommendations. As a result, workers waste less time on detours, plan more in advance, and require much less planning effort. However, the push-based systems are not without drawbacks. The major concern is the potential privacy invasion that could result from the disclosure of individual's mobility traces to the crowd-sourcing platform. In this paper, we first demonstrate specific threats of continuous sharing of use...
    Orienteering Problems (OPs) are used to model many routing and trip planning problems. OPs are a variant of the well-known traveling salesman problem where the goal is to compute the highest reward path that includes a subset of vertices... more
    Orienteering Problems (OPs) are used to model many routing and trip planning problems. OPs are a variant of the well-known traveling salesman problem where the goal is to compute the highest reward path that includes a subset of vertices and has an overall travel time less than a specified deadline. However, the applicability of OPs is limited due to the assumption of deterministic and static travel times. To that end, Campbell et al. extended OPs to Stochastic OPs (SOPs) to represent uncertain travel times (Campbell et al. 2011). In this article, we make the following key contributions: (1) We extend SOPs to Dynamic SOPs (DSOPs), which allow for time-dependent travel times; (2) we introduce a new objective criterion for SOPs and DSOPs to represent a percentile measure of risk; (3) we provide non-linear optimization formulations along with their linear equivalents for solving the risk-sensitive SOPs and DSOPs; (4) we provide a local search mechanism for solving the risk-sensitive SO...
    We address the problem of maritime traffic management in busy waterways to increase the safety of navigation by reducing congestion. We model maritime traffic as a large multiagent systems with individual vessels as agents, and VTS... more
    We address the problem of maritime traffic management in busy waterways to increase the safety of navigation by reducing congestion. We model maritime traffic as a large multiagent systems with individual vessels as agents, and VTS authority as the regulatory agent. We develop a maritime traffic simulator based on historical traffic data that incorporates realistic domain constraints such as uncertain and asynchronous movement of vessels. We also develop a traffic coordination approach that provides speed recommendation to vessels in different zones. We exploit the nature of collective interactions among agents to develop a scalable policy gradient approach that can scale up to real world problems. Empirical results on synthetic and real world problems show that our approach can significantly reduce congestion while keeping the traffic throughput high.
    By effectively reaching out to and engaging larger population of mobile users, mobile crowd-sourcing has become a strategy to perform large amount of urban tasks. The recent empirical studies have shown that compared to the pull-based... more
    By effectively reaching out to and engaging larger population of mobile users, mobile crowd-sourcing has become a strategy to perform large amount of urban tasks. The recent empirical studies have shown that compared to the pull-based approach, which expects the users to browse through the list of tasks to perform, the push-based approach that actively recommends tasks can greatly improve the overall system performance. As the efficiency of the push-based approach is achieved by incorporating worker's mobility traces, privacy is naturally a concern. In this paper, we propose a novel, 2-stage and user-controlled obfuscation technique that provides a trade off-amenable framework that caters to multi-attribute privacy measures (considering the per-user sensitivity and global uniqueness of locations). We demonstrate the effectiveness of our approach by testing it using the real-world data collected from the well-established TA$Ker platform. More specifically, we show that one can in...
    Scheduling problems in manufacturing, logistics and project management have frequently been modeled using the framework of Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max). Due to the... more
    Scheduling problems in manufacturing, logistics and project management have frequently been modeled using the framework of Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max). Due to the importance of these problems, providing scalable solution schedules for RCPSP/max problems is a topic of extensive research. However, all existing methods for solving RCPSP/max assume that durations of activities are known with certainty, an assumption that does not hold in real world scheduling problems where unexpected external events such as manpower availability, weather changes, etc. lead to delays or advances in completion of activities. Thus, in this paper, our focus is on providing a scalable method for solving RCPSP/max problems with durational uncertainty. To that end, we introduce the robust local search method consisting of three key ideas: (a) Introducing and studying the properties of two decision rule approximations used to compute start tim...

    And 225 more