Search | arXiv e-print repository

Quantifying the Cost of Learning in Queueing Systems

Authors: Daniel Freund, Thodoris Lykouris, Wentao Weng

Abstract: Queueing systems are widely applicable stochastic models with use cases in communication networks, healthcare, service systems, etc. Although their optimal control has been extensively studied, most existing approaches assume perfect knowledge of the system parameters. Of course, this assumption rarely holds in practice where there is parameter uncertainty, thus motivating a recent line of work on… ▽ More Queueing systems are widely applicable stochastic models with use cases in communication networks, healthcare, service systems, etc. Although their optimal control has been extensively studied, most existing approaches assume perfect knowledge of the system parameters. Of course, this assumption rarely holds in practice where there is parameter uncertainty, thus motivating a recent line of work on bandit learning for queueing systems. This nascent stream of research focuses on the asymptotic performance of the proposed algorithms. In this paper, we argue that an asymptotic metric, which focuses on late-stage performance, is insufficient to capture the intrinsic statistical complexity of learning in queueing systems which typically occurs in the early stage. Instead, we propose the Cost of Learning in Queueing (CLQ), a new metric that quantifies the maximum increase in time-averaged queue length caused by parameter uncertainty. We characterize the CLQ of a single queue multi-server system, and then extend these results to multi-queue multi-server systems and networks of queues. In establishing our results, we propose a unified analysis framework for CLQ that bridges Lyapunov and bandit analysis, provides guarantees for a wide range of algorithms, and could be of independent interest. △ Less

Submitted 27 October, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

Comments: A condensed version of this work was accepted for presentation at the Conference on Neural Information Processing Systems (NeurIPS 2023). Compared to the first version of the paper, the current version expands the comparison with related work

arXiv:2307.16315 [pdf, other]

Towards Practical Robustness Auditing for Linear Regression

Authors: Daniel Freund, Samuel B. Hopkins

Abstract: We investigate practical algorithms to find or disprove the existence of small subsets of a dataset which, when removed, reverse the sign of a coefficient in an ordinary least squares regression involving that dataset. We empirically study the performance of well-established algorithmic techniques for this task -- mixed integer quadratically constrained optimization for general linear regression p… ▽ More We investigate practical algorithms to find or disprove the existence of small subsets of a dataset which, when removed, reverse the sign of a coefficient in an ordinary least squares regression involving that dataset. We empirically study the performance of well-established algorithmic techniques for this task -- mixed integer quadratically constrained optimization for general linear regression problems and exact greedy methods for special cases. We show that these methods largely outperform the state of the art and provide a useful robustness check for regression problems in a few dimensions. However, significant computational bottlenecks remain, especially for the important task of disproving the existence of such small sets of influential samples for regression problems of dimension $3$ or greater. We make some headway on this challenge via a spectral algorithm using ideas drawn from recent innovations in algorithmic robust statistics. We summarize the limitations of known techniques in several challenge datasets to encourage further algorithmic innovation. △ Less

Submitted 30 July, 2023; originally announced July 2023.

arXiv:2301.10642 [pdf, other]

Group fairness in dynamic refugee assignment

Authors: Daniel Freund, Thodoris Lykouris, Elisabeth Paulson, Bradley Sturt, Wentao Weng

Abstract: Ensuring that refugees and asylum seekers thrive (e.g., find employment) in their host countries is a profound humanitarian goal, and a primary driver of employment is the geographic location within a host country to which the refugee or asylum seeker is assigned. Recent research has proposed and implemented algorithms that assign refugees and asylum seekers to geographic locations in a manner tha… ▽ More Ensuring that refugees and asylum seekers thrive (e.g., find employment) in their host countries is a profound humanitarian goal, and a primary driver of employment is the geographic location within a host country to which the refugee or asylum seeker is assigned. Recent research has proposed and implemented algorithms that assign refugees and asylum seekers to geographic locations in a manner that maximizes the average employment across all arriving refugees. While these algorithms can have substantial overall positive impact, using data from two industry collaborators we show that the impact of these algorithms can vary widely across key subgroups based on country of origin, age, or educational background. Thus motivated, we develop a simple and interpretable framework for incorporating group fairness into the dynamic refugee assignment problem. In particular, the framework can flexibly incorporate many existing and future definitions of group fairness from the literature (e.g., maxmin, randomized, and proportionally-optimized within-group). Equipped with our framework, we propose two bid-price algorithms that maximize overall employment while simultaneously yielding provable group fairness guarantees. Through extensive numerical experiments using various definitions of group fairness and real-world data from the U.S. and the Netherlands, we show that our algorithms can yield substantial improvements in group fairness compared to an offline benchmark fairness constraints, with only small relative decreases ($\approx$ 1%-5%) in global performance. △ Less

Submitted 11 January, 2024; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2206.03324 [pdf, other]

Efficient decentralized multi-agent learning in asymmetric bipartite queueing systems

Authors: Daniel Freund, Thodoris Lykouris, Wentao Weng

Abstract: We study decentralized multi-agent learning in bipartite queueing systems, a standard model for service systems. In particular, N agents request service from K servers in a fully decentralized way, i.e, by running the same algorithm without communication. Previous decentralized algorithms are restricted to symmetric systems, have performance that is degrading exponentially in the number of servers… ▽ More We study decentralized multi-agent learning in bipartite queueing systems, a standard model for service systems. In particular, N agents request service from K servers in a fully decentralized way, i.e, by running the same algorithm without communication. Previous decentralized algorithms are restricted to symmetric systems, have performance that is degrading exponentially in the number of servers, require communication through shared randomness and unique agent identities, and are computationally demanding. In contrast, we provide a simple learning algorithm that, when run decentrally by each agent, leads the queueing system to have efficient performance in general asymmetric bipartite queueing systems while also having additional robustness properties. Along the way, we provide the first provably efficient UCB-based algorithm for the centralized case of the problem. △ Less

Submitted 5 August, 2023; v1 submitted 5 June, 2022; originally announced June 2022.

Comments: To appear in Operations Research. A preliminary version of this work was accepted for presentation at the Conference on Learning Theory (COLT) 2022. Compared to the first version of the paper, the current version expands upon the related work and adds intuition on the technical content

arXiv:2111.00002 [pdf, other]

Fair Incentives for Repeated Engagement

Authors: Daniel Freund, Chamsi Hssaine

Abstract: We study a decision-maker's problem of finding optimal monetary incentive schemes for retention when faced with agents whose participation decisions (stochastically) depend on the incentive they receive. Our focus is on policies constrained to fulfill two fairness properties that preclude outcomes wherein different groups of agents experience different treatment on average. We formulate the proble… ▽ More We study a decision-maker's problem of finding optimal monetary incentive schemes for retention when faced with agents whose participation decisions (stochastically) depend on the incentive they receive. Our focus is on policies constrained to fulfill two fairness properties that preclude outcomes wherein different groups of agents experience different treatment on average. We formulate the problem as a high-dimensional stochastic optimization problem, and study it through the use of a closely related deterministic variant. We show that the optimal static solution to this deterministic variant is asymptotically optimal for the dynamic problem under fairness constraints. Though solving for the optimal static solution gives rise to a non-convex optimization problem, we uncover a structural property that allows us to design a tractable, fast-converging heuristic policy. Traditional schemes for retention ignore fairness constraints; indeed, the goal in these is to use differentiation to incentivize repeated engagement with the system. Our work (i) shows that even in the absence of explicit discrimination, dynamic policies may unintentionally discriminate between agents of different types by varying the type composition of the system, and (ii) presents an asymptotically optimal policy to avoid such discriminatory outcomes. △ Less

Submitted 29 July, 2024; v1 submitted 28 October, 2021; originally announced November 2021.

arXiv:2104.14740 [pdf, other]

Driver Positioning and Incentive Budgeting with an Escrow Mechanism for Ridesharing Platforms

Authors: Hao Yi Ong, Daniel Freund, Davide Crapis

Abstract: Drivers on the Lyft rideshare platform do not always know where the areas of supply shortage are in real time. This lack of information hurts both riders trying to find a ride and drivers trying to determine how to maximize their earnings opportunity. Lyft's Personal Power Zone (PPZ) product helps the company to maintain high levels of service on the platform by influencing the spatial distributio… ▽ More Drivers on the Lyft rideshare platform do not always know where the areas of supply shortage are in real time. This lack of information hurts both riders trying to find a ride and drivers trying to determine how to maximize their earnings opportunity. Lyft's Personal Power Zone (PPZ) product helps the company to maintain high levels of service on the platform by influencing the spatial distribution of drivers in real time via monetary incentives that encourage them to reposition their vehicles. The underlying system that powers the product has two main components: (1) a novel 'escrow mechanism' that tracks available incentive budgets tied to locations within a city in real time, and (2) an algorithm that solves the stochastic driver positioning problem to maximize short-run revenue from riders' fares. The optimization problem is a multiagent dynamic program that is too complicated to solve optimally for our large-scale application. Our approach is to decompose it into two subproblems. The first determines the set of drivers to incentivize and where to incentivize them to position themselves. The second determines how to fund each incentive using the escrow budget. By formulating it as two convex programs, we are able to use commercial solvers that find the optimal solution in a matter of seconds. Rolled out to all 320 cities in which Lyft's operates in a little over a year, the system now generates millions of bonuses that incentivize hundreds of thousands of active drivers to optimally position themselves in anticipation of ride requests every week. Together, the PPZ product and its underlying algorithms represent a paradigm shift in how Lyft drivers drive and generate earnings on the platform. Its direct business impact has been a 0.5% increase in incremental bookings, amounting to tens of millions of dollars per year. △ Less

Submitted 29 April, 2021; originally announced April 2021.

Comments: Forthcoming in INFORMS Journal on Applied Analytics

arXiv:1801.09749 [pdf, ps, other]

Deep Learning based Retinal OCT Segmentation

Authors: Mike Pekala, Neil Joshi, David E. Freund, Neil M. Bressler, Delia Cabrera DeBuc, Philippe M Burlina

Abstract: Our objective is to evaluate the efficacy of methods that use deep learning (DL) for the automatic fine-grained segmentation of optical coherence tomography (OCT) images of the retina. OCT images from 10 patients with mild non-proliferative diabetic retinopathy were used from a public (U. of Miami) dataset. For each patient, five images were available: one image of the fovea center, two images of… ▽ More Our objective is to evaluate the efficacy of methods that use deep learning (DL) for the automatic fine-grained segmentation of optical coherence tomography (OCT) images of the retina. OCT images from 10 patients with mild non-proliferative diabetic retinopathy were used from a public (U. of Miami) dataset. For each patient, five images were available: one image of the fovea center, two images of the perifovea, and two images of the parafovea. For each image, two expert graders each manually annotated five retinal surfaces (i.e. boundaries between pairs of retinal layers). The first grader's annotations were used as ground truth and the second grader's annotations to compute inter-operator agreement. The proposed automated approach segments images using fully convolutional networks (FCNs) together with Gaussian process (GP)-based regression as a post-processing step to improve the quality of the estimates. Using 10-fold cross validation, the performance of the algorithms is determined by computing the per-pixel unsigned error (distance) between the automated estimates and the ground truth annotations generated by the first manual grader. We compare the proposed method against five state of the art automatic segmentation techniques. The results show that the proposed methods compare favorably with state of the art techniques, resulting in the smallest mean unsigned error values and associated standard deviations, and performance is comparable with human annotation of retinal layers from OCT when there is only mild retinopathy. The results suggest that semantic segmentation using FCNs, coupled with regression-based post-processing, can effectively solve the OCT segmentation problem on par with human capabilities with mild retinopathy. △ Less

Submitted 29 January, 2018; originally announced January 2018.

arXiv:1611.09304 [pdf, other]

Minimizing Multimodular Functions and Allocating Capacity in Bike-Sharing Systems

Authors: Daniel Freund, Shane G. Henderson, David B. Shmoys

Abstract: The growing popularity of bike-sharing systems around the world has motivated recent attention to models and algorithms for their effective operation. Most of this literature focuses on their daily operation for managing asymmetric demand. In this work, we consider the more strategic question of how to (re-)allocate dock-capacity in such systems. We develop mathematical formulations for variations… ▽ More The growing popularity of bike-sharing systems around the world has motivated recent attention to models and algorithms for their effective operation. Most of this literature focuses on their daily operation for managing asymmetric demand. In this work, we consider the more strategic question of how to (re-)allocate dock-capacity in such systems. We develop mathematical formulations for variations of this problem (either for service performance over the course of one day or for a long-run-average) and exhibit discrete convex properties in associated optimization problems. This allows us to design a practically fast polynomial-time allocation algorithm to compute an optimal solution for this problem, which can also handle practically motivated constraints, such as a limit on the number of docks moved in the system. We apply our algorithm to data sets from Boston, New York City, and Chicago to investigate how different dock allocations can yield better service in these systems. Recommendations based on our analysis have led to changes in the system design in Chicago and New York City. Beyond optimizing for improved quality of service through better allocations, our results also provide a metric to compare the impact of strategically reallocating docks and the rebalancing of bikes. △ Less

Submitted 14 March, 2022; v1 submitted 28 November, 2016; originally announced November 2016.

arXiv:1608.06819 [pdf, ps, other]

Pricing and Optimization in Shared Vehicle Systems: An Approximation Framework

Authors: Siddhartha Banerjee, Daniel Freund, Thodoris Lykouris

Abstract: Optimizing shared vehicle systems (bike/scooter/car/ride-sharing) is more challenging compared to traditional resource allocation settings due to the presence of \emph{complex network externalities} -- changes in the demand/supply at any location affect future supply throughout the system within short timescales. These externalities are well captured by steady-state Markovian models, which are the… ▽ More Optimizing shared vehicle systems (bike/scooter/car/ride-sharing) is more challenging compared to traditional resource allocation settings due to the presence of \emph{complex network externalities} -- changes in the demand/supply at any location affect future supply throughout the system within short timescales. These externalities are well captured by steady-state Markovian models, which are therefore widely used to analyze such systems. However, using such models to design pricing and other control policies is computationally difficult since the resulting optimization problems are high-dimensional and non-convex. To this end, we develop a \emph{rigorous approximation framework} for shared vehicle systems, providing a unified approach for a wide range of controls (pricing, matching, rebalancing), objective functions (throughput, revenue, welfare), and system constraints (travel-times, welfare benchmarks, posted-price constraints). Our approach is based on the analysis of natural convex relaxations, and obtains as special cases existing approximate-optimal policies for limited settings, asymptotic-optimality results, and heuristic policies. The resulting guarantees are non-asymptotic and parametric, and provide operational insights into the design of real-world systems. In particular, for any shared vehicle system with $n$ stations and $m$ vehicles, our framework obtains an approximation ratio of $1+(n-1)/m$, which is particularly meaningful when $m/n$, the average number of vehicles per station, is large, as is often the case in practice. △ Less

Submitted 10 May, 2021; v1 submitted 24 August, 2016; originally announced August 2016.

Comments: The current version represents the content that will appear in Operations Research. A one-page abstract of the paper appeared at the 18th ACM Conference on Economics and Computation (EC 2017)

arXiv:1510.00738 [pdf, ps, other]

Rank Aggregation: New Bounds for MCx

Authors: Daniel Freund, David P. Williamson

Abstract: The rank aggregation problem has received significant recent attention within the computer science community. Its applications today range far beyond the original aim of building metasearch engines to problems in machine learning, recommendation systems and more. Several algorithms have been proposed for these problems, and in many cases approximation guarantees have been proven for them. However,… ▽ More The rank aggregation problem has received significant recent attention within the computer science community. Its applications today range far beyond the original aim of building metasearch engines to problems in machine learning, recommendation systems and more. Several algorithms have been proposed for these problems, and in many cases approximation guarantees have been proven for them. However, it is also known that some Markov chain based algorithms (MC1, MC2, MC3, MC4) perform extremely well in practice, yet had no known performance guarantees. We prove supra-constant lower bounds on approximation guarantees for all of them. We also raise the lower bound for sorting by Copeland score from 3/2 to 2 and prove an upper bound of 11, before showing that in particular ways, MC4 can nevertheless be seen as a generalization of Copeland score. △ Less

Submitted 2 October, 2015; originally announced October 2015.

ACM Class: F.2.2; G.2.1

arXiv:1503.00158 [pdf, ps, other]

Contagious Sets in Dense Graphs

Authors: Daniel Freund, Matthias Poloczek, Daniel Reichman

Abstract: We study the activation process in undirected graphs known as bootstrap percolation: a vertex is active either if it belongs to a set of initially activated vertices or if at some point it had at least r active neighbors, for a threshold r that is identical for all vertices. A contagious set is a vertex set whose activation results with the entire graph being active. Let m(G,r) be the size of a sm… ▽ More We study the activation process in undirected graphs known as bootstrap percolation: a vertex is active either if it belongs to a set of initially activated vertices or if at some point it had at least r active neighbors, for a threshold r that is identical for all vertices. A contagious set is a vertex set whose activation results with the entire graph being active. Let m(G,r) be the size of a smallest contagious set in a graph G on n vertices. We examine density conditions that ensure m(G,r) = r for all r >= 2. With respect to the minimum degree, we prove that such a smallest possible contagious set is guaranteed to exist if and only if G has minimum degree at least (k-1)/k * n. Moreover, we study the speed with which the activation spreads and provide tight upper bounds on the number of rounds it takes until all nodes are activated in such a graph. We also investigate what average degree asserts the existence of small contagious sets. For n >= k >= r, we denote by M(n,k,r) the maximum number of edges in an n-vertex graph G satisfying m(G,r)>k. We determine the precise value of M(n,k,2) and M(n,k,k), assuming that n is sufficiently large compared to k. △ Less

Submitted 17 November, 2015; v1 submitted 28 February, 2015; originally announced March 2015.

Comments: Extended version of the IWOCA'15 paper that generalizes the results on the minimum degree condition and the speed of the activation process to arbitrary values for the threshold parameter r

MSC Class: 05C35; 68R10 ACM Class: G.2.2

Showing 1–11 of 11 results for author: Freund, D