Approximating Optimum Online for Capacitated Resource Allocation111This work was done in part while the authors were visiting the Simons Institute for the Theory of Computing. Research supported in part in by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), Project No. 437739576, NSF Awards CCF2209520, CCF2312156, and a gift from CISCO.
Abstract
We study online capacitated resource allocation, a natural generalization of online stochastic max-weight bipartite matching. This problem is motivated by ride-sharing and Internet advertising applications, where online arrivals may have the capacity to serve multiple offline users.
Our main result is a polynomial-time online algorithm which is -approximate to the optimal online algorithm for . This can be contrasted to the (tight) -competitive algorithms to the optimum offline benchmark from the prophet inequality literature. Optimum online is a recently popular benchmark for online Bayesian problems which can use unbounded computation, but not “prophetic” knowledge of future inputs.
Our algorithm (which also works for the case of stochastic rewards) rounds a generalized LP relaxation from the unit-capacity case via a two-proposal algorithm, as in previous works in the online matching literature. A key technical challenge in deriving our guarantee is bounding the positive correlation among users introduced when rounding our LP relaxation online. Unlike in the case of unit capacities, this positive correlation is unavoidable for guarantees beyond . Conceptually, our results show that the study of optimum online as a benchmark can reveal problem-specific insights that are irrelevant to competitive analysis.
Contents
- 1 Introduction
- 2 Formal Problem Statement and Preliminaries
- 3 The Algorithm: A Two-Step Approach
- 4 Analysis: Beating a 1/2-Approximation
- 5 Analyzing the Sample-based Algorithm
- 6 Conclusion and Future Directions
- A Informative Examples and Observations
- B Deferred Proofs
- C Beyond Bernoulli Distributions
- D Stochastic Rewards
1 Introduction
We study an online capacitated allocation problem, in which users should be assigned to resources arriving online. Specifically, at each timestep , a new resource arrives and its capacity and the values for every user are sampled from a known distribution . Upon the arrival of a resource, we observe its realized capacity and values, and must irrevocably decide which users to allocate to it. Our goal is to maximize social welfare, i.e., the sum of the values of assigned user-resource pairs.
This problem naturally arises in a number of settings, for example in the context of ride-sharing: after a spike in demand (e.g. at the arrival of a flight, or at the end of a large concert), waiting passengers need to be assigned to cabs who become available online. Another example is online advertising, which initiated the vast literature on online Bayesian matching [FMMM09], where ads should be assigned to search queries arriving online. Further examples are abundant: the assignment of orders to trucks by a shipping fulfillment center, the procurement of goods for stores with limited inventories, etc. Our formulation goes beyond the intensely-studied setting where each online resource can be matched to at most one offline node (e.g. [BK10, HMZ11, MGS12, JL13, EFGT20, BSSX20, HS21, TWW22, HSY22]). In many cases resources have capacity larger than one; multiple passengers can share a cab and multiple ads can be displayed under a search query.
The literature has studied this problem from the “prophet inequality” perspective, designing algorithms which compare favorably to the optimum offline algorithm which sees all realizations upfront. In particular, for online capacitated allocation, it is possible to obtain of the optimum offline benchmark [FGL15, DFKL20], and that is the best possible [KS78].
Still, comparing to the optimum offline algorithm as a benchmark might be too pessimistic in Bayesian settings. Its “prophetic” access to future realizations is unattainable for online algorithms (see [PPSW21] for further discussion). Therefore, a recent line of work (also including, e.g., [ANSS19, BDL22, DGR+23]) has shifted attention towards the following question: how well can we approximate the optimal (computationally unbounded) online algorithm in polynomial time?
In other words, how much must we lose when restricting to efficient algorithms instead of solving the optimal dynamic program? On the one hand, even for unit capacities it is PSPACE-hard to approximate the optimum online algorithm within some absolute constant [PPSW21].
Luckily, approximations strictly better than exist for unit capacities: [PPSW21] gave a -approximate algorithm, later improved to 0.52 [SW21], [BDL22], and [NSW23]. Motivated by this, we ask:
[hidealllines=false, backgroundcolor=white, leftmargin=0cm,innerleftmargin=0.35cm,innerrightmargin=0.35cm,innertopmargin=0.375cm,innerbottommargin=0.375cm,roundcorner=10pt] Can we obtain a better than -approximate algorithm to the optimal (computationally unbounded) online algorithm beyond unit-capacity allocations?
Our main result is to answer this question in the affirmative. In particular, we show that for online capacitated resource allocation problems, we can beat by a constant.
[hidealllines=true, backgroundcolor=gray!20, leftmargin=0cm,innerleftmargin=0.35cm,innerrightmargin=0.35cm,innertopmargin=0.375cm,innerbottommargin=0.375cm,roundcorner=10pt]
Theorem 1.1.
For online capacitated allocation, there exists a polynomial time -approximation to the social welfare of the optimal online algorithm, for a constant .
Interestingly, through the lens of prophet inequalities, the unit-capacity and the general capacity variant of the problem behave nearly identically. These variants (and more general ones) can all be handled by the same algorithmic template and techniques for the unit-capacity case directly carry over (for example, applying a -balanced online contention resolution scheme (OCRS) to each offline user). As we will discuss, studying capacitated resource allocation with the optimum online benchmark leads to technical challenges distinct from the unit-capacity case, and reveals differences that do not arise in competitive analysis. Our work hence gives evidence for the richness of studying optimum online as a benchmark.
We also provide an extension to where allocations are probabilistically successful, motivated by our initial example of Internet advertising. As in the literature on online matching with stochastic rewards [MP12, HZ20, GU23], after displaying ads under a search request, typically the advertiser is only charged if the ad is eventually clicked. This is typically modeled as happening with known probability called the click-through-rate. We hence update our setting so that after allocating at most users to the resource , each user is successfully allocated with known probability . If an offline user is not successfully allocated, it remains available to be matched in future rounds; however, online arrivals do not get to adaptively pick new allocations.
[hidealllines=true, backgroundcolor=gray!20, leftmargin=0cm,innerleftmargin=0.35cm,innerrightmargin=0.35cm,innertopmargin=0.375cm,innerbottommargin=0.375cm,roundcorner=10pt]
Theorem 1.2.
For online capacitated allocation with stochastic rewards, there exists a polynomial time -approximation to the social welfare of the optimal online algorithm, for a constant .
Note Theorem 1.1 is the special case of Theorem 1.2 in which all successes are deterministic, i.e. success probabilities for every , .
1.1 Our Techniques
Our algorithm rounds an LP relaxation online while introducing a controllable amount of positive correlation among offline users. For each online arrival , we apply two rounds of pivotal sampling to the unallocated offline nodes, to guarantee never “over-allocating” beyond its remaining capacity. In each round, we only randomly allocate a subset of this sampled group to avoid large positive correlation between users.
Throughout, in the main body of the paper, we focus on the special case where resource arrivals are “Bernoulli” (i.e., in step , resource with known capacity and known values arrives with probability and does not show up with probability .)
LP relaxation.
In order to bound the social welfare achieved by the optimum online algorithm, we will use a linear program (LP) relaxation with variables . The variables can be interpreted as the probabilities we would like an online algorithm to assign each user to resource (see Section 2). We require that at most users are allocated to resource in expectation, and also make use of an “online constraint” which does not hold for offline algorithms, as in [PPSW21, TT22]. In particular, for online algorithms, the arrival of resource and the event that user is unallocated at are independent. We account for stochastically successful allocations in this constraint and the LP’s objective via the independence of success along edges from an online algorithm’s allocation decisions.
A two-proposal algorithmic approach.
Our algorithm rounds an optimal LP solution online such that (i) for every resource , we do not allocate more users than its capacity, and (ii) every pair is successfully allocated with probability for a constant . To achieve guarantees (i) and (ii), we use a two-proposal algorithm inspired by the algorithm used for matching by [PPSW21]. For every resource , we run up to two rounds; in each we propose to a subset of users whose size does not exceed the remaining capacity and a random subset of users accepts.
While our LP relaxation and high-level framework are similar to [PPSW21], new ideas are needed for the specifics of the algorithm and (more importantly) its analysis. For example, when a new resource arrives, we would like to sample a subset of users such that is included with probability proportional to . In the matching case, summing over all users never exceeds one, and hence, the vector naturally forms a probability distribution over users. This is no longer true in our more general capacitated allocation problem. Naïvely sampling users independently with the given marginals has the issue that we might exceed the capacity of resource . We instead rely on the technique of pivotal sampling (also known as dependent rounding) to ensure that the sampled set of users never exceeds the capacity of resource and that sampled users are negatively correlated. Via the pivotal sampling subroutine we get a first proposal set of users. We note that in order to obtain this first proposal set, we apply the pivotal sampling in a history-agnostic way. That is, we may include previously successfully allocated users in the proposal set for resource at first. From the proposal set, we then randomly allocate each available user with some probability. This is essentially done according to a -balanced online contention resolution scheme (OCRS).
After this allocation process, there might remain a gap between the capacity of resource and the number of allocated users. We crucially exploit this gap by drawing a second proposal set of users by another call of the pivotal sampling subroutine with reduced marginal probabilities. The reduction is precisely to ensure that the capacity of resource is not exceeded. Afterwards, we probabilistically assign a subset of these users, with a carefully chosen downsampling function.
Analyzing the algorithm.
For the analysis, we distinguish for each pair whether it is assigned with probability at least already from the first proposal, or requires the second proposal to reach this threshold. In the first case, the analysis proceeds in a straightforward way via the calculations originally from the OCRS literature (see e.g. [EFGT20]).
For the remaining pairs , bounding their contribution to the social welfare requires analyzing the second proposal, and doing so is our main technical contribution. For the second proposal along to contribute to social welfare, clearly user needs to be unallocated just before arrives. Furthermore, we are required to reduce the marginal probability that is sampled in the second proposal depending on the number of allocated users in ’s first proposal. This number in turn depends on the availability of other users . In particular, if a user is already assigned before the arrival of resource , even when sampled as a first proposal, we cannot allocate the user. This increases the remaining capacity of resource , which is beneficial for the marginal reduction required in our second proposal.
Conversely, if we condition on other users being free before the arrival of resource , it leads to a larger decrease of the marginal probabilities in our second proposal. Still, this implies a decrease of the social welfare contribution of . The relevant technical question, then, is if we condition on being free before arrives (necessary for to contribute to social welfare), how much can the conditional probabilities of users being free increase? Equivalently, how significantly can the availabilities of offline users be correlated?
In the matching case this challenge was readily handled by showing negative correlation. While this is not possible for our problem (Section 1.2), we show that our algorithm obtains a good approximation if it can just avoid introducing “large” amounts of positive correlation. In the most technical part of our paper, we show our two-proposal algorithm achieves this by inductively tracking the availability of users over multiple rounds. In particular, we show the probability of both users being free at time is at most the product of the users’ individual probabilities of being free, multiplied by , for . Interestingly, the point at which we evaluate the function does not depend on user at all.
1.2 Capacitated Allocation Lacks Negative Correlation
Even in the case where every success probability equals , the potential positive correlation among offline users underlies the challenge for capacities exceeding 1. For example, a tempting naïve approach for general capacities is to directly reduce to the unit-capacity case: upon the arrival of a resource with capacity , model this as resources with unit capacities, and simply run the algorithms from prior work. Unfortunately, this fails; a crucial assumption of the relevant literature is that arrivals are independent across different rounds, and introducing positive correlation across arrivals can be extremely problematic for existing algorithms. For example, consider the natural generalization of the algorithm by [BDL22]: in round , let users propose to the arriving resource and allocate the proposing users with the highest values. Here the positive correlation introduced can create severe problems.
Observation 1.3.
For any , there exists an online capacitated allocation instance where the (generalized) algorithm of [BDL22] is no more than -approximate with respect to the welfare achieved by the optimal (computationally unbounded) online algorithm.
The formal proof can be found in Appendix A. The approach of [BDL22] is LP-based and one of the crucial steps is an upper bound on the probability that a subset of users is matched simultaneously. Intuitively speaking, their bound can be interpreted as a form of negative correlation among the offline users with respect to the LP variables.666It is possible to extend the algorithm of [BDL22] to one with the same approximation ratio which furthermore has full negative correlation between offline nodes. Unfortunately, simple examples show that in our case positive correlation is required to go beyond an approximation ratio of .
Observation 1.4.
Any algorithm for online capacitated allocation which has an approximation ratio better than with respect to (LPon) must create positive correlation between the events of offline users being available.
A formal version of the argument can be found in Appendix A. We note that the proof even rules out the “negative correlation with respect to the LP” showed by [BDL22], also used by follow-up work [NSW23]. In contrast to the line of work by [BDL22] and [NSW23], Papadimitriou et al. [PPSW21] gave a different algorithm for the unit-capacity case which operates in the mentioned “two-proposals framework” that has been successful for multiple problems in the online matching literature [FMMM09, MGS12]. Critically, their analysis shows that almost all of the matches create negative correlation of offline nodes (in fact, satisfying the very strong property of negative association). While our algorithm is inspired by the two-proposals framework, the example above demonstrates that there is no reasonable way to generalize this statement to the capacitated case while beating a -approximation.
1.3 Interlude on an Equivalent View: Online Combinatorial Auctions
Online capacitated resource allocation problems can also be interpreted in the context of online combinatorial auctions — a commonly studied setting in the prophet inequality literature, as in e.g. [FGL15, DFKL20, CC23] and many others. Here, online arrivals correspond to buyers and offline nodes are items. Our capacities translate to the assumption that each buyer has a -demand valuation function, interpolating between unit-demand and fully additive valuations.
In online capacitated allocation, we assume valuations are given upfront to the algorithm designer through a centralized planner. This view is (at first glance) less realistic for online combinatorial auctions — here we would expect buyers to report their own valuations, and would need to consider incentives. Luckily, applying recent work of [BHK+24], we can argue that our algorithm can be made dominant strategy incentive-compatibility (DISC) if we bound the demand size of buyers by a constant (a reasonable assumption for motivating applications). In particular, Theorem 1.1 implies the following result which we formally prove in Section B.1.
[hidealllines=true, backgroundcolor=gray!20, leftmargin=0cm,innerleftmargin=0.35cm,innerrightmargin=0.35cm,innertopmargin=0.375cm,innerbottommargin=0.375cm,roundcorner=10pt]
Theorem 1.5.
Say every buyer samples a -demand valuation function, where is upper bounded by a constant. Then, for online combinatorial auctions, there exists a polynomial-time DSIC mechanism giving a -approximation to the social welfare of the optimal online algorithm.
Note that our main Theorem 1.1 does not require any upper bounds on the capacities . In particular, the capacities can be as large as the number of offline users. The upper bound on in the combinatorial auction interpretation is only required such that the reduction from [BHK+24] runs in polynomial time.
1.4 Additional Related Work
Online resource allocation problems have gained attention in the last decades due to a plethora of applications introduced by large marketplaces (see e.g. [Meh13]).
A particularly well-studied variety of such problems is online matching. As initiated by [KVV90], here we have a set of offline vertices and a set of vertices arriving online. Upon arrival, online nodes reveal a subset of offline nodes they could be matched to, and we can allocate at most one that is still available. [KVV90] give an online algorithm for this problem that achieves a -approximation to the value of the best possible matching in hindsight. This guarantee was later extended to vertex-weighted instances, where offline vertices might have different values [AGKM11]. The case we consider where edges are only successful with known probability has also been studied in the literature, often going by online matching with stochastic rewards [MP12, MWZ15, HZ20, GU23, HJS+23]. When online nodes can adaptively attempt to “rematch” based on the successful status of edges, the problem is often called stochastic probing, and it has been studied in both online [BGL+12, AGM15, BMR20] and offline settings [CIK+09, Ada11, GKS19].
In the most general edge-weighted case, it is unfortunately impossible to obtain any constant-factor approximation for adversarial arrivals; a recent line of work studies the case where we relax the requirement of decisions being irrevocable [FHTZ20, GHH+21, BC21]. But in settings where allocations cannot be easily reversed, the only other option is to move beyond the pessimistic assumption of fully adversarial arrivals. The most natural way to do so is to consider the intermediate model of stochastic arrivals, a reasonable assumption for settings with large amounts of historical data available. There is a long line of work designing matching algorithms in such settings, including edge-weighted problems (e.g. [HMZ11, AHL12, BSSX16, EFGT20]) and vertex-weighted/unweighted problems (e.g. [FMMM09, MGS12, JL13, HS21, HSY22]). There is also very recent work studying correlated arrivals in online stochastic matching [AM23], showing guarantees of half against the offline benchmark when online nodes are independent across different types rather than arrival rounds.
Most of the literature on Bayesian online resource allocation problems focuses on competitive algorithms against the expected offline optimum, also called prophet inequalities. Originally introduced in the 70s and 80s by [KS78] and [SC84], statements of this form gained renewed attention in the past decades due to connections with mechanism design [HKS07, CHMS10, KW19]. In these mechanisms, a sequence of buyers arrives one-by-one and faces item prices, buying the most desirable feasible bundle. These mechanisms are incentive compatible and individually rational by design and lead to desirable approximation guarantees of the optimum achievable welfare. This explains the rise of literature in this area during the recent years [FGL15, DFKL20, DKL20, GW19, CCF+22, BK23]. For more details, we refer to the survey by Lucier [Luc17].
Typical problems studied in the literature are weighted bipartite matching (a.k.a. unit-demand combinatorial auctions) as well as its generalizations towards more general scenarios, such as XOS or subadditive valuations in combinatorial auctions [FGL15, DFKL20, DKL20, CC23]. In complementing work, also feasibility constraints such as (poly-)matroids [DK15, KW19, CGKM20], knapsacks [DFKL20, JMZ22] and beyond [GHK+14, Rub16, BM19] are considered.
The paradigm of online contention resolution schemes (OCRS) has been an influential technique for proving prophet inequalities. Here, we start with an LP relaxation of the offline allocation problem and run a rounding procedure online while observing realizations one-by-one. Introduced by [FSZ16], this technique has been broadly applied, see e.g. [LS18, EFGT20, PRSW22, FLT+22, ACCB+23, MMG23]. The LP relaxation we use for our algorithm differs from standard OCRS settings as there are additional constraints in our LP which are only valid for online algorithms.
Online allocation has also been studied in the literature where offline nodes have capacities and can be allocated simultaneously in different rounds [Ala14, AHL13]. For example, [AHL12] study such a setting and derive competitive ratios against the offline benchmark which can be improved beyond once there is a lower bound of at least 2 on the offline capacities. The literature has also considered the impact of reusability of offline nodes [FNS19, DSSX21, FNS22].
1.5 Paper Organization
In Section 2, we formally state our problem and review some preliminaries. In Section 3, we introduce our algorithm and argue that it is well-defined. Afterwards, in Section 4, we analyze the algorithm’s approximation ratio, the main technical contribution of our work. We conclude in Section 6 with some future directions suggested by our work. Appendix A contains a discussion of informative examples and observations for our problem. In Appendix B, we give proofs that are deferred from the main body.
In the first part of the main body of our paper, we prove a simpler result for ease of exposition; the remaining sections and appendices include the details required to prove our result in full generality. Our algorithm as stated in Section 3 requires an exponential-time computation; in Section 5 we analyze the natural Monte Carlo variant and hence provide a truly polynomial-time algorithm. Our algorithm in Section 3 also focuses on the special case of Bernoulli arrivals; in Appendix C we show how to extend our techniques to online arrivals with values and capacities drawn from general distributions. Finally, for simpler notation, when analyzing our algorithm we consider the special case where every success probability is one; in Appendix D we discuss the necessary changes to prove the result for arbitrary probabilities.
2 Formal Problem Statement and Preliminaries
In the following section, we will give a formal definition of a special case of our problem. For ease of exposition, in the first part of the main body of our paper we describe our algorithm and analysis for this special case, and list the additional details required to solve the general version only afterwards. We also will review some preliminaries including statements about our LP relaxation and the basics of pivotal sampling, an important ingredient for our algorithm.
Problem definition.
Recall that we defined the input to our problem as a set of users which are available offline. In addition, there is a set of resources which are revealed online in known order. In step , resource arrives (also noted as active) independently with known probability . In addition, value is user ’s value for being served by resource . Every user can be served by at most one resource; any resource can serve up to many users. We call the capacity of resource and emphasize that can be resource-specific, i.e. we allow different resources to have different capacities. Upon the arrival of resource , we observe the random realization if the resource is active, and can choose which users (if any) we would like to allocate to it, subject to the constraints that each user can be assigned to at most one resource and . If resource does not arrive, for convenience, we take .
Upon assigning , each is successfully allocated with probability independently. We denote the successful set by . More generally, the set denotes the set of successful allocations from some allocated set . Our objective is to maximize the expected social welfare, defined as .
Our goal is to design a polynomial-time approximation algorithm for this problem. An algorithm is a -approximation if for any instance of the problem, we have , where is the expected welfare achieved by the optimal online algorithm. The optimal online algorithm has unlimited computational power and also knows all distributions upfront, but only observes realizations one at a time and needs to make an irrevocable decision before observing the next realization. Formally, we can define via a Bellman equation. To this end, let denote the optimum gain achievable from resources with users available. Then, recursively we have
We recall that even in the case of unit capacities with deterministically successful assignments, it is PSPACE-hard to approximate within a factor [PPSW21].
LP relaxation.
We will use an LP relaxation of the optimum online algorithm which generalizes that for the unit-capacity and deterministic rewards case [PPSW21, BDL22, TT22]. It has a variable for every pair of a user and a resource .
(LPon) | |||||
s.t. | (1) | ||||
(2) |
This LP indeed relaxes the optimal online algorithm: set to be the marginal probability that this algorithm attempts to allocate to . Constraint (1) holds as any algorithm can only allocate at most users to resource if it arrives. Constraint (2) only holds for online algorithms: the event of users being not yet successfully allocated at step and the event of resource arriving are independent. We note it implies the natural constraint that .777Indeed, we can apply Constraint (2) to and observe
Observation 2.1.
The optimum objective value of (LPon) upper bounds the gain of optimum online, i.e., .
For completeness the short formal proof is included in Section B.2.
Generalized problem definition.
In the above problem definition, we made the simplifying assumption that the resource arriving at time has a simple “Bernoulli” distribution determining if it is active or not. In the general model, in every round, a resource randomly realizes one of many possible pairs of valuation vectors to the users and capacities. Formally, in our general model, resource realizes one of possible capacities together with a vector of values , where each realization is sampled with probability . We highlight that capacities and values during a single round can be arbitrarily correlated, although across different rounds we assume independence. In Appendix C we argue that our LP, algorithm, and analysis extend to such general settings as well.
2.1 Pivotal Sampling
As a part of our online algorithm we invoke the randomized offline rounding framework of pivotal sampling (also called Srinivasan rounding and dependent rounding) [Sri01, GKPS06]. Imagine we are given marginals with each and for some positive integer . We would like to randomly select at most indices from such that is selected with probability . Pivotal sampling selects such a subset while also guaranteeing strong negative correlation properties between individual indices. It does so by sequentially choosing a pair of fractional marginals, and applying a randomized “pivot” operation that makes at least one integral. We formally state some of the properties of the algorithm below which suffice for our analysis.
Theorem 2.2 (as in [Sri01]).
The pivotal sampling algorithm with input where efficiently produces a random subset of , denoted , with the following properties:
-
(P1)
For every , we have .
-
(P2)
The number of elements in is always at most .
-
(P3)
(Negative cylinder dependence) For any , we have
and
3 The Algorithm: A Two-Step Approach
We begin by a short description of our algorithm, before presenting the pseudocode in Algorithm 1. First we fix some useful definitions: we say user is “allocated to ” if it is one of the at most users served by the resource, and “successfully allocated to ” if it is allocated to and is successful (recall this is with probability ). We say user is “free at ” or “available at ” (or “free”/“available”, if the context is clear) if just before the arrival of resource , user has not yet been successfully allocated to any previous resource.
Our algorithm uses an optimal solution to (LPon) as input. After observing if resource arrives, if so, we sample a set of at most users (denoting the first proposal for ) using pivotal sampling, such that each user is selected with marginal probability . For every user , if is still available, we toss a coin independently with probability , and allocate user to resource if this coin toss is successful.
After this procedure, we have a number of users allocated to resource , where is a random variable which can take values in . In order to make use of the remaining space in the demand size of resource , we allow to make a second proposal. Again via the pivotal sampling subroutine, this time with a reduced marginal probability of for every user , we sample a set of users , denoting the second proposal with size at most . Among these users, we consider only those for which , was free at , and was not yet allocated to . For each such user , we allocate to with probability . The factor is chosen in a way to ensure that , i.e., such that we don’t overmatch any .
Concerning the definition of , we note that the expectation is over the randomness in the arrivals and algorithm up to when it reaches 8 for arrival in Algorithm 1 (in particular, we consider “re-running” the algorithm as defined thus far on a fresh instance). The indicator refers to the event that was not successfully allocated to some and is also not allocated yet to (it could be the case that was allocated to some , and this was unsuccessful). This indicator is potentially correlated with the number of allocated users .
The in the definition of is for convenience only; in particular, it is thus easy to see that the algorithm is well-defined. As a crux of our analysis, we will show that using ensures that the in the definition of is actually redundant.
In the remainder of this section, we will argue that Algorithm 1 is well-defined and guarantees to respect the capacity constraints of online resources.
Observation 3.1.
Algorithm 1 is well-defined.
Proof.
Note first that in 4, our call to the pivotal sampling algorithm is well-defined as each marginal is in by LPon Constraint (2). Each as defined in 7 is clearly a probability by construction. Our second call to is similarly well-defined. Note that is always a probability — if , it implies that by definition. This in turn shows that is always in the interval .
Finally, note that user is allocated only if available, and hence never successfully allocated to two different resources (or to the same resource twice). ∎
We also have that our algorithm respects capacity constraints for each online arrival.
Observation 3.2.
The number of users allocated to resource by Algorithm 1 is always at most .
Proof.
We also note that every line except 12 can be implemented in polynomial time. Indeed, note 2 can be run efficiently as (LPon) has polynomial size, and that our calls to pivotal sampling can be implemented efficiently [Sri01].
12 requires exponential time as written, and for ease of presentation, in the next section we analyze the above exponential time algorithm. In Section 5 we show that we can replace this computation with a sample average and appeal to concentration bounds, while only losing an arbitrarily small in the approximation ratio. The main point of care is to argue that is bounded away from 0 so that we can get a close multiplicative approximation.
4 Analysis: Beating a -Approximation
Our main result is as follows.
Theorem 4.1.
For , the social welfare achieved by Algorithm 1 satisfies
This section is dedicated to the proof of our main result. As mentioned before, we analyze the algorithm which has access to the expectation exactly. Note that this requires exponential time; however, in Section 5 we show that our sampling-based estimation only results in an additional loss of in the approximation. To prove this, we will rely on a consequence of our analysis, namely that the quantity is always bounded away from zero by some constant. Using this, we can apply standard Chernoff-Hoeffding concentration bounds to get reasonably close to the exact within small multiplicative error.
To simplify the exposition, we will additionally assume that every allocation is successful, i.e., each success probability equals 1. In Appendix D we outline the necessary steps to generalize our analysis to the case of arbitrary success probabilities .
Outline.
Before diving into details we outline the ingredients in our proof of Theorem 4.1. Firstly we note that by 3.2, the size of (the set of users allocated to ) is always at most , so
We will note that bounding the term naturally brings us into one of two cases. If is such that , the allocation of to can only happen in 7 of our algorithm, and consequently it is straightforward to bound the resulting welfare (which we do in 4.4). We then turn our perspective towards pairs with a subsampling probability ; for these, the analysis requires much more care. Again, we start by considering the contribution of allocating via a first proposal in Lemma 4.5 (i). Here the first proposal alone is not sufficient, and we are required to compensate for this via a suitable bound on the allocation probability via a second proposal. We do so by proving Lemma 4.5 (ii) which gives a sufficient lower bound of the contribution via a second proposal. This is the main technical contribution and will use lemmas analyzing the evolution of the correlation between offline users in Section 4.3.
Notation.
For convenience, we let Note that exactly when . We hence define as this threshold for after which the subsampling probability becomes one. If for resource and user we have , then we call the pair early. Otherwise, we call the pair late. In addition, we define as the set of all pairs such that user was allocated to resource in 7, and as the set of all pairs such that was allocated to in 14.
As is not allocated more than once in our algorithm, we quickly observe the following claim.
Observation 4.2.
For any resource , we have
To analyze the probabilities and , we consider two separate cases based on whether is early (Section 4.1) or late (Section 4.2).
4.1 Analysis for Early Pairs
It will be crucial to bound the probability of a user being free at time . We denote the event that user is free or available (i.e., not allocated) at the arrival of resource by . The following observation gives an expression of the probability with respect to the LP variables. It is crucial to note that if a pair is early, so is every pair with .
Observation 4.3.
For early pairs , we have .
Proof.
We proceed via induction on . Before the arrival of the first resource, the claim is trivially true, as all users are available with probability one. Afterwards, note that
(3) |
as ’s arrival, being included in , and the algorithm’s coin flip are mutually independent events. If is early, then , so we have
where we also use the induction hypothesis for the probability of the user being free at the arrival of resource . For early , we also clearly have , so
As a consequence we can bound the contribution of an early pair to and , as follows.
Observation 4.4.
For early pairs , and .
Thus for early pairs , our algorithm achieves the desired allocation probability.
4.2 Analysis for Late Pairs implies Theorem 4.1
For late pairs, we show the following lemma which will be sufficient to prove our main Theorem 4.1.
Lemma 4.5.
For late pairs , the following two statements hold:
-
(i)
, and
-
(ii)
.
We note that this immediately implies our main result.
Proof of Theorem 4.1..
Thus, it remains to prove Lemma 4.5. Our analysis here requires significantly more care as it must bound the gain from the second proposal. As the second proposal’s marginal probabilities are dependent on which offline users were allocated in the first proposal, a complete analysis must consider the correlation introduced.
4.2.1 Proof of Lemma 4.5 (i)
As for early pairs, the remainder of our proof will proceed by induction on . Thus, for every late pair with , by the inductive hypothesis we have . Recall also that for every early pair we know from 4.4 that . Thus, we may assume that for the late pair being considered we have
(4) |
With this, bounding the probability of allocation along a first proposal is very straightforward.
Proof of Lemma 4.5 (i)..
Note that
(Equation (3)) | ||||
(Equation 4) | ||||
4.2.2 Proof of Lemma 4.5 (ii)
We begin by bounding for late pairs , in the natural way which depends on the number of allocated users during the first proposal in 7. (Recall that this is because for second proposals, we reduce the marginal probabilities for pivotal sampling algorithm by a factor of ). Note that for to be matched as a second proposal we need all of the following to happen: (i) should arrive, (ii) must be available and unallocated after 8, and included as a second proposal, and (iii) the potential match should survive the final downsampling by . This lets us observe
(5) |
For the second equality, we relied on Property (P1) of pivotal sampling, which guarantees that individual elements are sampled with exactly their marginal probability. Note that this marginal probability is random, and potentially correlated with .
Recall that . If the here is redundant, we are immediately done; this is concretized in the following observation.
Observation 4.6.
If , then
Thus it suffices to show that the hypothesis of this observation holds. In other words, for the remainder of the proof, the only thing we need to show is the following proposition.
Proposition 4.7.
For any late pair , we have .
As a first step, we start with the following lower bound on .
Lemma 4.8.
For late pairs ,
Proof of Lemma 4.8..
Note first that we can expand
Note that as the pair is late, we have . Hence, conditioned on being free and the arrival of resource , user is not allocated in 7 if and only if it is not contained in the set . This allows us to bound
To reason about the resulting expectation, we first apply the following bounding to remove the conditioning on :
In addition, note that as pair is late. Thus we get
In order to exploit the bound obtained in Lemma 4.8, we need to control . In particular, our goal is to show that is bounded away from by a multiplicative constant smaller than . If there was no conditioning on , it is easy to check that
The conditioning could however lead us into trouble in the following way: When facing the conditioning, we end up with the expression
If implies for every , and for every , then
where the right-hand side could equal . This, in particular, would make the second proposal in our algorithm completely useless as we would reduce the marginal probabilities for the pivotal sampling in 9 to (almost) zero. The most crucial part of our analysis is to demonstrate that this cannot happen, by bounding the possible positive correlation introduced between offline users.
Lemma 4.9.
For any distinct users and , and , for any we have
The proof of Lemma 4.9 is deferred to Section 4.3; in the remainder of this section we demonstrate why it implies our bound on the approximation ratio. We note that for (the value we choose in Algorithm 1), we have . As a concrete example, note that if and are both late with , this bound quantifies that we avoid perfect positive correlation between and .
Having Lemma 4.9, we can prove the bound on which we state formally in 4.10 via
(7) | ||||
The last inequality uses the fact that and upper bounds by . By the online constraint (2) and the property that for late pairs , we have that . Hence, we can conclude that
(8) | ||||
Although this appears quite loose if is larger than 1, in Section A.4 we show that a fine-grained bound in terms of only results in limited improvements in the analysis. Equation 8 implies the following corollary of our correlation bound.
Corollary 4.10.
Let . For any late we have
We are now able to conclude the proof of Lemma 4.5 (ii), as follows.
Proof of Lemma 4.5 (ii)..
For convenience let , recalling that is a function of . Then, it suffices to show , or equivalently
For , we can confirm that the coefficient of on the right-hand side is positive, and hence it suffices to show this inequality when . This reduces to
which is readily confirmed by direct computation at . ∎
As a side remark, using Equation 9, we can observe that for our choice of , the expectation is bounded away from zero by a constant. In particular, for , we have that . This can be used to estimate via sampling with small multiplicative error, as we formalize in Section 5.
In order to finalize our proof of Lemma 4.5 (ii), it only remains to prove our bound on the correlation introduced between offline users, which we do in the following section.
4.3 Bounding the Correlation — Proof of Lemma 4.9
What remains to conclude the proof of our main Theorem 4.1 is to control the correlation of two users and to be free simultaneously, i.e., the bound from Lemma 4.9. To this end, we first state and prove Lemma 4.11 which uses the assumption that and are at most .
Lemma 4.11.
Define . For any distinct users and , and any time such that , we have
To prove this lemma we consider the function
which depends on our choice of . Note that . For this function, we can prove the following claim.
Claim 4.12.
For any distinct users and , and any time such that , we have
where .
In order to prove Lemma 4.11 from Claim 4.12, it suffices to note that is a monotone increasing function in , and hence, for all .
Proof of Claim 4.12..
We give a proof by induction. As and all users are available initially, the base case is clear. Assuming the claim is true for fixed , we will prove it for with the assumption .
Proof outline for the inductive step.
Our proof proceeds with the following steps:
-
(S1)
We find an upper bound for the probability that both and are not assigned to via a first proposal conditioned on being free.
-
(S2)
We compute , in order to apply the inductive hypothesis.
-
(S3)
We apply the induction hypothesis, and use Step (S2) to write our bound in terms of and .
-
(S4)
We argue that we can upper bound the coefficient in front of with .
Step (S1): Bounding the probability of not assigning both users via a first proposal.
As , they can only be matched as first proposals; hence the probability both and are free at time is
(10) |
The first term on the right-hand side of Equation 10 will later be bounded via the induction hypothesis. The second term can be equivalently written as
(11) | ||||
Now, observe that . The analogous equality holds for . Hence, it remains to get a suitable bound on the joint probability that both users and are assigned via a first proposal given they were both free. To this end, we make use of the negative cylinder dependence in pivotal sampling, observing
(Pivotal Sampling Property (P3)) | ||||
Combining all of the above, we can bound the conditional probability that neither nor is allocated to via a first proposal. In other words, the left-hand side of Equation 11 is at most
(12) | ||||
Step (S2): Comparing to .
To prepare for our use of the inductive hypothesis, we compute via a straightforward calculation:
(13) |
In the final line we used that is early. For , we analogously have
Step (S3): Applying the induction hypothesis.
Applying the induction hypothesis to Equation 10, plugging in Inequality (12) and using Equation 13, we can bound
(14) |
Here, the first inequality uses Inequality (12) from Step (S1), i.e., the upper bound on the probability of both users not being allocated via a first proposal. The second inequality applies the induction hypothesis for , and the last equality uses Equation 13 from Step (S2) for both users and and rearranges terms.
We now bound the second summand of (14), via the following inequality.
Fact 4.13.
For any we have
(15) |
Proof.
By Constraint (2) of the LP, we have that Thus it suffices to show that
which is equivalent to
As , the claim follows. ∎
We can apply Fact 4.13 to user and combine it with Equation 13 in order to bound the second summand via
Overall, we thus have
Step (S4): Upper bounding the coefficient by .
In order to complete the inductive step, we would like to show that
First, note that as we only consider early pairs, is always equal to , so we know Thus to conclude the proof, it suffices to show that
This is a consequence of our definition
In particular the following claim, whose proof can be found in Section B.3, completes the inductive step.
Claim 4.14.
For any with and as stated above, we have
This concludes the proof of Claim 4.12. ∎
Now, we can finally prove Lemma 4.9 which concludes the proof of our main Theorem 4.1. Let us restate Lemma 4.9 and prove it afterwards.
See 4.9
Proof of Lemma 4.9..
We assume that both and ; if neither inequality holds the result is clear and follows directly from Lemma 4.11 while if just one holds the proof proceeds nearly identically with a slightly better guarantee.
Let denote the latest resource in such that and and similarly let denote the latest resource in such that and .
Let denote the event that is allocated to some arrival in and let denote the event that is allocated to some arrival in . By the hypothesis that for all , we have
An analogous upper bound holds for .
To simplify notation, let us assume for a moment that (if , simply swap the roles of and in the following line). We apply Lemma 4.11 to get
(via Lemma 4.11) | ||||
In this expression, we aim to combine the last two factors concerning the events if item is free at some point in time. To this end, observe that
where the last equality uses the same ideas as Step (S2) in the proof of 4.12. So, overall, we have
(16) |
With this in mind, we are ready to prove the final statement as
(via Equation 16) | ||||
where in the last inequality we used and the last equality applies . ∎
5 Analyzing the Sample-based Algorithm
To update Algorithm 1 to run in polynomial time, instead of computing the exact value of we estimate it with polynomially many samples. For simplicity, we present the algorithm and its analysis for Bernoulli arrivals when every success probability equals 1 (the relevant changes needed for the generalizations are described in Appendix C and Appendix D, respectively). The pseudocode is presented below; observe that we reduce the constant by an arbitrarily small in 1.
As before, the definition of is over the randomness in the arrivals and algorithm up to when it reaches 8 for arrival in Algorithm 1, with the previously computed values of for . In particular, we do not recalculate these, but rather inductively use them as defined previously. This is why we use the shorthand of “conditioning on ” when defining .
We start with the observation that our algorithm is unchanged for early pairs . In particular, the following lemmas still hold for Algorithm 2.
See 4.4
See 4.12
In the remainder of the analysis, we will need to track the errors incurred by sampling. Note that by the Chernoff-Hoeffding bound, if is bounded away from 0 then the empirical average will be within a close multiplicative factor.
Observation 5.1.
If then we have that with probability at least
Proof.
We straightforwardly bound
Thus with probability at least , we have
The observation follows directly. ∎
We now show inductively that our algorithm allocates each with probability close to the idealized value of from the exact (exponential-time) calculations. In particular, we show that that we achieve a value of where the error accumulates only linearly in .
Lemma 5.2.
For any online arrival , with probability at least , we have for every that
(17) |
Note that once we have Lemma 5.2, it is immediate to bound the gain of Algorithm 2. In particular, the social welfare achieved by Algorithm 2 is with probability at least lower-bounded by
Note that for a realization of Algorithm 2, we can estimate its gain within a small multiplicative error factor by simulating it over polynomially-many independently sampled arrival sequences. Thus, this guarantee can be obtained with high probability, and it only remains to prove Lemma 5.2.
Proof of Lemma 5.2..
By induction on . We consider only the case where the lemma’s statement holds for all , and note this is with probability at least by the inductive hypothesis. Note that for any such that is early, we are done by 4.4.
For convenience of notation, let denote the error accumulated up to time . Using this notation, we can apply the inductive hypothesis to bound
(18) |
Hence the probability late is allocated as a first pick satisfies
where we used the induction hypothesis for bounding .
By Equation (5) the probability is allocated as a second pick is given by
(19) |
As before, we aim to show that is bounded away from . Note that analogously to Equation 6, we have
(20) |
To bound the conditional expectation , we will (as before) upper bound the joint probability , analogously to Lemma 4.9. Here, the main contribution is from 4.12; the probability mass from late edges does not greatly affect it for small , even when taking into account the possible error introduced by sampling. As our algorithm is unchanged along early edges, the proof from the body of the paper goes through in a very similar fashion, which we formalize below.
We assume that both and (if neither, or just one of these inequalities holds, the proof proceeds nearly identically with better bounds). Let denote the latest resource in such that and and similarly let denote the latest resource in such that and . Let denote the event that is allocated to some arrival in and let denote the event that is allocated to some arrival in . Using the hypothesis that for all , we have
An analogous upper bound holds for . For convenience, let us define . With this, we can bound
where in the last inequality, we first use a lower bound on and of and defined
Now, following the calculation of Equation 7, we have
For the final inequality, we are using by our hypothesis and substituting Using that as is late, we have
(21) |
as in 4.10.
Now, starting from Equation 20 and using Equation 21, we note
where the second inequality uses and the last inequality is a straightforward calculation for sufficiently small . The third inequality is calculation-heavy and holds only for small and , and requires some slightly tedious calculations. For example, we can upper bound by noting that for sufficiently small and .
Thus
for We loosely bound for sufficiently small. So we just need to show Using that it suffices to show Recalling that , we note this reduces to a single-variable inequality in only . This is not easy to show directly, as it crucially is true for the magic constant , but can readily be shown by computer verification. Indeed, the RHS and LHS are easily seen to be 100-Lipschitz as functions of , say, so we confirm the RHS is at least larger than the LHS on a grid of points on .
Hence, we get is bounded away from and can apply 5.1: for any fixed such that is late, we have with probability at least that Note that in this case we have
Recall Note
where the final (loose) inequality follows as and . This implies
(Equation 19) | ||||
and similarly
Then, we have
where the last inequality uses that the coefficient of above is , which is non-positive. Hence we can bound
We also have the analogous upper bound
By the union bound, with probability at least these two bounds hold for all with late. Via the inductive hypothesis, our starting assumption occurred with probability at least . Hence, by a final application of the union bound, we have that our desired property for arrivals holds with probability at least
∎
6 Conclusion and Future Directions
We gave the first algorithm achieving an approximation ratio strictly better than for capacitated online resource allocation, when comparing to the (computationally inefficient) optimum online algorithm. Our algorithm crucially limited the (necessary) positive correlation between offline users, and analyzed this via an inductive bound depending on the total LP flow sent to an individual user. This challenge does not arise in competitive analysis, and lends credence to the value of the optimum online as a complementary benchmark to the prophet.
Numerous directions for future research are suggested by our work. Can our guarantee of for be improved, perhaps by rounding stronger LPs? Is there a better tradeoff possible between the amount of positive correlation we introduce for early arrivals and the approximation ratio possible on late ones?
Finally, we believe the techniques developed for handling positive correlation may prove useful for future generalizations. The prophet inequalities literature has studied more general settings than capacitated allocation where the tight -guarantee is known [FGL15, DFKL20], and our work gives some evidence that it is possible to get an improved approximation ratio against the online benchmark for these problems as well.
References
- [ACCB+23] Vashist Avadhanula, Andrea Celli, Riccardo Colini-Baldeschi, Stefano Leonardi, and Matteo Russo. Fully dynamic online selection through online contention resolution schemes. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI’23/IAAI’23/EAAI’23. AAAI Press, 2023.
- [Ada11] Marek Adamczyk. Improved analysis of the greedy algorithm for stochastic matching. Information Processing Letters (IPL), 111(15):731–737, 2011.
- [AGKM11] Gagan Aggarwal, Gagan Goel, Chinmay Karande, and Aranyak Mehta. Online vertex-weighted bipartite matching and single-bid budgeted allocations. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1253–1264, 2011.
- [AGM15] Marek Adamczyk, Fabrizio Grandoni, and Joydeep Mukherjee. Improved approximation algorithms for stochastic matching. In Nikhil Bansal and Irene Finocchi, editors, Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14-16, 2015, Proceedings, volume 9294 of Lecture Notes in Computer Science, pages 1–12. Springer, 2015.
- [AHL12] Saeed Alaei, MohammadTaghi Hajiaghayi, and Vahid Liaghat. Online prophet-inequality matching with applications to ad allocation. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC), pages 18–35, 2012.
- [AHL13] Saeed Alaei, MohammadTaghi Hajiaghayi, and Vahid Liaghat. The online stochastic generalized assignment problem. In Prasad Raghavendra, Sofya Raskhodnikova, Klaus Jansen, and José D. P. Rolim, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques - 16th International Workshop, APPROX 2013, and 17th International Workshop, RANDOM 2013, Berkeley, CA, USA, August 21-23, 2013. Proceedings, volume 8096 of Lecture Notes in Computer Science, pages 11–25. Springer, 2013.
- [Ala14] Saeed Alaei. Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers. SIAM Journal on Computing (SICOMP), 43(2):930–972, 2014.
- [AM23] Ali Aouad and Will Ma. A nonparametric framework for online stochastic matching with correlated arrivals. In Kevin Leyton-Brown, Jason D. Hartline, and Larry Samuelson, editors, Proceedings of the 24th ACM Conference on Economics and Computation, EC 2023, London, United Kingdom, July 9-12, 2023, page 114. ACM, 2023.
- [ANSS19] Nima Anari, Rad Niazadeh, Amin Saberi, and Ali Shameli. Nearly optimal pricing algorithms for production constrained and laminar bayesian selection. In Proceedings of the 20th ACM Conference on Economics and Computation (EC), pages 91–92, 2019.
- [BC21] Guy Blanc and Moses Charikar. Multiway online correlated selection. In Proceedings of the 62nd Symposium on Foundations of Computer Science (FOCS), pages 1277–1284, 2021.
- [BDL22] Mark Braverman, Mahsa Derakhshan, and Antonio Molina Lovett. Max-weight online stochastic matching: Improved approximations against the online benchmark. In David M. Pennock, Ilya Segal, and Sven Seuken, editors, EC ’22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11 - 15, 2022, pages 967–985. ACM, 2022.
- [BGL+12] Nikhil Bansal, Anupam Gupta, Jian Li, Julián Mestre, Viswanath Nagarajan, and Atri Rudra. When lp is the cure for your matching woes: Improved bounds for stochastic matchings. Algorithmica, 63(4):733–762, 2012.
- [BHK+24] Kiarash Banihashem, MohammadTaghi Hajiaghayi, Dariusz R Kowalski, Piotr Krysta, and Jan Olkowski. Power of posted-price mechanisms for prophet inequalities. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4580–4604. SIAM, 2024.
- [BK10] Bahman Bahmani and Michael Kapralov. Improved bounds for online stochastic matching. In Mark de Berg and Ulrich Meyer, editors, Algorithms - ESA 2010, 18th Annual European Symposium, Liverpool, UK, September 6-8, 2010. Proceedings, Part I, volume 6346 of Lecture Notes in Computer Science, pages 170–181. Springer, 2010.
- [BK23] Alexander Braun and Thomas Kesselheim. Simplified prophet inequalities for combinatorial auctions. In 2023 Symposium on Simplicity in Algorithms (SOSA), pages 381–389, 2023.
- [BM19] Jackie Baek and Will Ma. Prophet inequalities on the intersection of a matroid and a graph. CoRR, abs/1906.04899, 2019.
- [BMR20] Allan Borodin, Calum MacRury, and Akash Rakheja. Bipartite stochastic matching: Online, random order, and iid models. arXiv preprint arXiv:2004.14304, 2020.
- [BSSX16] Brian Brubach, Karthik Abinav Sankararaman, Aravind Srinivasan, and Pan Xu. New algorithms, better bounds, and a novel model for online stochastic matching. In Proceedings of the 24th Annual European Symposium on Algorithms (ESA), pages 24:1–24:16, 2016.
- [BSSX20] Brian Brubach, Karthik Abinav Sankararaman, Aravind Srinivasan, and Pan Xu. Online stochastic matching: New algorithms and bounds. Algorithmica, 82(10):2737–2783, 2020.
- [CC23] José Correa and Andrés Cristi. A constant factor prophet inequality for online combinatorial auctions. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, page 686–697, New York, NY, USA, 2023. Association for Computing Machinery.
- [CCF+22] José Correa, Andrés Cristi, Andrés Fielbaum, Tristan Pollner, and S. Matthew Weinberg. Optimal item pricing in online combinatorial auctions. In Karen Aardal and Laura Sanità, editors, Integer Programming and Combinatorial Optimization, pages 126–139, Cham, 2022. Springer International Publishing.
- [CGKM20] Shuchi Chawla, Kira Goldner, Anna R. Karlin, and J. Benjamin Miller. Non-adaptive matroid prophet inequalities. CoRR, abs/2011.09406, 2020.
- [CHMS10] Shuchi Chawla, Jason D Hartline, David L Malec, and Balasubramanian Sivan. Multi-parameter mechanism design and sequential posted pricing. In Proceedings of the 42nd Annual ACM Symposium on Theory of Computing (STOC), pages 311–320, 2010.
- [CIK+09] Ning Chen, Nicole Immorlica, Anna R Karlin, Mohammad Mahdian, and Atri Rudra. Approximating matches made in heaven. In Proceedings of the 36th International Colloquium on Automata, Languages and Programming (ICALP), pages 266–278, 2009.
- [DFKL20] Paul Dütting, Michal Feldman, Thomas Kesselheim, and Brendan Lucier. Prophet inequalities made easy: Stochastic optimization by pricing nonstochastic inputs. SIAM Journal on Computing (SICOMP), 49(3), 2020.
- [DGR+23] Paul Dütting, Evangelia Gergatsouli, Rojin Rezvan, Yifeng Teng, and Alexandros Tsigonias-Dimitriadis. Prophet secretary against the online optimal. In Kevin Leyton-Brown, Jason D. Hartline, and Larry Samuelson, editors, Proceedings of the 24th ACM Conference on Economics and Computation, EC 2023, London, United Kingdom, July 9-12, 2023, pages 561–581. ACM, 2023.
- [DK15] Paul Dütting and Robert Kleinberg. Polymatroid prophet inequalities. In Nikhil Bansal and Irene Finocchi, editors, Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14-16, 2015, Proceedings, volume 9294 of Lecture Notes in Computer Science, pages 437–449. Springer, 2015.
- [DKL20] Paul Dütting, Thomas Kesselheim, and Brendan Lucier. An prophet inequality for subadditive combinatorial auctions. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 306–317. IEEE, 2020.
- [DSSX21] John P. Dickerson, Karthik A. Sankararaman, Aravind Srinivasan, and Pan Xu. Allocation problems in ride-sharing platforms: Online matching with offline reusable resources. ACM Trans. Econ. Comput., 9(3), June 2021.
- [EFGT20] Tomer Ezra, Michal Feldman, Nick Gravin, and Zhihao Gavin Tang. Online stochastic max-weight matching: prophet inequality for vertex and edge arrival models. In Proceedings of the 21st ACM Conference on Economics and Computation (EC), pages 769–787, 2020.
- [FGL15] Michal Feldman, Nick Gravin, and Brendan Lucier. Combinatorial auctions via posted prices. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 123–135, 2015.
- [FHTZ20] Matthew Fahrbach, Zhiyi Huang, Runzhou Tao, and Morteza Zadimoghaddam. Edge-weighted online bipartite matching. In Proceedings of the 61st Symposium on Foundations of Computer Science (FOCS), 2020. To Appear.
- [FLT+22] Hu Fu, Pinyan Lu, Zhihao Gavin Tang, Abner Turkieltaub, Hongxun Wu, Jinzhao Wu, and Qianfan Zhang. Oblivious online contention resolution schemes. In Symposium on Simplicity in Algorithms (SOSA), pages 268–278, 2022.
- [FMMM09] Jon Feldman, Aranyak Mehta, Vahab Mirrokni, and S Muthukrishnan. Online stochastic matching: Beating 1-1/e. In Proceedings of the 50th Symposium on Foundations of Computer Science (FOCS), pages 117–126, 2009.
- [FNS19] Yiding Feng, Rad Niazadeh, and Amin Saberi. Linear programming based online policies for real-time assortment of reusable resources. SSRN Electronic Journal, 01 2019.
- [FNS22] Yiding Feng, Rad Niazadeh, and Amin Saberi. Near-optimal bayesian online assortment of reusable resources. In Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, page 964–965, New York, NY, USA, 2022. Association for Computing Machinery.
- [FSZ16] Moran Feldman, Ola Svensson, and Rico Zenklusen. Online contention resolution schemes. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1014–1033, 2016.
- [GHH+21] Ruiquan Gao, Zhongtian He, Zhiyi Huang, Zipei Nie, Bijun Yuan, and Yan Zhong. Improved online correlated selection. In Proceedings of the 62nd Symposium on Foundations of Computer Science (FOCS), 2021. To Appear.
- [GHK+14] Oliver Göbel, Martin Hoefer, Thomas Kesselheim, Thomas Schleiden, and Berthold Vöcking. Online independent set beyond the worst-case: Secretaries, prophets, and periods. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part II, volume 8573 of Lecture Notes in Computer Science, pages 508–519. Springer, 2014.
- [GKPS06] Rajiv Gandhi, Samir Khuller, Srinivasan Parthasarathy, and Aravind Srinivasan. Dependent rounding and its applications to approximation algorithms. Journal of the ACM (JACM), 53(3):324–360, 2006.
- [GKS19] Buddhima Gamlath, Sagar Kale, and Ola Svensson. Beating greedy for stochastic bipartite matching. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2841–2854. SIAM, 2019.
- [GU23] Vineet Goyal and Rajan Udwani. Online matching with stochastic rewards: Optimal competitive ratio via path-based formulation. Oper. Res., 71(2):563–580, 2023.
- [GW19] Nikolai Gravin and Hongao Wang. Prophet inequality for bipartite matching: Merits of being simple and non adaptive. In Proceedings of the 20th ACM Conference on Economics and Computation (EC), pages 93–109, 2019.
- [HJS+23] Zhiyi Huang, Hanrui Jiang, Aocheng Shen, Junkai Song, Zhiang Wu, and Qiankun Zhang. Online matching with stochastic rewards: Advanced analyses using configuration linear programs. In Jugal Garg, Max Klimm, and Yuqing Kong, editors, Web and Internet Economics - 19th International Conference, WINE 2023, Shanghai, China, December 4-8, 2023, Proceedings, volume 14413 of Lecture Notes in Computer Science, pages 384–401. Springer, 2023.
- [HKS07] Mohammad Taghi Hajiaghayi, Robert Kleinberg, and Tuomas Sandholm. Automated online mechanism design and prophet inequalities. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI), pages 58–65, 2007.
- [HMZ11] Bernhard Haeupler, Vahab S. Mirrokni, and Morteza Zadimoghaddam. Online stochastic weighted matching: Improved approximation algorithms. In Ning Chen, Edith Elkind, and Elias Koutsoupias, editors, Internet and Network Economics - 7th International Workshop, WINE 2011, Singapore, December 11-14, 2011. Proceedings, volume 7090 of Lecture Notes in Computer Science, pages 170–181. Springer, 2011.
- [HS21] Zhiyi Huang and Xinkai Shu. Online stochastic matching, poisson arrivals, and the natural linear program. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 682–693. ACM, 2021.
- [HSY22] Zhiyi Huang, Xinkai Shu, and Shuyi Yan. The power of multiple choices in online stochastic matching. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 91–103. ACM, 2022.
- [HZ20] Zhiyi Huang and Qiankun Zhang. Online primal dual meets online matching with stochastic rewards: configuration lp to the rescue. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, page 1153–1164, New York, NY, USA, 2020. Association for Computing Machinery.
- [JL13] Patrick Jaillet and Xin Lu. Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research, 2013.
- [JMZ22] Jiashuo Jiang, Will Ma, and Jiawei Zhang. Tight guarantees for multi-unit prophet inequalities and online stochastic knapsack. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1221–1246, 2022.
- [KS78] Ulrich Krengel and Louis Sucheston. On semiamarts, amarts, and processes with finite value. Probability on Banach spaces, 4:197–266, 1978.
- [KVV90] Richard M Karp, Umesh V Vazirani, and Vijay V Vazirani. An optimal algorithm for on-line bipartite matching. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC), pages 352–358, 1990.
- [KW19] Robert Kleinberg and S Matthew Weinberg. Matroid prophet inequalities and applications to multi-dimensional mechanism design. Games and Economic Behavior, 113:97–115, 2019.
- [LS18] Euiwoong Lee and Sahil Singla. Optimal online contention resolution schemes via ex-ante prophet inequalities. In Proceedings of the 26th Annual European Symposium on Algorithms (ESA), pages 57:1–57:14, 2018.
- [Luc17] Brendan Lucier. An economic view of prophet inequalities. ACM SIGecom Exchanges, 16(1):24–47, 2017.
- [Meh13] Aranyak Mehta. Online matching and ad allocation. Foundations and Trends® in Theoretical Computer Science, 8(4):265–368, 2013.
- [MGS12] Vahideh H Manshadi, Shayan Oveis Gharan, and Amin Saberi. Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research, 37(4):559–573, 2012.
- [MMG23] Calum MacRury, Will Ma, and Nathaniel Grammel. On (random-order) online contention resolution schemes for the matching polytope of (bipartite) graphs. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1995–2014, 2023.
- [MP12] Aranyak Mehta and Debmalya Panigrahi. Online matching with stochastic rewards. In Symposium on Foundations of Computer Science (FOCS), 2012.
- [MWZ15] Aranyak Mehta, Bo Waggoner, and Morteza Zadimoghaddam. Online stochastic matching with unequal probabilities. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-15), pages 1388–1404, 2015.
- [NSW23] Joseph Naor, Aravind Srinivasan, and David Wajc. Online dependent rounding schemes. CoRR, abs/2301.08680, 2023.
- [PPSW21] Christos Papadimitriou, Tristan Pollner, Amin Saberi, and David Wajc. Online stochastic max-weight bipartite matching: Beyond prophet inequalities. In Proceedings of the 22nd ACM Conference on Economics and Computation (EC), pages 763–764, 2021.
- [PRSW22] Tristan Pollner, Mohammad Roghani, Amin Saberi, and David Wajc. Improved online contention resolution for matchings and applications to the gig economy. In Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, page 321–322, New York, NY, USA, 2022. Association for Computing Machinery.
- [Rub16] Aviad Rubinstein. Beyond matroids: secretary problem and prophet inequality with general constraints. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16, page 324–332, New York, NY, USA, 2016. Association for Computing Machinery.
- [SC84] Ester Samuel-Cahn. Comparison of threshold stop rules and maximum for independent nonnegative random variables. the Annals of Probability, 12(4):1213–1216, 1984.
- [Sri01] Aravind Srinivasan. Distributions on level-sets with applications to approximation algorithms. In Proceedings of the 42nd Symposium on Foundations of Computer Science (FOCS), pages 588–597, 2001.
- [SW21] Amin Saberi and David Wajc. The greedy algorithm is not optimal for on-line edge coloring. In Proceedings of the 48th International Colloquium on Automata, Languages and Programming (ICALP), pages 109:1–109:18, 2021.
- [TT22] Alfredo Torrico and Alejandro Toriello. Dynamic relaxations for online bipartite matching. INFORMS Journal on Computing, 2022.
- [TWW22] Zhihao Gavin Tang, Jinzhao Wu, and Hongxun Wu. (Fractional) online stochastic matching via fine-grained offline statistics. In Proceedings of the 54th Annual ACM Symposium on Theory of Computing (STOC), pages 77–90, 2022.
Appendix A Informative Examples and Observations
In this section, we give some examples and observations which might help to gain a deeper understanding of the problem.
A.1 The Generalization of [BDL22] Fails
Given the attention previously dedicated to the unit-capacity case, we first ask how these algorithms perform for the capacitated problem. Previous works for matching have all used the LP relaxation (LPon) with , in the special case where each success probability equals 1. In the simplest case where every resource either (i) arrives with a fixed capacity and values, with probability or (ii) does not arrive, with probability , the algorithm works in the following way: in the case that resource arrives, every available user sends a proposal to with probability
an expression that is at most 1 by (LPon) Constraint (2). The resource is matched to the proposing user with highest value . [BDL22] show that this algorithm gives a -approximation against (LPon), and hence also the optimum online benchmark.
To account for capacities, we might naturally generalize this algorithm to match an arriving resource to the top proposing users. Surprisingly, this small modification drastically changes the algorithm’s performance.
See 1.3
Proof.
Take some such that , and consider an instance with users and two resources. The first resource has a capacity of (i.e., values are additive over all users), arrives with probability and values are for each user individually. The second resource is unit-capacity, arrives with probability 1, and values are for each user individually. All allocations are successful with probability 1.
The unique optimal solution to (LPon) sets for every pair incident to the first resource, and sets for every pair incident to the second resource. Thus, when running (the natural generalization of) [BDL22], every user proposes to the first resource if it arrives, and hence with probability all users are assigned in the first timestep. If the first resource does not arrive, exactly one user is allocated to the second unit-capacity resource. Hence the expected gain of the algorithm is . However clearly for this instance . ∎
A.2 Positive Correlation is Required
Next, we argue that we need to have positive correlation for general capacitated resource allocation.
See 1.4
Proof.
Let denote an indicator for user being free just before the arrival of resource . Consider resource with capacity two arriving with probability which is adjacent to two users with unit values. Imagine the LP sets a value of on each edge. To achieve an approximation factor of against LP, we are required to have that that the expected number of users assigned to is at least . Equivalently, we must have
implying
However, because and can only be matched if arrives, we have
where the final inequality holds for sufficiently small . ∎
A.3 On the Gap of (LPon)
Example A.1.
There exists an instance of online capacitated allocation where
Proof.
Consider an instance with two offline users, and two stochastic arrivals. The first resource has capacity 2, and arrives with probability ; the second resource has capacity 1 and arrives with probability 1. Both resources have a value of 1 for each user; every edge is successful with probability 1.
The optimum online algorithm achieves a value of 2 if the first user arrives, and a value of 1 otherwise, hence achieving in expectation. However, a feasible solution to (LPon) sets for every edge , hence achieving a value of 2. ∎
A.4 A Bound Depending on .
As mentioned in Section 4.2.2, the bound following Equation 8 in the proof of 4.10 is not tight if all are strictly greater than one. Still, even though this step looks quite lossy at first glance, we are not losing much in our analysis by replacing with one. To see this, consider replacing the last inequality in the proof of 4.10 with a bound depending on . Doing so, we get
(22) |
As a consequence, in order to show the desired lower bound on , we first can use the same reasoning as we used in order to derive Equation 9, but use Inequality (A.4) instead:
Thus, the right-hand side needs to be at least as large as . In other words, we are required to show that
Hence we can take any such that
(23) |
As a consequence, we can now solve Equation 23 for in order to improve upon the constant of which we used initially, as a function of . In Table 1, we state these constants for , demonstrating that there is little loss in our analysis of Algorithm 1 when replacing with .
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|
0.0115 | 0.0126 | 0.0131 | 0.0133 | 0.0134 | 0.0135 | 0.01362 | 0.01367 | 0.01371 |
Appendix B Deferred Proofs
In this section, we provide proofs which were deferred from the main body.
B.1 Proof of Theorem 1.5
See 1.5
Proof.
We apply Theorem 19 of [BHK+24], as our problem of capacitated resource allocation can be viewed exactly as what they call a prophet inequalities problem. Using their notation, we take to be Algorithm 3, with expected social welfare . Note that Algorithm 3 is what [BHK+24] call “past-valuation-independent,” as its allocation decision for buyer depends only on the set of available items, the arriving valuation/capacity , and the LP solution calculated from knowledge of the input distributions. Note also that for each buyer , the outcome space (what [BHK+24] refer to as “”) is of size at most because is upper bounded by a constant. Finally, although our distribution over is not continuous, it is not hard to satisfy this assumption by adding a small amount of noise or a tiebreaking coordinate (as mentioned in [BHK+24]).
Hence, there is a pricing based algorithm which uses many samples, runs in time and whose expected social welfare satisfies
∎
B.2 Proof of 2.1
See 2.1
Proof.
Define an indicator random variable for every pair , which is one if and only if the optimum online algorithm allocates user to resource . In addition, let be the indicator which is one if the assignment of the pair was successful; i.e. the independent Bernoulli coin flip with probability comes up heads.
Denote by . First, note that the welfare achieved by the optimum online algorithm is
coinciding with the objective of (LPon). Here the expectation is over the randomness in as well as the success probabilities for , and we crucially use that the successful realization of is independent of our decision to allocate along .
Also, observe that for any resource , we have if the resource does not arrive, and if the resource arrives, as any algorithm is allowed to allocate at most users to resource if the resource arrives. Hence
Finally, note that if resource arrives, the optimum online algorithm can only allocate user if it is available. For user being available, it had not to be allocated to some previous resource whose independent coin flip was successful as well. Crucially, for any online algorithm, the event that user is available at time is independent of the arrival of resource (this does not hold for an offline algorithm). Hence, we observe
As a consequence, is a feasible solution to (LPon) and hence, . ∎
B.3 Proof of 4.14
See 4.14
Proof.
Plugging in the definition of , the claim is equivalent to
Multiplying out the left-hand side and subtracting on both sides, this is equivalent to
If , the claim is trivially true. If , we can divide both sides by to get
Multiplying both sides by , we get
Subtracting on both sides, we finally end up with
which is clear. ∎
Appendix C Beyond Bernoulli Distributions
When not restricting the model to Bernoulli arrivals, for every round , there is a known distribution over valuation vectors and a capacity . Upon the arrival of resource , it samples one index with probability 888We assume without loss of generality that all resource share the same space of valuation vectors and capacities, and we can set if realization is not feasible for resource . Also, we assume that resources always arrive by adding a valuation vector containing only zeros with the probability of resource not arriving., and realizes capacity and values over users. For the ease of exposition, we discuss general arrivals in the case that each success probability , and describe the changes needed to handle arbitrary success probabilities in Appendix D.
Generalized LP
We generalize LPon as follows.
(General-LPon) | |||||
s.t. | (24) | ||||
(25) | |||||
(26) |
In an equivalent manner to 2.1, we can argue that also for general distributions, , i.e. General-LPon is a relaxation of the optimum online algorithm.
Generalized Algorithm.
In order to round any fractional LP solution to an integral one in an online fashion, we extend our Algorithm 1 as follows: In round , we see the realization of index . We replace all previous LP variables with the ones from the generalized LP for index and run the slightly modified Algorithm 3.
As in our Bernoulli case, observe that we choose in a way so that the following holds: . Also, note that this algorithm can be implemented in polynomial time in the number of resources and users and the size of the support of the distributions. Concerning the computation of , we can observe that for our choice of , the generalized analysis also shows that any is lower bounded by a constant; equivalently to the Bernoullli case. This can be used to estimate via samples with a multiplicative error as small as desired, implying a -approximate algorithm, following the logic of Section 5.
Generalized Analysis.
In order to prove the generalization of Theorem 4.1, the major work is to change the syntax of the lemmas on the way. We do not give details for all lemmas but rather provide the key steps on what to change and how to overcome obstacles on the way.
First, we extend and change several definitions such as or , as the set of assignments if the realized index is via a first or second proposal. The lemmas, observations and statements which referred to “ arriving” are now with respect to the event “ realizes index ”. For example, when talking about assigning to via a first proposal, we replace this by saying that we assign to via a first proposal when realized the valuation vector with index .
The proofs for the analysis of early pairs directly carry over after adapting the syntax. For late pairs, the generalization of the proof of Lemma 4.5 (i) is also straightforward, as is the combination of both analyses at the end.
We need to take some care in generalizing the proof of Lemma 4.5 (ii). The majority of the steps can be extended straightforwardly via syntactic generalization from Section 4 (or Section 5 with an estimate of the expectation in 13). In contrast, the proof of generalized versions of the correlation bound from Section 4.3, and in particular 4.12 need some short updates. Note however that as 4.12 only concerns early pairs, it is not affected by the updates for a sample-based algorithm as in Section 5.
To see why 4.12 also holds in the more general variant, we go through its proof steps one-by-one. Concerning the generalization of Step (S1) we note that the probability of both users being free after time can still be decomposed as the product of the probability of both being free before times the conditional probability of assigning neither via a first proposal (as in Equation 10). Still, we are required to sum the latter conditional probabilities for all possible realizations of . Doing so, we first follow Steps (S1) and (S2) from the Bernoulli case. During Step (S3), we need to show that for two distinct users and resource , the following inequality holds:
(27) | ||||
In order to argue that this inequality is indeed true, we depart from the proof of the Bernoulli case by controlling the term via the online constraint for the user . By Constraint (26), we know that
Using this, we can bound
Plugging this into the left-hand side of Equation 27 and rearranging terms, we can conclude in a similar way as we did using Fact 4.13 in the Bernoulli case. Afterwards, Step (S4) of the correlation bound can again proceed via syntactic generalization which concludes the proof for general distributions.
Appendix D Stochastic Rewards
In Section 4 we assumed for convenience that every pair had a success probability . This was mainly for convenience of notation, as the guarantees for our algorithm carry over to the case of arbitrary success probabilities . The changes can furthermore be adapted to our sample-based algorithm (as in Section 5) and algorithm for non-Bernoulli arrivals (as in Appendix C), although for simplicity we start by extending the algorithm for Bernoulli arrivals without samples.
We recall that we say is allocated to if it is one of the at most items which we attempt to assign to , and we say it is successfully allocated to if and only if it is allocated and the independent success indicator comes up heads. Note that if for every we have that is allocated to with probability , then because of the independence of the success indicators we have that the expected welfare contribution of is and hence we achieve a -approximation to (LPon).
If we naturally update our definition of (instead of ), many of the changes required to the analysis are syntactic. We inductively show that the probability is allocated is , and hence have as part of the inductive hypothesis that . Thus, the probability an early is allocated is precisely
The analysis for late pairs also generalizes syntactically, with the caveat that we must take care to consider how the independent affect the correlation bound of Lemma 4.9. Intuitively, as these Bernoullis are independent of our proposals and history, they should not contribute to worse positive correlation. This is formalized below.
We first consider the proof of 4.12. Our original proof (the grey line below) used the bound
In the new setting, with the independence of successful matches, we have instead
Hence, we will define and . As the proof proceeds identically with this syntactic change, and implies
With this change in place the proof of Lemma 4.9 can be modified syntatically with the new definition of . Indeed, the only property we need is that is that should now denote the event that is successfully allocated to an arrival in (and similarly for ). Then,we compute
where we use the updated definition of .
We also can readily integrate these changes in our (sampling-based) algorithm for arrivals from general distributions. In particular, we have the following LP relaxation and algorithm.
(General-LPon-Stochastic) | |||||
s.t. | (28) | ||||
(29) | |||||
(30) |
To analyze the algorithm, we can now generalize , so that and . Similarly, we can define . Using and , the arguments of Appendix C now generalize syntatically, as described above for the Bernoulli case. The stochastic rewards do not change the argument from Appendix C that is bounded away from by a constant, and hence can be computed efficiently within a multiplicative error factor when running the polynomial-time sample-based algorithm.