Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access May 23, 2022

Two hide-search games with rapid strategies for multiple parallel searches

  • Peter E. Creasey ORCID logo EMAIL logo
From the journal Open Computer Science

Abstract

Making a rapid unpredictable decision from N choices of unequal value is a common control task. When the cost of predictability can be modelled as a penalty hidden under a single option by an intelligent adversary, then an optimal strategy can be found efficiently in O ( N log N ) steps using an approach described by Sakaguchi for a zero-sum hide-search game. In this work, we extend this to two games with multiple parallel predictions, either coordinated or drawn independently from the optimal distribution, both of which can be solved with the same scaling. An open-source code is provided online at https://github.com/pec27/rams.

1 Introduction

In scenarios that involve adversarial behaviour, it often pays to act in a manner that is not entirely predictable. Given a zero-sum competition with two adversaries and known payoffs (or expected payoffs) for every combination of finite choices, minimax probabilities can be found via linear programming [1].

For N potential choices and N predictions of that choice, the general approach requires us to enumerate all N 2 outcomes in the normal form. Solving the corresponding linear program can be performed via interior point methods in almost as little as O ( N 2 ) operations [2,3]. Parametric models for the payoffs only improve the computational scaling if the resultant linear program is amenable to a faster algorithm.

Real-time control applications frequently seek superior scalings. A common problem in artificial intelligence in video games (Game AI) is for simulated agents in an adversarial scenario to pick their next destination tactically N dynamically sampled locations. Choices are calculated from instantaneous measures of the local environment and positions of neighbouring agents, and must both emulate rational unpredictable behaviour and be computable for all agents within a few milliseconds [4,5].

One model that satisfies this requirement is the zero-sum hide-search game, following the approach of Sakaguchi [6][1]. Here the choice is that of a Hider who is rewarded for choosing site i with payoff r i , excepting that the Searcher (predictor) has also chosen i , in which case an additional strictly positive penalty p i is applied. This game has normal form decomposable into a column-independent matrix minus a positive-definite diagonal matrix, i.e.

(1) Γ i j = r i , i j , r i p i , i = j ,

is the payoff for the Hider. This imposes sufficient structure on the off-diagonal terms (the predictions are exact) that minimax strategies can be computed in O ( N log N ) steps, with bounds due to the ordering of the r i .

For general extensions this scaling is sacrificed, for example in security games where the requirement of exact cover (prediction) is dropped (e.g. [11]), or multiple non-interchangeable search resources are introduced (e.g. [12]), both of which require the more general linear program to be solved. Allowing additional stages where sequential choices are restricted to neighbours becomes a search-evasion game [13,14], and whilst they have received considerable attention even some of the most trivial search games remain unsolved [15].

The contribution of this article is to extend (1) to two games with multiple parallel searches where optimal strategies, and their sampling, can be computed with the same algorithmic scaling. In the first, the Searcher is allowed to coordinate Y > 1 searches, with the penalty applied once if any of the searches succeed. The additional parameter Y modifies the value of the game and its strategies in a way that is not a simple re-scaling of the r i and p i . Drawing Y samples without replacement from the marginal distribution is performed via the method of ref. [16]. In the second “non-coordinated” variation, the Searcher draws Y independent search sites (i.e. with replacement) from an identical optimised distribution. An interesting aspect of this variation is that it introduces non-linearity, and follows a method resembling the resource allocation problems of refs [17] and [18].

The structure of this article is as follows. In Section 2, we provide some basic definitions, along with a proof of the single prediction case, and illustrate with a simple example. In Section 3, we extend this to multiple searches, where in Section 3.1, we treat the case of multiple coordinated predictions, and in Section 3.2 multiple predictions drawn independently from an identical distribution. We summarise in Section 4.

2 Basic definitions and the single search case

In this section, we describe the solution to the case with a single search, i.e. the problem in equation (1). We then use this to solve a game in Example 2.10 to illustrate the behaviour of solution.

Let us follow the convention that the receiver of payoff Γ i j is named the Hider and Γ i j the Searcher, where the Hider and Searcher choose sites i , j { 1 , , N } , respectively, their respective probabilities x i and y j following the usual conditions that they are positive and sum to unity. The expected payoff of the game for the Hider is

(2) V ( x , y ) = E [ Γ ] = i x i ( r i p i y i )

and linearity w.r.t. x and y individually guarantees minimax, V = min y max x V = max x min y V .

Theorem 2.1

(Single prediction) The minimax strategy for this game has expected payoff

(3) V = max i 1 + j : r j r i r j p j k : r k r i 1 p k .

The minimax strategy for the Searcher is unique and has probability

(4) y j = max r j V p j , 0 ,

which is, in general, mixed.

The minimax strategy for the Hider is unique unless r i = V . In either the unique or degenerate case, however, the minimax strategies are those, and only those, which satisfy

(5) x i = K p i × 1 , r i > V , [ 0 , 1 ] , r i = V , 0 , r i < V ,

with the normalisation K fixed by summation of the x i to unity.

This is a slight generalisation of results described in [8, 8.1 “Scud Hunt”] and [9, 1.7.7], which describe this solution with a choice of zero for the r i = V case of (5). Special cases such as p i = β r i (assuming β > 0 and r i > 0 ) or r i = p i (also for r i > 0 ) are treated in refs [7] and [10, Part II 3.7.17], respectively.

A useful result for solving such problems, which will be extensively used here, is the Gibbs lemma. This is a necessary first-order condition to find a maximum over x i , with the constraint that the x i must be non-negative. For zero-sum games this can be written as:

E [ Γ ] x i = J , x i > 0 , J , x i = 0 , E [ Γ ] y i = K , y i > 0 , K , y i = 0 .

for x i and y i , respectively. We now describe the solution to Theorem 2.1.

Lemma 2.2

The set of supported sites of the Searcher are contained in those of the Hider, i.e.

(6) y i > 0 x i > 0 .

Proof

From the Gibbs lemma w.r.t. y i we have

(7) p i x i = K , y i > 0 , K , y i = 0 ,

i.e. K is the maximum value of { p i x i } . Since the p i are strictly positive, and at least some of the x i must be strictly positive (in order to sum to unity), then this maximum K must be strictly positive. The first case thus gives

(8) x i = K p i , y i > 0 ,

i.e. x i strictly positive ∀i s.t. yi > 0. Conversely, y i = 0 x i = 0 .□

Lemma 2.3

The supported sites for the Searcher are contiguous over the largest r i ,

(9) y j > 0 y k > 0 , k : r k r j ,

i.e. the Searcher will only visit those sites with rewards above some τ .

Proof

Assume this was not the case, i.e. there is some r k r j s.t. y j > 0 and y k = 0 . From the Gibbs lemma w.r.t. x i , we have

(10) r i y i p i = J , x i > 0 , J , x i = 0 .

From Lemma 2.2 we have y j > 0 x j > 0 , and so the upper equality in (10) applies for j , i.e. r j y j p j = J . Since x k unknown, we have the (less strict) lower inequality from (10), i.e. r k y k p k J , and substitute that y k = 0 to get the inequality

r j y j p j r k .

However, since r k r j (our assumption), and y j p j strictly positive, we have a contradiction.□

For convenience let us define L ( τ ) as the set of indices of elements larger than or equal to τ ,

(11) L ( τ ) = { i : r i τ } ,

and a measure of these μ ( τ ) defined as the sums of harmonic p i ,

(12) μ ( τ ) = i L ( τ ) 1 p i .

Corollary 2.4

The minimax strategy for y must correspond to some τ { r i } , where

(13) y i ( τ ) = r i J τ p i , r i τ , 0 , r i < τ ,

where

(14) J τ = 1 + j L ( τ ) r j p j μ ( τ ) .

Proof

From Lemma 2.3 we know that y i > 0 y j > 0 ; j : r j r i , and since at least some of y i non-zero, we can index the non-zero y using

τ = min i : y i > 0 r i

to re-write the index set for non-zero y using (11),

{ i : y i ( τ ) > 0 } = { i : r i τ } = L ( τ ) .

Using the upper case of (10), y i = r i J p i ; x i > 0 , for some constant J . From Lemma 2.2 we know all other y zero, i.e. this can be written as:

y i = r i J p i , y i > 0 .

Since the y i must sum to unity, we can find J τ as in (14). Substitution back into the first case of (10) gives

y i ( τ ) = r i J τ p i , r i τ , 0 , r i < τ .

Lemma 2.5

For J τ at minimax, and any τ < τ , we have

(15) J τ τ

and

(16) J τ J τ .

Proof

From (10) J τ is an upper bound for the r i y i p i , i.e.

J τ r i y i p i .

Now for r i < τ we have y i = 0 (Lemma 2.3) and so we have (15),

J τ r i r i < τ .

Now decompose J τ into

(17) J τ = 1 + i L ( τ ) r i p i μ ( τ ) = α τ τ J τ + i : τ r i < τ r i p i μ ( τ ) ,

where we have defined α τ τ as the ratio of measures

(18) α τ τ = μ ( τ ) μ ( τ )

and substitute (15) into (17) to give

J τ α τ τ J τ + i : τ r i < τ J τ p i μ ( τ ) α τ τ J τ + ( 1 α τ τ ) J τ J τ .

Corollary 2.6

Since at minimax we have J τ r i , r i < τ , then (13) can be re-written

(19) y i = max r i J p i , 0 .

Lemma 2.7

J τ at minimax J τ J τ + τ + > τ .

Proof

Using the same decomposition as (17) we have

(20) J τ = α τ + τ J τ + + i : τ r i < τ + r i p i μ ( τ ) .

Since all the probabilities y i must be positive, it follows from (13) that

(21) r i J τ i : r i τ

and so we have

J τ α τ + τ J τ + + i : τ r i < τ + J τ p i μ ( τ ) α τ + τ J τ + + ( 1 α τ + τ ) J τ .

Now α τ + τ ( 0 , 1 ) since L ( τ + ) L ( τ ) and L ( τ + ) , which gives

J τ J τ + .

Corollary 2.8

From Lemmas 2.5 and 2.7, minimax gives J τ J τ τ , i.e. it is maximal. We can thus write J = V , with V defined as in (3),

V = max i 1 + j L ( r i ) r j p j μ ( r i ) ,

and (4) follows immediately from Corollary (2.6),

y i = max r i V p i , 0 .

Lemma 2.9

The strategy for the Hider is at minimax iff it is of the form in (5),

x i = K p i × 1 , r i > V , [ 0 , 1 ] , r i = V , 0 , r i < V ,

with the normalisation K fixed by summation to unity.

The expected payoff of any of these strategies is V .

Proof

For (5) to describe all the minimax for the Hider we must show it is both necessary and sufficient.

First let us show that it is necessary. To do this we need to show that, given the Searcher’s optimal strategy, the Hider cannot improve its payoff.

If we fix Searcher’s probabilities to be its optimal strategy, the payoff for the Hider can be written as:

(22) E V y i = max r i V p i , 0 = i : r i V x i V + i : r i < V x i r i ,

and by inspection we see that, since the probabilities must sum to unity, the Hider can maximise its payoff by transferring any probability from the sites r i < V to those with r i V , i.e. x i = 0 , r i < V . In combination with (7) we see that these conditions are necessary. N.B. the payoff in this case is given by

(23) E V y i = max r i V p i , 0 = V .

We now need to check that these are sufficient, i.e. if we fix the Hider’s strategy to (5), we must check that the Searcher cannot improve its payoff by deviating its strategy. To this end, let us consider deviations to the minimax strategy of

(24) y ˜ i = max r i V p i , 0 + ξ i ,

with the condition that the ξ i must sum to zero and that the resulting probability must be positive. This latter condition implies ξ i 0 , i : r i V .

The payoff for the Searcher of such a deviation can thus be written as:

(25) E [ V y ˜ ] = V + i : r i > V K ξ i + i : r i = V x i p i ξ i .

It thus follows that increases in the payoff correspond to positive ξ i for i : x i p i = K and (by summation to zero) corresponding negative ξ i for i : x i p i < K . Since this latter set { i : x i p i < K } { i : r i V } , however, then those ξ i must be positive and we have a contradiction. Thus, there is no better strategy for the Searcher, and those strategies in (5) are optimal.□

2.1 Remarks and example

The reason this game is soluble by hand is that the dependence of the payoff on the site choice of the Searcher is restricted entirely to whether it chooses the same location as the Hider. In terms of algorithm, we see from Theorem 2.1 that the solution is described by a maximum over cumulative sums. These can be performed in O ( N ) operations, though we must first rank the r i in O ( N log N ) steps, making this the asymptotic scaling.

A special case that may be of interest is when the Hider suffers a fixed (still strictly positive) penalty for choosing the same site as the Searcher, i.e. p i = P . In this case, the Hider has probabilities uniform over i : r i > V , and zero for i : r i < V . The Searcher o.t.o.h has a probability structure very closely resembling r i , since (4) gives y i = max ( r i V , 0 ) / P .

Example 2.10

(Alice and Bob) Alice and Bob play a game where they each pick a number between 1 and 10. If they choose different numbers, then Bob gives Alice the value of her number in dollars; however, if they choose the same, Alice must give Bob 10 dollars. This corresponds to r = ( 1 , 2 , , 10 ) and p = ( 11 , 12 , 13 , 20 ) . By (3) we have V = J 5 = 357,730 80,507 4.4434 , and the smallest number that either player should pick is 5. The solution is for Alice to pick i with probability

(26) x i = 232,560 80,507 1 10 + i

and Bob to pick i with probability

(27) y i = max i J 5 10 + i , 0 ,

which is plotted in Figure 1.

Figure 1 
                  Optimal strategies for the game given in Example 2.10 as an illustration of Theorem 2.1. Alice and Bob pick a number between 1 and 10. If they choose different numbers, then Bob gives Alice the value of her number in dollars; however, if they choose the same, Alice must give Bob 10 dollars. Left panel gives probabilities for the optimal strategies (26) and (27) for Alice (the Hider, blue circles) and Bob (the Searcher, red squares). Right panel shows 
                        
                           
                           
                              
                                 
                                    V
                                 
                                 
                                    ⋆
                                 
                              
                              =
                              
                                 
                                    
                                       
                                       357,730
                                    
                                 
                                 
                                    
                                       80,507
                                       
                                    
                                 
                              
                           
                           {V}^{\star }=\frac{\hspace{0.1em}\text{357,730}}{\text{80,507}\hspace{0.1em}}
                        
                      (filled circle) as a maximum over 
                        
                           
                           
                              
                                 
                                    r
                                 
                                 
                                    i
                                 
                              
                           
                           {r}_{i}
                        
                     .
Figure 1

Optimal strategies for the game given in Example 2.10 as an illustration of Theorem 2.1. Alice and Bob pick a number between 1 and 10. If they choose different numbers, then Bob gives Alice the value of her number in dollars; however, if they choose the same, Alice must give Bob 10 dollars. Left panel gives probabilities for the optimal strategies (26) and (27) for Alice (the Hider, blue circles) and Bob (the Searcher, red squares). Right panel shows V = 357,730 80,507 (filled circle) as a maximum over r i .

3 Extensions to multiple searches

Let us now consider the case where the Searcher can pick multiple sites, applying a penalty if any of the predictions are exactly correct, and the penalty is only applied once (the case where the penalty is linear in the number of correct predictions can be treated as a re-scaling of the problem in Section 2).

Let us assume there are Y N + searches at sites j κ with κ { 1 , 2 , , Y } . With only a slight abuse of notation we write the payoff for the Hider,

(28) Γ i j Y = r i , i j , r i p i , i j ,

and we additionally define

(29) ν low = max i r i p i ,

a lower bound on the value of the game if the Hider were always caught.

Two things are immediately apparent, that optimal choices for the Searcher are disjoint, and that the expectation depends only on the marginal probabilities of a search of site j (i.e. it is independent of permutations). This “coordinated” problem is similar to combinatorial games such as [19, chapter 2] and we provide a solution in Section 3.1.

In Section 3.2, a second problem studied is where each search j κ is i.i.d., with distribution set by the Searcher. This is a separable non-linear problem, similar to the resource allocation problem of ref. [17].

Algorithms to solve these cases are composed of steps commonly described elsewhere (sampling a marginal distribution without replacement, solving piecewise differentiable monotonic functions etc.); however, testing an implementation for computational efficiency and for errors in special cases is not always trivial, therefore, we provide one in the following link: https://github.com/pec27/rams.

3.1 Y coordinated searches

For Y coordinated searches let us define π j is the marginal (or inclusion) probability of a search at site j . The expected payoff for the Hider can then be written as:

(30) E [ Γ i j Y ] = i x i ( r i p i π i ) ,

with the requirement that π j [ 0 , 1 ] and j π j = Y .

Theorem 3.1

The value of this game to the Hider is

(31) V Y = max ( ν low , ν Y ) ,

where we have defined

(32) ν Y = max i Y + j : r j r i r j p j k : r k r i 1 p k .

The optimal strategies for the Searcher are those, and only those, with inclusion probabilities for the searches that satisfy

(33) π i max r i V Y p i , 0 ,

with the lower bound replaced by equality when V Y = ν Y .

The probabilities for the Hider can be classified into two cases depending on which of ν low or ν Y is greater, and these are described as follows.

(34) x i = K p i × 1 , r i > ν Y , [ 0 , 1 ] , r i = ν Y , 0 , r i < ν Y , ν low ν Y , [ 0 , 1 ] , r i p i = ν low , 0 , r i p i < ν low , ν low > ν Y ,

requiring the x i sum to unity. In the case ν Y ν low summation to unity is fixed via the normalisation factor K, analogous to (5). In the case ν low > ν Y note i : r i p i = ν low since ν low the maximal value.

3.1.1 Proof

The procedure of this proof is to first try to solve the problem with one constraint removed and test if this solution also satisfies the removed constraint. In the case that it does not, we guess the solution set and prove stability.

Let us first consider the problem where we remove the constraint that π i 1 i . The solution to this is analogous to the problem with only one Searcher, with no upper bound on the π i except that indirectly imposed by the normalisation i π i = Y . We denote the optimal solution for the inclusion probabilities for the searches of this reduced problem as

(35) π ¯ i = max r i ν Y p i , 0 ,

with ν Y defined as in (32) (taking note of Y in the numerator there to account for the normalisation).

We now check whether this satisfies the constraint π i 1 . Since p i > 0 , rearrangement of (35) gives us

(36) π ¯ i 1 r i p i ν Y ,

and this is true for all i iff

(37) ν low ν Y ,

with ν low = max i r i p i as defined in (29).

The corresponding probability for the Hider is

(38) x ¯ i = K p i × 1 , r i > ν Y , [ 0 , 1 ] , r i = ν Y , 0 , r i < ν Y ,

with K chosen to fix summation to unity, which is the upper case of (34).

Let us now consider the solution when this is violated.

3.1.2 Solution for ν low > ν Y

In this case, the Hider can guarantee a return of ν low by picking a site with maximal r i p i . The strategy for the Searcher is to choose all of these sites with probability one, degenerate over the choice of the remainder.

Formally, let us begin with an ansatz for the inclusion probability distribution for the searches. We consider

(39) π i ν max r i ν low p i , 0 , 1 , i

chosen s.t. i π i ν = Y . Note this is always possible since max r i ν low p i , 0 π ¯ i and by substitution

(40) i max r i ν low p i , 0 Y .

What is the best response x to this probability distribution? Since the problem is linear we should maximise our probability at the sites of maximal expected value r i p i π i ν .

Note r i p i π i ν r i p i and the LHS achieves its maximum at π i ν = 1 and r i p i maximal, which occurs at i : r i p i = ν low . The best response strategy to π ν is thus to distribute all probability over these maximal values, i.e.

(41) x i ν = [ 0 , 1 ] , r i p i = ν low , 0 , r i p i < ν low ,

and the expected value of this strategy is ν low (since there is always a search with probability 1 at each site the Hider visits).

We now ask if there is a better strategy for the Searcher, and we can trivially see this is false, since it already catches the Hider every time (and there is no direct payoff dependency on the sites of the searches). Having showed these strategies satisfy minimax, let us proceed to verify that no other strategies do.

We check that for any other strategy for the Searcher, the Hider has a better response, and correspondingly for any other strategy for the Hider, the Searcher has a better response. Beginning with the Searcher, suppose some j s.t.

(42) π j 0 , r j ν low p j ,

i.e. some site where π j is strictly positive we choose a probability outside the interval in (39). We see this is only possible for j : r j > ν low . Then let us set x i = δ i j and the expected payoff for the Hider in ( ν low , r j ) , i.e. > ν low and we are done.

Correspondingly for the Hider, we try x ˜ j > 0 for j : r j p j < ν low . Note, however, that since ν low > ν Y then the Searcher has probability to ‘spare,’ i.e. every site can be made to have value (for the Hider) ν low without saturation of the probabilities (in the sense of 40). The Searcher can trivially add another to this site, making the expected value ( 1 x ˜ ) ν low + x ˜ ( r j p j ) < ν low and the payoff for the Hider is reduced.

Finally, we note that in this case the expected payoff is V Y = ν low , and so substitution of this into (39) is equivalent to the inequality in (33), and this completes the proof of Theorem 3.1.

3.1.3 Algorithmic solution and remarks

Given the explicit formulae for the value and probabilities in Theorem 3.1, the only operation that requires a non-trivial algorithm is that of choosing Y sites from N without replacement given the { π i } . Such an algorithm is given by the splitting method of ref. [16], which recursively performs a (weighted) random decision between reducing the number of sites by 1, or performing all the remaining samples from a uniform distribution, dependent upon the largest and smallest π i (in the remaining sites). This process is guaranteed to complete in at most N steps, and as such we are bounded by ranking the π i , i.e. in O ( N log N ) steps.

3.2 Non-coordinated searches

In this section, we apply the additional restriction to the game in (28) that the Searcher chooses Y > 1 sites via independent draws from distribution y i (with replacement). The expected payoff for the Hider is thus

(43) E [ Γ Y ] = i x i ( r i p i + p i ( 1 y i ) Y ) ,

with constraints that the x i and y i be non-negative and sum to unity. The term ( 1 y i ) Y appears as the probability that zero searches are performed on site i (independence).

Minimax still applies to this problem since it is a sum of concave–convex functions (for y i 0 ) (see e.g. ref. [17]). Some remarks on the algorithmic solution to the following theorem are given in Section 3.2.2.

Theorem 3.2

The optimal distribution for the searches is given by

(44) y j = 1 min 1 r j V p j , 1 1 Y

to choose site j , with V being the value of this game for the Hider. V is the single root of the equation

(45) i y i ( V ) = 1 , V [ ν low , max k r k ) .

The optimal strategy/strategies for the Hider depends on whether there is a pure solution for the Searcher, i.e. whether j : y j = 1 . These are explicitly

(46) x i = y i , j : y j = 1 , K N C p i × ( 1 y i ) 1 Y , r i > V , [ 0 , 1 ] , r i = V , 0 , r i < V , otherwise ,

with (in the lower case) K N C chosen to fix the sum of x i to unity.

3.2.1 Proof

The Gibbs lemma w.r.t. y i gives

(47) Y x i p i ( 1 y i ) Y 1 = K , y i > 0 , K , y i = 0 .

This gives us the following corollaries.

Corollary 3.3

K 0 .

Proof

Combining cases of (47) we have K Y x i p i ( 1 y i ) Y 1 . Since all terms on the RHS non-negative we have K 0 .□

Corollary 3.4

For K = 0 , y i = 0 x i = 0 . This immediately follows from the lower case of (47) (we know p i and Y strictly positive).

Lemma 3.5

K = 0 j : y j = 1 .

Proof

First let us show K = 0 j : y j = 1 . If K = 0 , then at any i where y i = 0 we have x i = 0 from Corollary 3.4. For the upper case of (47) for the LHS to be zero we must have either x i = 0 or y i = 1 . However, since x i must be non-zero for some i (in order to sum to unity), then there must be at least one i s.t. y i = 1 .

Now let us show K = 0 j : y j = 1 . At this j , y j > 0 and so the upper case of (47) must apply. Substitution of y j = 1 (and knowing Y > 1 ) we have K = 0 .□

Corollary 3.6

j : y j = 1 x i = y i i (Pure solutions are equal).

Proof

By summation of the y i to unity we have y i either 0 or 1, and there is only one j s.t. y j = 1 . From Corollary 3.4 we have x i = 0 at y i = 0 , and since the x i must also sum to unity we must have x j = 1 , and so x i = y i i .□

Corollary 3.7

y i > 0 x i > 0 (Searcher does not choose sites the Hider does not visit)

Proof

When K = 0 we have x i = y i from Lemma 3.5 and Corollary 3.6, and the result is trivial. When K > 0 , y i > 0 is the upper case of (47). For the LHS to be strictly positive we require x i > 0 .□

The Gibbs lemma w.r.t. x i gives

(48) r i p i + p i ( 1 y i ) Y = V , x i > 0 , V , x i = 0 ,

where the constant on the RHS has been noted as V to fulfil the sum in (43).

Corollary 3.8

r i > V y i > 0

Proof

Suppose otherwise, i.e. y i = 0 . Substitution into the combined cases of (48) gives r i V and we have a contradiction.□

Corollary 3.9

r i V y i = 0

Proof

Suppose otherwise, i.e. y i > 0 . By Corollary 3.7 we have x i > 0 and the upper case of (48) applies, i.e. r i p i + p i ( 1 y i ) Y = V . For y i > 0 the LHS is < r i and we have r i > V , a contradiction.□

Lemma 3.10

y i can be written

(49) y i = 1 1 r i V p i 1 Y , r i > V , 0 , r i V ,

defined over V r i p i (which by inspection is equivalent to 44).

Proof

The lower case follows immediately from Corollary 3.9.

For the upper case, we have r i > V y i > 0 by Corollary 3.8. By Corollary 3.7 we have x i > 0 , and the upper case of (48) applies, i.e. r i p i + p i ( 1 y i ) Y = V . Rearranging gives the formula for y i .□

Lemma 3.11

Taking y i from (44) to be functions of V , i.e. y i ( V ) , then they are monotonically decreasing, and strictly decreasing where y i > 0 .

Proof

By inspection they are monotonically decreasing. y i > 0 over the domain [ r i p i , r i ) . Over the open interval ( r i p i , r i ) we have y i ( 0 , 1 ) and its derivative is

(50) y i V = 1 Y p i ( 1 y i ) 1 Y < 0

strictly negative for p i > 0 , Y > 1 , hence y i strictly decreasing over [ r i p i , r i ) .□

Lemma 3.12

The root of i y i ( V ) = 1 lies in the interval [ max j r j p j , max k r k ) , i.e. (45).

Proof

By substitution of y i [ 0 , 1 ] into the combined case of (48) we have

(51) V max i r i p i .

By substitution of max k r k into (44) we see

(52) i y i ( max k r k ) = 0 ,

and since the sum is a monotonically decreasing function, max k r k is a strict upper bound.□

Corollary 3.13

V is the unique root of i y i ( V ) = 1 .

Proof

By minimax we know some solution exists in Lemma 3.12. The arguments of the sum are monotonically decreasing, and to match the RHS there must at least be some i for which y i ( V ) > 0 . By Lemma 3.11, it is strictly decreasing here and consequently the sum also. Hence, the root is unique.□

Lemma 3.14

For the case K > 0 , the solutions for the Hider are those and only those which satisfy (46lower case), i.e.

x i = K N C p i × ( 1 y i ) 1 Y , r i > V , [ 0 , 1 ] , r i = V , 0 , r i < V ,

where K N C = K Y .

Proof

The upper two cases follow immediately from (47). The case r i < V y i = 0 (3.9) and to simultaneously satisfy (48) we must have x i = 0 . Hence, these conditions are necessary.

For sufficiency, we can substitute the explicit expression for x i back into (43) and find the optimal y i . Since that is a convex problem, however, we need to only verify that substitution of (44) satisfies the first-derivative conditions of (48) with value V . This completes the proof of Theorem 3.2.□

3.2.2 Algorithmic solution and remarks

In the left panel of Figure 2 we illustrate root-finding for the monotonic piecewise-analytic function in (45) that is applied in the companion code. For the left-most point ν low , marked by the open circle, the derivative is singular (though the sum itself is finite). For all intervals right of this we have bounded derivative, and since they are convex and monotonically decreasing the Newton method from the left point has guaranteed convergence. For the left-most interval a binary search is performed until we have a new left-bound and the above can be applied.

Figure 2 
                     
                        Left: illustrative solution 
                           
                              
                              
                                 
                                    
                                       V
                                    
                                    
                                       ⋆
                                    
                                 
                              
                              {V}^{\star }
                           
                         as the root of 1 (filled circle of the sum of monotonic piecewise analytic functions 
                           
                              
                              
                                 
                                    
                                       
                                          ∑
                                       
                                    
                                    
                                       i
                                    
                                 
                                 
                                    
                                       y
                                    
                                    
                                       i
                                    
                                 
                                 
                                    (
                                    
                                       
                                          
                                             V
                                          
                                          
                                             ⋆
                                          
                                       
                                    
                                    )
                                 
                              
                              {\sum }_{i}{y}_{i}({V}^{\star })
                           
                        , solid line, (45). The leftmost point at which the sum is defined is at 
                           
                              
                              
                                 
                                    
                                       ν
                                    
                                    
                                       low
                                    
                                 
                              
                              {\nu }_{{\rm{low}}}
                           
                        , open circle. For simplicity we give 
                           
                              
                              
                                 
                                    
                                       r
                                    
                                    
                                       1
                                    
                                 
                                 >
                                 
                                    
                                       r
                                    
                                    
                                       2
                                    
                                 
                                 
                                 >
                                 ⋯
                                 >
                                 
                                    
                                       r
                                    
                                    
                                       5
                                    
                                 
                              
                              {r}_{1}\gt {r}_{2}\hspace{0.33em}\gt \cdots \gt {r}_{5}
                           
                        . Right: the effect of the number of searches 
                           
                              
                              
                                 Y
                              
                              Y
                           
                         on the hider probability 
                           
                              
                              
                                 
                                    
                                       x
                                    
                                    
                                       i
                                    
                                 
                              
                              {x}_{i}
                           
                         as a function of 
                           
                              
                              
                                 
                                    
                                       r
                                    
                                    
                                       i
                                    
                                 
                              
                              {r}_{i}
                           
                        , described in (46), illustrated at fixed 
                           
                              
                              
                                 
                                    
                                       p
                                    
                                    
                                       i
                                    
                                 
                                 =
                                 P
                              
                              {p}_{i}=P
                           
                        . Red, magenta, blue lines correspond to 
                           
                              
                              
                                 Y
                                 =
                                 1
                              
                              Y=1
                           
                        , 2, 3 searches, respectively, decreasing the value of the game (
                           
                              
                              
                                 
                                    
                                       V
                                    
                                    
                                       1
                                    
                                 
                                 ,
                                 
                                    
                                       V
                                    
                                    
                                       2
                                    
                                 
                                 ,
                                 
                                    
                                       V
                                    
                                    
                                       3
                                    
                                 
                              
                              {V}_{1},{V}_{2},{V}_{3}
                           
                        ) and increasing the dependence on the reward over the supported 
                           
                              
                              
                                 
                                    
                                       r
                                    
                                    
                                       i
                                    
                                 
                                 ≥
                                 
                                    
                                       V
                                    
                                    
                                       ⋆
                                    
                                 
                              
                              {r}_{i}\ge {V}^{\star }
                           
                        .
Figure 2

Left: illustrative solution V as the root of 1 (filled circle of the sum of monotonic piecewise analytic functions i y i ( V ) , solid line, (45). The leftmost point at which the sum is defined is at ν low , open circle. For simplicity we give r 1 > r 2 > > r 5 . Right: the effect of the number of searches Y on the hider probability x i as a function of r i , described in (46), illustrated at fixed p i = P . Red, magenta, blue lines correspond to Y = 1 , 2, 3 searches, respectively, decreasing the value of the game ( V 1 , V 2 , V 3 ) and increasing the dependence on the reward over the supported r i V .

In terms of algorithmic complexity, solving for the root of a convex strictly monotonically decreasing function f over a finite interval for a fixed precision can be considered a constant number of evaluations of f and its derivative (albeit typically 1 ). In our case, f is the sum of up to N terms and thus we have an O ( N ) bound. The combined algorithm (including sorting the r i ) is thus O ( N log N ) .

In the right panel of Figure 2, we illustrate the behaviour of the hider probabilities as a function of the reward r i and the number of searches Y . In all cases, the hider probability is non-zero only for rewards above the value of the game. As the number of searches increases (keeping the r i , p i fixed), the value of the game falls as the hider is forced to pick less rewarding sites. For the single search case ( Y = 1 ), the hider probability is independent of r i above the value of the game, whilst at Y > 1 a positive dependence is acquired. This is a consequence of the non-coordination, since for the coordinated version a constant value is always possible (34).

A novel feature of this solution is that it has an analytic continuation to Y continuous ( 1 , ) , with x i and y i computed in the usual way. This has no meaningful interpretation (the Searcher can only perform an integral number of searches) but does provide some intuition for the behaviour.

4 Summary

In this work, we addressed the decision problem of making a rapid unpredictable choice from N unequal options using a hide-search game approach. We extend the single search game to include multiple simultaneous searches, both with coordination and without. The game with coordinated searches, is solved in terms of marginal probabilities and we give explicit solutions in all cases. For the game with multiple non-coordinated searches, we describe the value implicitly as the single root of a monotonically decreasing piecewise-convex function. Unlike more general two-player zero-sum games, these permit algorithms to compute and sample their mixed strategies in O ( N log N ) steps. We provide a complete open-source implementation of all three algorithms.

Acknowledgments

The author would like to thank Thomas S. Ferguson and Annika Lang for reading drafts of this article and their comments and support, and to thank Ali Khan and Graeme Leese for early discussions of the problem in Example 2.10. PEC is employed at Mercuna Developments, an AI middleware company registered in Scotland, number SC545088.

  1. Conflict of interest: Author states no conflict of interest.

  2. Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

[1] J. von Neumann, O. Morgenstern, and A. Rubinstein, Theory of Games and Economic Behavior (60th Anniversary Commemorative Edition), Princeton, NJ, Princeton University Press, 1944. ISBN 9780691130613. Search in Google Scholar

[2] P. M. Vaidya, “Speeding-up linear programming using fast matrix multiplication,” in: Proceedings of the 30th Annual Symposium on Foundations of Computer Science, SFCS ’89, IEEE Computer Society, USA, 1989, pp. 332–337. ISBN 0818619821, 10.1109/SFCS.1989.63499. Search in Google Scholar

[3] S. Jiang, Z. Song, O. Weinstein, and H. Zhang, Faster Dynamic Matrix Inverse for Faster LPs. arXiv e-prints, art. arXiv:2004.07470, April 2020. Search in Google Scholar

[4] M. Jack, “Tactical position selection,” In: Game AI Pro., S. Game, Ed., Chapter 26. Boca Raton, CRC Press, 2013, pp. 337–359. 10.1201/9780429054969-1Search in Google Scholar

[5] E. Johnson, “Guide to effective auto-generated spatial queries,” in: Game AI Pro 3, chapter 26, S. Rabin, Ed., Boca Raton, CRC Press, 2017, pp. 309–325. 10.4324/9781315151700-26Search in Google Scholar

[6] M. Sakaguchi, “Two-sided search games,” J. Operat. Res. Soc. Japan, vol. 16, no. 4, pp. 207–225, Dec 1973. Search in Google Scholar

[7] M. Dresher, Games of Strategy: Theory and Applications. Englewood Cliffs, NJ, Prentice-Hall, 1961. Search in Google Scholar

[8] A. Washburn, Two-Person Zero-Sum Games. 4th edition, New York, Springer, 2014. 10.1007/978-1-4614-9050-0Search in Google Scholar

[9] L. A. Petrosyan and N. A. Zenkevich, Game Theory. Hackensack, NJ, World Scientific, 2016, ISBN 9789814725385, 10.1142/9824. Search in Google Scholar

[10] T. S. Ferguson, Game Theory, Second edition. Hackensack, NJ, World Scientific, 2014. Search in Google Scholar

[11] C. Kiekintveld, M. Jain, J. Tsai, J. Pita, F. Ordóñez, and M. Tambe, “Computing optimal randomized resource allocations for massive security games,” in: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), vol. 1, pp. 689–696, 2009, ISBN 9780981738161, 10.5555/1558013.1558108. Search in Google Scholar

[12] J. Letchford and V. Conitzer, “Solving security games on graphs via marginal probabilities,” Proc. AAAI Conference Artif. Intell., vol 27, no. 1, pp. 591–597, June 2013. 10.1609/aaai.v27i1.8688Search in Google Scholar

[13] K. T. Lee, “A firing game with time lag,” J. Optim. Theory Appl., vol. 41, no. 4, pp. 547–558, December 1983. ISSN 0022-3239, 10.1007/BF00934642. Search in Google Scholar

[14] T. Nakai, “A sequential evasion-search game with a goal,” J. Operat. Res. Soc. Japan, vol. 29, no. 2, pp. 113–122, 1986, 10.15807/jorsj.29.113. Search in Google Scholar

[15] S. Alpern, R. Fokkink, R. Lindelauf, and G.-J. Olsder, “The ‘Princess and Monster’ game on an interval,” SIAM J. Control Optimization, vol. 47, no. 3, pp. 1178–1190, 2008. 10.1137/060672054Search in Google Scholar

[16] J. C. Deville and Y. Tillé, “Unequal probability sampling without replacement through a splitting method,” Biometrika, vol. 85, pp. 89–101, March 1998, 10.1093/biomet/85.1.89. Search in Google Scholar

[17] J. Croucher, “Application of the fundamental theorem of games to an example concerning antiballistic missile defense,” Naval Res. Logistics Quarter., vol. 22, pp. 197–203, March 1975, 10.1002/NAV.3800220117. Search in Google Scholar

[18] V. J. Baston and A. Y. Garnaev, “A search game with a protector,” Naval Res. Logistics, vol. 47, no. 2, pp. 85–96, 2000. https://eprints.soton.ac.uk/29734/. 10.1002/(SICI)1520-6750(200003)47:2<85::AID-NAV1>3.0.CO;2-CSearch in Google Scholar

[19] W. H. Ruckle, Geometric Games and their Applications, Pitman, 1983. Search in Google Scholar

Received: 2022-03-18
Accepted: 2022-04-24
Published Online: 2022-05-23

© 2022 Peter E. Creasey, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 22.1.2025 from https://www.degruyter.com/document/doi/10.1515/comp-2022-0243/html
Scroll to top button