Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Reducing the Filtering Effect in Public School Admissions \TITLEReducing the Filtering Effect in Public School Admissions: A Bias-aware Analysis for Targeted Interventions


Yuri Faenza \AFFColumbia University, New York, NY, \EMAILyf2414@columbia.edu \AUTHORSwati Gupta \AFFGeorgia Institute of Technology, Atlanta, GA, \EMAILswatig@gatech.edu \AUTHORAapeli Vuorinen \AFFColumbia University, New York, NY, \EMAILaapeli.vuorinen@columbia.edu \AUTHORXuan Zhang \AFFColumbia University, New York, NY, \EMAILxz2569@columbia.edu


Problem definition: Traditionally, New York City’s top 8 public schools have selected candidates solely based on their scores in the Specialized High School Admissions Test (SHSAT). These scores are known to be impacted by socioeconomic status of students and test preparation received in middle schools, leading to a massive filtering effect in the education pipeline. The classical mechanisms for assigning students to schools do not naturally address problems like school segregation and class diversity, which have worsened over the years. The scientific community, including policymakers, have reacted by incorporating group-specific quotas and proportionality constraints, with mixed results. The problem of finding effective and fair methods for broadening access to top-notch education is still unsolved.

Methodology/results: We take an operations approach to the problem different from most established literature, with the goal of increasing opportunities for students with high economic needs. Using data from the Department of Education (DOE) in New York City, we show that there is a shift in the distribution of scores obtained by students that the DOE classifies as “disadvantaged” (following criteria mostly based on economic factors). We model this shift as a “bias” that results from an underestimation of the true potential of disadvantaged students. We analyze the impact this bias has on an assortative matching market. We show that centrally planned interventions can significantly reduce the impact of bias through scholarships or training, when they target the segment of disadvantaged students with average performance.

Managerial implications: To make these interventions incentive compatible and individually-fair, we propose a randomization-based policy for allocation of training resources to students, which is heavily targeted towards average performers. Our results challenge existing notions of scholarships in the current education system. We believe that these insights can guide policymakers in answering a critical question: how should one allocate limited funding across schools and students to maximally help disadvantaged students.


bias, admissions, interventions, assortative matching, randomized policies

1 Introduction

Bias and the disparity in opportunities are believed to play a major role in access to education at different levels (Quinn Capers et al. 2017). It is known that outcomes of middle school admissions dictate high school admissions, which in turn impact pathways to higher studies (Corcoran and Baker-Smith 2018). Selection however starts much earlier, with gifted and talented programs screening students as young as 4 years old; these tests often see few students from ethnic minorities succeeding (Shapiro 2019b). In this work, we are motivated by high school admissions in large cities such as New York City (NYC), which has an extensive public school system, with a current enrollment of over one million students. In particular, every year roughly 80,000 students wish to join one of the 700 high school programs. By far, the most sought after public schools are the so-called Specialized High Schools (SHSs). By law, these schools select candidates solely based on their score on the Specialized High School Admissions Test (SHSAT) (NYC DOE 2019). However, such scores are known to be impacted by socioeconomic status of students  (Lovaglia et al. 1998) and test preparation received in middle schools (Corcoran and Baker-Smith 2018, Shapiro 2019a). Since ethnic minorities tend to cluster in middle schools of lower quality (Boschma and Brownstein 2016), they are already at a disadvantage in high school admissions, which then reflects in under-representation in higher education programs (Ashkenas et al. 2017). The results is a massive filtering effect in high school admissions: 50% (resp. 80%) of the students admitted to the SHSs come from only the top 5% (resp. 15%) of the middle schools (Corcoran and Baker-Smith 2018).

The goal of this work is to investigate data-driven interventions at the middle school level to reduce the filtering effect. An extensive literature has focused on doing so by proposing changes to admissions policies themselves (see Section 1.2). However, there are substantial political and legislative hurdles to implement admissions criteria that take the disparate backgrounds of students into account. For example, in 2003, an attempt by the University of Michigan to add 12 points for “diversity” on a 150 point scale to promote admissions of underrepresented ethnic minorities was met with a lawsuit, which was ultimately decided not in favor of the university (Gratz v. Bollinger 2003). Moreover, a 2019 plan supported by the then mayor of New York City to eliminate the entrance exam to SHSs has failed to gain enough support, and was not approved by the New York State Senate (Shapiro and Wang 2019). Hence, finding a way to incorporate mechanisms that help disadvantaged students access high-quality education while keeping the procedure fair and legal is still a fundamental open problem.

In this work, we take a completely different operations perspective. We focus on centralized pre-admission interventions, such as targeted preparatory courses for selected students, that do not involve a change in the admissions criteria. We introduce a matching model of schools and students where some students (that we call disadvantaged) are not evaluated at their true potential, but at a strictly lower level. We then investigate both theoretically and empirically the impact of such differences in treatment, as well as interventions to counter it, in the form of vouchers to access additional training. Our main contribution is a randomized policy for voucher allocation that is individually fair, incentive compatible and, by targeting average disadvantaged students, can substantially reduce the mistreatment they experience, as measured by various metrics. We next present the setup, intermediate results, and experiments leading to our main contribution.

1.1 Contributions

In order to present our mathematical model, let us first look into the characteristics and the mechanism for SHSs admissions in NYC. SHSs admit students uniquely based on the student’s score on the highly competitive SHSAT. The NYC Department of Education (DOE) acknowledges that there is a disparity in students’ abilities to prepare for the test, and so classifies some students as disadvantaged based on criteria such as their household income and the middle school they attended (NYC DOE 2018). Following NYC DOE’s definition, we divide the students who took SHSAT into two groups: non-disadvantaged (G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) and disadvantaged (G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT). We find that the distribution of the SHSAT scores111We obtained this data from the NYC DOE under a non-disclosure agreement. of the two groups (Figure 1(a)) are significantly different, but match closely (as measured by Wasserstein distance) if we scale the scores of disadvantaged students by a factor 1β10.881.131𝛽10.881.13\frac{1}{\beta}\approx\frac{1}{0.88}\approx 1.13divide start_ARG 1 end_ARG start_ARG italic_β end_ARG ≈ divide start_ARG 1 end_ARG start_ARG 0.88 end_ARG ≈ 1.13 (Figure 1(b)). Motivated by this observation, in this paper we consider a model where the true potential Z𝑍Zitalic_Z of a student is sampled from the Pareto distribution222The choice of the Pareto distribution to model potentials is inspired by a body of empirical work (see, e.g., Clauset et al. (2009)) on the achievements of individuals in many professions. As we observe later, it also gives a good approximation to the SHSAT score distribution beyond a certain threshold (see Figure 2)., while the perceived potential Z^^𝑍\widehat{Z}over^ start_ARG italic_Z end_ARG is equal to Z𝑍Zitalic_Z for G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students and to βZ𝛽𝑍\beta Zitalic_β italic_Z for some β(0,1)𝛽01\beta\in(0,1)italic_β ∈ ( 0 , 1 ) for G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. We are not the first to use a multiplicative model333See Appendix 9 for a discussion on how the multiplicative model better fits our data compared to alternative models of bias. to scale the potential of agents: such a model has been pioneered in the literature by Kleinberg and Raghavan (2018), where it was also motivated by (different) experimental evidence (Wenneras and Wold 2010). Following Kleinberg et al. (2017), we call β𝛽\betaitalic_β the bias factor444Broadly speaking, this bias factor captures cases when candidates with the same level of knowledge, skills, or abilities have dissimilar algorithmic or perceived qualification due to their group membership (measurement bias) or when prediction relationships (i.e., predictor and outcome) are not equivalent for members of different groups (predictive bias) (Drasgow 1984)..

Thus, in the following, we assume that schools rank students based on perceived potentials Z^^𝑍\widehat{Z}over^ start_ARG italic_Z end_ARG (e.g., using SHSAT scores). To be able to find tractable policies, we assume in our theoretical analysis that all students share the same ranking of schools (e.g., based on US News Rankings). Although this assumption abstracts out considerations that may be important for students (such as proximity of a school (Burgess et al. 2015), limits on the length of preference lists (Calsamiglia et al. 2010), or strong preferences of students for certain high schools555For instance in the 2016-17 SHSAT cohort, 56% of students indicated Stuyvesant or Brooklyn Tech as their first preference, with 76% naming at least one of the two in their top two preferences.), we also later argue experimentally, by dropping this assumption, that our qualitative results are robust to relaxations of our stylized model.

After investigating the impact that bias has on both disadvantaged and non-disadvantaged students, we propose interventions such as additional training and scholarships towards disadvantaged students, quantifying the effect of those mechanisms and identifying the population they should be targeted to. We discuss qualitative results for the New York City Specialized High Schools by first estimating the multiplicative bias and fitting the Pareto distribution to the data, and then evaluating the effect of the interventions devised in the theoretical model. Our key findings are as follows:

Refer to caption
(a) Original SHSAT Scores
Refer to caption
(b) Scaled SHSAT Scores
Figure 1: Distribution of SHSAT scores across students in group G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT versus the distribution of SHSAT scores across students in group G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for the 2016-17 academic year. We estimate the bias factor β0.88𝛽0.88\beta\approx 0.88italic_β ≈ 0.88.
  1. 1.

    Asymmetric impact and minority effect: We observe that, under reasonable assumptions on the parameters, such as disadvantaged students being a minority, the impact of bias on G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (disadvantaged) students is much bigger than the slight advantage that G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students obtain (see Figure 3 and Figure 7). Moreover, at a societal level, the presence of bias excludes most disadvantaged students from top schools (see Appendix 11 of the e-Companion and Figure 9 therein), explaining a phenomenon often seen in the real world (Shapiro 2021).

  2. 2.

    Deterministic Centralized Interventions: Since numerically rescaling student scores has been ruled unconstitutional in many past cases (Gratz v. Bollinger 2003), we next consider centralized interventions such as extra training, problem solving groups, team building exercises, and refer to these collectively as vouchers in Section 4. These are provided to disadvantaged students, who we assume will benefit from these vouchers and be able to reveal their true potential to all schools (e.g., by retaking the SHSAT exam after additional training). To measure the impact of the bias on disadvantaged students, we first define the mistreatment of a student as the difference in the ranking of the school the disadvantaged student gets matched to under bias compared to the unbiased setting. We then study mechanisms to allocate vouchers to reduce aggregate measures of mistreatment, which we interpret as measures of (group) fairness. We first show that under two such very different measures, maximum benefit is achieved by providing vouchers to average-performing (rather than top) disadvantaged students, assuming as said before that their abilities are Pareto-distributed. These findings challenge existing scholarship/aid allocation mechanisms, addressing one of the key questions facing policy makers on how to distribute resources.

  3. 3.

    Incentive Compatible Voucher Distribution: We next observe that the deterministic allocation of vouchers to average performers creates an incentive for some top students to underperform. More generally, we show that the only deterministic policy that is incentive compatible distributes vouchers to all students whose perceived potential is above a threshold. However, this policy has a small impact on reducing mistreatment. Therefore, we discuss two random allocations of vouchers that are incentive compatible. In particular, one of them (that we term Proportional to Mistreatment or PropM) still favors mid-performing students and guarantees that the maximum expected mistreatment is lower than under the best deterministic policy. The other is incentive compatible under general potential distributions. These policies have the additional benefit of being individually fair, in the sense of a Lipschitz condition, so that the probability of receiving a voucher for students with similar potential is similar.

  4. 4.

    Experimental validation: In Section 6, we validate our theoretical results on admissions data to the SHSs for the academic year 2016-17. We define disadvantaged students following NYC DOE criteria. Although our model assumes that students have homogeneous preferences, we compute the stable matching using their SHSAT scores and reported heterogeneous preferences. We find our key theoretical takeaways to be still valid; for instance, the shape of students’ mistreatment resembles the theoretical prediction (including the fact that average performers are the most mistreated students) and that our voucher distribution program improves the mistreatment across the board. We further show that the ranges of students to give vouchers to, obtained from applying the stylized model to the real data, are qualitatively similar to the best ranges found via naive grid search. This leads to our policy insights, which we discuss next.

  5. 5.

    Policy Insights: Motivated by the goal of maximizing the impact of limited available resources, in this work we propose that additional training and vouchers should be offered to average performers rather than top performers. At a high level, the two assumptions that lead to this result are (1) the concentration of students who perform around the average, compared to a much smaller cohort who make up the top performers, and (2) that given enough opportunities and support, the performance of the two cohorts of the students would be indistinguishable, motivating a multiplicative model of bias which allows the scaled distributions to match closely. The key phenomenon that arises due to these assumptions is that a small deviation in an average disadvantaged student’s perceived performance leads to a significant drop in their school rank.

    (1) is a key characteristic of many common distributions, including the Pareto distribution investigated in detail in this paper. (2) is supported by education and policy literature which shows additional resources can positively impact low achieving student groups (Greenwald et al. 1996, Dee and Jacob 2011). Our work compliments this line of work through mathematical analysis, to target limited resources effectively. The multiplicative model and debiasing effect therefore justifies the design of our randomized policies to distribute the vouchers.

    The current rationale behind most scholarship programs is to reward top performers, driven by a desire for meritocracy, and to drive better top performance by creating this competition. Our analysis, on the other hand, suggests that if the goal is to maximize impact on disadvantaged students, more support should be given to average performers for maximal impact in terms of highest gains in improvement of matched schools, driven by a desire for more equitable outcomes. Moreover, this can be achieved with an incentive compatible randomized policy tailored to the distribution of potentials. However, in the case that nothing is known about the distribution of student potentials (so, in particular, condition (1) cannot be assumed), we show that the only policies guaranteed to be incentive compatible are the ones which allocate vouchers to students with high SHSAT performance with higher probability (e.g., threshold policies). It is worth noticing that the use of randomization is perfectly viable in school choice settings, and many current mechanisms explicitly rely on it666For instance, the NYC Department of Education assigns to each student a 32-hexadecimal digit random number that is used to break ties when two students have the same priority at some school. Ties are extremely frequent, e.g., in admissions to early school years. In fact, most public Kindergartens in New York City prioritize students based on three criteria only: whether a sibling is currently attending the school; whether the student lives in the same zone of the school; whether the student lives in the same district of the school. This means that the random number plays a key role in admission decisions at this level..

Refer to caption
Figure 2: Distribution of true potentials (scaled SHSAT scores) of students who score high enough to receive an offer from a SHS. The best fitting Pareto distribution (i.e., the “theoretical pdf” curve) has parameter α=8.9𝛼8.9\alpha=8.9italic_α = 8.9.

The rest of the paper is organized as follows: In Section 2, we formally introduce our mathematical models for the continuous matching market and multiplicative bias. In Section 3 we analyze the effects of bias on both disadvantaged and non-disadvantaged students, introducing the key concept of displacement. We then consider deterministic policies for reducing such bias via a centralized approach in Section 4, quantifying their impact on students and discussing various notions of (group) fairness. In particular, in Section 4 we present two theorems that quantify the optimal deterministic debiasing sets under two different measures of fairness. In Section 5, we show that such determinstic policies fail to be incentive compatible and individually fair. We introduce the randomized assignment of vouchers to satisfy these fairness conditions while at the same time achieving a lower maximum mistreatment than the deterministic policies. In Section 6 we apply our policies to the real-world dataset of SHSs admissions for the 2016-17 cohort. We close with a discussion in Section 7.

1.2 Related Work

Various selection problems have been investigated in models with a multiplicative bias introduced by Kleinberg and Raghavan (2018) (e.g., Celis et al. (2020, 2021), Salem and Gupta (2023), Emelianov et al. (2020)) but, to the best of our knowledge, this paper is the first to investigate it in the role of school choice. There exists some literature such as Hastings et al. (2009), Laverde (2020) on understanding the impact of family backgrounds on student preferences, but this is orthogonal to the questions we study here. Our work complements existing work in the education and policy literature which shows additional resources can positively impact low achieving student groups (Dee and Jacob 2011, Greenwald et al. 1996).

The most common way to model admissions to schools is through a two-sided market, with the two sides being schools and students, respectively, and each agent having an ordered preference on the agents from the other side of the market that are considered acceptable. This model has been used to match doctors to hospitals by the National Residency Matching Program since the 1960s, and it has gained widespread notoriety when Abdulkadiroğlu et al. (2005) used it to reform the admissions process for New York City public high schools in 2003. Since then, admission decisions have been centralized and are (essentially) governed by the classical Gale-Shapley Deferred Acceptance algorithm (Gale and Shapley 1962). The simplicity of the algorithm, as well as the drastic improvement to the quality of matches it provides when compared to the pre-2003 method, have led to academic and public acclaim, and spurred applications in many other systems (see, e.g., Biró (2008)). However, this mechanism does not naturally address problems like school segregation and class diversity, which have worsened and become more and more of a concern in recent years (Kamada and Kojima 2024, Kucsera and Orfield 2014, Shapiro March 26, 2019, Shapiro and Lai June 03, 2019). The scientific community, including policy-makers, has reacted e.g. by incorporating in the mathematical model group-specific quotas, proportionality constraints (Biró et al. 2010, Nguyen and Vohra 2019, Tomoeda 2018), but there is evidence that adding such constraints may even hurt the very students they were meant to help (Backes 2012, Fershtman and Pavan 2021, Hafalir et al. 2013) or create legal challenges.

There is a long line of work on affirmative action policies in theory and in practice (Abdulkadiroğlu 2005, Arcidiacono et al. 2011, Chade et al. 2014, Chan and Eyster 2003, Hafalir et al. 2013, Quinn Capers et al. 2017) and alternatives like the top 10% admissions criteria implemented in Texas (Texas Comptroller of Public Accounts 2024). In our work, we do not consider proposing substitute mechanisms such as the top 10% criteria, due to significant deviation from current practice. Moreover, it is unclear whether this would improve the status quo or worsen it (e.g., (Long 2004) found a significant impact on the admissions of minorities if affirmative action policies for college admissions were replaced by top x𝑥xitalic_x% rules). In this work, we take a completely different approach to improve the performance of disadvantaged students by voucher distribution, and help mitigate the impact of disparities. This will naturally help with the downstream impacts in the education pipeline towards economic opportunities (Kannan et al. 2019, Coate and Loury 1993), since the evaluation criteria, i.e., SHSAT, and the threshold for admissions remains unchanged in this work. Further, test-optional policies typically studied in the context of college admissions are likely to not be adaptable for high school admissions (Liu and Garg 2021, Dessein et al. 2023), due to the state law in New York.

Further, to the best of our understanding, our work differs from statistical discrimination theories in economics (Phelps 1972) or taste-based discrimination (Becker 2010), since in our setting the sole admissions criteria is performance on the SHSAT score, as necessitated by state law, disbarring the use of additional characteristics of candidates. In fact, our approach tries to minimize the impact of unequal opportunities, through pre-admission resources like after-school coaching or simply the means to retake the SHSAT exam. A recent study (Niu et al. 2022) shows the impact of being able to retake SAT exams and that reporting all the scores leads to more equitable outcomes as well as a more accurate signal for colleges. Our work is aligned with the latter result, in the sense of creating a more accurate signal for disadvantaged students. Further, a recent study (Garg et al. 2021) focused on the design of a fair admissions process by identifying conditions where standardized tests should be dropped, while our paper mostly focuses on pre-admission policies. Furthermore, changing the admissions criteria for SHSs in New York would require changing a state law (Hecht-Calandra Act), which is a significant hurdle to the implementation of such policies (see, for example, (Chin 2022)).

Lastly, any admissions policy is susceptible to manipulation by applicants. Recent work by (Hu et al. 2019) has considered strategic behavior of students in a classification setting, where each student can expend some bounded amount of resources to improve their test-score performance and convert a “reject” decision to an “accept” decision. The school can provide subsidies to students to reveal their true potential. (Hu et al. 2019) shows cases in which providing a subsidy can make the group receiving the subsidy worse-off. Though our work considers a completely different model, we also find cases in which the voucher distribution can in fact worsen some fairness metrics over the disadvantaged groups, or students may be strategic, see the discussions in Section 4 and Section 5.1.

2 A continuous matching market

We introduce a stylized matching model, where students rank schools following a unique strict order, and schools rank students following a unique order. For tractability of results, both schools and students are assumed to be continuous sets, following a recent trend in the literature (see Appendix 8 for discussion). We let the student population be a set ΘΘ\Thetaroman_Θ, and we associate to ΘΘ\Thetaroman_Θ a probability distribution on the potentials of students. For each student θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, we use Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ ) to denote their true potential. Unless otherwise stated, we assume Z(θ)Pareto(1,α)similar-to𝑍𝜃Pareto1𝛼{Z(\theta)}\sim\text{Pareto}(1,\alpha)italic_Z ( italic_θ ) ∼ Pareto ( 1 , italic_α ), and thus, all students have potentials at least 1111.

Continuous Model with One Group:

Let us first consider a case when every student’s true potential is visible to the schools, i.e., there is only one group of students. Consider the cumulative distribution function (cdf) of the distribution of potentials Pareto(1,α)1𝛼(1,\alpha)( 1 , italic_α ), given by F(t)=1tα𝐹𝑡1superscript𝑡𝛼F(t)=1-t^{-\alpha}italic_F ( italic_t ) = 1 - italic_t start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT. The domain of F()𝐹F(\cdot)italic_F ( ⋅ ) is [1,)1[1,\infty)[ 1 , ∞ ) and the range of F𝐹Fitalic_F is [0,1]. Let μ:Θ[0,1]:𝜇Θ01\mu:\Theta\rightarrow[0,1]italic_μ : roman_Θ → [ 0 , 1 ] be the function that ranks students based on their potentials, assuming rankings to be normalized between 0 and 1 (0 being the best, and 1 worst). Note that μ(θ)=1F(Z(θ))𝜇𝜃1𝐹𝑍𝜃\mu(\theta)=1-F(Z(\theta))italic_μ ( italic_θ ) = 1 - italic_F ( italic_Z ( italic_θ ) ), since F(Z(θ))𝐹𝑍𝜃F(Z(\theta))italic_F ( italic_Z ( italic_θ ) ) is the fraction of students whose potential is lower than student θ𝜃\thetaitalic_θ’s potential Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ ). Since 1F()1𝐹1-F(\cdot)1 - italic_F ( ⋅ ) (i.e., the complementary cumulative distribution function, ccdf) will appear in our analysis often, we use the notation F¯=1F¯𝐹1𝐹\bar{F}=1-Fover¯ start_ARG italic_F end_ARG = 1 - italic_F. Schools are also parametrized by [0,1]01[0,1][ 0 , 1 ], thus we let a student θ𝜃\thetaitalic_θ be assigned to school ranked μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ). Therefore, we will at times overload the notation μ𝜇\muitalic_μ to directly imply the assignment of students to schools. We will often want to associate a (set of) students who are assigned to a school in a way which ensures that the total probability mass is preserved under the assignment (similar to an inverse probability transform (Grimmett and Stirzaker 2020)). Consider the assignment μ:Θ[0,1]:𝜇Θ01\mu:\Theta\rightarrow[0,1]italic_μ : roman_Θ → [ 0 , 1 ], where μ(θ)=F¯(Z(θ))=1F(Z(θ))𝜇𝜃¯𝐹𝑍𝜃1𝐹𝑍𝜃\mu(\theta)=\bar{F}(Z(\theta))=1-F(Z(\theta))italic_μ ( italic_θ ) = over¯ start_ARG italic_F end_ARG ( italic_Z ( italic_θ ) ) = 1 - italic_F ( italic_Z ( italic_θ ) ). Then, for school s[0,1]𝑠01s\in[0,1]italic_s ∈ [ 0 , 1 ], μ1(s)=Z1(F1(1s))superscript𝜇1𝑠superscript𝑍1superscript𝐹11𝑠\mu^{-1}(s)=Z^{-1}(F^{-1}(1-s))italic_μ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_s ) = italic_Z start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 - italic_s ) ) is the set of students assigned to school s𝑠sitalic_s. Note that the fraction of students who are matched to schools with ranks in (s1,s2]subscript𝑠1subscript𝑠2(s_{1},s_{2}]( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] (with 0s1<s210subscript𝑠1subscript𝑠210\leq s_{1}<s_{2}\leq 10 ≤ italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1) is F1(1s1)F1(1s2)superscript𝐹11subscript𝑠1superscript𝐹11subscript𝑠2F^{-1}(1-s_{1})-F^{-1}(1-s_{2})italic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 - italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - italic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 - italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ).

Example 2.1

A student Maya ΘabsentΘ\in\Theta∈ roman_Θ scores Z(Z(italic_Z (Maya)=1.4)=1.4) = 1.4, with their score sampled from Pareto(1,3)Pareto13\text{Pareto}(1,3)Pareto ( 1 , 3 ) (i.e., α=3𝛼3\alpha=3italic_α = 3). The fraction of students who are better than Maya is equal to F¯(Z(\bar{F}(Z(over¯ start_ARG italic_F end_ARG ( italic_Z (Maya))=1(11/(1.43)0.3644))=1-(1-1/(1.4^{3})\approx 0.3644) ) = 1 - ( 1 - 1 / ( 1.4 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) ≈ 0.3644, which is also the school (rank) s𝑠sitalic_s Maya is assigned to in the continuous model.

Continuous Model with Two Groups:

We now consider two groups of students: non-disadvantaged G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and disadvantaged G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. We assume that the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students constitute a p𝑝pitalic_p fraction of the entire student population for some p[0,1]𝑝01p\in[0,1]italic_p ∈ [ 0 , 1 ], and that their perceived potentials are biased by a constant multiplicative factor β(0,1]𝛽01\beta\in(0,1]italic_β ∈ ( 0 , 1 ]. For students in G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, we let their perceived potentials be exactly their true potentials and they account for 1p1𝑝1-p1 - italic_p proportion of the population. We let Z^(θ)^𝑍𝜃\widehat{Z}(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) denote the perceived potential of a student θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ. That is, if θG1𝜃subscript𝐺1\theta\in G_{1}italic_θ ∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, then Z^(θ)=Z(θ)^𝑍𝜃𝑍𝜃\widehat{Z}(\theta)=Z(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) = italic_Z ( italic_θ ); otherwise, Z^(θ)=βZ(θ)^𝑍𝜃𝛽𝑍𝜃\widehat{Z}(\theta)=\beta Z(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) = italic_β italic_Z ( italic_θ ). The cdfs for G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students are F1subscript𝐹1F_{1}italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and F2subscript𝐹2F_{2}italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT respectively:

F1(t)=1tα;F2(t)=1βαtα.formulae-sequencesubscript𝐹1𝑡1superscript𝑡𝛼subscript𝐹2𝑡1superscript𝛽𝛼superscript𝑡𝛼F_{1}(t)=1-t^{-\alpha};\qquad F_{2}(t)=1-\beta^{\alpha}t^{-\alpha}.italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) = 1 - italic_t start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ; italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) = 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT .

Note that both functions F1()subscript𝐹1F_{1}(\cdot)italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ ) and F2()subscript𝐹2F_{2}(\cdot)italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ⋅ ) take in the value of perceived potentials. Moreover, the domain of F1subscript𝐹1F_{1}italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is [1,)1[1,\infty)[ 1 , ∞ ), whereas the domain of F2subscript𝐹2F_{2}italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is [β,)𝛽[\beta,\infty)[ italic_β , ∞ ). We denote their respective ccdfs by F¯1subscript¯𝐹1\bar{F}_{1}over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and F¯2subscript¯𝐹2\bar{F}_{2}over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Now for a student θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, we let μ^(θ)^𝜇𝜃\widehat{\mu}(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) be equal to the fraction of students whose perceived potentials are higher than that of θ𝜃\thetaitalic_θ in the two groups case. We get:

μ^(θ)={(1p)F¯1(Z^(θ))+pF¯2(Z^(θ)) if θG1,(1p)F¯1(Z^(θ)1)+pF¯2(Z^(θ)) if θG2,^𝜇𝜃cases1𝑝subscript¯𝐹1^𝑍𝜃𝑝subscript¯𝐹2^𝑍𝜃 if 𝜃subscript𝐺11𝑝subscript¯𝐹1^𝑍𝜃1𝑝subscript¯𝐹2^𝑍𝜃 if 𝜃subscript𝐺2\widehat{\mu}(\theta)=\begin{cases}(1-p)\bar{F}_{1}(\widehat{Z}(\theta))+p\bar% {F}_{2}(\widehat{Z}(\theta))&\text{ if }\theta\in G_{1},\\[2.84526pt] (1-p)\bar{F}_{1}(\widehat{Z}(\theta)\vee 1)+p\bar{F}_{2}(\widehat{Z}(\theta))&% \text{ if }\theta\in{G_{2}},\end{cases}over^ start_ARG italic_μ end_ARG ( italic_θ ) = { start_ROW start_CELL ( 1 - italic_p ) over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) + italic_p over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) end_CELL start_CELL if italic_θ ∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL ( 1 - italic_p ) over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ∨ 1 ) + italic_p over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) end_CELL start_CELL if italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , end_CELL end_ROW (1)

where \vee is the maximum operator. As before, we say that student θ𝜃\thetaitalic_θ is assigned to school μ^(θ)^𝜇𝜃\widehat{\mu}(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ). Note that, when β=1𝛽1\beta=1italic_β = 1 (i.e., no bias), formula (1) computes μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ): μ(θ)=F¯1(Z(θ))𝜇𝜃subscript¯𝐹1𝑍𝜃\mu(\theta)=\bar{F}_{1}(Z(\theta))italic_μ ( italic_θ ) = over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Z ( italic_θ ) ) for θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ. We let μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ) be the school that student θ𝜃\thetaitalic_θ gets assigned to, without any bias, and let μ^(θ)^𝜇𝜃\widehat{\mu}(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) be the school which student θ𝜃\thetaitalic_θ is actually assigned to due to bias. For assignment γ{μ,μ^}𝛾𝜇^𝜇\gamma\in\{\mu,\widehat{\mu}\}italic_γ ∈ { italic_μ , over^ start_ARG italic_μ end_ARG }, and school s[0,1]𝑠01s\in[0,1]italic_s ∈ [ 0 , 1 ], we let again γ1(s)superscript𝛾1𝑠\gamma^{-1}(s)italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_s ) be the “set” of students assigned to school s𝑠sitalic_s under matching γ𝛾\gammaitalic_γ.

Formally, we define a matching in this market to be a surjective measurable function γ𝛾\gammaitalic_γ from ΘΘ\Thetaroman_Θ to [0,1]01[0,1][ 0 , 1 ] (i.e., students to schools), such that the mass of students mapped to a set of schools S[0,1]𝑆01S\subseteq[0,1]italic_S ⊆ [ 0 , 1 ] coincides with the standard Lebesgue measure ν𝜈\nuitalic_ν of S𝑆Sitalic_S. In formula, any surjective function γ𝛾\gammaitalic_γ from ΘΘ\Thetaroman_Θ to [0,1]01[0,1][ 0 , 1 ] is a matching if

ν(γ1(S)):=(1p)θγ1(S)G1𝑑F1(Z^(θ))+pθγ1(S)G2𝑑F2(Z^(θ))assign𝜈superscript𝛾1𝑆1𝑝subscript𝜃superscript𝛾1𝑆subscript𝐺1differential-dsubscript𝐹1^𝑍𝜃𝑝subscript𝜃superscript𝛾1𝑆subscript𝐺2differential-dsubscript𝐹2^𝑍𝜃\nu(\gamma^{-1}(S)):=(1-p)\int_{\theta\in\gamma^{-1}(S)\cap G_{1}}dF_{1}(% \widehat{Z}(\theta))+p\int_{\theta\in\gamma^{-1}(S)\cap G_{2}}dF_{2}(\widehat{% Z}(\theta))italic_ν ( italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S ) ) := ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_θ ∈ italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S ) ∩ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) + italic_p ∫ start_POSTSUBSCRIPT italic_θ ∈ italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S ) ∩ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) )

is equal to the standard Lebesgue measure of S𝑆Sitalic_S for all S[0,1]𝑆01S\subseteq[0,1]italic_S ⊆ [ 0 , 1 ]. One can easily check that μ𝜇\muitalic_μ and μ^^𝜇\widehat{\mu}over^ start_ARG italic_μ end_ARG defined above are matchings.

Example 2.2

Student scores are again sampled from Pareto(1,3)Pareto13\text{Pareto}(1,3)Pareto ( 1 , 3 ). Maya G2absentsubscript𝐺2\in G_{2}∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT scores again Z(Z(italic_Z (Maya)=1.4)=1.4) = 1.4. Lisa G1absentsubscript𝐺1\in G_{1}∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT instead scores Z(Z(italic_Z (Lisa)=1.3)=1.3) = 1.3. In the unbiased setting, Maya gets matched to schoool F¯(Z(\bar{F}(Z(over¯ start_ARG italic_F end_ARG ( italic_Z (Maya))=1(11/(1.43)0.3644))=1-(1-1/(1.4^{3})\approx 0.3644) ) = 1 - ( 1 - 1 / ( 1.4 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) ≈ 0.3644 while Lisa gets matched to F¯(Z(\bar{F}(Z(over¯ start_ARG italic_F end_ARG ( italic_Z (Lisa))=1(11/(1.33)0.4552))=1-(1-1/(1.3^{3})\approx 0.4552) ) = 1 - ( 1 - 1 / ( 1.3 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) ≈ 0.4552. Letting β=.9𝛽.9\beta=.9italic_β = .9, we have Z^(\widehat{Z}(over^ start_ARG italic_Z end_ARG (Maya)=1.26)=1.26) = 1.26, Z^(\widehat{Z}(over^ start_ARG italic_Z end_ARG (Lisa)=1.3)=1.3) = 1.3. Letting p=.2𝑝.2p=.2italic_p = .2, we have that in the biased setting Maya and Lisa are matched to schools

μ^(“Maya”)=0.4729andμ^(“Lisa”)=0.4305,formulae-sequence^𝜇“Maya”0.4729and^𝜇“Lisa”0.4305\widehat{\mu}(\text{``Maya''})=0.4729\quad\hbox{and}\quad\widehat{\mu}(\text{`% `Lisa''})=0.4305,over^ start_ARG italic_μ end_ARG ( “Maya” ) = 0.4729 and over^ start_ARG italic_μ end_ARG ( “Lisa” ) = 0.4305 ,

respectively to a significantly worse (slightly better) school than they used to in the setting without bias. Note that Lisa has a smaller true potential than Maya but is assigned to a better school in the biased setting.

3 Impact on Students

Our first goal is to understand how much perceived bias affects agents in the market. In particular, we would like to answer the following question: what is the loss of efficiency for students777In Appendix 11 of the e-Companion, we take the schools’ perspective and show that there is effectively no loss of efficiency for schools under this model, creating little incentive for them to intervene at the individual school level. We also measure there the diversity of the admitted cohort in our model. when all students θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ are assigned to school μ^(θ)^𝜇𝜃\widehat{\mu}(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) instead of μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ )? Formally, we define μ^(θ)μ(θ)^𝜇𝜃𝜇𝜃\widehat{\mu}(\theta)-\mu(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) - italic_μ ( italic_θ ) to be the displacement of a student θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ. Note that if θG1𝜃subscript𝐺1\theta\in G_{1}italic_θ ∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, the displacement is non-positive, and if θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, it is non-negative. The displacement can be easily calculated using the formulae for μ𝜇\muitalic_μ and μ^^𝜇\widehat{\mu}over^ start_ARG italic_μ end_ARG given in (1).

Proposition 3.1

For any student θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the displacement μ^(θ)μ(θ)^𝜇𝜃𝜇𝜃\widehat{\mu}(\theta)-\mu(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) - italic_μ ( italic_θ ) is given by:

μ^(θ)μ(θ)={(1p)(Z(θ))α(βα1) if Z(θ)1β,(1p)(1(Z(θ))α) if Z(θ)1β.^𝜇𝜃𝜇𝜃cases1𝑝superscript𝑍𝜃𝛼superscript𝛽𝛼1 if 𝑍𝜃1𝛽1𝑝1superscript𝑍𝜃𝛼 if 𝑍𝜃1𝛽\widehat{\mu}(\theta)-\mu(\theta)=\begin{cases}\displaystyle(1-p)\left({Z(% \theta)}\right)^{-\alpha}\left(\beta^{-\alpha}-1\right)&\text{ if }Z(\theta)% \geq\frac{1}{\beta},\\[2.84526pt] \displaystyle(1-p)\left(1-\left({Z(\theta)}\right)^{-\alpha}\right)&\text{ if % }Z(\theta)\leq\frac{1}{\beta}.\end{cases}over^ start_ARG italic_μ end_ARG ( italic_θ ) - italic_μ ( italic_θ ) = { start_ROW start_CELL ( 1 - italic_p ) ( italic_Z ( italic_θ ) ) start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) end_CELL start_CELL if italic_Z ( italic_θ ) ≥ divide start_ARG 1 end_ARG start_ARG italic_β end_ARG , end_CELL end_ROW start_ROW start_CELL ( 1 - italic_p ) ( 1 - ( italic_Z ( italic_θ ) ) start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT ) end_CELL start_CELL if italic_Z ( italic_θ ) ≤ divide start_ARG 1 end_ARG start_ARG italic_β end_ARG . end_CELL end_ROW

For any student θG1𝜃subscript𝐺1\theta\in G_{1}italic_θ ∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, we have μ^(θ)μ(θ)=(p+pβα)(Z(θ))α.^𝜇𝜃𝜇𝜃𝑝𝑝superscript𝛽𝛼superscript𝑍𝜃𝛼\widehat{\mu}(\theta)-\mu(\theta)=\left(-p+p\beta^{\alpha}\right)\left({Z(% \theta)}\right)^{-\alpha}.over^ start_ARG italic_μ end_ARG ( italic_θ ) - italic_μ ( italic_θ ) = ( - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ( italic_Z ( italic_θ ) ) start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT . Thus, the maximum displacement of (1p)(1βα)1𝑝1superscript𝛽𝛼(1-p)(1-\beta^{\alpha})( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) is experienced by a G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT student with potential 1/β1𝛽1/\beta1 / italic_β; and the most significant negative displacement of p(1βα)𝑝1superscript𝛽𝛼-p(1-\beta^{\alpha})- italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) is experienced by a G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT student with potential 1111.

One can think of this result intuitively in the following way. Starting from the top school, G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students gradually take up more seats than they deserve, and thus gradually push G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students to worse schools than what they deserve. This process stops once all G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students are assigned to schools, and the only students that remain to be assigned are G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. As a result, in lower ranked schools, all students are G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. Hence, the difference in ranks of the schools G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students are matched to decreases towards the end. Figure 3 gives a pictorial illustration of Proposition 3.1. From there, one can clearly see how the most mistreated students are average performers. This intuition will be fundamental in devising policies to counter the effect of bias.

Refer to caption
Figure 3: Schools students “should” versus “actually” attend. The green dotted line is a line of slope one, representing the place a student should be placed if there is no bias in the system.

4 Deterministic Centralized Interventions

In this section, we discuss how interventions of a central administration (such as the Department of Education) can act to mitigate the effects of bias. We assume that these interventions allow a set of students, chosen by the central agency, to be debiased, i.e., to reveal their true potential. In practice, this can be achieved by for instance giving free vouchers to (a limited amount of) students that allow them to access preparatory classes for exams or by spending resources to build a community for the students with vouchers that explores learning as a group. Given a certain amount of available vouchers, we want to investigate which students these vouchers should be offered to. We first formally define the negative impact on G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students due to the presence of bias and then derive policies that optimize certain fairness measures. These policies are deterministic, and the decision of whether to assign a voucher to a student (hence, debias them) depends only on the potential of the students. In the next section, we will discuss randomized policies where the decisions will also depend on the outcome of a random coin flip.

Metric of Impact on Students:

Recall that μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ) and μ^(θ)^𝜇𝜃\widehat{\mu}(\theta)over^ start_ARG italic_μ end_ARG ( italic_θ ) denote the school a student θ𝜃\thetaitalic_θ is assigned to in the unbiased and biased setting respectively. Now let μ~:Θ[0,1]:~𝜇Θ01\widetilde{\mu}:\Theta\rightarrow[0,1]over~ start_ARG italic_μ end_ARG : roman_Θ → [ 0 , 1 ] be the ranking of students after bias mitigation. The mistreatment of a student θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with respect to an assignment μ~~𝜇\widetilde{\mu}over~ start_ARG italic_μ end_ARG is defined as the positive part of their displacement, that is m(θ):=max(0,μ~(θ)μ(θ))assign𝑚𝜃0~𝜇𝜃𝜇𝜃m(\theta):=\max(0,\widetilde{\mu}(\theta)-\mu(\theta))italic_m ( italic_θ ) := roman_max ( 0 , over~ start_ARG italic_μ end_ARG ( italic_θ ) - italic_μ ( italic_θ ) ). That is, the mistreatment is the drop in the rank of the school the student is matched to (if this drop is positive). A student θ𝜃\thetaitalic_θ has mistreatment equal to 00 if they are assigned to a school at least as good as μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ). In the following, we evaluate a voucher distribution by its effect on the mistreatment of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students, since only G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students may experience strictly positive mistreatment. It is easy to see that after the interventions, no student θG1𝜃subscript𝐺1\theta\in G_{1}italic_θ ∈ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT will be matched to a school worse than μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ). This is because our interventions focus on helping (certain) G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students reveal their true potentials, hence for any G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT student θ𝜃\thetaitalic_θ, no student with potential lower than Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ ) can have a perceived potential higher than Z^(θ)=Z(θ)^𝑍𝜃𝑍𝜃\widehat{Z}(\theta)=Z(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) = italic_Z ( italic_θ ).

Fairness Considerations:

Finding a set of students to allocate vouchers to is a resource allocation problem with natural fairness considerations that guide the choice of the measures to be optimized888To read a more detailed philosophical discussion on relevant philosophies of equality and decision-making, we refer the interested reader to the 1979 Tanner Lectures on Human Values (Sen (1979)).. For the cohort of disadvantaged students, we take the view of finding a distribution of vouchers so that the mistreatment across G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students is as balanced or equitable as possible. We analyze two representative fairness measures in this regard: (1) the total mistreatment across all students, and (2) the maximum mistreatment experienced in this cohort. The former is the continuous L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT norm of the mistreatment after voucher allocation, or the positive area under the curve (PAUC). The latter is the continuous Lsuperscript𝐿L^{\infty}italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT norm, and we refer to it with the shorthand “mm𝑚𝑚mmitalic_m italic_m”.

σ(μ~)𝜎~𝜇\displaystyle\sigma(\widetilde{\mu})italic_σ ( over~ start_ARG italic_μ end_ARG ) :=θΘm(θ)𝑑F1(Z(θ))=m(θ)1,assignabsentsubscript𝜃Θ𝑚𝜃differential-dsubscript𝐹1𝑍𝜃subscriptnorm𝑚𝜃1\displaystyle:=\int_{\theta\in\Theta}m(\theta)\,dF_{1}(Z(\theta))=\|m(\theta)% \|_{1},:= ∫ start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT italic_m ( italic_θ ) italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Z ( italic_θ ) ) = ∥ italic_m ( italic_θ ) ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , (2)
mm(μ~)𝑚𝑚~𝜇\displaystyle mm(\widetilde{\mu})italic_m italic_m ( over~ start_ARG italic_μ end_ARG ) :=supθΘm(θ)=limp(θΘm(θ)p𝑑F1(Z(θ)))1/p=m(θ).assignabsentsubscriptsupremum𝜃Θ𝑚𝜃subscript𝑝superscriptsubscript𝜃Θ𝑚superscript𝜃𝑝differential-dsubscript𝐹1𝑍𝜃1𝑝subscriptnorm𝑚𝜃\displaystyle:=\sup_{\theta\in\Theta}m(\theta)=\lim_{p\rightarrow\infty}\left(% \int_{\theta\in\Theta}m(\theta)^{p}\,dF_{1}(Z(\theta))\right)^{1/p}=\|m(\theta% )\|_{\infty}.:= roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT italic_m ( italic_θ ) = roman_lim start_POSTSUBSCRIPT italic_p → ∞ end_POSTSUBSCRIPT ( ∫ start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT italic_m ( italic_θ ) start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_Z ( italic_θ ) ) ) start_POSTSUPERSCRIPT 1 / italic_p end_POSTSUPERSCRIPT = ∥ italic_m ( italic_θ ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT . (3)

These notions of fairness have been axiomatically established and are well-studied in the literature. For example, the min-max notion of fairness has been considered in (Kumar and Kleinberg 2000), and the notion of positive area under the curve corresponds to average mistreatment of group G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: it is a group notion of fairness consider in many fairness related studies (Conitzer et al. 2019, Dwork and Ilvento 2018, Marsh and Schilling 1994). Since we will show that the solutions of the two extremes L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and Lsuperscript𝐿L^{\infty}italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT target qualitatively similar sets of students999An Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT norm with p𝑝pitalic_p small generally measures the average of a function, whereas a large p𝑝pitalic_p measures its “peakiness”, with p=𝑝p=\inftyitalic_p = ∞ equalling the essential supremum (for a further discussion on the relationship between Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT spaces, see Folland (1999))., we expect the solution for any other Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT norm to also behave similarly and restrict our analysis to L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and Lsuperscript𝐿L^{\infty}italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT for simplicity.

Refer to caption
(a) maximum mistreatment
Refer to caption
(b) PAUC
Figure 4: Effect of bias after debiasing the optimal set of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students given c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG.

Optimal Deterministic Strategies:

Before stating the results formally, we need to introduce some notation. A deterministic debiasing set (DDS) T𝑇Titalic_T is a measurable subset of [1,)1[1,\infty)[ 1 , ∞ ). For c^[0,1]^𝑐01\widehat{c}\in[0,1]over^ start_ARG italic_c end_ARG ∈ [ 0 , 1 ], let 𝒯(c^)𝒯^𝑐\mathcal{T}(\widehat{c})caligraphic_T ( over^ start_ARG italic_c end_ARG ) be all DDS such that tT𝑑F1(t)c^subscript𝑡𝑇differential-dsubscript𝐹1𝑡^𝑐\int_{t\in T}dF_{1}(t)\leq\widehat{c}∫ start_POSTSUBSCRIPT italic_t ∈ italic_T end_POSTSUBSCRIPT italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) ≤ over^ start_ARG italic_c end_ARG. Here, c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG denotes the amount of resources or vouchers, and each T𝒯(c^)𝑇𝒯^𝑐T\in\mathcal{T}(\widehat{c})italic_T ∈ caligraphic_T ( over^ start_ARG italic_c end_ARG ) represents the potentials of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students to whom vouchers are provided, in effect, revealing their true potentials. That is, for θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT such that Z(θ)T𝑍𝜃𝑇Z(\theta)\in Titalic_Z ( italic_θ ) ∈ italic_T, we now have Z^(θ)=Z(θ)^𝑍𝜃𝑍𝜃\widehat{Z}(\theta)=Z(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) = italic_Z ( italic_θ ) after the intervention. We show in Figure 4 how much the two fairness measures can be maximally improved as a function of the amount of resources c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG.

Let μT:Θ[0,1]:subscript𝜇𝑇Θ01\mu_{T}:\Theta\rightarrow[0,1]italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT : roman_Θ → [ 0 , 1 ] be the ranking of the students after G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose true potentials lie in T𝑇Titalic_T have been debiased, and let 𝒯mm(c^)subscript𝒯𝑚𝑚^𝑐\mathcal{T}_{mm}(\widehat{c})caligraphic_T start_POSTSUBSCRIPT italic_m italic_m end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) be the collection of sets T𝑇Titalic_T such that sup(μTμ)supremumsubscript𝜇𝑇𝜇\sup(\mu_{T}-\mu)roman_sup ( italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_μ ) is minimized. The next result gives an explicit characterization of these sets, assuming101010We refer to the end of Section 5.1 for a discussion on the various technical assumptions on data from Section 4 and Section 5.1 p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT.

Refer to caption
(a) optimal correction when c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG is large
Refer to caption
(b) optimal correction when c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG is small
Figure 5: Maximum mistreatment before and after optimal voucher correction.
Theorem 4.1

Assume p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT. Then there exists a set T=[Z1,Z2]𝒯mm(c^)𝑇superscriptsubscript𝑍1superscriptsubscript𝑍2subscript𝒯𝑚𝑚^𝑐T=[Z_{1}^{*},Z_{2}^{*}]\in\mathcal{T}_{mm}(\widehat{c})italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] ∈ caligraphic_T start_POSTSUBSCRIPT italic_m italic_m end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) such that all other sets in 𝒯mm(c^)subscript𝒯𝑚𝑚^𝑐\mathcal{T}_{mm}(\widehat{c})caligraphic_T start_POSTSUBSCRIPT italic_m italic_m end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) differ from T𝑇Titalic_T on a set of measure zero. If c^(1p)(1βα)1p+1βα^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼\widehat{c}\geq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}}over^ start_ARG italic_c end_ARG ≥ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG, then

Z1=((1p)+(1βα1)c^1βαp)1αandZ2=((1p)(1c^)1βαp)1α,formulae-sequencesuperscriptsubscript𝑍1superscript1𝑝1superscript𝛽𝛼1^𝑐1superscript𝛽𝛼𝑝1𝛼andsuperscriptsubscript𝑍2superscript1𝑝1^𝑐1superscript𝛽𝛼𝑝1𝛼\displaystyle Z_{1}^{*}=\bigg{(}\frac{(1-p)+(\frac{1}{\beta^{\alpha}}-1)% \widehat{c}}{\frac{1}{\beta^{\alpha}}-p}\bigg{)}^{-\frac{1}{\alpha}}\quad\text% {and}\quad Z_{2}^{*}=\bigg{(}\frac{(1-p)(1-\widehat{c})}{\frac{1}{\beta^{% \alpha}}-p}\bigg{)}^{-\frac{1}{\alpha}},italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p ) + ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) over^ start_ARG italic_c end_ARG end_ARG start_ARG divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT and italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p ) ( 1 - over^ start_ARG italic_c end_ARG ) end_ARG start_ARG divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT ,

and mm(μ[Z1,Z2])=(1p)(1βα)1c^1pβα𝑚𝑚subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍21𝑝1superscript𝛽𝛼1^𝑐1𝑝superscript𝛽𝛼mm(\mu_{[Z_{1}^{*},Z_{2}^{*}]})=(1-p)(1-\beta^{\alpha})\frac{1-\widehat{c}}{1-% p\beta^{\alpha}}italic_m italic_m ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) = ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) divide start_ARG 1 - over^ start_ARG italic_c end_ARG end_ARG start_ARG 1 - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG, reduced from mm(μ^)=(1p)(1βα)𝑚𝑚^𝜇1𝑝1superscript𝛽𝛼mm(\widehat{\mu})=(1-p)(1-\beta^{\alpha})italic_m italic_m ( over^ start_ARG italic_μ end_ARG ) = ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ). Conversely, if c^(1p)(1βα)1p+1βα^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}}over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG, then:

Z1=((1pc^)βα1p+c^)1α and Z2=((1pc^)βα1p)1α,formulae-sequencesuperscriptsubscript𝑍1superscript1𝑝^𝑐superscript𝛽𝛼1𝑝^𝑐1𝛼 and superscriptsubscript𝑍2superscript1𝑝^𝑐superscript𝛽𝛼1𝑝1𝛼\displaystyle Z_{1}^{*}=\bigg{(}\frac{(1-p-\widehat{c})\beta^{\alpha}}{1-p}+% \widehat{c}\bigg{)}^{-\frac{1}{\alpha}}\quad\text{ and }\quad Z_{2}^{*}=\bigg{% (}\frac{(1-p-\widehat{c})\beta^{\alpha}}{1-p}\bigg{)}^{-\frac{1}{\alpha}},italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p - over^ start_ARG italic_c end_ARG ) italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p end_ARG + over^ start_ARG italic_c end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT and italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p - over^ start_ARG italic_c end_ARG ) italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT ,

and mm(μ[Z1,Z2])=(1pc^)(1βα)+pc^𝑚𝑚subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍21𝑝^𝑐1superscript𝛽𝛼𝑝^𝑐mm(\mu_{[Z_{1}^{*},Z_{2}^{*}]})=(1-p-\widehat{c})(1-\beta^{\alpha})+p\widehat{c}italic_m italic_m ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) = ( 1 - italic_p - over^ start_ARG italic_c end_ARG ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + italic_p over^ start_ARG italic_c end_ARG.

We include the proof of Theorem 4.1 in Appendix 12 of the e-Companion. Interestingly, our proof also shows that if vouchers are not distributed carefully, one may actually increase the maximum mistreatment and, more generally, shows which distribution of vouchers leads to an improvement of the status quo. A pictorial representation of Theorem 4.1 is given in Figure 5. The two sub-figures correspond to two choices of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG. Moreover, Figure 4(a) shows how much mm(μ[Z1,Z2])𝑚𝑚subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2mm(\mu_{[Z_{1}^{*},Z_{2}^{*}]})italic_m italic_m ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) decreases as c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, the amount of resources, increases.

Next, consider minimizing the positive area under the curve (PAUC): this is the aggregate amount of mistreatment experienced by all G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. In this case, we restrict our attention to debiasing G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose potentials are in a connected set — this is a justifiable implementation in practice (otherwise a student might feel fairly treated given that someone with a better potential as well as someone with a worse potential receives the voucher). This assumption also makes our analysis more tractable. In particular, let 𝒯c(c^)𝒯(c^)superscript𝒯𝑐^𝑐𝒯^𝑐\mathcal{T}^{c}(\widehat{c})\subseteq\mathcal{T}(\widehat{c})caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) ⊆ caligraphic_T ( over^ start_ARG italic_c end_ARG ) be the family of all connected subsets of 𝒯(c^)𝒯^𝑐\mathcal{T}(\widehat{c})caligraphic_T ( over^ start_ARG italic_c end_ARG ). That is, 𝒯c(c^):={[t1,t2]:F¯1(t1)F¯1(t2)c^}assignsuperscript𝒯𝑐^𝑐conditional-setsubscript𝑡1subscript𝑡2subscript¯𝐹1subscript𝑡1subscript¯𝐹1subscript𝑡2^𝑐\mathcal{T}^{c}(\widehat{c}):=\{[t_{1},t_{2}]:\bar{F}_{1}(t_{1})-\bar{F}_{1}(t% _{2})\leq\widehat{c}\}caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) := { [ italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] : over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ≤ over^ start_ARG italic_c end_ARG }. In addition, let 𝒯aucc(c^)subscriptsuperscript𝒯𝑐𝑎𝑢𝑐^𝑐\mathcal{T}^{c}_{auc}(\widehat{c})caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a italic_u italic_c end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) be the collection of sets T𝒯c(c^)𝑇superscript𝒯𝑐^𝑐T\in\mathcal{T}^{c}(\widehat{c})italic_T ∈ caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) such that σ(μTμ)𝜎subscript𝜇𝑇𝜇\sigma(\mu_{T}-\mu)italic_σ ( italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_μ ) is minimized. The next result gives an explicit description of the set 𝒯aucc(c^)subscriptsuperscript𝒯𝑐𝑎𝑢𝑐^𝑐\mathcal{T}^{c}_{auc}(\widehat{c})caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a italic_u italic_c end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) when assuming, again that p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT and additionally that p<0.5𝑝0.5p<0.5italic_p < 0.5.

Refer to caption
(a) optimal correction when c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG is large
Refer to caption
(b) optimal correction when c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG is small
Figure 6: PAUC before and after optimal voucher correction.
Theorem 4.2

Assume p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT and p<0.5𝑝0.5p<0.5italic_p < 0.5. Then 𝒯aucc(c^)subscriptsuperscript𝒯𝑐𝑎𝑢𝑐^𝑐\mathcal{T}^{c}_{auc}(\widehat{c})caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a italic_u italic_c end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) is made up of a unique set T=[Z1,Z2]𝑇superscriptsubscript𝑍1superscriptsubscript𝑍2T=[Z_{1}^{*},Z_{2}^{*}]italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ]. If c^(1p)(1βα)2pβαpβα+pβ2α^𝑐1𝑝1superscript𝛽𝛼2𝑝superscript𝛽𝛼𝑝superscript𝛽𝛼𝑝superscript𝛽2𝛼\widehat{c}\geq\frac{(1-p)(1-\beta^{\alpha})}{2-p-\beta^{\alpha}-p\beta^{% \alpha}+p\beta^{2\alpha}}over^ start_ARG italic_c end_ARG ≥ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 2 - italic_p - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p italic_β start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT end_ARG, then:

Z2=((1p)(1c^)pβα+1βα2p)1α and Z1=((1p)(1c^)pβα+1βα2p+c^)1α,formulae-sequencesuperscriptsubscript𝑍2superscript1𝑝1^𝑐𝑝superscript𝛽𝛼1superscript𝛽𝛼2𝑝1𝛼 and superscriptsubscript𝑍1superscript1𝑝1^𝑐𝑝superscript𝛽𝛼1superscript𝛽𝛼2𝑝^𝑐1𝛼\displaystyle Z_{2}^{*}=\bigg{(}\frac{(1-p)(1-\widehat{c})}{p\beta^{\alpha}+% \frac{1}{\beta^{\alpha}}-2p}\bigg{)}^{-\frac{1}{\alpha}}\quad\text{ and }\quad Z% _{1}^{*}=\bigg{(}\frac{(1-p)(1-\widehat{c})}{p\beta^{\alpha}+\frac{1}{\beta^{% \alpha}}-2p}+\widehat{c}\bigg{)}^{-\frac{1}{\alpha}},italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p ) ( 1 - over^ start_ARG italic_c end_ARG ) end_ARG start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 2 italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT and italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( 1 - italic_p ) ( 1 - over^ start_ARG italic_c end_ARG ) end_ARG start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 2 italic_p end_ARG + over^ start_ARG italic_c end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT ,

and σ(μ[Z1,Z2])=12(1p)(1βα)((1βαp)(1c^)2pβα+1βα2p)𝜎subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2121𝑝1superscript𝛽𝛼1superscript𝛽𝛼𝑝superscript1^𝑐2𝑝superscript𝛽𝛼1superscript𝛽𝛼2𝑝\sigma(\mu_{[Z_{1}^{*},Z_{2}^{*}]})=\frac{1}{2}(1-p)(1-\beta^{\alpha})\left(% \frac{(\frac{1}{\beta^{\alpha}}-p)(1-\widehat{c})^{2}}{p\beta^{\alpha}+\frac{1% }{\beta^{\alpha}}-2p}\right)italic_σ ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ( divide start_ARG ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - italic_p ) ( 1 - over^ start_ARG italic_c end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 2 italic_p end_ARG ), down from σ(μ^)=12(1p)(1βα)𝜎^𝜇121𝑝1superscript𝛽𝛼\sigma(\widehat{\mu})=\frac{1}{2}(1-p)(1-\beta^{\alpha})italic_σ ( over^ start_ARG italic_μ end_ARG ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ). Conversely, if c^(1p)(1βα)2pβαpβα+pβ2α^𝑐1𝑝1superscript𝛽𝛼2𝑝superscript𝛽𝛼𝑝superscript𝛽𝛼𝑝superscript𝛽2𝛼\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{2-p-\beta^{\alpha}-p\beta^{% \alpha}+p\beta^{2\alpha}}over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 2 - italic_p - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p italic_β start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT end_ARG, then:

Z2=((pβα1)c^+(1p)(1p)1βα)1α and Z1=((pβα1)c^+(1p)(1p)1βα+c^)1α,formulae-sequencesuperscriptsubscript𝑍2superscript𝑝superscript𝛽𝛼1^𝑐1𝑝1𝑝1superscript𝛽𝛼1𝛼 and superscriptsubscript𝑍1superscript𝑝superscript𝛽𝛼1^𝑐1𝑝1𝑝1superscript𝛽𝛼^𝑐1𝛼\displaystyle Z_{2}^{*}=\bigg{(}\frac{(p\beta^{\alpha}-1)\widehat{c}+(1-p)}{(1% -p)\frac{1}{\beta^{\alpha}}}\bigg{)}^{-\frac{1}{\alpha}}\quad\text{ and }\quad Z% _{1}^{*}=\bigg{(}\frac{(p\beta^{\alpha}-1)\widehat{c}+(1-p)}{(1-p)\frac{1}{% \beta^{\alpha}}}+\widehat{c}\bigg{)}^{-\frac{1}{\alpha}},italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - 1 ) over^ start_ARG italic_c end_ARG + ( 1 - italic_p ) end_ARG start_ARG ( 1 - italic_p ) divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT and italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( divide start_ARG ( italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - 1 ) over^ start_ARG italic_c end_ARG + ( 1 - italic_p ) end_ARG start_ARG ( 1 - italic_p ) divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG end_ARG + over^ start_ARG italic_c end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT ,

and σ(μ[Z1,Z2])=12(1p)(1c^)212βα([(pβα1)c^+(1p)]21p+pc^2)𝜎subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2121𝑝superscript1^𝑐212superscript𝛽𝛼superscriptdelimited-[]𝑝superscript𝛽𝛼1^𝑐1𝑝21𝑝𝑝superscript^𝑐2\sigma(\mu_{[Z_{1}^{*},Z_{2}^{*}]})=\frac{1}{2}(1-p)(1-\widehat{c})^{2}-\frac{% 1}{2}\beta^{\alpha}\left(\frac{[(p\beta^{\alpha}-1)\widehat{c}+(1-p)]^{2}}{1-p% }+p\widehat{c}^{2}\right)italic_σ ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - italic_p ) ( 1 - over^ start_ARG italic_c end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( divide start_ARG [ ( italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - 1 ) over^ start_ARG italic_c end_ARG + ( 1 - italic_p ) ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p end_ARG + italic_p over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ).

The proof of Theorem 4.2 is given in Appendix 13 of the e-Companion. A pictorial representation of Theorem 4.2 is given in Figure 6. Again, two sub-figures are presented for two different choices of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG. Figure 4(b) shows how much σ(μ[Z1,Z2])𝜎subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2\sigma(\mu_{[Z_{1}^{*},Z_{2}^{*}]})italic_σ ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT ) decreases as c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, the amount of resources, increases.

We compare the optimal ranges of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students to debias under the two measures of fairness, with parameters α=3𝛼3\alpha=3italic_α = 3, β=.8𝛽.8\beta=.8italic_β = .8, and p=.25𝑝.25p=.25italic_p = .25. We check the optimal intervals under both measures of fairness, and find on average a 95% overlap of the optimal intervals. In particular, both measures suggest that vouchers should be given to the average (middle performing) students. More details are given in Table 4 in the e-Companion.

Although these results highlight an important deviation from the current practice of prioritizing top-performing students for scholarships, we highlight in the next section two fundamental problems with such policies.

5 Incentive Compatible and Individually Fair Voucher Distribution

In Section 4, we characterized the optimal deterministic intervals for the distribution of vouchers under the maximum mistreatment and PAUC measures. In this section, we introduce two natural and desirable properties that the policies developed in Section 4 fail to have. Once we recognize these flaws, we show in Section 5.1 how we can shift from a deterministic voucher distribution policy to a randomized one in order to satisfy them.

Our first property is individual fairness (Dwork et al. 2012), which requires that similar individuals be treated similarly. While a formal definition of this concept is postponed to Section 5.1, we observe here that in the policies developed in Section 4 fail to be individually fair as individuals close to the boundary of the debiasing interval are treated very differently depending on whether they are inside or outside of it.

Our second property is incentive compatibility (see, e.g., Roughgarden (2010)). In general, it requires that no individual can benefit from misrepresenting their features. In our setting, we assume that a student can misrepresent themselves as appearing to have lower potential (e.g., intentionally achieving a lower score in a test) in order to be part of the set of students who can access vouchers. Recall that a DDS is a measurable set T[1,)𝑇1T\subseteq[1,\infty)italic_T ⊆ [ 1 , ∞ ). A DDS is incentive compatible if no student is assigned to a better school if they misreport their performance. Formally, assume that a voucher given to a disadvantaged student with reported perceived potential βZ(θ)𝛽𝑍𝜃\beta Z(\theta)italic_β italic_Z ( italic_θ ), will improve their performance up to Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ )111111This assumption is justified by the fact that additional training is usually commensurate with the (perceived) level of a student.; then, a DDS T𝑇Titalic_T is incentive compatible if for each x[1,)T𝑥1𝑇x\in[1,\infty)\setminus Titalic_x ∈ [ 1 , ∞ ) ∖ italic_T and xTsuperscript𝑥𝑇x^{\prime}\in Titalic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_T with x>x𝑥superscript𝑥x>x^{\prime}italic_x > italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we have βxx𝛽𝑥superscript𝑥\beta x\geq x^{\prime}italic_β italic_x ≥ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Lemma 5.1

Assume β[0,1)𝛽01\beta\in[0,1)italic_β ∈ [ 0 , 1 ) and let T𝑇T\neq\emptysetitalic_T ≠ ∅ be an incentive compatible DDS. Then T𝑇Titalic_T is of the form {θΘ:Z(θ)δ}conditional-set𝜃Θ𝑍𝜃𝛿\{\theta\in\Theta:Z(\theta)\geq\delta\}{ italic_θ ∈ roman_Θ : italic_Z ( italic_θ ) ≥ italic_δ } or {θΘ:Z(θ)>δ}conditional-set𝜃Θ𝑍𝜃𝛿\{\theta\in\Theta:Z(\theta)>\delta\}{ italic_θ ∈ roman_Θ : italic_Z ( italic_θ ) > italic_δ } for some value δ[1,)𝛿1\delta\in[1,\infty)italic_δ ∈ [ 1 , ∞ ).

We defer the proof to Appendix 10. This lemma shows that, if we care about incentive compatibility and require that vouchers are distributed deterministically, then the only feasible mechanism is to debias all students that have potential above some threshold cutoff δ𝛿\deltaitalic_δ. To overcome these flaws in deterministic policies, we next turn to randomization.

5.1 Randomized assignment of vouchers

We now introduce and study randomized mechanisms for the allocation of vouchers. For simplicity of notation, in this section (and in its proofs in Appendix 14 of the e-Companion), we abuse notation and identify a student θ𝜃\thetaitalic_θ with their true potential Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ ).

A Randomized Voucher Program (RVP) is a measurable function ρ:Θ[0,1]:𝜌Θ01\rho:\Theta\to[0,1]italic_ρ : roman_Θ → [ 0 , 1 ] that gives, for each θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, the probability that a G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT student with true potential θ𝜃\thetaitalic_θ is assigned a voucher. Observe that if ρ(θ){0,1}𝜌𝜃01\rho(\theta)\in\left\{0,1\right\}italic_ρ ( italic_θ ) ∈ { 0 , 1 } for all θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, then ρ1(1)superscript𝜌11\rho^{-1}(1)italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 ) is a deterministic debiasing set (DDS) as in the definition in Section 4; likewise, given a DDS T𝑇{T}italic_T we can construct the RVP ρT(θ):=𝟙θTassignsubscript𝜌𝑇𝜃subscript1𝜃𝑇\rho_{{T}}(\theta):=\mathbbm{1}_{\theta\in{T}}italic_ρ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_θ ) := blackboard_1 start_POSTSUBSCRIPT italic_θ ∈ italic_T end_POSTSUBSCRIPT that matches this DDS.

The main class of RVPs investigated in this section are Proportional-to-Mistreatment (PropM), denoted by ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and defined as

ρm(θ):=2c^(1βα)(1p)m(θ),assignsubscript𝜌𝑚𝜃2^𝑐1superscript𝛽𝛼1𝑝𝑚𝜃\rho_{m}(\theta):=\frac{2\widehat{c}}{(1-\beta^{\alpha})(1-p)}m(\theta),italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_θ ) := divide start_ARG 2 over^ start_ARG italic_c end_ARG end_ARG start_ARG ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ( 1 - italic_p ) end_ARG italic_m ( italic_θ ) , (4)

for some c^(0,1/2]^𝑐012\widehat{c}\in(0,1/2]over^ start_ARG italic_c end_ARG ∈ ( 0 , 1 / 2 ] (recall that m(θ)𝑚𝜃m(\theta)italic_m ( italic_θ ) is the mistreatment of a student with real potential θ𝜃\thetaitalic_θ when no vouchers are distributed). It is easy to see that c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG is the expected proportion of disadvantaged students that will get a voucher, that is c^=Θρm𝑑F^𝑐subscriptΘsubscript𝜌𝑚differential-d𝐹\widehat{c}=\int_{\Theta}\rho_{m}\,dFover^ start_ARG italic_c end_ARG = ∫ start_POSTSUBSCRIPT roman_Θ end_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_d italic_F. Intuitively, ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT assigns a larger probability of being debiased to students with a higher mistreatment.

As we show next, under broadly applicable technical hypotheses on the parameters, PropMs satisfy many of the properties that deterministic voucher allocations fail to have. Moreover, they can lower the maximum expected mistreatment. To state these results formally, we first extend concepts from deterministic DDSs to randomized RVPs. We let μρ(θ)subscript𝜇𝜌𝜃\mu_{\rho}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) be the expected school a student with true potential θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ is assigned to under ρ𝜌\rhoitalic_ρ and let mρ(θ):=max(0,μρ(θ)μ(θ))assignsubscript𝑚𝜌𝜃0subscript𝜇𝜌𝜃𝜇𝜃m_{\rho}(\theta):=\max(0,\mu_{\rho}(\theta)-\mu(\theta))italic_m start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) := roman_max ( 0 , italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) - italic_μ ( italic_θ ) ) be the mistreatment they experience under ρ𝜌\rhoitalic_ρ. An RVP ρ𝜌\rhoitalic_ρ is incentive compatible if μρ(θ)μρ(θ)subscript𝜇𝜌superscript𝜃subscript𝜇𝜌𝜃\mu_{\rho}(\theta^{\prime})\geq\mu_{\rho}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) for all θ<θsuperscript𝜃𝜃\theta^{\prime}<\thetaitalic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT < italic_θ. That is, an RVP is incentive compatible if a student with true potential θ𝜃\thetaitalic_θ is not better off by manipulating themselves to appear as having a true potential θ<θsuperscript𝜃𝜃\theta^{\prime}<\thetaitalic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT < italic_θ. We define the maximum mistreament as mmρ:=supxΘ{mρ(x)}assign𝑚subscript𝑚𝜌subscriptsupremum𝑥Θsubscript𝑚𝜌𝑥mm_{\rho}:=\sup_{x\in\Theta}\left\{m_{\rho}(x)\right\}italic_m italic_m start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT := roman_sup start_POSTSUBSCRIPT italic_x ∈ roman_Θ end_POSTSUBSCRIPT { italic_m start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_x ) }.

We define individual fairness as a Lipschitz continuity condition on ρ𝜌\rhoitalic_ρ. We say an RVP ρ𝜌\rhoitalic_ρ is k𝑘kitalic_k-individually fair if, for each θ,θ[1,)𝜃superscript𝜃1\theta,\theta^{\prime}\in[1,\infty)italic_θ , italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ 1 , ∞ ), |ρ(θ)ρ(θ)|k|θθ|𝜌𝜃𝜌superscript𝜃𝑘𝜃superscript𝜃|\rho(\theta)-\rho(\theta^{\prime})|\leq k|\theta-\theta^{\prime}|| italic_ρ ( italic_θ ) - italic_ρ ( italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | ≤ italic_k | italic_θ - italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | (note that under this definition, any non-empty DDS T[1,)𝑇1T\neq[1,\infty)italic_T ≠ [ 1 , ∞ ) is not k𝑘kitalic_k-individually fair for any k𝑘kitalic_k). We can now state the main result from this section, which is proved in Section 14.3 of the e-Companion. We let mm(c^)𝑚superscript𝑚^𝑐mm^{*}(\widehat{c})italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) be the maximum mistreatment achieved by the deterministic policy that minimizes the maximum mistreatment as a function of the available resources c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, as computed in Theorem 4.1.

Theorem 5.2

Let p[0,1/2]𝑝012p\in[0,1/2]italic_p ∈ [ 0 , 1 / 2 ] and ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT be a PropM defined as in (4) for some c^(0,1/2]^𝑐012\widehat{c}\in(0,1/2]over^ start_ARG italic_c end_ARG ∈ ( 0 , 1 / 2 ]. Then:

  1. 1.

    ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is 2c^α1βα2^𝑐𝛼1superscript𝛽𝛼\frac{2\widehat{c}\alpha}{1-\beta^{\alpha}}divide start_ARG 2 over^ start_ARG italic_c end_ARG italic_α end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG-individually fair.

  2. 2.

    ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is incentive compatible for

    c^1p2[p(1βα)+(1p)(βα1)].^𝑐1𝑝2delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1\widehat{c}\leq\frac{1-p}{2\left[p(1-\beta^{\alpha})+(1-p)(\beta^{-\alpha}-1)% \right]}.over^ start_ARG italic_c end_ARG ≤ divide start_ARG 1 - italic_p end_ARG start_ARG 2 [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] end_ARG . (5)
  3. 3.

    Suppose p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT and c^(1p)(1βα)1p+1βα^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}}over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG. Then mmρmmm(c^)𝑚subscript𝑚subscript𝜌𝑚𝑚superscript𝑚^𝑐mm_{\rho_{m}}\leq mm^{*}(\widehat{c})italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) if

    c^1p+1βα4p(1βα).^𝑐1𝑝1superscript𝛽𝛼4𝑝1superscript𝛽𝛼\widehat{c}\geq 1-\frac{p+1-\beta^{\alpha}}{4p(1-\beta^{\alpha})}.over^ start_ARG italic_c end_ARG ≥ 1 - divide start_ARG italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG . (6)

(5) and (6) give complementary conditions on the amount of vouchers that can be given out. On one hand, (5) suggests that distributing too many vouchers prevents incentive compatibility of the PropM. In fact, a c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG too large causes students performing just above the most mistreated student to be incentivized to artificially lower their score, as the absolute value of the derivative of the PropM becomes large around its maximum121212This observation can also be verified numerically.. On the other hand, (6) suggests that we need to distribute enough vouchers to see the maximum expected mistreatment (i.e., mmρm𝑚subscript𝑚subscript𝜌𝑚mm_{\rho_{m}}italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT) drop below the optimal deterministic one (i.e., mm(c^)𝑚superscript𝑚^𝑐mm^{*}(\widehat{c})italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG )). This is because the optimal deterministic policy debiases the most mistreated student straight away whereas the PropM distributes vouchers more widely, and so the maximum expected mistreatment does not immediately drop as significantly. As we discuss at the end of the section, both conditions are satisfied for a large range of parameters.

PropMs represent therefore a more robust and theoretically satisfying alternative, yet at least as effective, to the deterministic voucher assignments developed in Section 4 to minimize the maximum mistreatment.

We conclude this section be observing that to design a non-trivial incentive compatible RVP, it is essential to have knowledge of the distribution of student potentials. Define an RVP ρ𝜌\rhoitalic_ρ Increasing-with-Potential (IwP) if ρ(θ)ρ(θ)𝜌𝜃𝜌superscript𝜃\rho(\theta)\geq\rho(\theta^{\prime})italic_ρ ( italic_θ ) ≥ italic_ρ ( italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) for all θ>θ𝜃superscript𝜃\theta>\theta^{\prime}italic_θ > italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. An IwP assigns a higher probability of being debiased to students with higher potential. It can therefore be interpreted as a randomized counterpart of the DDS from Lemma 5.1 (in particular, the DDS from Lemma 5.1 is IwP).

Consider now the more general version of the model from Section 2, where the potentials are allowed to be drawn from any continuous, integrable distribution F𝐹Fitalic_F. All definitions of mistreatment, incentive compatibility, etc. naturally extend to this setting. We first show that, under mild technical conditions, IwPs are incentive compatible with respect to any F𝐹Fitalic_F. This fact is proved in Section 14.4 of the e-Companion.

Lemma 5.3

Suppose ρ𝜌\rhoitalic_ρ is IwP and such that it is everywhere continuously differentiable except a countable set of isolated points where it has right-continuous jump discontinuities. Then, for any distribution of potentials F𝐹Fitalic_F, ρ𝜌\rhoitalic_ρ is incentive compatible.

The following theorem gives a converse to the previous statement and it is also proved in Section 14.4 of the e-Companion.

Theorem 5.4

Suppose ρ𝜌\rhoitalic_ρ is an RVP. Let θ[1,)𝜃1\theta\in[1,\infty)italic_θ ∈ [ 1 , ∞ ) such that ρ𝜌\rhoitalic_ρ is continuously differentiable in some neighborhood of θ𝜃\thetaitalic_θ but ρ(θ)<0superscript𝜌𝜃0\rho^{\prime}(\theta)<0italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) < 0. Then, for any β(0,1)𝛽01\beta\in(0,1)italic_β ∈ ( 0 , 1 ) and p(0,1)𝑝01p\in(0,1)italic_p ∈ ( 0 , 1 ), there exists a continuous distribution F𝐹Fitalic_F such that, if true potentials are sampled from distribution F𝐹Fitalic_F, ρ𝜌\rhoitalic_ρ is not incentive compatible.

Theorem 5.4 implies that without any information on the distribution of student potentials, the only voucher distribution policies that are incentive compatible are those that allocate vouchers in such a way that a student with a higher potential always has a higher chance of receiving a voucher. Examples of such policies are lotteries for students whose potential is above a certain threshold. Hence, if no information on the distribution of students potential can be assumed, it is reasonable that policy-makers stick to a more conservative distribution of vouchers which rewards top-performing students.

Discussion on technical assumptions.

We now discuss the technical assumptions on the parameters of the model in Section 4 and Section 5. In Theorem 4.1, Theorem 4.2, and Theorem 5.2 we assume p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT. Note that the right hand side is equal to F2(1)F2(β)subscript𝐹21subscript𝐹2𝛽F_{2}(1)-F_{2}(\beta)italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 ) - italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_β ), that is, the proportion of disadvantaged students whose perceived potential is less than 1111 (that of any non-disadvantaged student). The condition p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT therefore requires that the proportion of disadvantaged students out of the whole student population is no more than the proportion of disadvantaged students that are perceived as being worse than any non-disadvantaged student. In Theorem 4.2, we further assume p<0.5𝑝0.5p<0.5italic_p < 0.5, and in Theorem 5.2 we assume both an upper and a lower bound on c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG. The conditions need to be checked and do not always hold, but we note that all conditions hold for many reasonable choices of (α,β,p,c^)𝛼𝛽𝑝^𝑐(\alpha,\beta,p,\widehat{c})( italic_α , italic_β , italic_p , over^ start_ARG italic_c end_ARG ). For instance, they hold if β=.8𝛽.8\beta=.8italic_β = .8, α=3𝛼3\alpha=3italic_α = 3, p<.4𝑝.4p<.4italic_p < .4 (as in Figure 4 and the Figure 5(b)), and c^1/4^𝑐14\widehat{c}\leq 1/4over^ start_ARG italic_c end_ARG ≤ 1 / 4; or if β=.9𝛽.9\beta=.9italic_β = .9, α=8.9𝛼8.9\alpha=8.9italic_α = 8.9, p1/3𝑝13p\leq 1/3italic_p ≤ 1 / 3, and c^1/4^𝑐14\widehat{c}\leq 1/4over^ start_ARG italic_c end_ARG ≤ 1 / 4 (as in our numerical experiments in the next section).

6 Experimental Case Study

Our theoretical analysis has shown that student mistreatment under various metrics can be substantially reduced by creating an intervention tailored to the distribution of student potential. We now use the student performance data from the NYC DOE, and compute the optimal ways to reduce student mistreatment, given their heterogeneous school preferences and distribution of performance. We show that our theoretical model provides a reasonable approximation despite some deviations from the data, and the qualitative results remain the same. Our theoretical analysis has therefore been instrumental in identifying effective debiasing policies for the real-world application, that can be optimized empirically depending on the actual data.

In New York City, there are eight SHSs that are considered to be among the top public high schools in the city. Admissions to these schools is highly competitive, with an intake of only about 5,00050005,0005 , 000 students every year (from a pool of 29,0002900029,00029 , 000 students who take the SHSAT). We remark that our model and the experimental data differ in two features. First, while schools’ preference over students are based strictly on the students’ scores on SHSAT and thus all schools share the same preference list over students, students’ preference for schools are not unanimous. However, as already remarked in the introduction, in the 2016-17 SHSAT cohort, 56% of students indicated Stuyvesant or Brooklyn Tech as their first preference, with 76% naming at least one of the two in their top two preferences, showing some alignment among preferences. In order to apply our analysis, we compute a stable matching using the real preferences of students. Second, although the Pareto distribution was mostly chosen for ease of theoretical analysis, we find that it adequately fits the data (see Figure 2). Overall, we find further evidence of reasonability of our assumptions since the empirical results on real-world data match the optimal target distribution of students predicted by our assortative model.

We describe our analysis using the dataset from the 2016-17 academic year. For each student, the dataset includes their SHSAT score (i.e., perceived potential), their preference over the schools, and whether they are in the G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT group or the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT group (based on the DOE definition). From the dataset, we estimate p=0.319𝑝0.319p=0.319italic_p = 0.319 and the Pareto distribution parameter to be α=8.9𝛼8.9\alpha=8.9italic_α = 8.9. The true potentials are computed by inflating the test scores of disadvantaged students by a factor of 1β1.131𝛽1.13\frac{1}{\beta}\approx 1.13divide start_ARG 1 end_ARG start_ARG italic_β end_ARG ≈ 1.13 (see Figure 1). The dataset is then restricted to all students who would receive an offer under their true potential.

Refer to caption
Refer to caption
Figure 7: The displacement of G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students in the SHSAT dataset from NYC DOE. Blue and magenta dots respectively show the displacement of disadvantaged and non-disadvantaged students when there is no intervention. In the top figure, the students in the debiased range (dashed lines) are offered vouchers, in the bottom figure vouchers are offered with probability given in Figure 8. In the bottom figure with the randomized voucher program, we plot the average displacement over 100 repetitions. In both figures, we plot the displacement of disadvantaged students whose assigned schools change, with red dots representing those going to more preferred schools and green dots for those going to less preferred schools.

First, we show empirically that without intervention, all G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students (magenta dots) have non-positive displacement and all G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students (blue dots) have non-negative displacement. These results are aligned with our analysis in Section 3. Furthermore, we consider deterministic and randomized interventions with c^=0.17^𝑐0.17\widehat{c}=0.17over^ start_ARG italic_c end_ARG = 0.17. In Figure 7, top, deterministic vouchers are offered to students between the two dashed lines. All G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students with vouchers (red dots) have a displacement of at most zero, but some G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students (green dots) might get worse, particularly the ones who are scoring just slightly higher than the range to which vouchers are offered, as they are overtaken by some other G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students just below them. This highlights the non-incentive compatible nature of such deterministic policies.

Applying the randomized voucher program to this dataset requires further modifications. Since students have heterogeneous preferences, their mistreatment is also heterogeneous; in the worst case two students with the same score may have different mistreatment. To produce an empirical PropM (see Figure 8), we divide the potentials of admitted students into 20 equal sized intervals and compute an average mistreatment within each interval. The PropM is then normalized to be proportional to this average mistreatment. Since PropM is a randomized voucher program, we ran the experiment 100 times and took the average displacement, which is plotted in Figure 7. Our experiments show that the maximum mistreatment is reduced (in particular, it is similar to the maximum mistreatment after the theoretically optimal deterministic debiasing), and more generally the mistreatment is improved across the board, indicating a more equitable outcome than the deterministic vouchers. Note that due to the heterogeneity of preferences and binned averaging, PropM is not in fact incentive compatible, as indicated by the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students with larger displacement under intervention (green diamonds). However, it can be observed that, unlike the deterministic debiasing procedure, students with an incentive to underperform are interspersed with students with no incentive to underperform, making it harder for students to game the system. Moreover, it is not uncommon that some theoretically incentive compatible mechanisms exhibits in practice some lack of incentive compatibility. For instance, the NYC School Match mechanism curtails the preference lists of students to at most 12 schools (Abdulkadiroğlu et al. 2005), incentivizing students to be at least partially strategic.

Last, we consider the PAUC measure, in two cases: low resources (c^=0.1^𝑐0.1\widehat{c}=0.1over^ start_ARG italic_c end_ARG = 0.1) and abundant resources (c^=0.4^𝑐0.4\widehat{c}=0.4over^ start_ARG italic_c end_ARG = 0.4). For both cases, we plug in the values to formula in Theorem 4.2 to obtain the theoretically optimal range of students to offer vouchers to, which is then shown to be close to the empirically optimal range obtained via grid search (see Table 1). Moreover, for both choices of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, the PAUC under our policies is substantially better than the one achieved by debiasing only the top students with total mass equal to c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG.

Refer to caption
Figure 8: The average mistreatment of students as computed from empirical data. Note that it peaks at 525, which represents an average student (cf. Figure 5 the curve representing mistreatment before intervention). The vertical line indicates the value of 1/β1𝛽1/\beta1 / italic_β where the theoretical PropM debiasing policy would have its maximum probability of assigning a voucher.
c^=0.1^𝑐0.1\widehat{c}=0.1over^ start_ARG italic_c end_ARG = 0.1 c^=0.4^𝑐0.4\widehat{c}=0.4over^ start_ARG italic_c end_ARG = 0.4
theoretically optimal range [529.01,547.27]529.01547.27[529.01,547.27][ 529.01 , 547.27 ] [506.86,583.09]506.86583.09[506.86,583.09][ 506.86 , 583.09 ]
empirically optimal range [527.29,543.17]527.29543.17[527.29,543.17][ 527.29 , 543.17 ] [504.61,561.31]504.61561.31[504.61,561.31][ 504.61 , 561.31 ]
Table 1: Comparing optimal ranges of students to offer vouchers obtained empirically and “theoretically” based on formula in Theorem 4.2, under two different amounts of resources.

7 Discussion

The qualitative takeaways from our work speak to a much ingrained systemic problem that limits access to opportunities – how can one understand the impact of bias on societal practices and systematically account for biases in the real world. Indeed, resources available for meaningful interventions in an existing system are limited, and there is resistance to change: for instance, a 2019 plan supported by the then New York City’s mayor to eliminate the entrance exam to top public high schools has failed to gain enough support, and was not approved by the New York State government (Shapiro and Wang 2019). Thus, our focus is on understanding the impact of minimally invasive use of targeted resources, as opposed to changing the matching mechanism itself.

From our analysis, we are able to highlight the following qualitative properties using simple models of bias and matching mechanism:

  1. 1.

    Disparate Impact: The disparity in admissions is experienced much more by the disadvantaged group of students, compared to the marginal advantage for the rest.

  2. 2.

    Interventions: A carefully-designed randomized voucher distribution program can counter some of the effects of bias, while also being incentive compatible and individually fair. We further showed empirically that our qualitative results remain unchanged when applied to our dataset (the admissions process for the 8 SHSs in New York).

  3. 3.

    Resources: Additional resources centrally distributed to slightly above-average students overall in the system (e.g., top performers in schools with high economic need index) would maximally impact fairness measures considered in this work. Targeting resources to students based on their performance provides an important lever to policy makers to improve fairness of the system.

These takeaways are a first step, and in no way address all the systemic problems in school admissions process – such as access to counselors, transport to schools or familial support towards education. But they do help us understand the most impacted student groups, and provide a mathematical basis to policymakers to make changes to allocation of the city funds and scholarships. We have shared the results of this work and are in discussions with the Department of Education of New York City. Further, our analysis leads to open questions such as theoretically optimal interventions under other structured student preferences and qualitative analysis when distributions of students’ potentials is not Pareto distributed.


The authors are deeply indebted to the editors and the reviewers for the many comments and suggestions on an earlier version of the manuscript.


  • Abdulkadiroğlu (2005) Abdulkadiroğlu A (2005) College admissions with affirmative action. International Journal of Game Theory 33:535–549.
  • Abdulkadiroğlu et al. (2005) Abdulkadiroğlu A, Pathak PA, Roth AE (2005) The New York City high school match. American Economic Review 95(2):364–367.
  • Arcidiacono et al. (2011) Arcidiacono P, Aucejo EM, Fang H, Spenner KI (2011) Does affirmative action lead to mismatch? a new test and evidence. Quantitative Economics 2(3):303–333.
  • Arnosti (2019) Arnosti N (2019) A continuum model of stable matchings with finite capacities, talk at Simons Institute for the Theory of Computing.
  • Ashkenas et al. (2017) Ashkenas J, Park H, Pearce A (2017) Even with affirmative action, blacks and hispanics are more underrepresented at top colleges than 35 years ago. New York Times 1–18.
  • Azevedo and Leshno (2016) Azevedo EM, Leshno JD (2016) A supply and demand framework for two-sided matching markets. Journal of Political Economy 124(5):1235–1268.
  • Backes (2012) Backes B (2012) Do affirmative action bans lower minority college enrollment and attainment?: Evidence from statewide bans. Journal of Human resources 47(2):435–455.
  • Becker (2010) Becker GS (2010) The economics of discrimination (University of Chicago press).
  • Biró (2008) Biró P (2008) Student admissions in Hungary as Gale and Shapley envisaged. University of Glasgow Technical Report TR-2008-291 .
  • Biró et al. (2010) Biró P, Fleiner T, Irving RW, Manlove DF (2010) The college admissions problem with lower and common quotas. Theoretical Computer Science 411(34-36):3136–3153.
  • Boschma and Brownstein (2016) Boschma J, Brownstein R (2016) The concentration of poverty in american schools. The Atlantic 29.
  • Burgess et al. (2015) Burgess S, Greaves E, Vignoles A, Wilson D (2015) What parents want: School preferences and school choice. The Economic Journal 125(587):1262–1289.
  • Calsamiglia et al. (2010) Calsamiglia C, Haeringer G, Klijn F (2010) Constrained school choice: An experimental study. American Economic Review 100(4):1860–74.
  • Celis et al. (2021) Celis LE, Hays C, Mehrotra A, Vishnoi NK (2021) The effect of the rooney rule on implicit bias in the long term. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 678–689.
  • Celis et al. (2020) Celis LE, Mehrotra A, Vishnoi NK (2020) Interventions for ranking in the presence of implicit bias. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 369–380.
  • Chade et al. (2014) Chade H, Lewis G, Smith L (2014) Student portfolios and the college admissions problem. Review of Economic Studies 81(3):971–1002.
  • Chan and Eyster (2003) Chan J, Eyster E (2003) Does banning affirmative action lower college student quality? American Economic Review 93(3):858–872.
  • Chin (2022) Chin WW (2022) Equity and excellence, four years later. City Journal URL https://www.city-journal.org/article/equity-and-excellence-four-years-later, accessed: 2024-07-01.
  • Clauset et al. (2009) Clauset A, Shalizi CR, Newman ME (2009) Power-law distributions in empirical data. SIAM review 51(4):661–703.
  • Coate and Loury (1993) Coate S, Loury GC (1993) Will affirmative-action policies eliminate negative stereotypes? The American Economic Review 1220–1240.
  • Conitzer et al. (2019) Conitzer V, Freeman R, Shah N, Vaughan JW (2019) Group fairness for the allocation of indivisible goods. Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI).
  • Corcoran and Baker-Smith (2018) Corcoran SP, Baker-Smith EC (2018) Pathways to an elite education: Application, admission, and matriculation to new york city’s specialized high schools. Education Finance and Policy 13(2):256–279.
  • Dee and Jacob (2011) Dee TS, Jacob B (2011) The impact of no child left behind on student achievement. Journal of Policy Analysis and management 30(3):418–446.
  • Dessein et al. (2023) Dessein W, Frankel A, Kartik N (2023) Test-optional admissions. arXiv preprint arXiv:2304.07551 .
  • Drasgow (1984) Drasgow F (1984) Scrutinizing psychological tests: Measurement equivalence and equivalent relations with external variables are the central issues. .
  • Dwork et al. (2012) Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science conference, 214–226 (ACM).
  • Dwork and Ilvento (2018) Dwork C, Ilvento C (2018) Group fairness under composition.
  • Emelianov et al. (2020) Emelianov V, Gast N, Gummadi KP, Loiseau P (2020) On fair selection in the presence of implicit variance. Proceedings of the 21st ACM Conference on Economics and Computation, 649–675.
  • Fershtman and Pavan (2021) Fershtman D, Pavan A (2021) “soft” affirmative action and minority recruitment. American Economic Review: Insights 3(1):1–18.
  • Folland (1999) Folland GB (1999) Real analysis: modern techniques and their applications, volume 40 (John Wiley & Sons).
  • Gale and Shapley (1962) Gale D, Shapley LS (1962) College admissions and the stability of marriage. The American Mathematical Monthly 69(1):9–15.
  • Garg et al. (2021) Garg N, Li H, Monachou F (2021) Standardized tests and affirmative action: The role of bias and variance. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 261–261.
  • Gratz v. Bollinger (2003) Gratz v Bollinger (2003) Gratz v. bollinger, 539 u.s. 244 (2003). .
  • Greenwald et al. (1996) Greenwald R, Hedges LV, Laine RD (1996) The effect of school resources on student achievement. Review of educational research 66(3):361–396.
  • Grimmett and Stirzaker (2020) Grimmett G, Stirzaker D (2020) Probability and random processes (Oxford university press).
  • Hafalir et al. (2013) Hafalir IE, Yenmez MB, Yildirim MA (2013) Effective affirmative action in school choice. Theoretical Economics 8(2):325–363.
  • Hastings et al. (2009) Hastings J, Kane TJ, Staiger DO (2009) Heterogeneous preferences and the efficacy of public school choice. NBER Working Paper 2145:1–46.
  • Hu et al. (2019) Hu L, Immorlica N, Vaughan JW (2019) The disparate effects of strategic manipulation. Proceedings of the Conference on Fairness, Accountability, and Transparency, 259–268.
  • Kamada and Kojima (2024) Kamada Y, Kojima F (2024) Fair matching under constraints: Theory and applications. Review of Economic Studies 91(2):1162–1199.
  • Kannan et al. (2019) Kannan S, Roth A, Ziani J (2019) Downstream effects of affirmative action. Proceedings of the Conference on Fairness, Accountability, and Transparency, 240–248.
  • Kleinberg et al. (2017) Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017) (Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik).
  • Kleinberg and Raghavan (2018) Kleinberg J, Raghavan M (2018) Selection problems in the presence of implicit bias. 9th Innovations in Theoretical Computer Science Conference (ITCS 2018) (Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik).
  • Kucsera and Orfield (2014) Kucsera J, Orfield G (2014) New York State’s extreme school segregation: Inequality, inaction and a damaged future .
  • Kumar and Kleinberg (2000) Kumar A, Kleinberg J (2000) Fairness measures for resource allocation. Proceedings 41st Annual Symposium on Foundations of Computer Science, 75–85 (IEEE).
  • Laverde (2020) Laverde M (2020) Unequal assignments to public schools and the limits of school choice. Unpublished working paper .
  • Liu and Garg (2021) Liu Z, Garg N (2021) Test-optional policies: Overcoming strategic behavior and informational gaps. Proceedings of the 1st ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, 1–13.
  • Long (2004) Long MC (2004) Race and college admissions: An alternative to affirmative action? review of Economics and Statistics 86(4):1020–1033.
  • Lovaglia et al. (1998) Lovaglia MJ, Lucas JW, Houser JA, Thye SR, Markovsky B (1998) Status processes and mental ability test scores. American Journal of Sociology 104(1):195–228.
  • Marsh and Schilling (1994) Marsh MT, Schilling DA (1994) Equity measurement in facility location analysis: A review and framework. European Journal of Operational Research 74(1):1–17.
  • Nguyen and Vohra (2019) Nguyen T, Vohra R (2019) Stable matching with proportionality constraints. Operations Research .
  • Niu et al. (2022) Niu M, Kannan S, Roth A, Vohra R (2022) Best vs. all: Equity and accuracy of standardized test score reporting. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 574–586.
  • NYC DOE (2018) NYC DOE (2018) Specialized high schools proposal. https://www.schools.nyc.gov/docs/default-source/default-document-library/specialized-high-schools-proposal.
  • NYC DOE (2019) NYC DOE (2019) 2019 NYC High School Directory. https://bigappleacademy.com/wp-content/uploads/2018/06/HSD_2019_ENGLISH_Web.pdf.
  • Phelps (1972) Phelps ES (1972) The statistical theory of racism and sexism. The american economic review 62(4):659–661.
  • Quinn Capers et al. (2017) Quinn Capers I, Clinchot D, McDougle L, Greenwald AG (2017) Implicit racial bias in medical school admissions. Academic Medicine 92(3):365–369.
  • Roth and Sotomayor (1992) Roth AE, Sotomayor M (1992) Two-sided matching. Handbook of game theory with economic applications 1:485–541.
  • Roughgarden (2010) Roughgarden T (2010) Algorithmic game theory. Communications of the ACM 53(7):78–86.
  • Salem and Gupta (2023) Salem J, Gupta S (2023) Secretary problems with biased evaluations using partial ordinal information. Management Science .
  • Sen (1979) Sen A (1979) Equality of what? The Tanner lecture on human values 1.
  • Shapiro (2019a) Shapiro E (2019a) Racist? fair? biased? asian-american alumni debate elite high school admissions. The New York Times Magazine .
  • Shapiro (2019b) Shapiro E (2019b) Should a single test decide a 4-year-old’s educational future? New York Times .
  • Shapiro (2021) Shapiro E (2021) Only 8 black students are admitted to stuyvesant high school. New York Times .
  • Shapiro (March 26, 2019) Shapiro E (March 26, 2019) Segregation has been the story of New York City’s schools for 50 years. The New York Times Magazine .
  • Shapiro and Lai (June 03, 2019) Shapiro E, Lai KKR (June 03, 2019) How New York’s elite public schools lost their black and hispanic students. The New York Times Magazine .
  • Shapiro and Wang (2019) Shapiro E, Wang V (2019) Amid racial divisions, mayor’s plan to scrap elite school exam fails. New York Times .
  • Texas Comptroller of Public Accounts (2024) Texas Comptroller of Public Accounts (2024) Top 10% rule. URL https://comptroller.texas.gov/programs/education/msp/funding/aid/state-programs/txttp.php, accessed: 2024-07-01.
  • Tomoeda (2018) Tomoeda K (2018) Finding a stable matching under type-specific minimum quotas. Journal of Economic Theory 176:81–117.
  • Wenneras and Wold (2010) Wenneras C, Wold A (2010) Nepotism and sexism in peer-review. Women, science, and technology, 64–70 (Routledge).

8 Discussion on discrete versus continuous models

Traditionally, matching markets are assumed to be discrete (Gale and Shapley 1962, Roth and Sotomayor 1992). There has been however, in recent years, an interest for models where one or both sides of the markets are continuous (Arnosti 2019, Azevedo and Leshno 2016). This is justified by the fact that, in many applications, markets are large, hence predictions in continuous markets often translate with a good degree of accuracy to discrete ones. Moreover, continuous markets are often analytically more tractable than discrete ones (see, again, Arnosti (2019), Azevedo and Leshno (2016)). Our case is no exception: the continuous model allows us to deduce precise mathematical formulae, while we show through experiments that those formulae are a good approximation to the discrete case. We also provide additional experiments evaluating the robustness of our results under relaxation of assumptions, such as that of a unique bias factor for all students in G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. We remark that the goal of this study is not to provide a mechanism to admits students to schools, for which the assumption of all rankings of schools as well as of students being the same would be too simplistic. On the contrary, as we want to understand the impact of bias at a macroscopic level, we believe our approximation to be meaningful and useful since as in our model any reasonable mechanism would output the same assignment. Note that in the classical discrete model, when schools and students have unique ranking, there is only one stable assignment, which is also Pareto-optimal for students. A similar statement holds for the appropriate translations of those concepts to our model.

9 Discussion on alternate models of bias

We fitted two complementary models of bias to the 2016-17 SHSAT score distribution that forms the basis of our experiments on real data in Section 6: a multiplicative model and an additive model. In the multiplicative model we assume the perceived scores of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students are given by Z^=βZ^𝑍𝛽𝑍\widehat{Z}=\beta Zover^ start_ARG italic_Z end_ARG = italic_β italic_Z for some β<1𝛽1\beta<1italic_β < 1, and in the additive model we assume Z^=Z+β^𝑍𝑍𝛽\widehat{Z}=Z+\betaover^ start_ARG italic_Z end_ARG = italic_Z + italic_β for some β<0𝛽0\beta<0italic_β < 0. We attempted to fit each model to the real SHSAT data of G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. To fit a model, we chose a measure for the similarity of distributions, then chose the β𝛽\betaitalic_β value that minimized this measure. We performed this under both the Wasserstein distance metric as well as Kullback-Leibler divergence as our measures of similarity. Each optimization problem (for each measure of similarity and each model) had a unique minimum, which was our best fit β𝛽\betaitalic_β. We then compared the additive and multiplicative models based on the measure of similarity under best fit β𝛽\betaitalic_β. We found that under both metrics, the best fit β𝛽\betaitalic_β had lower distance between adjusted G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT scores and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT scores in the multiplicative model than the additive model. Thus we conclude that the multiplicative model matches the data better. For our final analysis we chose the best fit β𝛽\betaitalic_β based on the Wasserstein metric (as opposed to Kullback-Leibler divergence) as we believe this is the more appropriate measure of similarity of distributions in this setting.

Model Metric Value of Metric Beta fit β𝛽\betaitalic_β
Additive KL-divergence 0.0321 -36.9
Multiplicative KL-divergence 0.0293 0.902
Additive Wasserstein distance 9.52 -49.0
Multiplicative Wasserstein distance 5.36 0.882
Table 2: Comparison of Additive and Multiplicative Models using KL-divergence and Wasserstein Distance Metrics

10 Proof of Lemma 5.1

Assume T𝑇Titalic_T is not of the form {θΘ:Z(θ)δ}conditional-set𝜃Θ𝑍𝜃𝛿\{\theta\in\Theta:Z(\theta)\geq\delta\}{ italic_θ ∈ roman_Θ : italic_Z ( italic_θ ) ≥ italic_δ } or {θΘ:Z(θ)>δ}conditional-set𝜃Θ𝑍𝜃𝛿\{\theta\in\Theta:Z(\theta)>\delta\}{ italic_θ ∈ roman_Θ : italic_Z ( italic_θ ) > italic_δ } for some δ𝛿\deltaitalic_δ and let U𝑈Uitalic_U be a connected and inclusionwise maximal subset of T𝑇Titalic_T that is bounded. Take the smallest number δ1[1,)subscript𝛿11\delta_{1}\in[1,\infty)italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ [ 1 , ∞ ) so that Z(θ)δ1𝑍𝜃subscript𝛿1Z(\theta)\leq\delta_{1}italic_Z ( italic_θ ) ≤ italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for all θU𝜃𝑈\theta\in Uitalic_θ ∈ italic_U. Let θ¯θ¯𝜃𝜃\overline{\theta}\in\thetaover¯ start_ARG italic_θ end_ARG ∈ italic_θ such that Z(θ¯)=δ1𝑍¯𝜃subscript𝛿1Z(\overline{\theta})=\delta_{1}italic_Z ( over¯ start_ARG italic_θ end_ARG ) = italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Assume first that θ¯U¯𝜃𝑈\overline{\theta}\in Uover¯ start_ARG italic_θ end_ARG ∈ italic_U. Then, for each ε1>0subscript𝜀10\varepsilon_{1}>0italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0, there exists ε(0,ε1]𝜀0subscript𝜀1\varepsilon\in(0,\varepsilon_{1}]italic_ε ∈ ( 0 , italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] such that Z1(δ1+ε)Tsuperscript𝑍1subscript𝛿1𝜀𝑇Z^{-1}(\delta_{1}+\varepsilon)\notin Titalic_Z start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ε ) ∉ italic_T. Since β<1𝛽1\beta<1italic_β < 1 and by continuity of Z()𝑍Z(\cdot)italic_Z ( ⋅ ), there exists ε2>0subscript𝜀20\varepsilon_{2}>0italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 0 such that β(δ1+ε)<δ1𝛽subscript𝛿1𝜀subscript𝛿1\beta(\delta_{1}+\varepsilon)<\delta_{1}italic_β ( italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ε ) < italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for all ε<ε2𝜀subscript𝜀2\varepsilon<\varepsilon_{2}italic_ε < italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. We can then take an appropriate x[δ1,δ1+ε2]T𝑥subscript𝛿1subscript𝛿1subscript𝜀2𝑇x\in[\delta_{1},\delta_{1}+\varepsilon_{2}]\setminus Titalic_x ∈ [ italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ∖ italic_T and x=δ1superscript𝑥subscript𝛿1x^{\prime}=\delta_{1}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT to show that T𝑇Titalic_T is not incentive compatible.

Next assume θ¯U¯𝜃𝑈\overline{\theta}\notin Uover¯ start_ARG italic_θ end_ARG ∉ italic_U. In particular, we have θ¯T¯𝜃𝑇\overline{\theta}\notin Tover¯ start_ARG italic_θ end_ARG ∉ italic_T. Similarly to the case above, we can find ε>0𝜀0\varepsilon>0italic_ε > 0 such that x=δ1εsuperscript𝑥subscript𝛿1𝜀x^{\prime}=\delta_{1}-\varepsilonitalic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_ε satisfies Z1(x)UTsuperscript𝑍1superscript𝑥𝑈𝑇Z^{-1}(x^{\prime})\in U\subseteq Titalic_Z start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_U ⊆ italic_T and βδ1<x𝛽subscript𝛿1superscript𝑥\beta\delta_{1}<x^{\prime}italic_β italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Setting x=δ1𝑥subscript𝛿1x=\delta_{1}italic_x = italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, we deduce that T𝑇Titalic_T is not incentive compatible. \Halmos


Online Supplement

11 Impact on Schools

This appendix explores the school’s perspective: the impact of bias on utility (quality of accepted students) and diversity for schools, as well as school-driven interventions such as interviews. In the notation of the two-group model in Section 2, we define the utility uγ(s)subscript𝑢𝛾𝑠u_{\gamma}(s)italic_u start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( italic_s ) of a school s𝑠sitalic_s under matching γ{μ,μ^}𝛾𝜇^𝜇\gamma\in\left\{\mu,\widehat{\mu}\right\}italic_γ ∈ { italic_μ , over^ start_ARG italic_μ end_ARG }, as

uγ(s):=θγ1(s)G1Z(θ)𝑑F1(Z^(θ))+θγ1(s)G2Z(θ)𝑑F2(Z^(θ)).assignsubscript𝑢𝛾𝑠subscript𝜃superscript𝛾1𝑠subscript𝐺1𝑍𝜃differential-dsubscript𝐹1^𝑍𝜃subscript𝜃superscript𝛾1𝑠subscript𝐺2𝑍𝜃differential-dsubscript𝐹2^𝑍𝜃u_{\gamma}(s):=\int_{\theta\in\gamma^{-1}(s)\cap G_{1}}Z(\theta)dF_{1}(% \widehat{Z}(\theta))+\int_{\theta\in\gamma^{-1}(s)\cap G_{2}}Z(\theta)dF_{2}(% \widehat{Z}(\theta)).italic_u start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( italic_s ) := ∫ start_POSTSUBSCRIPT italic_θ ∈ italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_s ) ∩ italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Z ( italic_θ ) italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) + ∫ start_POSTSUBSCRIPT italic_θ ∈ italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_s ) ∩ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Z ( italic_θ ) italic_d italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) . (7)

That is, the utility of a school is the average true potential of admitted students. Continuing from Example 2.2, let sMsubscript𝑠𝑀s_{M}italic_s start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT (resp. sLsubscript𝑠𝐿s_{L}italic_s start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT) be the school Maya (resp. Lisa) is assigned to in the biased setting. Following Proposition 11.1, the utilities of sM,sLsubscript𝑠𝑀subscript𝑠𝐿s_{M},s_{L}italic_s start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT in the unbiased setting are

uμ(sM)=1.283anduμ(sL)=1.324,formulae-sequencesubscript𝑢𝜇subscript𝑠𝑀1.283andsubscript𝑢𝜇subscript𝑠𝐿1.324u_{\mu}(s_{M})=1.283\quad\hbox{and}\quad u_{\mu}(s_{L})=1.324,italic_u start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) = 1.283 and italic_u start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) = 1.324 ,

while in the biased setting, they are

uμ^(sM)=1.280anduμ^(sL)=1.320.formulae-sequencesubscript𝑢^𝜇subscript𝑠𝑀1.280andsubscript𝑢^𝜇subscript𝑠𝐿1.320u_{\widehat{\mu}}(s_{M})=1.280\quad\hbox{and}\quad u_{\widehat{\mu}}(s_{L})=1.% 320.italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) = 1.280 and italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) = 1.320 .

Hence, the change in the utilities of the two school between the two settings is negligible. We develop the theory to validate these observations in this appendix.

We discuss first the impact of bias on the average true potential of students accepted by a school. Let s[0,1]𝑠01s\in[0,1]italic_s ∈ [ 0 , 1 ] denote the school that is ranked in the s×100%𝑠percent100s\times 100\%italic_s × 100 % position among the continuous range of schools. As the next proposition shows, the impact on the utilities of schools is negligible for all schools other than the lowest ranked schools. This is because for each school, although the average potential of assigned G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students is lower than it should be, its assigned G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students have much higher true potentials. And thus, the toll on the utility due to unqualified G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT students is partially canceled out by the overqualified G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students and the net effect is minimal. On the other hand, some lower ranked schools that only admit G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students fare better in the biased setting (since they admit over-qualified G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT candidates).

Proposition 11.1

For school s𝑠sitalic_s, its utility under the unbiased (resp. biased) models are respectively

uμ(s)=s1α and uμ^(s)={1p+pβα1p+pβα+1(s1p+pβα)1α if s1p+pβα,(s(1p)p)1α if s>1p+pβα.formulae-sequencesubscript𝑢𝜇𝑠superscript𝑠1𝛼 and subscript𝑢^𝜇𝑠cases1𝑝𝑝superscript𝛽𝛼1𝑝𝑝superscript𝛽𝛼1superscript𝑠1𝑝𝑝superscript𝛽𝛼1𝛼 if 𝑠1𝑝𝑝superscript𝛽𝛼superscript𝑠1𝑝𝑝1𝛼 if 𝑠1𝑝𝑝superscript𝛽𝛼u_{\mu}(s)=s^{-\frac{1}{\alpha}}\quad\hbox{ and }\quad u_{\widehat{\mu}}(s)=% \begin{cases}\displaystyle\frac{1-p+p\beta^{\alpha}}{1-p+p\beta^{\alpha+1}}% \left(\frac{s}{1-p+p\beta^{\alpha}}\right)^{-\frac{1}{\alpha}}&\text{ if }s% \leq 1-p+p\beta^{\alpha},\\[2.84526pt] \displaystyle\left(\frac{s-(1-p)}{p}\right)^{-\frac{1}{\alpha}}&\text{ if }s>1% -p+p\beta^{\alpha}.\end{cases}italic_u start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_s ) = italic_s start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT and italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s ) = { start_ROW start_CELL divide start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG ( divide start_ARG italic_s end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT end_CELL start_CELL if italic_s ≤ 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT , end_CELL end_ROW start_ROW start_CELL ( divide start_ARG italic_s - ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT end_CELL start_CELL if italic_s > 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT . end_CELL end_ROW

The key idea in the proof is to first compute the cutoffs at each school for each of the two groups, that is, the minimum perceived potential needed for a student to be matched to a given school. Once these are known, using Bayes’ rule, we deduce the minimum real potential needed by students of each group to attend the school. From the latter, we can immediately compute the average utility of each school.

Proof 11.2

Proof of Proposition 11.1. In order for a student θ𝜃\thetaitalic_θ to be assigned to a school that is at least as good as s𝑠sitalic_s, their perceived potential Z^(θ)^𝑍𝜃\widehat{Z}(\theta)over^ start_ARG italic_Z end_ARG ( italic_θ ) needs to be high enough to satisfy (1p)F¯1(1Z^(θ))+pF¯2(Z^(θ))s1𝑝subscript¯𝐹11^𝑍𝜃𝑝subscript¯𝐹2^𝑍𝜃𝑠(1-p)\bar{F}_{1}(1\vee\widehat{Z}(\theta))+p\bar{F}_{2}(\widehat{Z}(\theta))\leq s( 1 - italic_p ) over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 ∨ over^ start_ARG italic_Z end_ARG ( italic_θ ) ) + italic_p over¯ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over^ start_ARG italic_Z end_ARG ( italic_θ ) ) ≤ italic_s. That is, we need

Z^(θ)d(s):={(s1p+pβα)1αif s1p+pβα,β(s(1p)p)1αif s>1p+pβα.^𝑍𝜃𝑑𝑠assigncasessuperscript𝑠1𝑝𝑝superscript𝛽𝛼1𝛼if 𝑠1𝑝𝑝superscript𝛽𝛼𝛽superscript𝑠1𝑝𝑝1𝛼if 𝑠1𝑝𝑝superscript𝛽𝛼\widehat{Z}(\theta)\geq d(s):=\begin{cases}\displaystyle\left(\frac{s}{1-p+p% \beta^{\alpha}}\right)^{-\frac{1}{\alpha}}&\text{if }s\leq 1-p+p\beta^{\alpha}% ,\\[2.84526pt] \displaystyle\beta\left(\frac{s-(1-p)}{p}\right)^{-\frac{1}{\alpha}}&\text{if % }s>1-p+p\beta^{\alpha}.\end{cases}over^ start_ARG italic_Z end_ARG ( italic_θ ) ≥ italic_d ( italic_s ) := { start_ROW start_CELL ( divide start_ARG italic_s end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT end_CELL start_CELL if italic_s ≤ 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_β ( divide start_ARG italic_s - ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT end_CELL start_CELL if italic_s > 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT . end_CELL end_ROW

We call d(s)𝑑𝑠d(s)italic_d ( italic_s ) the cutoff for school s𝑠sitalic_s. With the cutoffs, we can compute the utilities of schools. We start with the formula for uμ^(s)subscript𝑢^𝜇𝑠u_{\widehat{\mu}}(s)italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s ). First note that by Bayes rule, the probability that a given student with perceived potential Z^(θ)1^𝑍𝜃1\widehat{Z}(\theta)\geq 1over^ start_ARG italic_Z end_ARG ( italic_θ ) ≥ 1 belongs to G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is 1p1p+pβα+11𝑝1𝑝𝑝superscript𝛽𝛼1\frac{1-p}{1-p+p\beta^{\alpha+1}}divide start_ARG 1 - italic_p end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG. Using Equation (1), observe that the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT student whose perceived potential is 1111 (i.e., true potential is 1β1𝛽\frac{1}{\beta}divide start_ARG 1 end_ARG start_ARG italic_β end_ARG) is matched to school 1p+pβα1𝑝𝑝superscript𝛽𝛼1-p+p\beta^{\alpha}1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT. Thus, if s1p+pβα𝑠1𝑝𝑝superscript𝛽𝛼s\geq 1-p+p\beta^{\alpha}italic_s ≥ 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, s𝑠sitalic_s is only assigned with G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. Therefore, when s1p+pβα𝑠1𝑝𝑝superscript𝛽𝛼s\leq 1-p+p\beta^{\alpha}italic_s ≤ 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT,

uμ^(s)=1p1p+pβα+1d(s)+pβα+11p+pβα+1d(s)β=1p+pβα1p+pβα+1(s1p+pβα)1α.subscript𝑢^𝜇𝑠1𝑝1𝑝𝑝superscript𝛽𝛼1𝑑𝑠𝑝superscript𝛽𝛼11𝑝𝑝superscript𝛽𝛼1𝑑𝑠𝛽1𝑝𝑝superscript𝛽𝛼1𝑝𝑝superscript𝛽𝛼1superscript𝑠1𝑝𝑝superscript𝛽𝛼1𝛼u_{\widehat{\mu}}(s)=\frac{1-p}{1-p+p\beta^{\alpha+1}}d(s)+\frac{p\beta^{% \alpha+1}}{1-p+p\beta^{\alpha+1}}\frac{d(s)}{\beta}=\frac{1-p+p\beta^{\alpha}}% {1-p+p\beta^{\alpha+1}}\left(\frac{s}{1-p+p\beta^{\alpha}}\right)^{-\frac{1}{% \alpha}}.italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s ) = divide start_ARG 1 - italic_p end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG italic_d ( italic_s ) + divide start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG divide start_ARG italic_d ( italic_s ) end_ARG start_ARG italic_β end_ARG = divide start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT end_ARG ( divide start_ARG italic_s end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT .

And when s>1p+pβα𝑠1𝑝𝑝superscript𝛽𝛼s>1-p+p\beta^{\alpha}italic_s > 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, we have

uμ^(s)=d(s)/β=(s(1p)p)1α.subscript𝑢^𝜇𝑠𝑑𝑠𝛽superscript𝑠1𝑝𝑝1𝛼u_{\widehat{\mu}}(s)=d(s)/\beta=\left(\frac{s-(1-p)}{p}\right)^{-\frac{1}{% \alpha}}.italic_u start_POSTSUBSCRIPT over^ start_ARG italic_μ end_ARG end_POSTSUBSCRIPT ( italic_s ) = italic_d ( italic_s ) / italic_β = ( divide start_ARG italic_s - ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT .

One the other hand, when there is no bias against G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students, we simply have uμ(s)=s1αsubscript𝑢𝜇𝑠superscript𝑠1𝛼u_{\mu}(s)=s^{-\frac{1}{\alpha}}italic_u start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_s ) = italic_s start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_α end_ARG end_POSTSUPERSCRIPT. \Halmos

As one readily observes from Proposition 11.1, the negative impact of bias on schools’ utility is negligible. Hence, from an operational perspective, it may be hard to convince schools to autonomously put in place mechanisms to alleviate the effect of bias given the limited impact on them.

Refer to caption
Figure 9: Proportion of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students in higher ranked schools decreases significantly in the biased setting.

Let pr(s)𝑝𝑟𝑠pr(s)italic_p italic_r ( italic_s ) (resp. pr^(s)^𝑝𝑟𝑠\widehat{pr}(s)over^ start_ARG italic_p italic_r end_ARG ( italic_s )) be the proportion of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students assigned to school s𝑠sitalic_s when there is no bias (resp. there is bias) against G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students. Since the distribution of potentials is the same for both G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students, it is immediate that pr(s)=p𝑝𝑟𝑠𝑝pr(s)=pitalic_p italic_r ( italic_s ) = italic_p when there is no bias.

Proposition 11.3

Without bias, we have pr(s)=p𝑝𝑟𝑠𝑝pr(s)=pitalic_p italic_r ( italic_s ) = italic_p. Under the biased setting, we have

pr^(s)={pβα1p+pβα if s1p+pβα,1 if s>1p+pβα.^𝑝𝑟𝑠cases𝑝superscript𝛽𝛼1𝑝𝑝superscript𝛽𝛼 if 𝑠1𝑝𝑝superscript𝛽𝛼1 if 𝑠1𝑝𝑝superscript𝛽𝛼\widehat{pr}(s)=\begin{cases}\displaystyle\frac{p\beta^{\alpha}}{1-p+p\beta^{% \alpha}}&\text{ if }s\leq 1-p+p\beta^{\alpha},\\ 1&\text{ if }s>1-p+p\beta^{\alpha}.\end{cases}over^ start_ARG italic_p italic_r end_ARG ( italic_s ) = { start_ROW start_CELL divide start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL if italic_s ≤ 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT , end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL if italic_s > 1 - italic_p + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT . end_CELL end_ROW
Proof 11.4

Proof. The formula for pr^(s)^𝑝𝑟𝑠\widehat{pr}(s)over^ start_ARG italic_p italic_r end_ARG ( italic_s ) follows from the analysis of utility of schools in Proposition 11.1.\Halmos

A visual comparison of pr(s)𝑝𝑟𝑠pr(s)italic_p italic_r ( italic_s ) and pr^(s)^𝑝𝑟𝑠\widehat{pr}(s)over^ start_ARG italic_p italic_r end_ARG ( italic_s ) can be found in Figure 9 for different values of β𝛽\betaitalic_β and p𝑝pitalic_p. In particular, we show that the proportion of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students in higher ranked schools decreases significantly in the biased setting.

12 Proof of Theorem 4.1 and related facts

12.1 Technical discussion

The main idea of the proof is to first assume that the set T𝑇Titalic_T forms a connected set (i.e., a closed interval). Then, we can express mm(μT)𝑚𝑚subscript𝜇𝑇mm(\mu_{T})italic_m italic_m ( italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) as a function of the endpoints of T𝑇Titalic_T and work out the minimizing interval. We next drop the assumption that T𝑇Titalic_T is connected and show that the optimal set of students to debias remains the same. The analysis we give is actually more general, and presents results under which vouchers improve the mistreatment of students lexicographically. Interestingly, it also shows that, if vouchers are not distributed carefully, one may actually worsen the most mistreated students.

12.2 A more general approach

The analysis we give is actually leads to a more general statement than Theorem 4.1, and has the goal of investigating conditions under which giving vouchers can improve over the status quo. More formally, for bounded functions f,g:G2:𝑓𝑔subscript𝐺2f,g:G_{2}\rightarrow\mathbb{R}italic_f , italic_g : italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT → blackboard_R, we write fgsucceeds𝑓𝑔f\succ gitalic_f ≻ italic_g if we can partition G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in two sets S,S𝑆superscript𝑆S,S^{\prime}italic_S , italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (with possibly S=superscript𝑆S^{\prime}=\emptysetitalic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ∅) so that f(θ)=g(θ)𝑓𝜃𝑔𝜃f(\theta)=g(\theta)italic_f ( italic_θ ) = italic_g ( italic_θ ) for θS𝜃superscript𝑆\theta\in S^{\prime}italic_θ ∈ italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and supθSf(θ)>supθSg(θ)subscriptsupremum𝜃𝑆𝑓𝜃subscriptsupremum𝜃𝑆𝑔𝜃\sup_{\theta\in S}f(\theta)>\sup_{\theta\in S}g(\theta)roman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_S end_POSTSUBSCRIPT italic_f ( italic_θ ) > roman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_S end_POSTSUBSCRIPT italic_g ( italic_θ ). Note that succeeds\succ is transitive and antisymmetric, and can be interpreted as a continuous equivalent of the classical lexicographic ordering for discrete vectors. In particular, if we let f=γμ𝑓𝛾𝜇f=\gamma-\muitalic_f = italic_γ - italic_μ and g=γμ𝑔superscript𝛾𝜇g=\gamma^{\prime}-\muitalic_g = italic_γ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_μ for matchings γ,γ𝛾superscript𝛾\gamma,\gamma^{\prime}italic_γ , italic_γ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then supθG2(γμ)(θ)>supθG2(γμ)(θ)subscriptsupremum𝜃subscript𝐺2𝛾𝜇𝜃subscriptsupremum𝜃subscript𝐺2superscript𝛾𝜇𝜃\sup_{\theta\in G_{2}}(\gamma-\mu)(\theta)>\sup_{\theta\in G_{2}}(\gamma^{% \prime}-\mu)(\theta)roman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_γ - italic_μ ) ( italic_θ ) > roman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_γ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_μ ) ( italic_θ ) implies fgsucceeds𝑓𝑔f\succ gitalic_f ≻ italic_g (taking S=G2𝑆subscript𝐺2S=G_{2}italic_S = italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT). Now suppose we debias student in T=[Z1,Z2]𝑇subscript𝑍1subscript𝑍2T=[Z_{1},Z_{2}]italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] for some T𝒯(c^)𝑇𝒯^𝑐T\in\mathcal{T}(\widehat{c})italic_T ∈ caligraphic_T ( over^ start_ARG italic_c end_ARG ), and let f:=μ^μassign𝑓^𝜇𝜇f:=\widehat{\mu}-\muitalic_f := over^ start_ARG italic_μ end_ARG - italic_μ, g:=μTμassign𝑔subscript𝜇𝑇𝜇g:=\mu_{T}-\muitalic_g := italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_μ. Table 3 provides conditions under which fgsucceeds𝑓𝑔f\succ gitalic_f ≻ italic_g (i.e., intervention reduces the maximum mistreated experienced by G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students). In particular it shows that for certain combinations of the data and the choice of Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, giving vouchers may actually lead to worse (according to succeeds\succ) matchings. One can check that under assumption p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, all conditions given in Table 3 for different cases are satisfied.

CASE subcase condition for μ^μμTμsucceeds^𝜇𝜇subscript𝜇𝑇𝜇\widehat{\mu}-\mu\succ\mu_{T}-\muover^ start_ARG italic_μ end_ARG - italic_μ ≻ italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_μ
I. βZ2Z1𝛽subscript𝑍2subscript𝑍1\beta Z_{2}\geq Z_{1}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1. 1βZ11𝛽subscript𝑍11\leq\beta Z_{1}1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT p<1(Z1Z2)α𝑝1superscriptsubscript𝑍1subscript𝑍2𝛼\displaystyle p<1-\left(\frac{Z_{1}}{Z_{2}}\right)^{\alpha}italic_p < 1 - ( divide start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT
2. βZ11βZ2𝛽subscript𝑍11𝛽subscript𝑍2\beta Z_{1}\leq 1\leq\beta Z_{2}italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT p<1(1βZ2)α𝑝1superscript1𝛽subscript𝑍2𝛼\displaystyle p<1-\left(\frac{1}{\beta Z_{2}}\right)^{\alpha}italic_p < 1 - ( divide start_ARG 1 end_ARG start_ARG italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT
II. βZ2Z1𝛽subscript𝑍2subscript𝑍1\beta Z_{2}\leq Z_{1}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1. 1βZ11𝛽subscript𝑍11\leq\beta Z_{1}1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT p<1βα𝑝1superscript𝛽𝛼\displaystyle p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT
2. βZ11βZ2𝛽subscript𝑍11𝛽subscript𝑍2\beta Z_{1}\leq 1\leq\beta Z_{2}italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT p((1Z1)α(1Z2)α)<(1p)(1βα1)(βα(1Z2)α)𝑝superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼1𝑝1superscript𝛽𝛼1superscript𝛽𝛼superscript1subscript𝑍2𝛼\displaystyle p\left(\left(\frac{1}{Z_{1}}\right)^{\alpha}-\left(\frac{1}{Z_{2% }}\right)^{\alpha}\right)<(1-p)\left(\frac{1}{\beta^{\alpha}}-1\right)\left(% \beta^{\alpha}-\left(\frac{1}{Z_{2}}\right)^{\alpha}\right)italic_p ( ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) < ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) ( italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT )
3. βZ21𝛽subscript𝑍21\beta Z_{2}\leq 1italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1 Not possible: gfsucceeds𝑔𝑓g\succ fitalic_g ≻ italic_f in this case.
Table 3: Sufficient conditions for μ^μμTμsucceeds^𝜇𝜇subscript𝜇𝑇𝜇\widehat{\mu}-\mu\succ\mu_{T}-\muover^ start_ARG italic_μ end_ARG - italic_μ ≻ italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - italic_μ by cases, where T=[Z1,Z2]𝑇subscript𝑍1subscript𝑍2T=[Z_{1},Z_{2}]italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ]. Each strict inequality, when replaced with its non-strict counterpart, gives instead a necessary condition.

In this first part of the proof, we proceed as follows. First, we assume that T𝒯c(c^)𝑇superscript𝒯𝑐^𝑐T\in\mathcal{T}^{c}(\widehat{c})italic_T ∈ caligraphic_T start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ). That is, we assume T=[Z1,Z2]𝑇subscript𝑍1subscript𝑍2T=[Z_{1},Z_{2}]italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] with extreme points Z1<Z2subscript𝑍1subscript𝑍2Z_{1}<Z_{2}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. For simplicity, we let μ~~𝜇\widetilde{\mu}over~ start_ARG italic_μ end_ARG denote μTsubscript𝜇𝑇\mu_{T}italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT. We then compare f:=μ^μassign𝑓^𝜇𝜇f:=\widehat{\mu}-\muitalic_f := over^ start_ARG italic_μ end_ARG - italic_μ and g:=μ~μassign𝑔~𝜇𝜇g:=\widetilde{\mu}-\muitalic_g := over~ start_ARG italic_μ end_ARG - italic_μ using the relation succeeds\succ.

Note that, if we let S𝑆Sitalic_S be the set of students in G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with potential in [Z1,Z2/β]subscript𝑍1subscript𝑍2𝛽[Z_{1},Z_{2}/\beta][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β ] and S:=G2Sassignsuperscript𝑆subscript𝐺2𝑆S^{\prime}:=G_{2}\setminus Sitalic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∖ italic_S, we have f(θ)=g(θ)𝑓𝜃𝑔𝜃f(\theta)=g(\theta)italic_f ( italic_θ ) = italic_g ( italic_θ ) for θS𝜃superscript𝑆\theta\in S^{\prime}italic_θ ∈ italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. That is, only G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose true potential lies in interval [Z1,Z2/β]subscript𝑍1subscript𝑍2𝛽[Z_{1},Z_{2}/\beta][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β ] are affected by the intervention. Hence, supθSf>supθSgsubscriptsupremum𝜃𝑆𝑓subscriptsupremum𝜃𝑆𝑔\sup_{\theta\in S}f>\sup_{\theta\in S}groman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_S end_POSTSUBSCRIPT italic_f > roman_sup start_POSTSUBSCRIPT italic_θ ∈ italic_S end_POSTSUBSCRIPT italic_g if and only if fgsucceeds𝑓𝑔f\succ gitalic_f ≻ italic_g. We divide the analysis in the following two major cases: the first case is when βZ2Z1𝛽subscript𝑍2subscript𝑍1\beta Z_{2}\geq Z_{1}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (i.e., when [βZ1,βZ2]𝛽subscript𝑍1𝛽subscript𝑍2[\beta Z_{1},\beta Z_{2}][ italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] and [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] overlap) and the second case is when βZ2Z1𝛽subscript𝑍2subscript𝑍1\beta Z_{2}\leq Z_{1}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. For both major cases, we will consider two subcases: βZ11𝛽subscript𝑍11\beta Z_{1}\geq 1italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 1, βZ11βZ2𝛽subscript𝑍11𝛽subscript𝑍2\beta Z_{1}\leq 1\leq\beta Z_{2}italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. And for the second major case, we also need to consider the subcase where βZ21𝛽subscript𝑍21\beta Z_{2}\leq 1italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1. The results for all cases are summarized in the Table 3.

Observation 12.1

If there is an interval [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] that is of either case I.2 or case II.2 such that μ[Z1,Z2]μμ^μprecedessubscript𝜇subscript𝑍1subscript𝑍2𝜇^𝜇𝜇\mu_{[Z_{1},Z_{2}]}-\mu\prec\widehat{\mu}-\muitalic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ≺ over^ start_ARG italic_μ end_ARG - italic_μ with S=G2𝑆subscript𝐺2S=G_{2}italic_S = italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, then the optimal range must be of case I.2 or case II.2. This is because for any interval [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{\prime},Z_{2}^{\prime}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] that is not of case I.2 or case II.2, we have

supθΘ{(μ[Z1,Z2]μ)}supθΘ{μ^μ}>supθΘ{μ[Z1,Z2]μ}.subscriptsupremum𝜃Θsubscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇subscriptsupremum𝜃Θ^𝜇𝜇subscriptsupremum𝜃Θsubscript𝜇subscript𝑍1subscript𝑍2𝜇\sup_{\theta\in\Theta}\{(\mu_{[Z_{1}^{\prime},Z_{2}^{\prime}]}-\mu)\}\geq\sup_% {\theta\in\Theta}\{\widehat{\mu}-\mu\}>\sup_{\theta\in\Theta}\{\mu_{[Z_{1},Z_{% 2}]}-\mu\}.roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) } ≥ roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { over^ start_ARG italic_μ end_ARG - italic_μ } > roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ } .

As it turns out, indeed, the optimal range will be either case I.2 or case II.2, and exactly which one the optimal solution is depends on the amount of resources, i.e., the value of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG.

We now show the first half of Theorem 4.1, i.e., we assume c^(1p)(1βα)1p+1βα^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼\widehat{c}\geq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}}over^ start_ARG italic_c end_ARG ≥ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG. The proof steps are outlined below. Each step can be shown by simple algebra and is thus omitted.

  • (1).

    We first show that [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is of case I.2. That is, we show βZ2Z1𝛽superscriptsubscript𝑍2superscriptsubscript𝑍1\beta Z_{2}^{*}\geq Z_{1}^{*}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≥ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and Z11βZ2superscriptsubscript𝑍11𝛽superscriptsubscript𝑍2Z_{1}^{*}\leq\frac{1}{\beta}\leq Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_β end_ARG ≤ italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

By writing out the formula for μ[Z1,Z2]μsubscript𝜇subscript𝑍1subscript𝑍2𝜇\mu_{[Z_{1},Z_{2}]}-\muitalic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ, one can see that for an interval [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] of case I.2 or case II.2, μ[Z1,Z2]μsubscript𝜇subscript𝑍1subscript𝑍2𝜇\mu_{[Z_{1},Z_{2}]}-\muitalic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ increases on [1,Z1]1subscript𝑍1[1,Z_{1}][ 1 , italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ], deceases on [Z2,]subscript𝑍2[Z_{2},\infty][ italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ∞ ], and it is non-positive on [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ]. This means supθΘ{μ[Z1,Z2]μ}subscriptsupremum𝜃Θsubscript𝜇subscript𝑍1subscript𝑍2𝜇\sup_{\theta\in\Theta}\{\mu_{[Z_{1},Z_{2}]}-\mu\}roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ } is achieved either at Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT or Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

  • (2).

    Next, we show that [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is an exact range, that is, (1Z1)α(1Z2)α=c^superscript1superscriptsubscript𝑍1𝛼superscript1superscriptsubscript𝑍2𝛼^𝑐(\frac{1}{Z_{1}^{*}})^{\alpha}-(\frac{1}{Z_{2}^{*}})^{\alpha}=\widehat{c}( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT = over^ start_ARG italic_c end_ARG. Moreover, let θ1superscriptsubscript𝜃1\theta_{1}^{*}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and θ2superscriptsubscript𝜃2\theta_{2}^{*}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose potentials are Z1superscriptsubscript𝑍1Z_{1}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and Z2superscriptsubscript𝑍2Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT respectively. Then, (μ[Z1,Z2]μ)(θ1)=(μ[Z1,Z2]μ)(θ2)subscript𝜇subscript𝑍1subscript𝑍2𝜇superscriptsubscript𝜃1subscript𝜇subscript𝑍1subscript𝑍2𝜇superscriptsubscript𝜃2(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{1}^{*})=(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{2}% ^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) and thus, they are both equal to supθΘ{μ[Z1,Z2]μ}subscriptsupremum𝜃Θsubscript𝜇subscript𝑍1subscript𝑍2𝜇\sup_{\theta\in\Theta}\{\mu_{[Z_{1},Z_{2}]}-\mu\}roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ }.

Together with the assumption p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, we have supθΘ{μ[Z1,Z2]μ}supθΘ{μ^μ}subscriptsupremum𝜃Θsubscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇subscriptsupremum𝜃Θ^𝜇𝜇\sup_{\theta\in\Theta}\{\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu\}\leq\sup_{\theta\in% \Theta}\{\widehat{\mu}-\mu\}roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ } ≤ roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { over^ start_ARG italic_μ end_ARG - italic_μ }. Thus, due to Observation 12.1, it is sufficient to compare [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] only with intervals [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] of case I.2 and case II.2 (i.e, when βZ11βZ2𝛽subscript𝑍11𝛽subscript𝑍2\beta Z_{1}\leq 1\leq\beta Z_{2}italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT). Since [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is exact, we must either have Z1>Z1subscript𝑍1superscriptsubscript𝑍1Z_{1}>Z_{1}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT or Z2<Z2subscript𝑍2superscriptsubscript𝑍2Z_{2}<Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

  • (3).

    Lastly, we show that for any other feasible range [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] of case I.2 or case II.2, we must have supθΘ{μ[Z1,Z2]μ}>supθΘ{μ[Z1,Z2]μ}subscriptsupremum𝜃Θsubscript𝜇subscript𝑍1subscript𝑍2𝜇subscriptsupremum𝜃Θsubscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇\sup_{\theta\in\Theta}\{\mu_{[Z_{1},Z_{2}]}-\mu\}>\sup_{\theta\in\Theta}\{\mu_% {[Z_{1}^{*},Z_{2}^{*}]}-\mu\}roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ } > roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ }. Let θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose potentials are Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. It suffices to show

    1. i).

      if Z1>Z1subscript𝑍1superscriptsubscript𝑍1Z_{1}>Z_{1}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, then (μ[Z1,Z2]μ)(θ1)>(μ[Z1,Z2]μ)(θ1)subscript𝜇subscript𝑍1subscript𝑍2𝜇subscript𝜃1subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇superscriptsubscript𝜃1(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{1})>(\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu)(\theta% _{1}^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) > ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT );

    2. ii).

      if Z2<Z2subscript𝑍2superscriptsubscript𝑍2Z_{2}<Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, then (μ[Z1,Z2]μ)(θ2)>(μ[Z1,Z2]μ)(θ2)subscript𝜇subscript𝑍1subscript𝑍2𝜇subscript𝜃2subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇superscriptsubscript𝜃2(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{2})>(\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu)(\theta% _{2}^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) > ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

For the second half of the theorem, we will follow similar steps and reasoning, outlined below.

  • (1).

    We first show that [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is of case II.2. That is to show βZ2Z1𝛽superscriptsubscript𝑍2superscriptsubscript𝑍1\beta Z_{2}^{*}\leq Z_{1}^{*}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≤ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and Z11βZ2superscriptsubscript𝑍11𝛽superscriptsubscript𝑍2Z_{1}^{*}\leq\frac{1}{\beta}\leq Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_β end_ARG ≤ italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

  • (2).

    We check that [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is an exact range. And let θ1superscriptsubscript𝜃1\theta_{1}^{*}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and θ2superscriptsubscript𝜃2\theta_{2}^{*}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose potentials are Z1superscriptsubscript𝑍1Z_{1}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and Z2superscriptsubscript𝑍2Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT respectively, we want to show that (μ[Z1,Z2]μ)(θ1)=(μ[Z1,Z2]μ)(θ2)subscript𝜇subscript𝑍1subscript𝑍2𝜇superscriptsubscript𝜃1subscript𝜇subscript𝑍1subscript𝑍2𝜇superscriptsubscript𝜃2(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{1}^{*})=(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{2}% ^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), which implies that both are supθΘ{μ[Z1,Z2]μ}subscriptsupremum𝜃Θsubscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇\sup_{\theta\in\Theta}\{\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu\}roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT { italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ }.

  • (3).

    We show μ[Z1,Z2]μμ^μprecedessubscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇^𝜇𝜇\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu\prec\widehat{\mu}-\muitalic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ≺ over^ start_ARG italic_μ end_ARG - italic_μ, which, unlike in the previous case, is not immediate from the assumption p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT.

Again, due to Observation 12.1, it is sufficient to compare [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] only with regions [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] of case I.2 and case II.2 (i.e, when βZ11βZ2𝛽subscript𝑍11𝛽subscript𝑍2\beta Z_{1}\leq 1\leq\beta Z_{2}italic_β italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 1 ≤ italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT).

  • (4).

    As before, we will show two cases, which is enough because [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] is exact and one of the two cases is bound to happen. Again, let θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and θ2subscript𝜃2\theta_{2}italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be the G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students whose potentials are Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT respectively. We want to show

    1. i).

      if Z1>Z1subscript𝑍1superscriptsubscript𝑍1Z_{1}>Z_{1}^{*}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, then (μ[Z1,Z2]μ)(θ1)>(μ[Z1,Z2]μ)(θ1)subscript𝜇subscript𝑍1subscript𝑍2𝜇subscript𝜃1subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇superscriptsubscript𝜃1(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{1})>(\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu)(\theta% _{1}^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) > ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ),

    2. ii).

      otherwise, we must have Z2<Z2subscript𝑍2superscriptsubscript𝑍2Z_{2}<Z_{2}^{*}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and then (μ[Z1,Z2]μ)(θ2)>(μ[Z1,Z2]μ)(θ2)subscript𝜇subscript𝑍1subscript𝑍2𝜇subscript𝜃2subscript𝜇superscriptsubscript𝑍1superscriptsubscript𝑍2𝜇superscriptsubscript𝜃2(\mu_{[Z_{1},Z_{2}]}-\mu)(\theta_{2})>(\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu)(\theta% _{2}^{*})( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) > ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

Now let T𝒯(c^)superscript𝑇𝒯^𝑐T^{*}\in\mathcal{T}(\widehat{c})italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ caligraphic_T ( over^ start_ARG italic_c end_ARG ) be the optimal solution without the restriction that sets in 𝒯(c^)𝒯^𝑐\mathcal{T}(\widehat{c})caligraphic_T ( over^ start_ARG italic_c end_ARG ) are connected. We will show that Tsuperscript𝑇T^{*}italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT differs from [Z1,Z2]superscriptsubscript𝑍1superscriptsubscript𝑍2[Z_{1}^{*},Z_{2}^{*}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] in a set of measure zero. First, in order to have sup(μTμ)sup(μ[Z1,Z2]μ)=:s\sup(\mu_{T^{*}}-\mu)\leq\sup(\mu_{[Z_{1}^{*},Z_{2}^{*}]}-\mu)=:sroman_sup ( italic_μ start_POSTSUBSCRIPT italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_μ ) ≤ roman_sup ( italic_μ start_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] end_POSTSUBSCRIPT - italic_μ ) = : italic_s, in Tsuperscript𝑇T^{*}italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, we must debias all students θ𝜃\thetaitalic_θ whose mistreatment (μ^μ)(θ)^𝜇𝜇𝜃(\widehat{\mu}-\mu)(\theta)( over^ start_ARG italic_μ end_ARG - italic_μ ) ( italic_θ ) is greater than s𝑠sitalic_s. That is, we must have T1:=[Z1,Z(1)]Tassignsuperscriptsubscript𝑇1superscriptsubscript𝑍1superscript𝑍1superscript𝑇T_{1}^{*}:=[Z_{1}^{*},Z^{(1)}]\subseteq T^{*}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT := [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ] ⊆ italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, where Z(1):=Z(θ(1))1/βassignsuperscript𝑍1𝑍superscript𝜃11𝛽Z^{(1)}:=Z(\theta^{(1)})\geq 1/\betaitalic_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT := italic_Z ( italic_θ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ) ≥ 1 / italic_β and (μ^μ)(θ(1))=s^𝜇𝜇superscript𝜃1𝑠(\widehat{\mu}-\mu)(\theta^{(1)})=s( over^ start_ARG italic_μ end_ARG - italic_μ ) ( italic_θ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ) = italic_s. There is a G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT student θ(2)superscript𝜃2\theta^{(2)}italic_θ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT such that Z(2):=Z(θ(2))>Z(1)assignsuperscript𝑍2𝑍superscript𝜃2superscript𝑍1Z^{(2)}:=Z(\theta^{(2)})>Z^{(1)}italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT := italic_Z ( italic_θ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ) > italic_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT and (μT1μ)(θ(2))=ssubscript𝜇superscriptsubscript𝑇1𝜇superscript𝜃2𝑠(\mu_{T_{1}^{*}}-\mu)(\theta^{(2)})=s( italic_μ start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_μ ) ( italic_θ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ) = italic_s. We have moreover that (μT1μ)(θ)ssubscript𝜇superscriptsubscript𝑇1𝜇𝜃𝑠(\mu_{T_{1}^{*}}-\mu)(\theta)\geq s( italic_μ start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_μ ) ( italic_θ ) ≥ italic_s for all θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT such that Z(θ)[Z(1),Z(2)]𝑍𝜃superscript𝑍1superscript𝑍2Z(\theta)\in[Z^{(1)},Z^{(2)}]italic_Z ( italic_θ ) ∈ [ italic_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ]. Thus, we must also have [Z(1),Z(2)]Tsuperscript𝑍1superscript𝑍2superscript𝑇[Z^{(1)},Z^{(2)}]\in T^{*}[ italic_Z start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ] ∈ italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Let T2:=[Z1,Z(2)]assignsuperscriptsubscript𝑇2superscriptsubscript𝑍1superscript𝑍2T_{2}^{*}:=[Z_{1}^{*},Z^{(2)}]italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT := [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ]. We can repeat the argument and observe that there is a G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT student θ(3)superscript𝜃3\theta^{(3)}italic_θ start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT such that Z(3):=Z(θ(3))>Z(2)assignsuperscript𝑍3𝑍superscript𝜃3superscript𝑍2Z^{(3)}:=Z(\theta^{(3)})>Z^{(2)}italic_Z start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT := italic_Z ( italic_θ start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ) > italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT and (μT2μ)(θ)ssubscript𝜇superscriptsubscript𝑇2𝜇𝜃𝑠(\mu_{T_{2}^{*}}-\mu)(\theta)\geq s( italic_μ start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT - italic_μ ) ( italic_θ ) ≥ italic_s for θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT such that Z(θ)[Z(2),Z(3)]𝑍𝜃superscript𝑍2superscript𝑍3Z(\theta)\in[Z^{(2)},Z^{(3)}]italic_Z ( italic_θ ) ∈ [ italic_Z start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ] and conclude that T3:=[Z1,Z(3)]assignsuperscriptsubscript𝑇3superscriptsubscript𝑍1superscript𝑍3T_{3}^{*}:=[Z_{1}^{*},Z^{(3)}]italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT := [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_Z start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ] must be contained in Tsuperscript𝑇T^{*}italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Continuously applying the same argument, we have limnZ(θ(n))=Z2subscript𝑛𝑍superscript𝜃𝑛superscriptsubscript𝑍2\lim_{n\rightarrow\infty}Z(\theta^{(n)})=Z_{2}^{*}roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT italic_Z ( italic_θ start_POSTSUPERSCRIPT ( italic_n ) end_POSTSUPERSCRIPT ) = italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and thus the claim follows.

13 Proof of Theorem 4.2 and related facts

Assume T=[Z1,Z2]𝑇subscript𝑍1subscript𝑍2T=[Z_{1},Z_{2}]italic_T = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] is the range of true potentials of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students we want to debias. For simplicity, as in previous sections, let μ~~𝜇\widetilde{\mu}over~ start_ARG italic_μ end_ARG denote μTsubscript𝜇𝑇\mu_{T}italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT. In order to obtain the minimizer of σ(μ~μ)𝜎~𝜇𝜇\sigma(\widetilde{\mu}-\mu)italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ), first, we want to compute σ(μ~μ)𝜎~𝜇𝜇\sigma(\widetilde{\mu}-\mu)italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) for each of the cases in Table 3.

For 1t1t2{+}1subscript𝑡1subscript𝑡21\leq t_{1}\leq t_{2}\in\mathbb{R}\cup\{+\infty\}1 ≤ italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R ∪ { + ∞ }, let σt1t2(f):=t1t2max(f(t),0)𝑑F1(t)assignsuperscriptsubscript𝜎subscript𝑡1subscript𝑡2𝑓superscriptsubscriptsubscript𝑡1subscript𝑡2𝑓𝑡0differential-dsubscript𝐹1𝑡\sigma_{t_{1}}^{t_{2}}(f):=\int_{t_{1}}^{t_{2}}\max(f(t),0)dF_{1}(t)italic_σ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_f ) := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_max ( italic_f ( italic_t ) , 0 ) italic_d italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) for any function f:[1,][0,1]:𝑓101f:[1,\infty]\rightarrow[0,1]italic_f : [ 1 , ∞ ] → [ 0 , 1 ]. When t1=1subscript𝑡11t_{1}=1italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 and t2=subscript𝑡2t_{2}=\inftyitalic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ∞, we simply write σ(f)𝜎𝑓\sigma(f)italic_σ ( italic_f ), which is consistent with previous notations. Note that with σ(μ^μ)𝜎^𝜇𝜇\sigma(\widehat{\mu}-\mu)italic_σ ( over^ start_ARG italic_μ end_ARG - italic_μ ) as a reference, it actually suffices to compute only σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\sigma_{Z_{1}}^{Z_{2}/\beta}(\widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ), because minimizing σ(μ~μ)𝜎~𝜇𝜇\sigma(\widetilde{\mu}-\mu)italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) is equivalent to maximizing σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z_{2}/\beta}(% \widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) since (μ^μ)(θ)=(μ~μ)(θ)^𝜇𝜇𝜃~𝜇𝜇𝜃(\widehat{\mu}-\mu)(\theta)=(\widetilde{\mu}-\mu)(\theta)( over^ start_ARG italic_μ end_ARG - italic_μ ) ( italic_θ ) = ( over~ start_ARG italic_μ end_ARG - italic_μ ) ( italic_θ ) for all θG2𝜃subscript𝐺2\theta\in G_{2}italic_θ ∈ italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with Z(θ)[Z1,Z2/β]𝑍𝜃subscript𝑍1subscript𝑍2𝛽Z(\theta)\notin[Z_{1},Z_{2}/\beta]italic_Z ( italic_θ ) ∉ [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β ].

For each case, we give an explicit formula for σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z_{2}/\beta}(% \widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ). These formulae can be computed via simply integration, and are thus omitted. In addition, we analyze how this value changes (increase or decrease) with respect to Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

CASE I – Subcase 1. After integrating, we have

σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\displaystyle\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z% _{2}/\beta}(\widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) =(1p)(1βα1)2(1Z1)2α+ppβα1βα+12(1Z2)2α.absent1𝑝1superscript𝛽𝛼12superscript1subscript𝑍12𝛼𝑝𝑝superscript𝛽𝛼1superscript𝛽𝛼12superscript1subscript𝑍22𝛼\displaystyle=\frac{(1-p)\left(\frac{1}{\beta^{\alpha}}-1\right)}{2}\left(% \frac{1}{Z_{1}}\right)^{2\alpha}+\frac{p-p\beta^{\alpha}-\frac{1}{\beta^{% \alpha}}+1}{2}\left(\frac{1}{Z_{2}}\right)^{2\alpha}.= divide start_ARG ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) end_ARG start_ARG 2 end_ARG ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT + divide start_ARG italic_p - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG + 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT .

Now, to analyze how this quantity changes with Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we first simplify some of the terms, which will also be used in later sections. Let x=(1Z2)α[0,1]𝑥superscript1subscript𝑍2𝛼01x=(\frac{1}{Z_{2}})^{\alpha}\in[0,1]italic_x = ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ∈ [ 0 , 1 ] and let (1Z1)α=c+x[0,1]superscript1subscript𝑍1𝛼𝑐𝑥01(\frac{1}{Z_{1}})^{\alpha}=c+x\in[0,1]( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT = italic_c + italic_x ∈ [ 0 , 1 ] for some cc^𝑐^𝑐c\leq\widehat{c}italic_c ≤ over^ start_ARG italic_c end_ARG. Also, let g(x,c):=σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)assign𝑔𝑥𝑐superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇g(x,c):=\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z_{2}/% \beta}(\widetilde{\mu}-\mu)italic_g ( italic_x , italic_c ) := italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ). Then,

g(x,c)=(1p)(1βα1)2(c+x)2+ppβα1βα+12x2.𝑔𝑥𝑐1𝑝1superscript𝛽𝛼12superscript𝑐𝑥2𝑝𝑝superscript𝛽𝛼1superscript𝛽𝛼12superscript𝑥2g(x,c)=\frac{(1-p)\left(\frac{1}{\beta^{\alpha}}-1\right)}{2}(c+x)^{2}+\frac{p% -p\beta^{\alpha}-\frac{1}{\beta^{\alpha}}+1}{2}x^{2}.italic_g ( italic_x , italic_c ) = divide start_ARG ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) end_ARG start_ARG 2 end_ARG ( italic_c + italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG italic_p - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG + 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

First order conditions (FOC) show that g(x,c)𝑔𝑥𝑐g(x,c)italic_g ( italic_x , italic_c ) increases as x𝑥xitalic_x increases (or equivalently, as Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT decreases) and as c𝑐citalic_c increases (meaning that the constraint (1Z1)α(1Z2)αc^superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼^𝑐(\frac{1}{Z_{1}})^{\alpha}-(\frac{1}{Z_{2}})^{\alpha}\leq\widehat{c}( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ≤ over^ start_ARG italic_c end_ARG is effectively (1Z1)α(1Z2)α=c^superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼^𝑐(\frac{1}{Z_{1}})^{\alpha}-(\frac{1}{Z_{2}})^{\alpha}=\widehat{c}( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT = over^ start_ARG italic_c end_ARG).

CASE I – Subcase 2. In this case, we have

σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\displaystyle\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z% _{2}/\beta}(\widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) =12(1p)βα+(1p)((1Z1)α12(1Z1)2α)absent121𝑝superscript𝛽𝛼1𝑝superscript1subscript𝑍1𝛼12superscript1subscript𝑍12𝛼\displaystyle=-\frac{1}{2}(1-p)\beta^{\alpha}+(1-p)\left(\left(\frac{1}{Z_{1}}% \right)^{\alpha}-\frac{1}{2}\left(\frac{1}{Z_{1}}\right)^{2\alpha}\right)= - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - italic_p ) italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + ( 1 - italic_p ) ( ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT )
+ppβα1βα+12(1Z2)2α.𝑝𝑝superscript𝛽𝛼1superscript𝛽𝛼12superscript1subscript𝑍22𝛼\displaystyle+\frac{p-p\beta^{\alpha}-\frac{1}{\beta^{\alpha}}+1}{2}\left(% \frac{1}{Z_{2}}\right)^{2\alpha}.+ divide start_ARG italic_p - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG + 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT .

Now, for the analysis, similarly, write

g(x,c)=const+(1p)((c+x)12(c+x)2)+ppβα1βα+12x2.𝑔𝑥𝑐const1𝑝𝑐𝑥12superscript𝑐𝑥2𝑝𝑝superscript𝛽𝛼1superscript𝛽𝛼12superscript𝑥2g(x,c)=\text{const}+(1-p)\left((c+x)-\frac{1}{2}(c+x)^{2}\right)+\frac{p-p% \beta^{\alpha}-\frac{1}{\beta^{\alpha}}+1}{2}x^{2}.italic_g ( italic_x , italic_c ) = const + ( 1 - italic_p ) ( ( italic_c + italic_x ) - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_c + italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + divide start_ARG italic_p - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG + 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Then, FOC shows that g(x,c)𝑔𝑥𝑐g(x,c)italic_g ( italic_x , italic_c ) is an increasing function w.r.t. c𝑐citalic_c, and it is an increasing function w.r.t. x𝑥xitalic_x on [0,hI(c)]0subscriptI𝑐[0,h_{\text{I}}(c)][ 0 , italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( italic_c ) ] and is a decreasing function on [hI(c),1]subscriptI𝑐1[h_{\text{I}}(c),1][ italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( italic_c ) , 1 ], where

hI(c)=(1p)(1c)pβα+1βα2p.subscriptI𝑐1𝑝1𝑐𝑝superscript𝛽𝛼1superscript𝛽𝛼2𝑝h_{\text{I}}(c)=\frac{(1-p)(1-c)}{p\beta^{\alpha}+\frac{1}{\beta^{\alpha}}-2p}.italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( italic_c ) = divide start_ARG ( 1 - italic_p ) ( 1 - italic_c ) end_ARG start_ARG italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 2 italic_p end_ARG .

CASE II – Subcase 1. In this case, σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z_{2}/\beta}(% \widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) equals to

(1Z1)2α((1p)(1βα1)+pβα2)(1Z2)2α((1p)(1βα1)+pβα2)+p(1Z2)2αp(1Z1)α(1Z2)α.superscript1subscript𝑍12𝛼1𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼2superscript1subscript𝑍22𝛼1𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼2𝑝superscript1subscript𝑍22𝛼𝑝superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼\displaystyle\left(\frac{1}{Z_{1}}\right)^{2\alpha}\left(\frac{(1-p)\left(% \frac{1}{\beta^{\alpha}}-1\right)+p\beta^{\alpha}}{2}\right)-\left(\frac{1}{Z_% {2}}\right)^{2\alpha}\left(\frac{(1-p)\left(\frac{1}{\beta^{\alpha}}-1\right)+% p\beta^{\alpha}}{2}\right)+p\left(\frac{1}{Z_{2}}\right)^{2\alpha}-p\left(% \frac{1}{Z_{1}}\right)^{\alpha}\left(\frac{1}{Z_{2}}\right)^{\alpha}.( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT ( divide start_ARG ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT ( divide start_ARG ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) + italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT - italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT .

Now, for the analysis, let A=[(1p)(1βα1)+pβα]/20𝐴delimited-[]1𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼20A=[(1-p)(\frac{1}{\beta^{\alpha}}-1)+p\beta^{\alpha}]/2\geq 0italic_A = [ ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ] / 2 ≥ 0. Then,

g(x,c)=A(c+x)2Ax2+px2p(c+x)(x).𝑔𝑥𝑐𝐴superscript𝑐𝑥2𝐴superscript𝑥2𝑝superscript𝑥2𝑝𝑐𝑥𝑥g(x,c)=A(c+x)^{2}-Ax^{2}+px^{2}-p(c+x)(x).italic_g ( italic_x , italic_c ) = italic_A ( italic_c + italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_A italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_p italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_p ( italic_c + italic_x ) ( italic_x ) .

Checking the FOCs, we have that g(x,c)𝑔𝑥𝑐g(x,c)italic_g ( italic_x , italic_c ) is an increasing function w.r.t. c𝑐citalic_c and w.r.t. x𝑥xitalic_x.

CASE II – Subcase 2. We have

σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)=superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇absent\displaystyle\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z% _{2}/\beta}(\widetilde{\mu}-\mu)=italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) = 12(1p)βα+(1Z1)2α((1p)+pβα2)+(1p)(1Z1)α121𝑝superscript𝛽𝛼superscript1subscript𝑍12𝛼1𝑝𝑝superscript𝛽𝛼21𝑝superscript1subscript𝑍1𝛼\displaystyle-\frac{1}{2}(1-p)\beta^{\alpha}+\left(\frac{1}{Z_{1}}\right)^{2% \alpha}\bigg{(}\frac{-(1-p)+p\beta^{\alpha}}{2}\bigg{)}+(1-p)\left(\frac{1}{Z_% {1}}\right)^{\alpha}- divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - italic_p ) italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT ( divide start_ARG - ( 1 - italic_p ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) + ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT
(1Z2)2α((1p)(1βα1)+pβα2)+p(1Z2)2αp(1Z1)α(1Z2)α.superscript1subscript𝑍22𝛼1𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼2𝑝superscript1subscript𝑍22𝛼𝑝superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼\displaystyle-\left(\frac{1}{Z_{2}}\right)^{2\alpha}\bigg{(}\frac{(1-p)\left(% \frac{1}{\beta^{\alpha}}-1\right)+p\beta^{\alpha}}{2}\bigg{)}+p\left(\frac{1}{% Z_{2}}\right)^{2\alpha}-p\left(\frac{1}{Z_{1}}\right)^{\alpha}\left(\frac{1}{Z% _{2}}\right)^{\alpha}.- ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT ( divide start_ARG ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) + italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT - italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT .

For the analysis, again let A=[(1p)(1βα1)+pβα]/20𝐴delimited-[]1𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼20A=[(1-p)(\frac{1}{\beta^{\alpha}}-1)+p\beta^{\alpha}]/2\geq 0italic_A = [ ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ] / 2 ≥ 0 and B=[(1p)+pβα]/2<0𝐵delimited-[]1𝑝𝑝superscript𝛽𝛼20B=[-(1-p)+p\beta^{\alpha}]/2<0italic_B = [ - ( 1 - italic_p ) + italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ] / 2 < 0. Then,

g(x,c)=const+B(c+x)2+(1p)(c+x)Ax2+px2p(c+x)(x),𝑔𝑥𝑐const𝐵superscript𝑐𝑥21𝑝𝑐𝑥𝐴superscript𝑥2𝑝superscript𝑥2𝑝𝑐𝑥𝑥g(x,c)=\text{const}+B(c+x)^{2}+(1-p)(c+x)-Ax^{2}+px^{2}-p(c+x)(x),italic_g ( italic_x , italic_c ) = const + italic_B ( italic_c + italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( 1 - italic_p ) ( italic_c + italic_x ) - italic_A italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_p italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_p ( italic_c + italic_x ) ( italic_x ) ,

and for c𝑐citalic_c, it is an increasing function; and for x𝑥xitalic_x, it is an increasing function on [0,hII(c)]0subscriptII𝑐[0,h_{\text{II}}(c)][ 0 , italic_h start_POSTSUBSCRIPT II end_POSTSUBSCRIPT ( italic_c ) ] and is a decreasing function on [hII(c),1]subscriptII𝑐1[h_{\text{II}}(c),1][ italic_h start_POSTSUBSCRIPT II end_POSTSUBSCRIPT ( italic_c ) , 1 ], where

hII(c)=(pβα1)c+(1p)(1p)1βα.subscriptII𝑐𝑝superscript𝛽𝛼1𝑐1𝑝1𝑝1superscript𝛽𝛼h_{\text{II}}(c)=\frac{(p\beta^{\alpha}-1)c+(1-p)}{(1-p)\frac{1}{\beta^{\alpha% }}}.italic_h start_POSTSUBSCRIPT II end_POSTSUBSCRIPT ( italic_c ) = divide start_ARG ( italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - 1 ) italic_c + ( 1 - italic_p ) end_ARG start_ARG ( 1 - italic_p ) divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG end_ARG .

CASE II – Subcase 3. Lastly, we have that σZ1Z2/β(μ^μ)σZ1Z2/β(μ~μ)superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽^𝜇𝜇superscriptsubscript𝜎subscript𝑍1subscript𝑍2𝛽~𝜇𝜇\sigma_{Z_{1}}^{Z_{2}/\beta}(\widehat{\mu}-\mu)-\sigma_{Z_{1}}^{Z_{2}/\beta}(% \widetilde{\mu}-\mu)italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ start_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_β end_POSTSUPERSCRIPT ( over~ start_ARG italic_μ end_ARG - italic_μ ) equals to

(1Z1)2αB+(1p)(1Z1)α(1Z2)2αB(1p)(1Z2)α+p(1Z2)2αp(1Z1)α(1Z2)α.superscript1subscript𝑍12𝛼𝐵1𝑝superscript1subscript𝑍1𝛼superscript1subscript𝑍22𝛼𝐵1𝑝superscript1subscript𝑍2𝛼𝑝superscript1subscript𝑍22𝛼𝑝superscript1subscript𝑍1𝛼superscript1subscript𝑍2𝛼\displaystyle\left(\frac{1}{Z_{1}}\right)^{2\alpha}B+(1-p)\left(\frac{1}{Z_{1}% }\right)^{\alpha}-\left(\frac{1}{Z_{2}}\right)^{2\alpha}B-(1-p)\left(\frac{1}{% Z_{2}}\right)^{\alpha}+p\left(\frac{1}{Z_{2}}\right)^{2\alpha}-p\left(\frac{1}% {Z_{1}}\right)^{\alpha}\left(\frac{1}{Z_{2}}\right)^{\alpha}.( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT italic_B + ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT italic_B - ( 1 - italic_p ) ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT - italic_p ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT .

For the analysis, write

g(x,c)=B(c+x)2+(1p)(c+x)Bx2(1p)x+px2p(c+x)x.𝑔𝑥𝑐𝐵superscript𝑐𝑥21𝑝𝑐𝑥𝐵superscript𝑥21𝑝𝑥𝑝superscript𝑥2𝑝𝑐𝑥𝑥g(x,c)=B(c+x)^{2}+(1-p)(c+x)-Bx^{2}-(1-p)x+px^{2}-p(c+x)x.italic_g ( italic_x , italic_c ) = italic_B ( italic_c + italic_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( 1 - italic_p ) ( italic_c + italic_x ) - italic_B italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( 1 - italic_p ) italic_x + italic_p italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_p ( italic_c + italic_x ) italic_x .

g(x,c)𝑔𝑥𝑐g(x,c)italic_g ( italic_x , italic_c ) is a decreasing function in x𝑥xitalic_x. The sign of g(x,c)c𝑔𝑥𝑐𝑐\frac{\partial g(x,c)}{\partial c}divide start_ARG ∂ italic_g ( italic_x , italic_c ) end_ARG start_ARG ∂ italic_c end_ARG is actually not clear in this subcase. But for the purpose of finding the minimizer of σ(μ~μ)𝜎~𝜇𝜇\sigma(\widetilde{\mu}-\mu)italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ), this is not important because for a fixed value of c𝑐citalic_c, g(x,c)𝑔𝑥𝑐g(x,c)italic_g ( italic_x , italic_c ) achieves its maximum when x𝑥xitalic_x is of the value such that [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] is of subcase 2, of either case I or case II.

Not that for a fixed value of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, as Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT gets larger (or equivalently as Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT gets larger, or as x:=(1Z2)αassign𝑥superscript1subscript𝑍2𝛼x:=(\frac{1}{Z_{2}})^{\alpha}italic_x := ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT gets smaller), the range [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] goes from case II to case I. In particular, for each value of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, such transition happens exactly when βZ2=Z1𝛽subscript𝑍2subscript𝑍1\beta Z_{2}=Z_{1}italic_β italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. That is, when

c^=(1βα1)(1Z2)α(1Z2)α=c^βα1βα.formulae-sequence^𝑐1superscript𝛽𝛼1superscript1subscript𝑍2𝛼superscript1subscript𝑍2𝛼^𝑐superscript𝛽𝛼1superscript𝛽𝛼\widehat{c}=\left(\frac{1}{\beta^{\alpha}}-1\right)\left(\frac{1}{Z_{2}}\right% )^{\alpha}\quad\Leftrightarrow\quad\left(\frac{1}{Z_{2}}\right)^{\alpha}=\frac% {\widehat{c}\beta^{\alpha}}{1-\beta^{\alpha}}.over^ start_ARG italic_c end_ARG = ( divide start_ARG 1 end_ARG start_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG - 1 ) ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⇔ ( divide start_ARG 1 end_ARG start_ARG italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT = divide start_ARG over^ start_ARG italic_c end_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG .
Refer to caption
Figure 10: For fixed values of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, the function of σ(μ^μ)σ(μ~μ)𝜎^𝜇𝜇𝜎~𝜇𝜇\sigma(\widehat{\mu}-\mu)-\sigma(\widetilde{\mu}-\mu)italic_σ ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) on Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

Now, for each fixed value of c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, Figure 10 plots σ(μ^μ)σ(μ~μ)𝜎^𝜇𝜇𝜎~𝜇𝜇\sigma(\widehat{\mu}-\mu)-\sigma(\widetilde{\mu}-\mu)italic_σ ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) against Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. It also shows that as Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT increases, how the interval [Z1,Z2]subscript𝑍1subscript𝑍2[Z_{1},Z_{2}][ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] changes by cases.

With simple algebra, one can easily check that

c^=(1p)(1βα)2pβαpβα+pβ2αhI(c^)=hII(c^)=c^βα1βα.formulae-sequence^𝑐1𝑝1superscript𝛽𝛼2𝑝superscript𝛽𝛼𝑝superscript𝛽𝛼𝑝superscript𝛽2𝛼subscriptI^𝑐subscriptII^𝑐^𝑐superscript𝛽𝛼1superscript𝛽𝛼\widehat{c}=\frac{(1-p)(1-\beta^{\alpha})}{2-p-\beta^{\alpha}-p\beta^{\alpha}+% p\beta^{2\alpha}}\quad\Rightarrow\quad h_{\text{I}}(\widehat{c})=h_{\text{II}}% (\widehat{c})=\frac{\widehat{c}\beta^{\alpha}}{1-\beta^{\alpha}}.over^ start_ARG italic_c end_ARG = divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 2 - italic_p - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p italic_β start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT end_ARG ⇒ italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) = italic_h start_POSTSUBSCRIPT II end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) = divide start_ARG over^ start_ARG italic_c end_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG .

Therefore, when

c^(1p)(1βα)2pβαpβα+pβ2α,^𝑐1𝑝1superscript𝛽𝛼2𝑝superscript𝛽𝛼𝑝superscript𝛽𝛼𝑝superscript𝛽2𝛼\widehat{c}\geq\frac{(1-p)(1-\beta^{\alpha})}{2-p-\beta^{\alpha}-p\beta^{% \alpha}+p\beta^{2\alpha}},over^ start_ARG italic_c end_ARG ≥ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 2 - italic_p - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p italic_β start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT end_ARG ,

both hI(c^)subscriptI^𝑐h_{\text{I}}(\widehat{c})italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) and hII(c^)subscriptII^𝑐h_{\text{II}(\widehat{c})}italic_h start_POSTSUBSCRIPT II ( over^ start_ARG italic_c end_ARG ) end_POSTSUBSCRIPT are no more than c^βα1βα^𝑐superscript𝛽𝛼1superscript𝛽𝛼\displaystyle\frac{\widehat{c}\beta^{\alpha}}{1-\beta^{\alpha}}divide start_ARG over^ start_ARG italic_c end_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG. Thus, the maximum value of σ(μ^μ)σ(μ~μ)𝜎^𝜇𝜇𝜎~𝜇𝜇\sigma(\widehat{\mu}-\mu)-\sigma(\widetilde{\mu}-\mu)italic_σ ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) is achieved when x=hI(c^)𝑥subscriptI^𝑐x=h_{\text{I}}(\widehat{c})italic_x = italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ); and when

c^(1p)(1βα)2pβαpβα+pβ2α,^𝑐1𝑝1superscript𝛽𝛼2𝑝superscript𝛽𝛼𝑝superscript𝛽𝛼𝑝superscript𝛽2𝛼\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{2-p-\beta^{\alpha}-p\beta^{% \alpha}+p\beta^{2\alpha}},over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 2 - italic_p - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + italic_p italic_β start_POSTSUPERSCRIPT 2 italic_α end_POSTSUPERSCRIPT end_ARG ,

both hI(c^)subscriptI^𝑐h_{\text{I}}(\widehat{c})italic_h start_POSTSUBSCRIPT I end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) and hII(c^)subscriptII^𝑐h_{\text{II}(\widehat{c})}italic_h start_POSTSUBSCRIPT II ( over^ start_ARG italic_c end_ARG ) end_POSTSUBSCRIPT are no less than c^βα1βα^𝑐superscript𝛽𝛼1superscript𝛽𝛼\displaystyle\frac{\widehat{c}\beta^{\alpha}}{1-\beta^{\alpha}}divide start_ARG over^ start_ARG italic_c end_ARG italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG. Thus, the maximum value of σ(μ^μ)σ(μ~μ)𝜎^𝜇𝜇𝜎~𝜇𝜇\sigma(\widehat{\mu}-\mu)-\sigma(\widetilde{\mu}-\mu)italic_σ ( over^ start_ARG italic_μ end_ARG - italic_μ ) - italic_σ ( over~ start_ARG italic_μ end_ARG - italic_μ ) is achieved when x=hII(c^)𝑥subscriptII^𝑐x=h_{\text{II}}(\widehat{c})italic_x = italic_h start_POSTSUBSCRIPT II end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ).

c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG 𝒯mm(c^)=[Z1,Z2]subscript𝒯𝑚𝑚^𝑐subscript𝑍1subscript𝑍2\mathcal{T}_{mm}(\widehat{c})=[Z_{1},Z_{2}]caligraphic_T start_POSTSUBSCRIPT italic_m italic_m end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] 𝒯auc(c^)=[Z1,Z2]subscript𝒯𝑎𝑢𝑐^𝑐superscriptsubscript𝑍1superscriptsubscript𝑍2\mathcal{T}_{auc}(\widehat{c})=[Z_{1}^{\prime},Z_{2}^{\prime}]caligraphic_T start_POSTSUBSCRIPT italic_a italic_u italic_c end_POSTSUBSCRIPT ( over^ start_ARG italic_c end_ARG ) = [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] Z1Z1subscript𝑍1superscriptsubscript𝑍1Z_{1}-Z_{1}^{\prime}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
0.10 [1.2252, 1.3111] [1.2187, 1.3026] 0.0065
0.20 [1.2022, 1.3861] [1.1903, 1.3653] 0.0119
0.30 [1.1802, 1.4803] [1.1644, 1.4421] 0.0158
0.40 [1.1461, 1.5584] [1.1346, 1.5203] 0.0115
0.50 [1.1156, 1.6560] [1.1070, 1.6155] 0.0086
0.60 [1.0881, 1.7839] [1.0819, 1.7403] 0.0063
0.70 [1.0632, 1.9635] [1.0589, 1.9154] 0.0043
0.80 [1.0404, 2.2476] [1.0377, 2.1926] 0.0026
Table 4: Compare the optimal ranges of G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT students to debias under two measures of unfairness, under parameters α=3𝛼3\alpha=3italic_α = 3, β=.8𝛽.8\beta=.8italic_β = .8, and p=.25𝑝.25p=.25italic_p = .25. We check the optimal intervals under both measures of unfairness, and find on an average 95% overlap of the optimal intervals.

14 Proofs from Section 5.1

14.1 Auxiliary results for Section 5.1

Recall that, in this section, we consider the generalization of the model from Section 2 where students’ true potential follow a generic continuous, integrable cdf F𝐹Fitalic_F. Moreover, we write [x]+:=max(0,x)assignsuperscriptdelimited-[]𝑥0𝑥\left[x\right]^{+}:=\max(0,x)[ italic_x ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT := roman_max ( 0 , italic_x ) for a number or a function x𝑥xitalic_x. Recall that, similarly to Section 5.1, we abuse notation and identify a student θ𝜃\thetaitalic_θ with their potential Z(θ)𝑍𝜃Z(\theta)italic_Z ( italic_θ ).

Lemma 14.1

Let ρ𝜌\rhoitalic_ρ be an RVP. Under any continuous distribution of potentials F𝐹Fitalic_F, we have

μρ(θ)subscript𝜇𝜌𝜃\displaystyle\mu_{\rho}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) =ρ(θ)((1p)θdF+p[θθ/βρdF+θ/βdF])+(1ρ(θ))\displaystyle=\rho(\theta)\left((1-p)\int_{\theta}^{\infty}dF+p\left[\int_{% \theta}^{\theta/\beta}\rho\,dF+\int_{\theta/\beta}^{\infty}dF\right]\right)+(1% -\rho(\theta))\cdot= italic_ρ ( italic_θ ) ( ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT italic_ρ italic_d italic_F + ∫ start_POSTSUBSCRIPT italic_θ / italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F ] ) + ( 1 - italic_ρ ( italic_θ ) ) ⋅
((1p)βθ𝑑F+p[βθθρ𝑑F+θ𝑑F]),1𝑝superscriptsubscript𝛽𝜃differential-d𝐹𝑝delimited-[]superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹superscriptsubscript𝜃differential-d𝐹\displaystyle\left((1-p)\int_{\beta\theta}^{\infty}dF+p\left[\int_{\beta\theta% }^{\theta}\rho\,dF+\int_{\theta}^{\infty}dF\right]\right),( ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F + ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F ] ) , (8)
mρ(θ)subscript𝑚𝜌𝜃\displaystyle m_{\rho}(\theta)italic_m start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) =[0,(1ρ(θ))(1p)βθθ𝑑F+p[(1ρ(θ))βθθρ𝑑Fρ(θ)θθ/β(1ρ)𝑑F]]+.absentsuperscript01𝜌𝜃1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝delimited-[]1𝜌𝜃superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹𝜌𝜃superscriptsubscript𝜃𝜃𝛽1𝜌differential-d𝐹\displaystyle=[0,(1-\rho(\theta))(1-p)\int_{\beta\theta}^{\theta}\,dF+p\left[(% 1-\rho(\theta))\int_{\beta\theta}^{\theta}\rho\,dF-\rho(\theta)\int_{\theta}^{% \theta/\beta}(1-\rho)\,dF\right]]^{+}.= [ 0 , ( 1 - italic_ρ ( italic_θ ) ) ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ( 1 - italic_ρ ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F - italic_ρ ( italic_θ ) ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F ] ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT . (9)
Proof 14.2

Proof. Suppose a student appears to have potential τ𝜏\tauitalic_τ, possibly after having been debiased. Then under μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, they will be matched to school s(τ)𝑠𝜏s(\tau)italic_s ( italic_τ ) given by

s(τ)=(1p)τ𝑑F+p[ττ/βρ𝑑F+τ/β𝑑F],𝑠𝜏1𝑝superscriptsubscript𝜏differential-d𝐹𝑝delimited-[]superscriptsubscript𝜏𝜏𝛽𝜌differential-d𝐹superscriptsubscript𝜏𝛽differential-d𝐹s(\tau)=(1-p)\int_{\tau}^{\infty}\,dF+p\left[\int_{\tau}^{\tau/\beta}\rho\,dF+% \int_{\tau/\beta}^{\infty}\,dF\right],italic_s ( italic_τ ) = ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ∫ start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ / italic_β end_POSTSUPERSCRIPT italic_ρ italic_d italic_F + ∫ start_POSTSUBSCRIPT italic_τ / italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F ] ,

that is, they will appear after all non-disadvantaged students with true potential exceeding τ𝜏\tauitalic_τ; those disadvantaged students with potential exceeding τ/β𝜏𝛽\tau/\betaitalic_τ / italic_β; and those disadvantaged students who receive a voucher and have potential in the interval (τ,τ/β)𝜏𝜏𝛽(\tau,\tau/\beta)( italic_τ , italic_τ / italic_β ).

A student with true potential θ𝜃\thetaitalic_θ now receives a voucher with probability ρ(θ)𝜌𝜃\rho(\theta)italic_ρ ( italic_θ ), so by the law of total expectation, have

μρ(θ)=ρ(θ)s(θ)+(1ρ(θ))s(βθ),subscript𝜇𝜌𝜃𝜌𝜃𝑠𝜃1𝜌𝜃𝑠𝛽𝜃\mu_{\rho}(\theta)=\rho(\theta)s(\theta)+(1-\rho(\theta))s(\beta\theta),italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) = italic_ρ ( italic_θ ) italic_s ( italic_θ ) + ( 1 - italic_ρ ( italic_θ ) ) italic_s ( italic_β italic_θ ) ,

which is exactly (8). (9) follows from (8) and the definitions of displacement and μ(θ)𝜇𝜃\mu(\theta)italic_μ ( italic_θ ). \Halmos

We next report more expressions for μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT and μρsuperscriptsubscript𝜇𝜌\mu_{\rho}^{\prime}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, as they will be used in the upcoming proofs.

Proposition 14.3

Let ρ𝜌\rhoitalic_ρ be an RVP. For all θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, we have

μρ(θ)subscript𝜇𝜌𝜃\displaystyle\mu_{\rho}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_θ ) =ρ(θ)((1p)βθθ𝑑F+p[θθ/β(1ρ)𝑑F+βθθρ𝑑F])absent𝜌𝜃1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝delimited-[]superscriptsubscript𝜃𝜃𝛽1𝜌differential-d𝐹superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹\displaystyle=-\rho(\theta)\left((1-p)\int_{\beta\theta}^{\theta}dF+p\left[% \int_{\theta}^{\theta/\beta}(1-\rho)\,dF+\int_{\beta\theta}^{\theta}\rho\,dF% \right]\right)= - italic_ρ ( italic_θ ) ( ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F + ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F ] )
+((1p)βθ𝑑F+p[βθθρ𝑑F+θ𝑑F]).1𝑝superscriptsubscript𝛽𝜃differential-d𝐹𝑝delimited-[]superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹superscriptsubscript𝜃differential-d𝐹\displaystyle+\left((1-p)\int_{\beta\theta}^{\infty}dF+p\left[\int_{\beta% \theta}^{\theta}\rho\,dF+\int_{\theta}^{\infty}dF\right]\right).+ ( ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F + italic_p [ ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F + ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F ] ) . (10)

Moreover, if μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is differentiable at θ𝜃\thetaitalic_θ, we have

μρ(θ)superscriptsubscript𝜇𝜌𝜃\displaystyle\mu_{\rho}^{\prime}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) =f(θ)ρ(θ)[pθθ/β(1ρ)𝑑F+(1p)βθθ𝑑F+pβθθρ𝑑F]absent𝑓𝜃superscript𝜌𝜃delimited-[]𝑝superscriptsubscript𝜃𝜃𝛽1𝜌differential-d𝐹1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹\displaystyle=-f(\theta)-\rho^{\prime}(\theta)\left[p\int_{\theta}^{\theta/% \beta}(1-\rho)\,dF+(1-p)\int_{\beta\theta}^{\theta}\,dF+p\int_{\beta\theta}^{% \theta}\rho\,dF\right]= - italic_f ( italic_θ ) - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) [ italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F + ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F ]
pρ(θ)[1β(1ρ(θ/β))f(θ/β)(1ρ(θ))f(θ)]𝑝𝜌𝜃delimited-[]1𝛽1𝜌𝜃𝛽𝑓𝜃𝛽1𝜌𝜃𝑓𝜃\displaystyle\qquad-p\rho(\theta)\left[\frac{1}{\beta}(1-\rho(\theta/\beta))f(% \theta/\beta)-(1-\rho(\theta))f(\theta)\right]- italic_p italic_ρ ( italic_θ ) [ divide start_ARG 1 end_ARG start_ARG italic_β end_ARG ( 1 - italic_ρ ( italic_θ / italic_β ) ) italic_f ( italic_θ / italic_β ) - ( 1 - italic_ρ ( italic_θ ) ) italic_f ( italic_θ ) ]
+(1ρ(θ))[(1p)(f(θ)βf(βθ))+p(f(θ)ρ(θ)βf(βθ)ρ(βθ))].1𝜌𝜃delimited-[]1𝑝𝑓𝜃𝛽𝑓𝛽𝜃𝑝𝑓𝜃𝜌𝜃𝛽𝑓𝛽𝜃𝜌𝛽𝜃\displaystyle\qquad+(1-\rho(\theta))\left[(1-p)(f(\theta)-\beta f(\beta\theta)% )+p(f(\theta)\rho(\theta)-\beta f(\beta\theta)\rho(\beta\theta))\right].+ ( 1 - italic_ρ ( italic_θ ) ) [ ( 1 - italic_p ) ( italic_f ( italic_θ ) - italic_β italic_f ( italic_β italic_θ ) ) + italic_p ( italic_f ( italic_θ ) italic_ρ ( italic_θ ) - italic_β italic_f ( italic_β italic_θ ) italic_ρ ( italic_β italic_θ ) ) ] . (11)
Proof 14.4

Proof. (10) follows by simple rearrangement of (8), and (11) follows by standard mechanics of derivative computation. \Halmos

Definition 14.5

The RVP that assigns no vouchers, denoted ρ0subscript𝜌0\rho_{0}italic_ρ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is defined by ρ0(θ):=0assignsubscript𝜌0𝜃0\rho_{0}(\theta):=0italic_ρ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_θ ) := 0 for all θ[1,)𝜃1\theta\in[1,\infty)italic_θ ∈ [ 1 , ∞ ). Note that μρ0(θ)=(1p)βθ𝑑F+pθ𝑑Fsubscript𝜇subscript𝜌0𝜃1𝑝superscriptsubscript𝛽𝜃differential-d𝐹𝑝superscriptsubscript𝜃differential-d𝐹\mu_{\rho_{0}}(\theta)=(1-p)\int_{\beta\theta}^{\infty}dF+p\int_{\theta}^{% \infty}dFitalic_μ start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_θ ) = ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_d italic_F and mρ0(θ)=m(θ)=(1p)βθθ𝑑Fsubscript𝑚subscript𝜌0𝜃𝑚𝜃1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹m_{\rho_{0}}(\theta)=m(\theta)=(1-p)\int_{\beta\theta}^{\theta}\,dFitalic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_θ ) = italic_m ( italic_θ ) = ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F.

14.2 Necessary and sufficient conditions for incentive compatibility

In this section we develop necessary and sufficient conditions for incentive compatibility through the concept of well-behavedness and prove an important technical lemma.

Definition 14.6 (Well-behaved RVP)

We call an RVP ρ𝜌\rhoitalic_ρ well-behaved if it is everywhere continuously differentiable except for a set of isolated points where it has non-negative, right-continuous jump discontinuities.

Lemma 14.7 (Necessary and sufficient conditions for incentive compatibility)

Let ρ𝜌\rhoitalic_ρ be a well-behaved RVP and F𝐹Fitalic_F be an arbitrary continuous distribution of potentials. ρ𝜌\rhoitalic_ρ is incentive compatible with respect to F𝐹Fitalic_F if and only if, for all θ𝜃\thetaitalic_θ such that ρ𝜌\rhoitalic_ρ is continuously differentiable at θ𝜃\thetaitalic_θ, we have ρ(θ)0superscript𝜌𝜃0\rho^{\prime}(\theta)\geq 0italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≥ 0 or μρ(θ)0superscriptsubscript𝜇𝜌𝜃0\mu_{\rho}^{\prime}(\theta)\leq 0italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ 0.

Proof 14.8

Proof. Recall that ρ𝜌\rhoitalic_ρ is incentive compatible if μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is everywhere non-increasing. Observe from (10) in Proposition 14.3 that μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is continuous at θ𝜃\thetaitalic_θ if and only if ρ𝜌\rhoitalic_ρ is continuous at θ𝜃\thetaitalic_θ. On the other hand, if μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is not continuous at θ𝜃\thetaitalic_θ then it must have a negative jump-discontinuity caused by a positive jump-discontinuity of ρ𝜌\rhoitalic_ρ (since all other terms of (10) are positive). Further note that if μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is not continuously differentiable at θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, then ρ𝜌\rhoitalic_ρ is not continuously differentiable at θ𝜃\thetaitalic_θ, βθ𝛽𝜃\beta\thetaitalic_β italic_θ or θ/β𝜃𝛽\theta/\betaitalic_θ / italic_β; so the set of points where μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is not continuously differentiable also forms an isolated set.

Consider any θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ where μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is continuously differentiable, then μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is non-increasing iff μρ(θ)0superscriptsubscript𝜇𝜌𝜃0\mu_{\rho}^{\prime}(\theta)\leq 0italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ 0. By collecting the terms for f(θ)𝑓𝜃f(\theta)italic_f ( italic_θ ), f(βθ)𝑓𝛽𝜃f(\beta\theta)italic_f ( italic_β italic_θ ) and f(θ/β)𝑓𝜃𝛽f(\theta/\beta)italic_f ( italic_θ / italic_β ) in (11), one can see that μρ(θ)0superscriptsubscript𝜇𝜌𝜃0\mu_{\rho}^{\prime}(\theta)\leq 0italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ 0 if ρ(θ)0superscript𝜌𝜃0\rho^{\prime}(\theta)\geq 0italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≥ 0.

We have established that μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is continuous at all but an isolated set of negative jump-discontinuities, and that μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is continuously differentiable and non-increasing at all but an isolated set of points. μρsubscript𝜇𝜌\mu_{\rho}italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is therefore everywhere non-increasing, as required. \Halmos

Lemma 14.9

Suppose ρ𝜌\rhoitalic_ρ is a well-behaved RVP such that for all θ𝜃\thetaitalic_θ where ρ𝜌\rhoitalic_ρ is continuously differentiable, we have ρ(θ)ϕ(θ)superscript𝜌𝜃italic-ϕ𝜃\rho^{\prime}(\theta)\geq-\phi(\theta)italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≥ - italic_ϕ ( italic_θ ), with

ϕ(θ):=α(1p)θ[p(1βα)+(1p)(βα1)].assignitalic-ϕ𝜃𝛼1𝑝𝜃delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1\displaystyle\phi(\theta):=\frac{\alpha(1-p)}{\theta\left[p(1-\beta^{\alpha})+% (1-p)(\beta^{-\alpha}-1)\right]}.italic_ϕ ( italic_θ ) := divide start_ARG italic_α ( 1 - italic_p ) end_ARG start_ARG italic_θ [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] end_ARG .

Then, ρ𝜌\rhoitalic_ρ is incentive compatible.

Proof 14.10

Proof. By Lemma 14.7, it suffices to show that μρ(θ)0superscriptsubscript𝜇𝜌𝜃0\mu_{\rho}^{\prime}(\theta)\leq 0italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ 0 for θ𝜃\thetaitalic_θ such that ρ(θ)superscript𝜌𝜃\rho^{\prime}(\theta)italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) exists and is continuous, and ρ(θ)<0superscript𝜌𝜃0\rho^{\prime}(\theta)<0italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) < 0. Now define

=pθθ/β(1ρ)𝑑F+(1p)βθθ𝑑F+pβθθρ𝑑F and W=ρ(θ)(1p)f(θ)(1ρ(θ))(1p)βf(βθ),𝑝superscriptsubscript𝜃𝜃𝛽1𝜌differential-d𝐹1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹 and 𝑊𝜌𝜃1𝑝𝑓𝜃1𝜌𝜃1𝑝𝛽𝑓𝛽𝜃\mathcal{L}=p\int_{\theta}^{\theta/\beta}(1-\rho)\,dF+(1-p)\int_{\beta\theta}^% {\theta}\,dF+p\int_{\beta\theta}^{\theta}\rho\,dF\hbox{ and }W=-\rho(\theta)(1% -p)f(\theta)-(1-\rho(\theta))(1-p)\beta f(\beta\theta),caligraphic_L = italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F + ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F and italic_W = - italic_ρ ( italic_θ ) ( 1 - italic_p ) italic_f ( italic_θ ) - ( 1 - italic_ρ ( italic_θ ) ) ( 1 - italic_p ) italic_β italic_f ( italic_β italic_θ ) ,

and note that 00\mathcal{L}\geq 0caligraphic_L ≥ 0, and W0𝑊0W\leq 0italic_W ≤ 0. Simple calculations based on (11) in Proposition 14.3 shows that μρ(θ)ρ(θ)+Wsuperscriptsubscript𝜇𝜌𝜃superscript𝜌𝜃𝑊\mu_{\rho}^{\prime}(\theta)\leq-\rho^{\prime}(\theta)\mathcal{L}+Witalic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) caligraphic_L + italic_W. It is therefore enough to prove ρ(θ)Wsuperscript𝜌𝜃𝑊-\rho^{\prime}(\theta)\leq\frac{-W}{\mathcal{L}}- italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≤ divide start_ARG - italic_W end_ARG start_ARG caligraphic_L end_ARG. Compute next

\displaystyle\mathcal{L}caligraphic_L pθθ/β𝑑F+(1p)βθθ𝑑Fθα[p(1βα)+(1p)(βα1)] andabsent𝑝superscriptsubscript𝜃𝜃𝛽differential-d𝐹1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹superscript𝜃𝛼delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1 and\displaystyle\leq p\int_{\theta}^{\theta/\beta}\,dF+(1-p)\int_{\beta\theta}^{% \theta}\,dF\leq\theta^{-\alpha}\left[p(1-\beta^{\alpha})+(1-p)(\beta^{-\alpha}% -1)\right]\hbox{ and }≤ italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT italic_d italic_F + ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F ≤ italic_θ start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] and
W𝑊\displaystyle-W- italic_W (1p)min{f(θ),βf(βθ)}=α(1p)θ1+α.absent1𝑝𝑓𝜃𝛽𝑓𝛽𝜃𝛼1𝑝superscript𝜃1𝛼\displaystyle\geq(1-p)\min\left\{f(\theta),\beta f(\beta\theta)\right\}=\frac{% \alpha(1-p)}{\theta^{1+\alpha}}.≥ ( 1 - italic_p ) roman_min { italic_f ( italic_θ ) , italic_β italic_f ( italic_β italic_θ ) } = divide start_ARG italic_α ( 1 - italic_p ) end_ARG start_ARG italic_θ start_POSTSUPERSCRIPT 1 + italic_α end_POSTSUPERSCRIPT end_ARG .

This yields

W𝑊\displaystyle\frac{-W}{\mathcal{L}}divide start_ARG - italic_W end_ARG start_ARG caligraphic_L end_ARG α(1p)θ[p(1βα)+(1p)(βα1)]=ϕ(θ).absent𝛼1𝑝𝜃delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1italic-ϕ𝜃\displaystyle\geq\frac{\alpha(1-p)}{\theta\left[p(1-\beta^{\alpha})+(1-p)(% \beta^{-\alpha}-1)\right]}=\phi(\theta).≥ divide start_ARG italic_α ( 1 - italic_p ) end_ARG start_ARG italic_θ [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] end_ARG = italic_ϕ ( italic_θ ) .

We have shown Wϕ(θ)𝑊italic-ϕ𝜃\frac{-W}{\mathcal{L}}\geq\phi(\theta)divide start_ARG - italic_W end_ARG start_ARG caligraphic_L end_ARG ≥ italic_ϕ ( italic_θ ), which combined with the assumption that ϕ(θ)ρ(θ)italic-ϕ𝜃superscript𝜌𝜃\phi(\theta)\geq-\rho^{\prime}(\theta)italic_ϕ ( italic_θ ) ≥ - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) completes the proof.\Halmos

14.3 Proof of Theorem 5.2: properties of PropMs

We next prove Lemmas 14.11, 14.13, and 14.19, which together constitute Theorem 5.2.

Lemma 14.11

The proportional-to-mistreatment RVP ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is 2c^α1βα2^𝑐𝛼1superscript𝛽𝛼\frac{2\widehat{c}\alpha}{1-\beta^{\alpha}}divide start_ARG 2 over^ start_ARG italic_c end_ARG italic_α end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG-individually fair.

Proof 14.12

Proof. ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is everywhere continuous and continuously differentiable on ΘΘ\Thetaroman_Θ, except at θ=1/β𝜃1𝛽\theta=1/\betaitalic_θ = 1 / italic_β. ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is therefore Lipschitz for a constant given by the supremum of the absolute value of the derivative, which occurs at θ=1𝜃1\theta=1italic_θ = 1 where ρm(θ)=2c^α1βαsuperscriptsubscript𝜌𝑚𝜃2^𝑐𝛼1superscript𝛽𝛼\rho_{m}^{\prime}(\theta)=\frac{2\widehat{c}\alpha}{1-\beta^{\alpha}}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) = divide start_ARG 2 over^ start_ARG italic_c end_ARG italic_α end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG.\Halmos

Lemma 14.13

The proportional-to-mistreatment RVP ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is incentive compatible for

c^1p2[p(1βα)+(1p)(βα1)].^𝑐1𝑝2delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1\displaystyle\widehat{c}\leq\frac{1-p}{2\left[p(1-\beta^{\alpha})+(1-p)(\beta^% {-\alpha}-1)\right]}.over^ start_ARG italic_c end_ARG ≤ divide start_ARG 1 - italic_p end_ARG start_ARG 2 [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] end_ARG .
Proof 14.14

Proof. Applying Lemma 14.9, it suffices to show

ρm(θ)=2αc^βαθα1subscriptsuperscript𝜌𝑚𝜃2𝛼^𝑐superscript𝛽𝛼superscript𝜃𝛼1\displaystyle-\rho^{\prime}_{m}(\theta)=2\alpha\widehat{c}\beta^{-\alpha}% \theta^{-\alpha-1}- italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_θ ) = 2 italic_α over^ start_ARG italic_c end_ARG italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT italic_θ start_POSTSUPERSCRIPT - italic_α - 1 end_POSTSUPERSCRIPT ϕ(θ)=α(1p)θ[p(1βα)+(1p)(βα1)].absentitalic-ϕ𝜃𝛼1𝑝𝜃delimited-[]𝑝1superscript𝛽𝛼1𝑝superscript𝛽𝛼1\displaystyle\leq\phi(\theta)=\frac{\alpha(1-p)}{\theta\left[p(1-\beta^{\alpha% })+(1-p)(\beta^{-\alpha}-1)\right]}.≤ italic_ϕ ( italic_θ ) = divide start_ARG italic_α ( 1 - italic_p ) end_ARG start_ARG italic_θ [ italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + ( 1 - italic_p ) ( italic_β start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT - 1 ) ] end_ARG . (12)

for θ1/β𝜃1𝛽\theta\geq 1/\betaitalic_θ ≥ 1 / italic_β (since ρm(θ)0superscriptsubscript𝜌𝑚𝜃0\rho_{m}^{\prime}(\theta)\geq 0italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≥ 0 for θ<1/β𝜃1𝛽\theta<1/\betaitalic_θ < 1 / italic_β). Note that this is tightest when θ=1/β𝜃1𝛽\theta=1/\betaitalic_θ = 1 / italic_β, which gives the condition for c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG. \Halmos

Lemma 14.15

For the proportional-to-mistreatment RVP ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, we have

supθΘ(1ρm(θ))βθθ𝑑F=(1βα)ξ(c^), whereξ(c^):={12c^,c^1/4,18c^,c^>1/4.formulae-sequencesubscriptsupremum𝜃Θ1subscript𝜌𝑚𝜃superscriptsubscript𝛽𝜃𝜃differential-d𝐹1superscript𝛽𝛼𝜉^𝑐 whereassign𝜉^𝑐cases12^𝑐^𝑐1418^𝑐^𝑐14\sup_{\theta\in\Theta}\,(1-\rho_{m}(\theta))\int_{\beta\theta}^{\theta}\,dF=(1% -\beta^{\alpha})\xi(\widehat{c}),\quad\hbox{ where}\quad\xi(\widehat{c}):=% \left\{\begin{array}[]{ll}1-2\widehat{c},&\widehat{c}\leq 1/4,\\ \frac{1}{8\widehat{c}},&\widehat{c}>1/4.\end{array}\right.roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ end_POSTSUBSCRIPT ( 1 - italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F = ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) italic_ξ ( over^ start_ARG italic_c end_ARG ) , where italic_ξ ( over^ start_ARG italic_c end_ARG ) := { start_ARRAY start_ROW start_CELL 1 - 2 over^ start_ARG italic_c end_ARG , end_CELL start_CELL over^ start_ARG italic_c end_ARG ≤ 1 / 4 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG 8 over^ start_ARG italic_c end_ARG end_ARG , end_CELL start_CELL over^ start_ARG italic_c end_ARG > 1 / 4 . end_CELL end_ROW end_ARRAY
Proof 14.16

Proof. With F(a,b):=ab𝑑Fassign𝐹𝑎𝑏superscriptsubscript𝑎𝑏differential-d𝐹F(a,b):=\int_{a}^{b}\,dFitalic_F ( italic_a , italic_b ) := ∫ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT italic_d italic_F, define q:=2c^1βαassign𝑞2^𝑐1superscript𝛽𝛼q:=\frac{2\widehat{c}}{1-\beta^{\alpha}}italic_q := divide start_ARG 2 over^ start_ARG italic_c end_ARG end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG and y(θ):=F(βθ,θ)assign𝑦𝜃𝐹𝛽𝜃𝜃y(\theta):=F(\beta\theta,\theta)italic_y ( italic_θ ) := italic_F ( italic_β italic_θ , italic_θ ). Now write

(1ρm(θ))βθθ𝑑F1subscript𝜌𝑚𝜃superscriptsubscript𝛽𝜃𝜃differential-d𝐹\displaystyle(1-\rho_{m}(\theta))\int_{\beta\theta}^{\theta}\,dF( 1 - italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F =(12c^1βαF(βθ,θ))F(βθ,θ)=(1qy(θ))y(θ).absent12^𝑐1superscript𝛽𝛼𝐹𝛽𝜃𝜃𝐹𝛽𝜃𝜃1𝑞𝑦𝜃𝑦𝜃\displaystyle=\left(1-\frac{2\widehat{c}}{1-\beta^{\alpha}}F(\beta\theta,% \theta)\right)F(\beta\theta,\theta)=(1-qy(\theta))y(\theta).= ( 1 - divide start_ARG 2 over^ start_ARG italic_c end_ARG end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG italic_F ( italic_β italic_θ , italic_θ ) ) italic_F ( italic_β italic_θ , italic_θ ) = ( 1 - italic_q italic_y ( italic_θ ) ) italic_y ( italic_θ ) .

This is a quadratic in y𝑦yitalic_y that increases from y=0𝑦0y=0italic_y = 0 to its maximum at y=1/(2q)𝑦12𝑞y=1/(2q)italic_y = 1 / ( 2 italic_q ). Observe that maxθF(βθ,θ)=1βαsubscript𝜃𝐹𝛽𝜃𝜃1superscript𝛽𝛼\max_{\theta}F(\beta\theta,\theta)=1-\beta^{\alpha}roman_max start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_F ( italic_β italic_θ , italic_θ ) = 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT which is attained at θ=1/β𝜃1𝛽\theta=1/\betaitalic_θ = 1 / italic_β. This means that if c^1/4^𝑐14\widehat{c}\leq 1/4over^ start_ARG italic_c end_ARG ≤ 1 / 4, then

y(θ)=F(βθ,θ)1βα1βα4c^=12q,𝑦𝜃𝐹𝛽𝜃𝜃1superscript𝛽𝛼1superscript𝛽𝛼4^𝑐12𝑞\displaystyle y(\theta)=F(\beta\theta,\theta)\leq 1-\beta^{\alpha}\leq\frac{1-% \beta^{\alpha}}{4\widehat{c}}=\frac{1}{2q},italic_y ( italic_θ ) = italic_F ( italic_β italic_θ , italic_θ ) ≤ 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ≤ divide start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 4 over^ start_ARG italic_c end_ARG end_ARG = divide start_ARG 1 end_ARG start_ARG 2 italic_q end_ARG ,

and the maximum of the quadratic over y𝑦yitalic_y is realized at the maximum value of y𝑦yitalic_y. Thus,

supθΘ,c^1/4(1qy(θ))y(θ)=(12c^1βα(1βα))(1βα)=(1βα)(12c^).subscriptsupremumformulae-sequence𝜃Θ^𝑐141𝑞𝑦𝜃𝑦𝜃12^𝑐1superscript𝛽𝛼1superscript𝛽𝛼1superscript𝛽𝛼1superscript𝛽𝛼12^𝑐\displaystyle\sup_{\theta\in\Theta,\widehat{c}\leq 1/4}\,(1-qy(\theta))y(% \theta)=\left(1-\frac{2\widehat{c}}{1-\beta^{\alpha}}(1-\beta^{\alpha})\right)% (1-\beta^{\alpha})=(1-\beta^{\alpha})(1-2\widehat{c}).roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ , over^ start_ARG italic_c end_ARG ≤ 1 / 4 end_POSTSUBSCRIPT ( 1 - italic_q italic_y ( italic_θ ) ) italic_y ( italic_θ ) = ( 1 - divide start_ARG 2 over^ start_ARG italic_c end_ARG end_ARG start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) = ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ( 1 - 2 over^ start_ARG italic_c end_ARG ) .

On the other hand if c^>1/4^𝑐14\widehat{c}>1/4over^ start_ARG italic_c end_ARG > 1 / 4, then this expression reaches its maximum when the quadratic does, at y=1/(2q)𝑦12𝑞y=1/(2q)italic_y = 1 / ( 2 italic_q ), giving

supθΘ,c^>1/4(1qy(θ))y(θ)subscriptsupremumformulae-sequence𝜃Θ^𝑐141𝑞𝑦𝜃𝑦𝜃\displaystyle\sup_{\theta\in\Theta,\widehat{c}>1/4}\,(1-qy(\theta))y(\theta)roman_sup start_POSTSUBSCRIPT italic_θ ∈ roman_Θ , over^ start_ARG italic_c end_ARG > 1 / 4 end_POSTSUBSCRIPT ( 1 - italic_q italic_y ( italic_θ ) ) italic_y ( italic_θ ) =(1q12q)12q=(112)12q=1βα8c^.absent1𝑞12𝑞12𝑞11212𝑞1superscript𝛽𝛼8^𝑐\displaystyle=\left(1-q\frac{1}{2q}\right)\frac{1}{2q}=\left(1-\frac{1}{2}% \right)\frac{1}{2q}=\frac{1-\beta^{\alpha}}{8\widehat{c}}.= ( 1 - italic_q divide start_ARG 1 end_ARG start_ARG 2 italic_q end_ARG ) divide start_ARG 1 end_ARG start_ARG 2 italic_q end_ARG = ( 1 - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ) divide start_ARG 1 end_ARG start_ARG 2 italic_q end_ARG = divide start_ARG 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 8 over^ start_ARG italic_c end_ARG end_ARG .

Combining the two completes the proof.\Halmos

Lemma 14.17

For the proportional-to-mistreament RVP, ρmsubscript𝜌𝑚\rho_{m}italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the maximum mistreament mmρm𝑚subscript𝑚subscript𝜌𝑚mm_{\rho_{m}}italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT satisfies

mmρm(1p(12c^))(1βα)ξ(c^),𝑚subscript𝑚subscript𝜌𝑚1𝑝12^𝑐1superscript𝛽𝛼𝜉^𝑐\displaystyle mm_{\rho_{m}}\leq(1-p(1-2\widehat{c}))(1-\beta^{\alpha})\xi(% \widehat{c}),italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ( 1 - italic_p ( 1 - 2 over^ start_ARG italic_c end_ARG ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) italic_ξ ( over^ start_ARG italic_c end_ARG ) ,

where ξ()𝜉\xi(\cdot)italic_ξ ( ⋅ ) is defined as in Lemma 14.15.

Proof 14.18

Proof. Abbreviate ρ=ρm𝜌subscript𝜌𝑚\rho=\rho_{m}italic_ρ = italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. Apply ρ(θ)ρ(1/β)𝜌𝜃𝜌1𝛽\rho(\theta)\leq\rho(1/\beta)italic_ρ ( italic_θ ) ≤ italic_ρ ( 1 / italic_β ) to (9) and simplify to get

mρm(θ)subscript𝑚subscript𝜌𝑚𝜃\displaystyle m_{\rho_{m}}(\theta)italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_θ ) [(1p)(1ρ(θ))βθθ𝑑F+p(1ρ(θ))ρ(1/β)βθθ𝑑F]+absentsuperscriptdelimited-[]1𝑝1𝜌𝜃superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝1𝜌𝜃𝜌1𝛽superscriptsubscript𝛽𝜃𝜃differential-d𝐹\displaystyle\leq\left[(1-p)(1-\rho(\theta))\int_{\beta\theta}^{\theta}\,dF+p(% 1-\rho(\theta))\rho(1/\beta)\int_{\beta\theta}^{\theta}\,dF\right]^{+}≤ [ ( 1 - italic_p ) ( 1 - italic_ρ ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ( 1 - italic_ρ ( italic_θ ) ) italic_ρ ( 1 / italic_β ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT
=[(1+p(ρ(1/β)1))(1ρ(θ))βθθ𝑑F]+.absentsuperscriptdelimited-[]1𝑝𝜌1𝛽11𝜌𝜃superscriptsubscript𝛽𝜃𝜃differential-d𝐹\displaystyle=\left[(1+p(\rho(1/\beta)-1))(1-\rho(\theta))\int_{\beta\theta}^{% \theta}\,dF\right]^{+}.= [ ( 1 + italic_p ( italic_ρ ( 1 / italic_β ) - 1 ) ) ( 1 - italic_ρ ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT .

The thesis follows by taking the supremum over θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ, applying Lemma 14.15, and substituting ρ(1/β)=2c^𝜌1𝛽2^𝑐\rho(1/\beta)=2\widehat{c}italic_ρ ( 1 / italic_β ) = 2 over^ start_ARG italic_c end_ARG.\Halmos

Recall that we let mm(c^)𝑚superscript𝑚^𝑐mm^{*}(\widehat{c})italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) be the maximum mistreatment achieved by the optimal policy from Theorem 4.1 with the amount of resources being c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG. We have

mm(c^)𝑚superscript𝑚^𝑐\displaystyle mm^{*}(\widehat{c})italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) ={(1pc^)(1βα)+c^pif c^(1p)(1βα)1p+1βα,(1p)(1βα)1c^1pβαotherwise.absentcases1𝑝^𝑐1superscript𝛽𝛼^𝑐𝑝if ^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼1^𝑐1𝑝superscript𝛽𝛼otherwise.\displaystyle=\begin{cases}(1-p-\widehat{c})(1-\beta^{\alpha})+\widehat{c}p&% \hbox{if }\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}},% \\ (1-p)(1-\beta^{\alpha})\frac{1-\widehat{c}}{1-p\beta^{\alpha}}&\text{otherwise% .}\end{cases}= { start_ROW start_CELL ( 1 - italic_p - over^ start_ARG italic_c end_ARG ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + over^ start_ARG italic_c end_ARG italic_p end_CELL start_CELL if over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG , end_CELL end_ROW start_ROW start_CELL ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) divide start_ARG 1 - over^ start_ARG italic_c end_ARG end_ARG start_ARG 1 - italic_p italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL otherwise. end_CELL end_ROW (13)
Lemma 14.19

Suppose p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, p1/2𝑝12p\leq 1/2italic_p ≤ 1 / 2, and c^(1p)(1βα)1p+1βα^𝑐1𝑝1superscript𝛽𝛼1𝑝1superscript𝛽𝛼\widehat{c}\leq\frac{(1-p)(1-\beta^{\alpha})}{1-p+1-\beta^{\alpha}}over^ start_ARG italic_c end_ARG ≤ divide start_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG start_ARG 1 - italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG. If c^1p+1βα4p(1βα)^𝑐1𝑝1superscript𝛽𝛼4𝑝1superscript𝛽𝛼\widehat{c}\geq 1-\frac{p+1-\beta^{\alpha}}{4p(1-\beta^{\alpha})}over^ start_ARG italic_c end_ARG ≥ 1 - divide start_ARG italic_p + 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_p ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) end_ARG, then mmρmmm(c^)𝑚subscript𝑚subscript𝜌𝑚𝑚superscript𝑚^𝑐mm_{\rho_{m}}\leq mm^{*}(\widehat{c})italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ).

Proof 14.20

Proof. Let Q:=mm(c^)mmρmassign𝑄𝑚superscript𝑚^𝑐𝑚subscript𝑚subscript𝜌𝑚Q:=mm^{*}(\widehat{c})-mm_{\rho_{m}}italic_Q := italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) - italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT, we need to show Q0𝑄0Q\geq 0italic_Q ≥ 0. Using Theorem 4.1 and Lemma 14.17, compute

Q𝑄\displaystyle Qitalic_Q =mm(c^)mmρm(1p)(1βα)c^(1βαp)(1+p(2c^1))(1βα)ξ(c^).absent𝑚superscript𝑚^𝑐𝑚subscript𝑚subscript𝜌𝑚1𝑝1superscript𝛽𝛼^𝑐1superscript𝛽𝛼𝑝1𝑝2^𝑐11superscript𝛽𝛼𝜉^𝑐\displaystyle=mm^{*}(\widehat{c})-mm_{\rho_{m}}\geq(1-p)(1-\beta^{\alpha})-% \widehat{c}(1-\beta^{\alpha}-p)-(1+p(2\widehat{c}-1))(1-\beta^{\alpha})\xi(% \widehat{c}).= italic_m italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) - italic_m italic_m start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) - over^ start_ARG italic_c end_ARG ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p ) - ( 1 + italic_p ( 2 over^ start_ARG italic_c end_ARG - 1 ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) italic_ξ ( over^ start_ARG italic_c end_ARG ) .

For c^1/4^𝑐14\widehat{c}\leq 1/4over^ start_ARG italic_c end_ARG ≤ 1 / 4, we now have

Qc^[(14p(1c^))(1βα)+p].𝑄^𝑐delimited-[]14𝑝1^𝑐1superscript𝛽𝛼𝑝\displaystyle Q\geq\widehat{c}\left[(1-4p(1-\widehat{c}))(1-\beta^{\alpha})+p% \right].italic_Q ≥ over^ start_ARG italic_c end_ARG [ ( 1 - 4 italic_p ( 1 - over^ start_ARG italic_c end_ARG ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) + italic_p ] . (14)

If p14𝑝14p\leq\frac{1}{4}italic_p ≤ divide start_ARG 1 end_ARG start_ARG 4 end_ARG, then the right-hand side of (14) is nonnegative, concluding the proof. Thus, assume p>14𝑝14p>\frac{1}{4}italic_p > divide start_ARG 1 end_ARG start_ARG 4 end_ARG. Since c^>0^𝑐0\widehat{c}>0over^ start_ARG italic_c end_ARG > 0, we can drop the leading c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG, so for Q0𝑄0Q\geq 0italic_Q ≥ 0, we need

p(14(1βα)(1c^))(1βα).𝑝141superscript𝛽𝛼1^𝑐1superscript𝛽𝛼\displaystyle p(1-4(1-\beta^{\alpha})(1-\widehat{c}))\geq-(1-\beta^{\alpha}).italic_p ( 1 - 4 ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) ( 1 - over^ start_ARG italic_c end_ARG ) ) ≥ - ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) .

Rearranging leads to the thesis.

Consider next the case where c^1/4^𝑐14\widehat{c}\geq 1/4over^ start_ARG italic_c end_ARG ≥ 1 / 4. In this case we want to show the inequality

(1p)(1βα)c^(1βαp)(1+p(2c^1))(1βα)18c^0.1𝑝1superscript𝛽𝛼^𝑐1superscript𝛽𝛼𝑝1𝑝2^𝑐11superscript𝛽𝛼18^𝑐0\displaystyle(1-p)(1-\beta^{\alpha})-\widehat{c}(1-\beta^{\alpha}-p)-(1+p(2% \widehat{c}-1))(1-\beta^{\alpha})\frac{1}{8\widehat{c}}\geq 0.( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) - over^ start_ARG italic_c end_ARG ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p ) - ( 1 + italic_p ( 2 over^ start_ARG italic_c end_ARG - 1 ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) divide start_ARG 1 end_ARG start_ARG 8 over^ start_ARG italic_c end_ARG end_ARG ≥ 0 .

Again since c^>0^𝑐0\widehat{c}>0over^ start_ARG italic_c end_ARG > 0, we can multiply by c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG to get a quadratic in c^^𝑐\widehat{c}over^ start_ARG italic_c end_ARG; call the resulting expression W(c^)𝑊^𝑐W(\widehat{c})italic_W ( over^ start_ARG italic_c end_ARG ):

W(c^)𝑊^𝑐\displaystyle W(\widehat{c})italic_W ( over^ start_ARG italic_c end_ARG ) =c^(1p)(1βα)c^2(1βαp)(1+p(2c^1))(1βα)18.absent^𝑐1𝑝1superscript𝛽𝛼superscript^𝑐21superscript𝛽𝛼𝑝1𝑝2^𝑐11superscript𝛽𝛼18\displaystyle=\widehat{c}(1-p)(1-\beta^{\alpha})-\widehat{c}^{2}(1-\beta^{% \alpha}-p)-(1+p(2\widehat{c}-1))(1-\beta^{\alpha})\frac{1}{8}.= over^ start_ARG italic_c end_ARG ( 1 - italic_p ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) - over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT - italic_p ) - ( 1 + italic_p ( 2 over^ start_ARG italic_c end_ARG - 1 ) ) ( 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) divide start_ARG 1 end_ARG start_ARG 8 end_ARG .

Since p<1βα𝑝1superscript𝛽𝛼p<1-\beta^{\alpha}italic_p < 1 - italic_β start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, W′′(c^)0superscript𝑊′′^𝑐0W^{\prime\prime}(\widehat{c})\leq 0italic_W start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( over^ start_ARG italic_c end_ARG ) ≤ 0, hence this is a concave quadratic. One can verify that if p1/2𝑝12p\leq 1/2italic_p ≤ 1 / 2 then W(1/4)0𝑊140W(1/4)\geq 0italic_W ( 1 / 4 ) ≥ 0 and W(1/2)0𝑊120W(1/2)\geq 0italic_W ( 1 / 2 ) ≥ 0, which means that W𝑊Witalic_W must also be non-negative for c^[1/4,1/2]^𝑐1412\widehat{c}\in[1/4,1/2]over^ start_ARG italic_c end_ARG ∈ [ 1 / 4 , 1 / 2 ], as required.\Halmos

14.4 Increasing-with-Potential RVPs

Proof 14.21

Proof of Lemma 5.3. Directly from Lemma 14.7.\Halmos

Proof 14.22

Proof of Theorem 5.4. We claim that there exists δ>0𝛿0\delta>0italic_δ > 0 with δ<θ(1β)𝛿𝜃1𝛽\delta<\theta(1-\beta)italic_δ < italic_θ ( 1 - italic_β ) such that on I:=(θδ,θ+δ)assign𝐼𝜃𝛿𝜃𝛿I:=(\theta-\delta,\theta+\delta)italic_I := ( italic_θ - italic_δ , italic_θ + italic_δ ), the following properties hold for all tI𝑡𝐼t\in Iitalic_t ∈ italic_I: ρ𝜌\rhoitalic_ρ is continuous and differentiable at t𝑡titalic_t; ρ𝜌\rhoitalic_ρ is monotonically decreasing at t𝑡titalic_t; and 0<ρ(t)(1+ρ(θ))/20𝜌𝑡1𝜌𝜃20<\rho(t)\leq(1+\rho(\theta))/20 < italic_ρ ( italic_t ) ≤ ( 1 + italic_ρ ( italic_θ ) ) / 2. The existence of an interval that satisfies the first and second properties follows since ρ𝜌\rhoitalic_ρ is continuously differentiable in some neighborhood of θ𝜃\thetaitalic_θ and has strictly negative derivative. The third follows since ρ𝜌\rhoitalic_ρ has a strictly negative derivative at θ𝜃\thetaitalic_θ, so it must be strictly bounded away from 00 and 1111 itself, and then one can restrict δ𝛿\deltaitalic_δ to guarantee the same for t𝑡titalic_t close to θ𝜃\thetaitalic_θ. Note also that I(βθ,θ/β)𝐼𝛽𝜃𝜃𝛽I\subset(\beta\theta,\theta/\beta)italic_I ⊂ ( italic_β italic_θ , italic_θ / italic_β ).

Next, fix ε>0𝜀0\varepsilon>0italic_ε > 0, then one can construct a distribution f𝑓fitalic_f that satisfies the following conditions: f𝑓fitalic_f is continuous and differentiable everywhere; f(θ)=ε𝑓𝜃𝜀f(\theta)=\varepsilonitalic_f ( italic_θ ) = italic_ε; f(t)=0𝑓𝑡0f(t)=0italic_f ( italic_t ) = 0 for tI𝑡𝐼t\not\in Iitalic_t ∉ italic_I; and θθ+δf(t)𝑑t12superscriptsubscript𝜃𝜃𝛿𝑓𝑡differential-d𝑡12\int_{\theta}^{\theta+\delta}f(t)\,dt\geq\frac{1}{2}∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ + italic_δ end_POSTSUPERSCRIPT italic_f ( italic_t ) italic_d italic_t ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG. This can be done for instance by constructing a piece-wise constant function that satisfies all but the first condition, then smoothing it out with an appropriate bump function via standard techniques.

From (11) of Proposition 14.3 we can compute

μρ(θ)superscriptsubscript𝜇𝜌𝜃\displaystyle\mu_{\rho}^{\prime}(\theta)italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) =ρ(θ)(pθθ/β(1ρ)𝑑F+(1p)βθθ𝑑F+pβθθρ𝑑F)absentsuperscript𝜌𝜃𝑝superscriptsubscript𝜃𝜃𝛽1𝜌differential-d𝐹1𝑝superscriptsubscript𝛽𝜃𝜃differential-d𝐹𝑝superscriptsubscript𝛽𝜃𝜃𝜌differential-d𝐹\displaystyle=-\rho^{\prime}(\theta)\left(p\int_{\theta}^{\theta/\beta}(1-\rho% )\,dF+(1-p)\int_{\beta\theta}^{\theta}\,dF+p\int_{\beta\theta}^{\theta}\rho\,% dF\right)= - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ( italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ / italic_β end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F + ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ∫ start_POSTSUBSCRIPT italic_β italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F )
1βpρ(θ)f(θβ)(1ρ(θβ))β(1ρ(θ))f(βθ)(1p(1ρ(βθ)))1𝛽𝑝𝜌𝜃𝑓𝜃𝛽1𝜌𝜃𝛽𝛽1𝜌𝜃𝑓𝛽𝜃1𝑝1𝜌𝛽𝜃\displaystyle\qquad-\frac{1}{\beta}p\rho(\theta)f\left(\frac{\theta}{\beta}% \right)\left(1-\rho\left(\frac{\theta}{\beta}\right)\right)-\beta(1-\rho(% \theta))f(\beta\theta)(1-p(1-\rho(\beta\theta)))- divide start_ARG 1 end_ARG start_ARG italic_β end_ARG italic_p italic_ρ ( italic_θ ) italic_f ( divide start_ARG italic_θ end_ARG start_ARG italic_β end_ARG ) ( 1 - italic_ρ ( divide start_ARG italic_θ end_ARG start_ARG italic_β end_ARG ) ) - italic_β ( 1 - italic_ρ ( italic_θ ) ) italic_f ( italic_β italic_θ ) ( 1 - italic_p ( 1 - italic_ρ ( italic_β italic_θ ) ) )
f(θ)(p(1ρ(θ))(12ρ(θ))+ρ(θ))𝑓𝜃𝑝1𝜌𝜃12𝜌𝜃𝜌𝜃\displaystyle\qquad-f(\theta)(p(1-\rho(\theta))(1-2\rho(\theta))+\rho(\theta))- italic_f ( italic_θ ) ( italic_p ( 1 - italic_ρ ( italic_θ ) ) ( 1 - 2 italic_ρ ( italic_θ ) ) + italic_ρ ( italic_θ ) )
=(ρ(θ))(pθθ+δ(1ρ)𝑑F+(1p)θδθ𝑑F+pθδθρ𝑑F)ε(p(1ρ(θ))(12ρ(θ))+ρ(θ))absentsuperscript𝜌𝜃𝑝superscriptsubscript𝜃𝜃𝛿1𝜌differential-d𝐹1𝑝superscriptsubscript𝜃𝛿𝜃differential-d𝐹𝑝superscriptsubscript𝜃𝛿𝜃𝜌differential-d𝐹𝜀𝑝1𝜌𝜃12𝜌𝜃𝜌𝜃\displaystyle=(-\rho^{\prime}(\theta))\left(p\int_{\theta}^{\theta+\delta}(1-% \rho)\,dF+(1-p)\int_{\theta-\delta}^{\theta}\,dF+p\int_{\theta-\delta}^{\theta% }\rho\,dF\right)-\varepsilon(p(1-\rho(\theta))(1-2\rho(\theta))+\rho(\theta))= ( - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ) ( italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ + italic_δ end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F + ( 1 - italic_p ) ∫ start_POSTSUBSCRIPT italic_θ - italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_d italic_F + italic_p ∫ start_POSTSUBSCRIPT italic_θ - italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ end_POSTSUPERSCRIPT italic_ρ italic_d italic_F ) - italic_ε ( italic_p ( 1 - italic_ρ ( italic_θ ) ) ( 1 - 2 italic_ρ ( italic_θ ) ) + italic_ρ ( italic_θ ) )
(ρ(θ))pθθ+δ(1ρ)𝑑Fεabsentsuperscript𝜌𝜃𝑝superscriptsubscript𝜃𝜃𝛿1𝜌differential-d𝐹𝜀\displaystyle\geq(-\rho^{\prime}(\theta))p\int_{\theta}^{\theta+\delta}(1-\rho% )\,dF-\varepsilon≥ ( - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ) italic_p ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ + italic_δ end_POSTSUPERSCRIPT ( 1 - italic_ρ ) italic_d italic_F - italic_ε
12(ρ(θ))p(1ρ(θ))θθ+δ𝑑Fεabsent12superscript𝜌𝜃𝑝1𝜌𝜃superscriptsubscript𝜃𝜃𝛿differential-d𝐹𝜀\displaystyle\geq\frac{1}{2}(-\rho^{\prime}(\theta))p(1-\rho(\theta))\int_{% \theta}^{\theta+\delta}\,dF-\varepsilon≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ) italic_p ( 1 - italic_ρ ( italic_θ ) ) ∫ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_θ + italic_δ end_POSTSUPERSCRIPT italic_d italic_F - italic_ε
14(ρ(θ))p(1ρ(θ))ε.absent14superscript𝜌𝜃𝑝1𝜌𝜃𝜀\displaystyle\geq\frac{1}{4}(-\rho^{\prime}(\theta))p(1-\rho(\theta))-\varepsilon.≥ divide start_ARG 1 end_ARG start_ARG 4 end_ARG ( - italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ) italic_p ( 1 - italic_ρ ( italic_θ ) ) - italic_ε . (15)

Here we used the fact that p(1ρ(θ))(12ρ(θ))+ρ(θ)[0,1]𝑝1𝜌𝜃12𝜌𝜃𝜌𝜃01p(1-\rho(\theta))(1-2\rho(\theta))+\rho(\theta)\in[0,1]italic_p ( 1 - italic_ρ ( italic_θ ) ) ( 1 - 2 italic_ρ ( italic_θ ) ) + italic_ρ ( italic_θ ) ∈ [ 0 , 1 ].

Now the first term in (15) is strictly positive, and we can freely choose ε𝜀\varepsilonitalic_ε strictly smaller in magnitude to get μρ(θ)>0superscriptsubscript𝜇𝜌𝜃0\mu_{\rho}^{\prime}(\theta)>0italic_μ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) > 0. Note that although ρ𝜌\rhoitalic_ρ might not be well-behaved everywhere, it is well behaved on I𝐼Iitalic_I, and we can apply Lemma 14.7 to this point to get that ρ𝜌\rhoitalic_ρ is not incentive compatible for θ𝜃\thetaitalic_θ, completing the proof. \Halmos