1. Introduction
Comparisons in pairs are frequently used in ranking and rating problems. They are mainly applied when scaling is very uncertain, but comparing the objects to the others can guarantee more reliable statistical data. The area of the possible applications is extremely large, and some examples are the followings: education [
1,
2], sports [
3,
4,
5], information retrieval [
6], energy supply [
7], financial sector [
8], management [
9], and food industry [
10].
The most popular method is Analytic Hierarchy Process (AHP) elaborated by Saaty [
11,
12] and developed by others; see for example the detailed literature in [
13]. The method has many advantages: more than two options, several methods for evaluation, the opportunity of incomplete comparisons, a simple condition for the uniqueness of the evaluation [
14], the possibility of multi-level decisions [
15], and the concept of consistency [
16]. Nevertheless, although [
17] proposed a very flexible and applicable composite indicator to an inferential approach that consists of multiple tests based on non-parametric techniques in a stochastic framework, in the case of AHP, due to the lack of a stochastic framework, confidence intervals for the parameters and testing hypotheses (in connection with the parameters) are out of the possibilities.
Fundamentally different models of paired comparisons are Thurstone motivated stochastic models. The basic concept is the idea of latent random variables, presented in [
18]. Thurstone assumed Gauss-distributed latent random variables and allowed two options in decisions, “worse” and “better”. The method was modified: Gauss distribution was replaced by logistic distribution in [
19] and the model is called the Bradley–Terry Model (BTM). One of its main advantage is the simple mathematical formulae. Thurstone applied the least squares method for parameter estimation, whereas BTM applies maximum likelihood estimation [
20] and the not-complicated formulae allow quick numerical methods for solving optimization problems. The existence and uniqueness of the optimizer is a key issue in the case of ML estimations; the necessary and sufficient condition for it is proven in [
21].
Ref. [
22] handles the Bradley–Terry model, also allowing options such as a nonlinear logistic regression model, and finds the optimal regions of design. A detailed survey for paired comparison, including regression models, is contained in [
23].
The model was generalized for three options (“worse”, “equal”, and “better”) in [
24] for Gauss distribution and in [
25] for logistic distribution. The latter paper applied maximum likelihood parameter estimation. Davidson made further modifications to the model concerning ties in [
26]. For more than three options we can find generalization in [
27] in the case of the Bradley–Terry model, and in [
28] in the case of Gauss distribution. In [
29], it was proven that the models require the same conditions in order to be able to evaluate the data uniquely in the case of a broad set of cumulative distribution functions for the latent random variables: the strictly log-concave property of the probability density function is the crucial point of the uniqueness, while the assurance of the existence is hidden in the data structure. We mention that Gauss distribution and logistic distribution are included in the set of distributions having a strictly log-concave probability density function. Note that, due to the probabilistic background, the Thurstone motivated models have the opportunity of building in the home field or first-mover advantage [
30], testing hypotheses [
31], and making forecasts [
32]; therefore, they are worth investigating.
In [
33], the author analyzed the structure of the comparisons allowing both two and three options in choice. The author emphasized that not only the structure of the graph made from the compared pairs but the results of the comparisons affect the existence of MLE. He made some data perturbations in the cases where there are comparisons, but some results did not occur. By these perturbations, the zero data values became positive, and these positive value guaranteed the strongly connected property of the directed graph constructed by the wins. However, these perturbations modified the data structures; therefore, it would be better to avoid them.
In [
34], the authors investigated BTM with two options and provided estimations for the probability of the existence of MLE. The authors turned to the condition of Ford to check whether MLE exists uniquely or not. As the condition of Ford is a necessary and sufficient condition, it indicates explicitly whether the MLE works or not. However, in the case of other distributions and/or more than two options, these investigations could not be performed due to the lack of a necessary and sufficient condition for the existence and uniqueness of MLE.
To continue their research, it would be conducive to have a (necessary and) sufficient condition for the existence and uniqueness. To the best of our knowledge, there is no such theorem in the research literature in the case of three options, and only two sufficient conditions are known. In this paper we compare the known conditions, we formulate their generalization, and we prove it. Then, we compare the applicability of the different conditions from the following point of view: how often and for what type of parameters are they able to indicate the existence and uniqueness of MLE. We have made large numbers of computer simulations’ repetitions in many parameters’ cases and we use them to answer these questions.
The paper is organized as follows: In
Section 2, the investigated model is described. In
Section 3, we present new conditions under which the existence and uniqueness are fulfilled. The proof can be found in
Appendix A. In
Section 4, two real-life applications are contained. In
Section 5, simulation results concerning the applicability are presented. Finally, a short summary is given.
2. The Investigated Model
Let
n denote the number of the different objects to evaluate and let the number
denote the objects themselves. We would like to evaluate (rank and rate) them with the help of the opinions of some persons called observers. We think that every object has a latent random variable behind,
. Let the number of the options in a choice be
, namely, “worse”, “equal”, and “better”, denoted by
,
, and
, respectively. The set of the real numbers
is the union of three intervals, which have no common points. Each option in judgment corresponds to an interval of these intervals. If the judgment comparing
i and
j is the option
, then the difference
is situated in the interval
The intervals are appointed by their initial points and endpoints, −
∞,
,
d, and
∞, so
= (−
∞,
),
= [
], and
= (
d,
∞). The above intervals together with the corresponding options are presented in
Figure 1.
We can write
where:
are the strengths of the objects, and
are identically distributed random variables with expectation 0. The ranking of the expectations determines the ranking of the objects and the differences in their values give information concerning the differences of the strengths. We want to estimate the expectations and the value of the border of “equal” (
d) on the basis of the data. For that, we use maximum likelihood estimation.
The probabilities of judgment
can be determined based on the distribution of
as follows [
35]:
where
F is the (common) cumulative distribution function (c.d.f) of
.
Let us denote the number of observers by r. The judgment produced by the uth observer () concerning the comparison of i and j is encoded in the elements of a four-dimensional matrix. The third indices correspond to the options in choices, and k = 1, 2, 3 are for judgments “worse”, “equal”, and ”better”, respectively. The matrix X has four dimensions, 1, 2, 3, and its elements are:
=
Let . Of course, due to the symmetry, . It expresses that if the ith object is “better” than the jth object, then the jth object is “worse” than the ith object, according to the judgment of the uth respondent.
Let be the number of observations when objects i and j are compared and let A denote the three-dimensional matrix containing the elements Of course,
The likelihood function is the probability of the sample in the function of the parameters. If the judgments are independent, the likelihood function is expressed as follows:
which has to be maximized in
and
We can notice that the function (
6) depends on the differences of the expectation’s coordinates; therefore, one of the coordinates, for example
, can be fixed.
3. Conditions for the Existence and Uniqueness
In [
21], the author presents a necessary and sufficient condition for the existence and uniqueness of MLE, if there are only two options for choice and
F, the c.d.f. of
is the logistic c.d.f. The condition is the following: for an arbitrary non-empty partition of the objects,
S and
, there exists at least one element of
S that is “better” than an element of
, and vice versa. In [
26], the author states that this condition supplemented with the condition “there is at least one tie (“equal”)” is enough for having a unique maximizer in a modified Bradley–Terry model. The theorem assumes logistic distribution, and its proof uses this special form; therefore, the proof is valid only for the investigated special model. Now, we prove it for a broad set of cumulative distribution functions. We require the following properties:
F is a c.d.f. with
,
F is three times continuously differentiable, its probability density function
f is symmetric and the logarithm of
f is a strictly concave function in
. Gauss and logistic distributions belong to this set, together with many others. Let us denote the set of these c.d.f.’s by
.
First, we state the following generalization of Ford’s theorem:
Theorem 1. Let , and suppose that there are only two options in the choice. Fix the value of the parameter . The necessary and sufficient condition for the existence and uniqueness of MLE is the following: for an arbitrary non-empty partition of the objects S and , there exists at least one element of S that is “better” than an element of , and vice versa.
The proof of sufficiency relies on the argumentation of Theorem 4 omitting the variable
d. The used steps are (ST3), (ST5), and (ST6) in
Appendix A. In the last step, the strictly concave property of
can be concluded from the theory of logarithmic concave measures [
36]. The necessity is obvious: if there would be a partition without “better” from one subset to another, then each element of this subset would be “worse” than the elements of the complement, but the measure of “worse” could not be estimated. The likelihood function would be monotone increasing, and consequently, the maximum would not be reached.
Returning to the case of three options, we formulate the conditions of Davidson in the following:
DC 1. There exists an index pair for which A
DC 2. For any non-empty partition of the objects S and , there exist at least two index pairs (,) and (,) , for which and
We shall refer to them as the set of conditions DC. Condition DC 1 expresses that there is a judgment “equal”. Condition DC 2 coincides with the condition of Ford in [
21] in the case of two options. It expresses that there is at least one object in both subsets that is “better ” than an object in the complement.
Theorem 2. Let . If conditions DC 1 and DC 2 hold, then, fixing , the likelihood function (6) reaches its maximal value and its argument is unique. Theorem 2 follows from a more general statement, Theorem 4, which will be proven in
Appendix A.
Now, we turn to another set of conditions that guarantees the existence and uniqueness of MLE. These conditions will be abbreviated by the initial letters MC.
MC 1. There is at least one index pair for which
holds.
MC 2. There is at least one index pair for which
and
Let us define the graph as follows: the nodes are the objects to be compared. There is an edge between two nodes i and j if or ( and ) hold.
MC 3. Graph is connected.
Theorem 3. Ref. [29] Let . If conditions MC 1, MC 2, and MC 3 hold, then, after fixing = 0, the likelihood function (6) attains its maximal value and the argument of the maximum is unique. To clear the relationship between conditions DC 1 and DC 2, and MC 1, MC 2, and MC 3, we present two examples. In Example 1, DC 1 and DC 2 are satisfied but MC 2 and MC 3 are not. In Example 2, DC 2 is not satisfied but MC 1, MC 2, and MC 3 are. These examples expose that the sets of conditions DC and MC do not cover each other. Moreover, they support that MLE may exist uniquely even if DC 1 and DC 2 or MC 1, MC 2, and MC 3 do not hold. Therefore, we can see that neither conditions DC nor conditions MC are necessary conditions.
Example 1. Let n = 3 and = 1, = 1, = 1, and = 1 (see Figure 2). Now, both DC 1 and DC 2 hold, but MC 3 does not. Example 2. Let n = 3 and = 1, = 1, and = 1 (see Figure 3). Now, one can easily check that MC 1, MC 2, and MC 3 hold but DC 2 does not. As a short explanation, the graph in
Figure 2 represents the following comparison results. There are 4 comparisons between the object “1”, “2” and “3”. There is an opinion according to which “equal” between “1” and “2” (denoted by 1- - -2;
), moreover there is an opinion according to which “1” is better than “2” (denoted by 1->2,
). Furthermore, there is an opinion according to which “2” is better than “3” (denoted by 2->3;
), finally there is an opinion according to which “3” is better than “1” (denoted by 3->1;
). Similarly, the graph of
Figure 3 visualizes the following comparison results between the objects “1”, “2”, and “3”: there is an opinion according to which “2” is “equal” to “3” (2- -3;
), there is an opinion according to which “1” is better than “2”, and vice versa (1->2; 2->1;
=
= 1).
Theorems 2 and 3 can be generalized. Let us introduce the following set of conditions, denoted by SC:
SC 1. There is at least one index pair for which holds.
Let us introduce a graph belonging to the results of the comparisons as follows: let be a directed graph, the nodes be the objects, and let there be a directed edge from i to j if there is an opinion according to which i is “better” than j, that is . Now, we can formulate the following conditions:
SC 2. There is a cycle in the directed graph
SC 3. For any non-empty partition of the objects S and , there exists at least two (not necessarily different) index pairs (,) and (,) , for which
and ,
or there exists an index pair (,) and for which .
It is easy to see that condition SC 2 is more general than condition MC 3 and condition SC 3 is more general than condition DC 2. Condition SC 3 expresses that any subset and its complement is interconnected by an opinion “better” from one to another and vice versa or an opinion “equal”. Here, Condition DC 2 is replaced by a more general condition: next to the “better” opinions, the opinion “equal” is an appropriate judgment for connection.
To analyze the relationships between the sets of conditions DC, MC, and SC, we can recognize that:
(A) DC 1, MC 1, and SC 1 coincide.
(B) If DC 2 holds, then so do SC 2 and SC 3.
(C) If MC 2 holds, so does SC 2.
(D) If MC 3 holds, so does SC 3.
These together present that conditions SC 1, SC 2, and SC 3 are the generalization of the conditions DC and MC. To show that SC is really a more general set of conditions we present Example 3.
Example 3. Let n = 4, = 1, = 1, = 1, and = 1 (see Figure 4). In this case, neither conditions DC 2 nor MC 2 hold, but SC 1, SC 2, and SC 3 do. Now, we state the following theorem.
Theorem 4. Let . If conditions SC 1, SC 2, and SC 3 hold, then, after fixing = 0, the likelihood function (6) attains its maximum value and its argument is unique. The proof of Theorem 4 can be found in
Appendix A.
We note that Theorem 2 is a straightforward consequence of Theorem 4.
Unfortunately, conditions SC 1, SC 2, and SC 3 are not necessary conditions. One can prove that in the case of Example 4 there exists a unique maximizer of function (
6) but SC 2 does not hold.
Example 4. Let n = 3, , , and (see Figure 5). 5. Comparisons of the Efficiency of the Conditions
In this section, we investigate, in some special situations, which sets of conditions (conditions DC 1, DC 2; conditions MC 1, MC 2, MC 3; conditions SC 1, SC 2, SC 3) are fulfilled, i.e., are able to detect the existence and the uniqueness of the maximizer.
From applications’ perspective, there are such cases when the strengths of the objects to rank are close to each other and when they differ very much. On the other hand, there are such cases when the judgment “equal” is frequent, and such cases when it is rare. Referring to sports: in football and in chess the result draw comes up often, but in handball rarely.
The most general set of conditions is the set of conditions SC. These conditions are fulfilled most frequently from the three sets of conditions. Nevertheless, it is interesting to what extent it is more applicable than the other two sets of conditions. For that, we made a large amount of computer simulation repetitions in the case of different parameter settings, and we investigated how frequently the conditions are satisfied and how frequently we experience that the maximum exists.
Monte-Carlo simulation is a widespread tool for demonstrating the efficiency of a method or finding optimal solutions [
39,
40,
41]. Due to the wide variety of the possible outcomes, we applied Monte-Carlo simulation for the investigations. The steps of the simulations are detailed as follows:
- 1
Fix the expectations () and the value of the parameter d. We used arithmetical sequences, i.e., = .
- 2
Fix the number of comparisons.
- 3
Generate randomly the pairs between which the comparisons exist.
- 4
Generate randomly the result of each comparison, according to the probabilities (
3), (
4), and (
5).
- 5
Check if the set of conditions DC, the set of conditions MC, and the set of conditions SC are satisfied or not in the case of the above generated random graph.
- 6
Optimize numerically the likelihood function (
6) applying the data corresponding to the generated graph. Decide whether the optimization process is convergent or not.
- 7
Repeat steps 3–6 N times.
- 8
Compute the relative frequency of the fulfills of the set of conditions DC, MC, and SC; moreover, compute the relative frequency of the iterations in which the numerical optimization is convergent.
During the simulations, logistic distribution was used in the case of the likelihood function. The numerical optimization could be performed by statistical program packages (for example MATLAB [
42], or R) but for the sake of very quick optimizations we developed a program in C#, applying a modified fix-point iteration for the partial derivatives of the logarithm of the likelihood function (
6). After every 25 iterations, we checked the changes in the variables rounded to five decimal places. If it was zero, we stopped the iteration. If the iteration process was not stopped after 1000 iterations, it indicated the absence of a clear maximum position. Using this method, we calculated the objects’ strength and decided about the convergence of the iteration.
In the presented cases,
=
We present the case of eight objects. The numbers of comparisons were any integer from 8 to 64. In the presented cases they are 8, 16, 32, and 64. We present four parameter ensembles, called situations, which are shown in
Table 3. The numbers of repetitions were high,
.
In the presented situations, if the value of h is small then the strengths of the objects are close to each other. It implies that many “better–worse” pairs could be formed during the simulations. On the other hand, if the value of h is large, the strengths of the objects are far from each other, then we can expect only few “better–worse” pairs, but a great amount of “better” judgment. In terms of the number of “equal” judgments, if d is large then many “equal” judgments could be formed during the simulations, while only few, when d is small. The set of conditions DC can apply well the judgments “better”, but it requires only a single “equal” judgment. However, the set of conditions MC can use the judgments “equal” for connections, and the pairs “better–worse” judgments. Conditions SC do not require “better–worse” pairs, only judgments “better”, in one circle. We recall that a single “better–worse” pair is appropriate as a circle. The judgments “equal” are well-applicable for this set of conditions, too.
Table 3 summarizes the situations with the presumable ratios of the “equal” judgments and “better–worse” pairs. In addition,
Table 4,
Table 5,
Table 6 and
Table 7 contains the numerical results of the simulations. The order of the situations in terms of the number of the existence of the maximal values is decreasing. Column MAX contains the number of the cases when the maximum exists. Columns DC/MAX, MC/MAX and SC/MAX present the ratios of the cases when the set of conditions DC, MC, SC hold, respectively. We can see that increasing the number of comparisons, the number of such cases when the maximal value exists and the ratios increase. We can realize that the values of the columns SC/MAX are less than 1 on several occasions. This detects again that SC is not a necessary condition.
We performed
simulations per situation.
Table 4 presents the results in Situation I. In this case we can see that the DC/MAX rate is lower than the MC/MAX rate. We could predict it because there are many “equal” judgments. The SC/MAX rate is high even for 16 comparisons. In the case of 16 comparisons, SC is 3.5 times better than MC and over 100 times better than DC.
Table 5 presents the results of Situation II. In this case, the rate of “equal” is low, which does not favor the set of conditions MC. This is also reflected in the ratio MC/MAX, which is much worse than the ratio DC/MAX. The set of conditions SC still stands out among the other conditions.
Table 6 shows the results of Situation III. Here, the maximum values exist more rarely than in the previous two cases. In this case, the number of “equal” decisions is high, while the rate of the “better–worse” pairs is low, which is more favorable for MC than DC, as we can see in
Table 6. It can also be seen that the set of conditions DC is not as good as in the previous tables in terms of detecting the existence of the maximum. SC stands out again from the other two sets of conditions. Nevertheless, SC is able to show the existence of the maximum only at 73% in the case of 32 comparisons, compared to 99% in the previous situations. The set of conditions DC is almost useless, it is useful only in the cases of 3.3% even if the number of comparisons equals 64. The set of conditions of the MC method is slowly catching up and getting better, but for small numbers of comparisons (8, 16, 32) it is far from the much better SC criteria.
Table 7 presents the results in Situation IV. In the latter case, the numbers of “equal” choices and “better–worse” pairs are small, which is unfavorable to MC, principally. In this situation, SC detects the existence of the maximal value exceptionally well. DC evinces them less fine, but it works better than MC.
In all situations, we found that when we make few comparisons, SC is superior to the other conditions. As we made more and more comparisons, both other methods got better and better, but they were always worse than SC. The clear conclusion from the four tables is that the set of conditions SC is much more effective than the others, especially for small numbers of comparisons.