Median Statistics Estimate of the Distance to M87

Nicholas Rackers; Sofia Splawska; Bharat Ratra

doi:10.1088/1538-3873/ad220e

1. Introduction

The extragalactic distance ladder is essential to astrophysics and cosmology and must constantly be refined. As the galaxy M87 lies near the center of the closest galaxy cluster to us, Virgo, it is an important rung on the distance ladder and allows us to extend the ladder to more distant clusters such as Coma and Fornax. de Grijs & Bono (2019), hereafter deGB, compiled a list of 211 distance measurements to M87 obtained by 15 different distance tracers. They report a mean value of 31.03 ± 0.14 mag after reducing their data set to 24 measurements from 3 tracers which they believe to be well-calibrated and independent.

Using the mean as a central estimate, as deGB did, implicitly assumes Gaussianly distributed data, or at least a distribution with deviations from an underlying symmetry and outliers only at the noise level. However, we determine that the full compilation, as well as the two sub-compilations studied by deGB are significantly non-Gaussian.⁵ Non-Gaussian data compilations are not uncommon in astronomy: well known examples including Hubble constant measurements (Gott et al. 2001; Chen et al. 2003; Chen & Ratra 2011; Calabrese et al. 2012; Bethapudi & Desai 2017; Zhang 2018), cosmological mass density estimates (Chen & Ratra 2003), distances to the SMC and LMC (De Grijs et al. 2014; Crandall & Ratra 2015), and measurements of the Solar radius and Galactic rotational velocity (De Grijs & Bono 2016, 2017; Camarillo et al. 2018a, 2018b; Rajan & Desai 2018; Bobylev & Bajkova 2021). For further examples see Crandall et al. (2015), Bailey (2017), Zhang (2017), Rajan & Desai (2020), and Zhang et al. (2022).

Following Crandall & Ratra (2014), Penton et al. (2018), and Yu et al. (2020), we analyze the deGB data sets using median statistics (Gott et al. 2001), which is free from assumptions about the distribution underlying the data set and its errors. Because median statistics does not take into account error bars on individual measurements, it is generally less constraining. Nevertheless, we believe the median to provide a more accurate estimate of the distance to M87 than methods which rely on the unsatisfied assumption of Gaussianity. In addition to using median statistics to estimate a more reliable statistical uncertainty in the M87 distance measurement, we also utilize it to estimate the systematic uncertainty in this measurement, based on the scatter in the M87 distance estimated using each of the 15 different tracer sub-groups in the deGB compilation.⁶

The results of this median statistics analysis on the entire data set of 211 distance measurements provided by deGB yields an M87 distance of ${31.08}_{-0.04}^{+0.04}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag at 68.27% significance. Combining the two errors in quadrature we get ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc.

In Section 2 we introduce the different data compilations studied. In Section 3 we summarize median statistics and outline the Gaussianity test we use. In Section 4 we study the Gaussianity of the deGB compilations and argue that our median statistics result is a better representation of the true distance to M87 than a more conventional mean analysis. We conclude in Section 5.

2. Data

deGB compiled 211⁷ M87 distance measurements from the NASA/Astrophysics Data System that they found to be statistically independent. They included measurements both to M87 and to the geometric center of the Virgo cluster since these values were found to be statistically indistinguishable. These were grouped into 15 tracers.⁸ deGB selected 5 of these tracers: Cepheids, planetary nebulae luminosity function (PNLF), surface brightness fluctuations (SBF), tip of the red giant branch (TRGB) magnitude, and novae, which they found to be internally consistent and provide tight averages as opposed to the other 10 tracers. These 5 tracers correspond to a total of 44 measurements. They adjusted the measurements from these 5 tracers to agree with their best estimate of the distance modulus to the LMC of 18.49 ± 0.09 mag (De Grijs et al. 2014).⁹ This value is in good agreement with the median statistics estimate of 18.49 ± 0.13 mag found by Crandall & Ratra (2015). deGB's final recommended value is the arithmetic mean of a set of 24¹⁰ points from three tracers: Cepheids, SBF, and TRGB. The PNLF and novae tracers were removed due to foreground- and background-biased outliers, respectively. The result of deGB's mean analysis of the 24 points from these 3 tracers is a distance modulus of 31.03 ± 0.14 mag at 1σ sample standard deviation; we find an identical result.

In this paper we analyze the following four subsets of the deGB compilation and provide the best estimate for each subset with 1σ uncertainty as explained in Section 3.1. All 15 refers to deGB's full list of 211 unadjusted data points from 15 tracers. All 15 without averages is the same but now excluding the averages tracer.¹¹ Best 5 refers to the adjusted 44 measurements from 5 tracers from Table 1 of deGB that they determined to be internally consistent. Best 3 refers to the adjusted 24 data points from 3 tracers (excluding the Tammann et al. 2000, point) from Table 1 of deGB that they used to compute their favored summary measurement value.

3. Analysis

Conventional methods such as mean and χ² analyses assume (Gott et al. 2001):

1.
Individual data points are statistically independent.
2.
There are no systematic effects.
3.
The errors are Gaussianly distributed.
4.
One knows the standard deviation of the errors.

Median statistics was developed by Gott et al. (2001) as a powerful alternative to mean and χ² analyses. The essential idea is that the true value of the quantity being measured is the median of a set of repeated error-affected measurements as the number of measurements tends toward infinity. This follows from the assumptions that the data set contains only independent measurements and that it does not have any overall systematic error.¹² The individual measurement errors have no effect on the computation of the median of a data set. Thus assumptions 3 and 4 are not necessary for the application of median statistics. This is advantageous for analyzing non-Gaussian data sets, as for such cases any mean analysis is suspect due to the failure to satisfy assumption 3. Since the individual measurement errors are not taken into account (assumption 4 is dropped), a median statistics analysis will generally provide a less constraining central estimate than mean statistics. We argue that despite this, median statistics provides the most reliable central estimate in the case of a non-Gaussian data compilation.

We study the Gaussianity of the compilations described in Section 2. We do this by creating error distributions of the data from various central estimates and comparing them to the Gaussian probability distribution. Based on the results of this analysis, we argue that the median is the most accurate and reliable central estimate of the deGB compilations.

3.1. Computing the Central Estimate

To study the Gaussianity of a data set, we construct an error distribution using a central estimate. We compute the median, weighted mean, and arithmetic mean central estimates and create an error distribution for each.

The true median of a data set is defined as the median as the number of measurements, N, goes to infinity. Gott et al. (2001) showed that the probability that the true median lies between any two adjacent measurements M_i and M_i+1 is given by the appropriately normalized binomial distribution:

$\begin{eqnarray}&&P=\displaystyle \frac{{2}^{-N}N!}{i!(N-i)!},\end{eqnarray} \tag{ 1 }$

where M₀ ≡ −∞ and M_N+1 ≡ +∞ . To compute the uncertainty in our estimate of the median: (i) for every index, we find the probability that the true median is between it and the subsequent index; (ii) we split the data into two halves above and below the median of our distribution; (iii) we compute the area of each half; (iv) we iterate from the median until 68.27% of the area of a half is exceeded; and, (v) the data at the index which yields an area closest to the 68.27% is then recorded as M_i+ or M_i−.

These indices are then used in the ordered data to construct the uncertainty as described:

$\begin{eqnarray*}\begin{array}{rcl}{\sigma }_{+} & = & {M}_{i+}-{M}_{\mathrm{med}},\\ {\sigma }_{-} & = & {M}_{\mathrm{med}}-{M}_{i-},\end{array}\end{eqnarray*}$

where M_med is the median.

The weighted-mean central estimate, while it has the added assumption of Gaussianity, has the benefit of using the reported uncertainties and so is a more constraining estimate than the median. Given a data set M_i ± σ_i we compute the weighted-mean central estimate as (Podariu et al. 2001)

$\begin{eqnarray}&&{M}_{\mathrm{wm}}=\displaystyle \frac{{\sum }_{i=1}^{N}{M}_{i}/{\sigma }_{i}^{2}}{{\sum }_{i=1}^{N}1/{\sigma }_{i}^{2}},\end{eqnarray} \tag{ 2 }$

with standard deviation

$\begin{eqnarray}&&{\sigma }_{\mathrm{wm}}=\displaystyle \frac{1}{\sqrt{{\sum }_{i=1}^{N}1/{\sigma }_{i}^{2}}}.\end{eqnarray} \tag{ 3 }$

We also employ the arithmetic mean central estimate

$\begin{eqnarray}&&{M}_{{\rm{m}}}=\displaystyle \frac{1}{N}\displaystyle \sum _{i=1}^{N}{M}_{i},\end{eqnarray} \tag{ 4 }$

with standard error of the mean

$\begin{eqnarray}&&{\sigma }_{{\rm{m}}}=\sqrt{\displaystyle \frac{1}{{N}^{2}}\displaystyle \sum _{i=1}^{N}{({M}_{i}-{M}_{m})}^{2}}.\end{eqnarray} \tag{ 5 }$

These central estimates are used to construct error distributions which we will compare with a Gaussian in order to evaluate the Gaussianity of the data set.

3.2. Error Distributions

We create error distributions of our data in order to study their Gaussianity. The error distribution ${N}_{{\sigma }_{i}}$ is a measure of how many standard deviations any individual measurement deviates from the central estimate.

Given measurements M_i ± σ_i and a corresponding central estimate M_CE ± σ_CE, the error distribution is

$\begin{eqnarray}&&{N}_{{\sigma }_{i}}=\displaystyle \frac{{M}_{i}-{M}_{\mathrm{CE}}}{\sqrt{{\sigma }_{i}^{2}+{\sigma }_{\mathrm{CE}}^{2}}}.\end{eqnarray} \tag{ 6 }$

This expression assumes that the central estimate is not correlated with the data set. This assumption is not satisfied here, as our central estimates are computed directly from the data compilations. Finding such a formula for an error distribution of a correlated median is beyond the scope of this work.

The weighted-mean case has been solved in the case of correlation between the weighted-mean and the data set.¹³ The error distribution using a weighted-mean central estimate that is correlated with the data set is

$\begin{eqnarray}&&{N}_{{\sigma }_{i}}=\displaystyle \frac{{M}_{i}-{M}_{\mathrm{wm}}}{\sqrt{{\sigma }_{i}^{2}-{\sigma }_{\mathrm{wm}}^{2}}}.\end{eqnarray} \tag{ 7 }$

We often have asymmetric error bars on the median, in which case we slightly alter Equation (6). We use the upper error of the median when M_i > M_med and the lower error when M_i < M_med

$\begin{eqnarray}{N}_{{\sigma }_{i}}^{\mathrm{med}}=\left\{\begin{array}{ll}\displaystyle \frac{{M}_{i}-{M}_{\mathrm{med}}}{\sqrt{{\sigma }_{i}^{2}+{\sigma }_{\mathrm{med}}^{2}}}, & {M}_{i}\geqslant {M}_{\mathrm{med}}\\ \displaystyle \frac{{M}_{i}-{M}_{\mathrm{med}}}{\sqrt{{\sigma }_{i}^{2}-{\sigma }_{\mathrm{med}}^{2}}}, & {M}_{i}\lt {M}_{\mathrm{med}}.\end{array}\right.\end{eqnarray} \tag{ 8 }$

After symmetrizing the error distribution about 0,¹⁴ we use the Kolmogorov–Smirnov (KS) test to study the Gaussianity of the data compilations.

3.3. The Kolmogorov–Smirnov Test and Testing Gaussianity

The KS test is a statistical test that compares empirical error distributions with continuous probability density functions (PDFs).¹⁵ The first step is calculating the D-statistic, which is the largest difference between the empirical cumulative distribution function and that of the relevant PDF.

The D-statistic is then used to compute z as given in Press et al. (2007)

$\begin{eqnarray}&&z=\left(\sqrt{N}+0.12\displaystyle \frac{0.11}{\sqrt{N}}\right)D,\end{eqnarray} \tag{ 9 }$

which is used to compute the p-value

$\begin{eqnarray}&&p=2\displaystyle \sum _{i=1}^{\infty }{(-1)}^{i-1}{e}^{-2{i}^{2}{z}^{2}}.\end{eqnarray} \tag{ 10 }$

Explicitly, the p-value is the probability that the D-statistic could be smaller than measured if a similar data set is used. The p-value represents the probability that we can reject the null hypothesis which states that these data do not come from the PDF of interest. Conventionally, if p ≥ 0.95, we can reject the null hypothesis that our data do not come from the relevant PDF. Therefore, if p ≤ 0.95, we conclude that these data are not consistent with having been drawn from a Gaussian distribution.

We also introduce a scale factor S such that

$\begin{eqnarray}&&{N}_{{\sigma }_{i}}^{\mathrm{scaled}}=\displaystyle \frac{{N}_{{\sigma }_{i}}}{S}.\end{eqnarray} \tag{ 11 }$

S > 1 corresponds to decreasing the errors, or narrowing the distribution, and S < 1 corresponds to increasing the errors, or widening the distribution. We then run the KS test, varying S from 0 to 10 to find S*, the value of S which optimizes the p-value. This allows us to compare the error distribution to a Gaussian. If S* > 1, that is, if the optimal scale factor is such that the distribution must be narrowed to fit a Gaussian, the distribution is broader than a Gaussian and we conclude that the errors may have been overestimated. Similarly, if S* < 1, the distribution is more narrow than a Gaussian and we conclude that the errors may have been underestimated.

3.4. Estimating Systematic Error

All uncertainties computed thus far have corresponded to statistical errors. We perform an analysis of the systematic error present in the All 15 data set, and in the All 15 without Averages data subset, using the procedure outlined in Chen & Ratra (2011). Within each tracer subgroup there is statistical uncertainty resulting in a spread of measurements, and between the tracers there could be systematic error resulting from the different techniques and calibrations.

We construct a new data set consisting of the median of each tracer subgroup. We perform a median statistics analysis on this new data set to find the median of medians and its associated uncertainty. If we assume that these medians differ only systematically from each other, this uncertainty corresponds to the systematic uncertainty of the entire group of tracers.

4. Results

We perform a median statistics analysis on deGB's compilation of 211 distance measurements. We calculate three central estimates and study the Gaussianity of the All 15 data set (comprised of 211 measurements from 15 tracers), the Best 5 data set (comprised of 44 adjusted measurements from 5 tracers), and the Best 3 data set (comprised of 24 adjusted measurements from 3 tracers). These data sets are outlined in Section 2. The central estimates and results of the KS test for these data are shown in Table 1.

Table 1. Central Estimates and KS Test Results for the Four Data Compilations

Data Set		Central Estimate (mag)	p^a	S*	p* ^b
	Median	${31.08}_{-0.04}^{+0.04}$	≪0.1	2.2	0.83
All 15	Weighted Mean^c	31.07±0.01	≪0.1	2.4	0.63
	Arithmetic Mean	30.97±0.07	≪0.1	2.1	0.51

All 15	Median	${31.08}_{-0.04}^{+0.05}$	≪0.1	2.1	0.85
w/o Averages	Weighted Mean	31.06±0.01	≪0.1	2.1	0.68
	Arithmetic Mean	30.95±0.08	≪0.1	1.9	0.49

	Median	${31.02}_{-0.05}^{+0.04}$	0.20	1.4	1.0
Best 5^d	Weighted Mean	31.03±0.01	0.039	1.8	1.0
	Arithmetic Mean	31.04±0.03	0.057	1.6	1.0

	Median	${31.025}_{-0.005}^{+0.04}$	0.26	1.7	1.0
Best 3^e	Weighted Mean	31.05±0.01	0.109	1.9	1.0
	Arithmetic Mean	31.03±0.03	0.30	1.8	1.0

Notes.

^aThe p-values in this column are for the unscaled error distribution, or S = 1.^bThe 1.0 values in the p* column all lie within the range of (0.995, 0.999).^cThere were some data without errors. These were set to the mean of the uncertainties for that tracer in order to perform the weighted mean analysis.^dIncluding the Tammann et al. (2000) measurement.^eExcluding the Tammann et al. (2000) measurement.

Download table as: ASCII Typeset image

We find all four data sets to be inconsistent with Gaussianity. Therefore, we argue that the median provides the best central estimate of each of these compilations. As the optimal scale factor S* > 1 for all of these data sets, the error distributions are all wider than a Gaussian. This could be due to an overestimation of some of the errors.

We report a median of ${31.08}_{-0.04}^{+0.04}$ mag for the All 15 data set, ${31.08}_{-0.04}^{+0.05}$ for the All 15 without averages data set, ${31.02}_{-0.05}^{+0.04}$ mag for the Best 5 data set, and ${31.03}_{-0.01}^{+0.04}$ mag for the Best 3 data set, where all errors are at a 68.27% confidence level.

In the case of the All 15 and All 15 without averages data sets, there are enough tracers to estimate the systematic error in the median of the entire data as outlined in Section 3. The results of this analysis are shown in Table 2. The median of the All 15 data set is ${31.08}_{-0.04}^{+0.04}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag and the median of the All 15 without averages is ${31.08}_{-0.04}^{+0.05}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag, at 68.27% significance. The systematic error due to the tracer differences is comparable to the statistical uncertainty on the median. Combining the two errors in quadrature we get ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc for both the All 15 and the All 15 without averages data sets.

Table 2. All 15 and All 15 Without Averages Systematic Uncertainty Analyses^a

Tracer Type	N	Median (Shift) ^b	w/o Ave (Shift)	1σ Error (Width) ^c	w/o Ave 1σ Error (width)
All Data	211	31.08	⋯	−0.04, +0.04 (0.08)	−0.05, +0.05 (0.09)
TFR.............	36	31.24 (0.16)	(0.16)	−0.16, +0.16 (0.32)	−0.16, +0.16 (0.32)
GCLF............	32	31.11 (0.03)	(0.04)	−0.14, +0.04 (0.18)	−0.14, +0.04 (0.18)
Averages........	21	31.08 (0.00)	⋯	−0.02, +0.15 (0.17)	⋯
SBF.............	18	31.12 (0.04)	(0.05)	−0.09, +0.03 (0.12)	−0.09, +0.03 (0.12)
SNe.............	18	31.65 (0.57)	(0.57)	−0.05, +0.05 (0.10)	−0.05, +0.05 (0.10)
Other Methods	15	30.90 (−0.18)	(−0.18)	−0.10, +0.25 (0.35)	−0.10, +0.25 (0.35)
PNLF............	12	30.87 (−0.22)	(−0.21)	−0.02, +0.03 (0.05)	−0.02, +0.03 (0.05)
Faber-Jackson	11	31.14 (0.06)	(0.07)	−0.01, +0.34 (0.35)	−0.01, +0.34 (0.35)
Color-magnitude	11	30.84 (−0.24)	(−0.24)	−0.08, +0.06 (0.14)	−0.08, +0.06 (0.14)
Novae...........	8	31.40 (0.32)	(0.33)	...	...
Hubble law......	8	27.30 (−3.78)	(−3.78)	...	...
Cepheids........	7	31.02 (−0.06)	(−0.06)	...	...
HII.............	6	31.20 (0.12)	(0.13)	...	...
Group Member	5	30.50 (−0.58)	(−0.58)	...	...
TRGB............	3	31.05 (−0.03)	(−0.03)	...	...

Subgroup Medians	15	31.08	31.08	−0.06, +0.04 (0.1)	−0.06, +0.04 (0.1)

Notes.

^aAll Data are the 211 data points including the Tammann et al. (2000) measurement. The next 15 lines are the individual tracer types. Only for tracer types with more than 10 measurements do we show their uncertainty in the last column. The Subgroup Medians row shows the results from a median statistics analysis of the previous 15 medians and its uncertainty is the reported systematic uncertainty.^bThe shift is the difference between the tracer median and the All Data median in the first row.^cError on the median of the previous column.

Download table as: ASCII Typeset image

The All 15 data set has the most measurements, enough to allow an estimate of systematic uncertainty, and also to make this data set the best defended against the effects of small number statistics. Therefore, we choose to use the results of the analysis of this data set as our final reported value.

5. Conclusion

After analyzing the data sets compiled by deGB, we recommend an M87 median statistics distance modulus from the All 15 data set without averages of value ${31.08}_{-0.04}^{+0.05}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag at 68.27% significance. Combining the two errors in quadrature we have ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc. This estimate is consistent with deGB's result of 31.03 ± 0.14 mag based on the Best 3 data set. We argue that our reported value is more reliable than deGB's mean statistics analysis since these data are not Gaussianly distributed. Using the larger data set allowed us to estimate systematic uncertainty and include that in our reported value.

Acknowledgments

We acknowledge helpful discussions with Jacob Peyton, Aman Singal, Shantanu Desai, and Gunasekar Ramakrishnan. This project was supported by funding from Kansas State University's REU program funded by NSF grant No. 1757778.

Median Statistics Estimate of the Distance to M87

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Data