Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Paper The following article is Open access

Median Statistics Estimate of the Distance to M87

, , and

Published 2024 March 5 © 2024. The Author(s). Published by IOP Publishing Ltd on behalf of the Astronomical Society of the Pacific (ASP). All rights reserved
, , Citation Nicholas Rackers et al 2024 PASP 136 024101 DOI 10.1088/1538-3873/ad220e

1538-3873/136/2/024101

Abstract

de Grijs & Bono compiled 211 independent measurements of the distance to galaxy M87 in the Virgo cluster from 15 different tracers and reported 31.03 ± 0.14 mag as the arithmetic mean of a subset of this compilation as the best estimate of the distance. We compute three different central estimates—the arithmetic mean, weighted mean, and the median—and corresponding statistical uncertainty for the full data set as well as three sub-compilations. We find that for all three central estimates the error distributions show that the data sets are significantly non-Gaussian. As a result, we conclude that the median is the most reliable of the three central estimates, as median statistics do not assume Gaussianity. We use median statistics to determine the systematic error on the distance by analyzing the scatter in the 15 tracer subgroup distances. From the 211 distance measurements, we recommend a summary M87 distance modulus of ${31.08}_{-0.04}^{+0.05}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag, or combining the two errors in quadrature ${31.08}_{-0.07}^{+0.06}$ mag, rounded to 16.4 ± 0.5 Mpc, all at 68.27% significance.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The extragalactic distance ladder is essential to astrophysics and cosmology and must constantly be refined. As the galaxy M87 lies near the center of the closest galaxy cluster to us, Virgo, it is an important rung on the distance ladder and allows us to extend the ladder to more distant clusters such as Coma and Fornax. de Grijs & Bono (2019), hereafter deGB, compiled a list of 211 distance measurements to M87 obtained by 15 different distance tracers. They report a mean value of 31.03 ± 0.14 mag after reducing their data set to 24 measurements from 3 tracers which they believe to be well-calibrated and independent.

Using the mean as a central estimate, as deGB did, implicitly assumes Gaussianly distributed data, or at least a distribution with deviations from an underlying symmetry and outliers only at the noise level. However, we determine that the full compilation, as well as the two sub-compilations studied by deGB are significantly non-Gaussian. 5 Non-Gaussian data compilations are not uncommon in astronomy: well known examples including Hubble constant measurements (Gott et al. 2001; Chen et al. 2003; Chen & Ratra 2011; Calabrese et al. 2012; Bethapudi & Desai 2017; Zhang 2018), cosmological mass density estimates (Chen & Ratra 2003), distances to the SMC and LMC (De Grijs et al. 2014; Crandall & Ratra 2015), and measurements of the Solar radius and Galactic rotational velocity (De Grijs & Bono 2016, 2017; Camarillo et al. 2018a, 2018b; Rajan & Desai 2018; Bobylev & Bajkova 2021). For further examples see Crandall et al. (2015), Bailey (2017), Zhang (2017), Rajan & Desai (2020), and Zhang et al. (2022).

Following Crandall & Ratra (2014), Penton et al. (2018), and Yu et al. (2020), we analyze the deGB data sets using median statistics (Gott et al. 2001), which is free from assumptions about the distribution underlying the data set and its errors. Because median statistics does not take into account error bars on individual measurements, it is generally less constraining. Nevertheless, we believe the median to provide a more accurate estimate of the distance to M87 than methods which rely on the unsatisfied assumption of Gaussianity. In addition to using median statistics to estimate a more reliable statistical uncertainty in the M87 distance measurement, we also utilize it to estimate the systematic uncertainty in this measurement, based on the scatter in the M87 distance estimated using each of the 15 different tracer sub-groups in the deGB compilation. 6

The results of this median statistics analysis on the entire data set of 211 distance measurements provided by deGB yields an M87 distance of ${31.08}_{-0.04}^{+0.04}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag at 68.27% significance. Combining the two errors in quadrature we get ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc.

In Section 2 we introduce the different data compilations studied. In Section 3 we summarize median statistics and outline the Gaussianity test we use. In Section 4 we study the Gaussianity of the deGB compilations and argue that our median statistics result is a better representation of the true distance to M87 than a more conventional mean analysis. We conclude in Section 5.

2. Data

deGB compiled 211 7  M87 distance measurements from the NASA/Astrophysics Data System that they found to be statistically independent. They included measurements both to M87 and to the geometric center of the Virgo cluster since these values were found to be statistically indistinguishable. These were grouped into 15 tracers. 8 deGB selected 5 of these tracers: Cepheids, planetary nebulae luminosity function (PNLF), surface brightness fluctuations (SBF), tip of the red giant branch (TRGB) magnitude, and novae, which they found to be internally consistent and provide tight averages as opposed to the other 10 tracers. These 5 tracers correspond to a total of 44 measurements. They adjusted the measurements from these 5 tracers to agree with their best estimate of the distance modulus to the LMC of 18.49 ± 0.09 mag (De Grijs et al. 2014). 9  This value is in good agreement with the median statistics estimate of 18.49 ± 0.13 mag found by Crandall & Ratra (2015). deGB's final recommended value is the arithmetic mean of a set of 24 10  points from three tracers: Cepheids, SBF, and TRGB. The PNLF and novae tracers were removed due to foreground- and background-biased outliers, respectively. The result of deGB's mean analysis of the 24 points from these 3 tracers is a distance modulus of 31.03 ± 0.14 mag at 1σ sample standard deviation; we find an identical result.

In this paper we analyze the following four subsets of the deGB compilation and provide the best estimate for each subset with 1σ uncertainty as explained in Section 3.1. All 15 refers to deGB's full list of 211 unadjusted data points from 15 tracers. All 15 without averages is the same but now excluding the averages tracer. 11 Best 5 refers to the adjusted 44 measurements from 5 tracers from Table 1 of deGB that they determined to be internally consistent. Best 3 refers to the adjusted 24 data points from 3 tracers (excluding the Tammann et al. 2000, point) from Table 1 of deGB that they used to compute their favored summary measurement value.

3. Analysis

Conventional methods such as mean and χ2 analyses assume (Gott et al. 2001):

  • 1.  
    Individual data points are statistically independent.
  • 2.  
    There are no systematic effects.
  • 3.  
    The errors are Gaussianly distributed.
  • 4.  
    One knows the standard deviation of the errors.

Median statistics was developed by Gott et al. (2001) as a powerful alternative to mean and χ2 analyses. The essential idea is that the true value of the quantity being measured is the median of a set of repeated error-affected measurements as the number of measurements tends toward infinity. This follows from the assumptions that the data set contains only independent measurements and that it does not have any overall systematic error. 12 The individual measurement errors have no effect on the computation of the median of a data set. Thus assumptions 3 and 4 are not necessary for the application of median statistics. This is advantageous for analyzing non-Gaussian data sets, as for such cases any mean analysis is suspect due to the failure to satisfy assumption 3. Since the individual measurement errors are not taken into account (assumption 4 is dropped), a median statistics analysis will generally provide a less constraining central estimate than mean statistics. We argue that despite this, median statistics provides the most reliable central estimate in the case of a non-Gaussian data compilation.

We study the Gaussianity of the compilations described in Section 2. We do this by creating error distributions of the data from various central estimates and comparing them to the Gaussian probability distribution. Based on the results of this analysis, we argue that the median is the most accurate and reliable central estimate of the deGB compilations.

3.1. Computing the Central Estimate

To study the Gaussianity of a data set, we construct an error distribution using a central estimate. We compute the median, weighted mean, and arithmetic mean central estimates and create an error distribution for each.

The true median of a data set is defined as the median as the number of measurements, N, goes to infinity. Gott et al. (2001) showed that the probability that the true median lies between any two adjacent measurements Mi and Mi+1 is given by the appropriately normalized binomial distribution:

Equation (1)

where M0 ≡ − and MN+1 ≡ + . To compute the uncertainty in our estimate of the median: (i) for every index, we find the probability that the true median is between it and the subsequent index; (ii) we split the data into two halves above and below the median of our distribution; (iii) we compute the area of each half; (iv) we iterate from the median until 68.27% of the area of a half is exceeded; and, (v) the data at the index which yields an area closest to the 68.27% is then recorded as Mi+ or Mi.

These indices are then used in the ordered data to construct the uncertainty as described:

where Mmed is the median.

The weighted-mean central estimate, while it has the added assumption of Gaussianity, has the benefit of using the reported uncertainties and so is a more constraining estimate than the median. Given a data set Mi ± σi we compute the weighted-mean central estimate as (Podariu et al. 2001)

Equation (2)

with standard deviation

Equation (3)

We also employ the arithmetic mean central estimate

Equation (4)

with standard error of the mean

Equation (5)

These central estimates are used to construct error distributions which we will compare with a Gaussian in order to evaluate the Gaussianity of the data set.

3.2. Error Distributions

We create error distributions of our data in order to study their Gaussianity. The error distribution ${N}_{{\sigma }_{i}}$ is a measure of how many standard deviations any individual measurement deviates from the central estimate.

Given measurements Mi ± σi and a corresponding central estimate MCE ± σCE, the error distribution is

Equation (6)

This expression assumes that the central estimate is not correlated with the data set. This assumption is not satisfied here, as our central estimates are computed directly from the data compilations. Finding such a formula for an error distribution of a correlated median is beyond the scope of this work.

The weighted-mean case has been solved in the case of correlation between the weighted-mean and the data set. 13 The error distribution using a weighted-mean central estimate that is correlated with the data set is

Equation (7)

We often have asymmetric error bars on the median, in which case we slightly alter Equation (6). We use the upper error of the median when Mi > Mmed and the lower error when Mi < Mmed

Equation (8)

After symmetrizing the error distribution about 0, 14 we use the Kolmogorov–Smirnov (KS) test to study the Gaussianity of the data compilations.

3.3. The Kolmogorov–Smirnov Test and Testing Gaussianity

The KS test is a statistical test that compares empirical error distributions with continuous probability density functions (PDFs). 15 The first step is calculating the D-statistic, which is the largest difference between the empirical cumulative distribution function and that of the relevant PDF.

The D-statistic is then used to compute z as given in Press et al. (2007)

Equation (9)

which is used to compute the p-value

Equation (10)

Explicitly, the p-value is the probability that the D-statistic could be smaller than measured if a similar data set is used. The p-value represents the probability that we can reject the null hypothesis which states that these data do not come from the PDF of interest. Conventionally, if p ≥ 0.95, we can reject the null hypothesis that our data do not come from the relevant PDF. Therefore, if p ≤ 0.95, we conclude that these data are not consistent with having been drawn from a Gaussian distribution.

We also introduce a scale factor S such that

Equation (11)

S > 1 corresponds to decreasing the errors, or narrowing the distribution, and S < 1 corresponds to increasing the errors, or widening the distribution. We then run the KS test, varying S from 0 to 10 to find S*, the value of S which optimizes the p-value. This allows us to compare the error distribution to a Gaussian. If S* > 1, that is, if the optimal scale factor is such that the distribution must be narrowed to fit a Gaussian, the distribution is broader than a Gaussian and we conclude that the errors may have been overestimated. Similarly, if S* < 1, the distribution is more narrow than a Gaussian and we conclude that the errors may have been underestimated.

3.4. Estimating Systematic Error

All uncertainties computed thus far have corresponded to statistical errors. We perform an analysis of the systematic error present in the All 15 data set, and in the All 15 without Averages data subset, using the procedure outlined in Chen & Ratra (2011). Within each tracer subgroup there is statistical uncertainty resulting in a spread of measurements, and between the tracers there could be systematic error resulting from the different techniques and calibrations.

We construct a new data set consisting of the median of each tracer subgroup. We perform a median statistics analysis on this new data set to find the median of medians and its associated uncertainty. If we assume that these medians differ only systematically from each other, this uncertainty corresponds to the systematic uncertainty of the entire group of tracers.

4. Results

We perform a median statistics analysis on deGB's compilation of 211 distance measurements. We calculate three central estimates and study the Gaussianity of the All 15 data set (comprised of 211 measurements from 15 tracers), the Best 5 data set (comprised of 44 adjusted measurements from 5 tracers), and the Best 3 data set (comprised of 24 adjusted measurements from 3 tracers). These data sets are outlined in Section 2. The central estimates and results of the KS test for these data are shown in Table 1.

Table 1. Central Estimates and KS Test Results for the Four Data Compilations

Data Set Central Estimate (mag) pa S* p* b
 Median ${31.08}_{-0.04}^{+0.04}$ ≪0.12.20.83
All 15Weighted Mean c 31.07±0.01≪0.12.40.63
 Arithmetic Mean30.97±0.07≪0.12.10.51
All 15Median ${31.08}_{-0.04}^{+0.05}$ ≪0.12.10.85
w/o AveragesWeighted Mean31.06±0.01≪0.12.10.68
 Arithmetic Mean30.95±0.08≪0.11.90.49
 Median ${31.02}_{-0.05}^{+0.04}$ 0.201.41.0
Best 5 d Weighted Mean31.03±0.010.0391.81.0
 Arithmetic Mean31.04±0.030.0571.61.0
 Median ${31.025}_{-0.005}^{+0.04}$ 0.261.71.0
Best 3 e Weighted Mean31.05±0.010.1091.91.0
 Arithmetic Mean31.03±0.030.301.81.0

Notes.

a The p-values in this column are for the unscaled error distribution, or S = 1. b The 1.0 values in the p* column all lie within the range of (0.995, 0.999). c There were some data without errors. These were set to the mean of the uncertainties for that tracer in order to perform the weighted mean analysis. d Including the Tammann et al. (2000) measurement. e Excluding the Tammann et al. (2000) measurement.

Download table as:  ASCIITypeset image

We find all four data sets to be inconsistent with Gaussianity. Therefore, we argue that the median provides the best central estimate of each of these compilations. As the optimal scale factor S* > 1 for all of these data sets, the error distributions are all wider than a Gaussian. This could be due to an overestimation of some of the errors.

We report a median of ${31.08}_{-0.04}^{+0.04}$ mag for the All 15 data set, ${31.08}_{-0.04}^{+0.05}$ for the All 15 without averages data set, ${31.02}_{-0.05}^{+0.04}$ mag for the Best 5 data set, and ${31.03}_{-0.01}^{+0.04}$ mag for the Best 3 data set, where all errors are at a 68.27% confidence level.

In the case of the All 15 and All 15 without averages data sets, there are enough tracers to estimate the systematic error in the median of the entire data as outlined in Section 3. The results of this analysis are shown in Table 2. The median of the All 15 data set is ${31.08}_{-0.04}^{+0.04}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag and the median of the All 15 without averages is ${31.08}_{-0.04}^{+0.05}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag, at 68.27% significance. The systematic error due to the tracer differences is comparable to the statistical uncertainty on the median. Combining the two errors in quadrature we get ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc for both the All 15 and the All 15 without averages data sets.

Table 2. All 15 and All 15 Without Averages Systematic Uncertainty Analyses a

Tracer Type N Median (Shift) b w/o Ave (Shift)1σ Error (Width) c w/o Ave 1σ Error (width)
All Data21131.08−0.04, +0.04 (0.08)−0.05, +0.05 (0.09)
TFR.............3631.24 (0.16)(0.16)−0.16, +0.16 (0.32)−0.16, +0.16 (0.32)
GCLF............3231.11 (0.03)(0.04)−0.14, +0.04 (0.18)−0.14, +0.04 (0.18)
Averages........2131.08 (0.00)−0.02, +0.15 (0.17)
SBF.............1831.12 (0.04)(0.05)−0.09, +0.03 (0.12)−0.09, +0.03 (0.12)
SNe.............1831.65 (0.57)(0.57)−0.05, +0.05 (0.10)−0.05, +0.05 (0.10)
Other Methods1530.90 (−0.18)(−0.18)−0.10, +0.25 (0.35)−0.10, +0.25 (0.35)
PNLF............1230.87 (−0.22)(−0.21)−0.02, +0.03 (0.05)−0.02, +0.03 (0.05)
Faber-Jackson1131.14 (0.06)(0.07)−0.01, +0.34 (0.35)−0.01, +0.34 (0.35)
Color-magnitude1130.84 (−0.24)(−0.24)−0.08, +0.06 (0.14)−0.08, +0.06 (0.14)
Novae...........831.40 (0.32)(0.33)......
Hubble law......827.30 (−3.78)(−3.78)......
Cepheids........731.02 (−0.06)(−0.06)......
HII.............631.20 (0.12)(0.13)......
Group Member530.50 (−0.58)(−0.58)......
TRGB............331.05 (−0.03)(−0.03)......
Subgroup Medians1531.0831.08−0.06, +0.04 (0.1)−0.06, +0.04 (0.1)

Notes.

a All Data are the 211 data points including the Tammann et al. (2000) measurement. The next 15 lines are the individual tracer types. Only for tracer types with more than 10 measurements do we show their uncertainty in the last column. The Subgroup Medians row shows the results from a median statistics analysis of the previous 15 medians and its uncertainty is the reported systematic uncertainty. b The shift is the difference between the tracer median and the All Data median in the first row. c Error on the median of the previous column.

Download table as:  ASCIITypeset image

The All 15 data set has the most measurements, enough to allow an estimate of systematic uncertainty, and also to make this data set the best defended against the effects of small number statistics. Therefore, we choose to use the results of the analysis of this data set as our final reported value.

5. Conclusion

After analyzing the data sets compiled by deGB, we recommend an M87 median statistics distance modulus from the All 15 data set without averages of value ${31.08}_{-0.04}^{+0.05}$ (statistical) ${}_{-0.06}^{+0.04}$ (systematic) mag at 68.27% significance. Combining the two errors in quadrature we have ${31.08}_{-0.07}^{+0.06}$ mag or 16.4 ± 0.5 Mpc. This estimate is consistent with deGB's result of 31.03 ± 0.14 mag based on the Best 3 data set. We argue that our reported value is more reliable than deGB's mean statistics analysis since these data are not Gaussianly distributed. Using the larger data set allowed us to estimate systematic uncertainty and include that in our reported value.

Acknowledgments

We acknowledge helpful discussions with Jacob Peyton, Aman Singal, Shantanu Desai, and Gunasekar Ramakrishnan. This project was supported by funding from Kansas State University's REU program funded by NSF grant No. 1757778.

Footnotes

  • 5  

    Ramakrishnan & Desai (2023) have more thoroughly examined the Gaussianity of the full deGB compilation, as well as that of many of the 15 different tracer compilations, and also find that the full deGB compilation is non-Gaussian.

  • 6  

    Ramakrishnan & Desai (2023) have also performed a median statistics analysis of the deGB compilation, not based on the Gott et al. (2001) technique but on one that assumes only mildly non-Gaussian data and so get a different statistical error on the median. They do not estimate the systematic error on the median.

  • 7  

    deGB uses 213 measurements, but their tracer organized database only lists 211. The missing 2 points do not statistically affect the results.

  • 8  

    Refer to Table 2 for a full list.

  • 9  

    The unadjusted measurements database sorted by tracer type can be found at https://astro-expat.info/Data/m87distbytracer.html.

  • 10  

    deGB say they use 28 data points, so 27 after they discard the Tammann et al. (2000) point (as discussed next); however, they list only 25, including the Tammann et al. (2000) point, in their Table 1, possibly due to a confusion between their unadjusted data-points and adjusted data-points, of which there are 3 fewer. deGB discard one particular outlier of the Cepheid tracer from Tammann et al. (2000), who reported this result under the assumption of the long distance scale. Currently, the short distance scale is the accepted framework (Freedman et al. 2001).

  • 11  

    So that we may determine what effects removing this tracer has on the results.

  • 12  

    There can be systematic error in different subsets of the data set, so long as the same systematic error does not affect a significant portion of the full data set.

  • 13  

    A derivation is shown in Camarillo et al. (2018b).

  • 14  

    For example: [1, 2, 3] → [−3, −2, −1, 1, 2, 3].

  • 15  

    We also used the Anderson Darling (AD) test for this purpose. The AD test results agree with our KS test findings and so are not recorded here.

Please wait… references are loading.