Bayesian jackknife tests with a small number of subsets: application to HERA 21 cm power spectrum upper limits
MJ Wilensky, F Kennedy, P Bull… - Monthly Notices of …, 2023 - academic.oup.com
Monthly Notices of the Royal Astronomical Society, 2023•academic.oup.com
We present a Bayesian jackknife test for assessing the probability that a data set contains
biased subsets, and, if so, which of the subsets are likely to be biased. The test can be used
to assess the presence and likely source of statistical tension between different
measurements of the same quantities in an automated manner. Under certain broadly
applicable assumptions, the test is analytically tractable. We also provide an open-source
code, chiborg, that performs both analytic and numerical computations of the test on general …
biased subsets, and, if so, which of the subsets are likely to be biased. The test can be used
to assess the presence and likely source of statistical tension between different
measurements of the same quantities in an automated manner. Under certain broadly
applicable assumptions, the test is analytically tractable. We also provide an open-source
code, chiborg, that performs both analytic and numerical computations of the test on general …
Abstract
We present a Bayesian jackknife test for assessing the probability that a data set contains biased subsets, and, if so, which of the subsets are likely to be biased. The test can be used to assess the presence and likely source of statistical tension between different measurements of the same quantities in an automated manner. Under certain broadly applicable assumptions, the test is analytically tractable. We also provide an open-source code, chiborg, that performs both analytic and numerical computations of the test on general Gaussian-distributed data. After exploring the information theoretical aspects of the test and its performance with an array of simulations, we apply it to data from the Hydrogen Epoch of Reionization Array (HERA) to assess whether different sub-seasons of observing can justifiably be combined to produce a deeper 21 cm power spectrum upper limit. We find that, with a handful of exceptions, the HERA data in question are statistically consistent and this decision is justified. We conclude by pointing out the wide applicability of this test, including to CMB experiments and the H0 tension.
Oxford University Press