Abstract
In this paper, k-FWER (generalized familywise error rate) control for grouped hypotheses testing is considered. We offer the weights for the p-values in each group, by maximizing an objective function, which is the expectation of the proportion of rejected hypotheses. This objective function utilizes not only the information of the proportion of true null hypotheses, but also the null and non-null distributions of the p-values in each group. When this information is known prior, our weighted testing procedure controls k-FWER for arbitrarily dependent p-values. When this information is unknown, and is estimated from the data, our procedure asymptotically controls k-FWER under the weak dependence assumption of the p-values in each group. The new procedure is shown to be more powerful than some existing procedures both in theory and simulations. For illustration, the proposed procedure is applied to analyse the adequate yearly progress data.
Similar content being viewed by others
References
Basu P, Cai T, Das K, Sun W (2018) Weighted false discovery rate control in large-scale multiple testing. J Am Stat Assoc. https://doi.org/10.1080/01621459.2017.1336443
Benjamini Y, Cohen R (2017) Weighted false discovery rate controlling procedures for clinical trials. Biostatistics 18:91–104
Benjamini Y, Heller R (2007) False discovery rates for spatial signals. J Am Stat Assoc 102:1272–1281
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300
Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25:60–83
Cai T, Sun W (2009) Simultaneous testing of grouped hypotheses: finding needles in multiple haystacks. J Am Stat Assoc 104:1467–1481
Clements N, Sarkar SK, Guo W (2012) Astronomical transient detection controlling the false discovery rate. In: Feigelson E, Babu G (eds) Statistical challenges in modern astronomy V. Springer, New York, pp 383–396
Efron B (2008) Simultaneous inference: when should hypothesis testing problems be combined? Ann Appl Stat 2:197–223
Genovese C, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 32:1035–1061
Guo W, Romano JP (2007) A generalized Sidak-Holm procedure and control of generalized error rates under independence. Stat Appl Genet Mol Biol 6:3
Hu J, Zhao H, Zhou H (2010) False discovery rate control with groups. J Am Stat Assoc 105:1215–1227
Jin J (2008) Proportion of non-zero normal means: universal oracle equivalences and uniformly consistent estimators. J R Stat Soc Ser B 70:461–493
Jin J, Cai T (2007) Estimating the null and the proportion of nonnull effects in large-scale multiple comparisons. J Am Stat Assoc 102:495–506
Kellerer H, Pferschy U, Pisinger D (2004) Knapsack problems. Springer, Berlin
Lehmann EL, Romano JP (2005) Generalizations of the familywise error rate. Ann Stat 33:1138–1154
Liu Y, Sarkar SK, Zhao Z (2016) A new approach to multiple testing of grouped hypotheses. J Stat Plan Inference 179:1–14
Romano JP, Shaikh AM (2006) Stepup procedures for control of generalizations of the familywise error rate. Ann Stat 34:1850–1873
Sarkar SK (2007) Stepup procedures controlling generalized FWER and generalized FDR. Ann Stat 35:2405–2420
Sarkar SK (2008) Generalizing Simes’ test and Hochberg’s stepup procedure. Ann Stat 36:337–363
Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc Ser B 66:187–205
Sun W, Cai T (2007) Oracle and adaptive compound decision rules for false discovery rate control. J Am Stat Assoc 102:901–912
van der Lann MJ, Dudoit S, Pollard KS (2004) Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Stat Appl Genet Mol Biol 3(1):15
Wang L, Xu X (2012) Step-up procedure controlling generalized family-wise error rate. Stat Probab Lett 82:775–782
Zhao H (2014) Adaptive FWER control procedure for grouped hypotheses. Stat Probab Lett 95:63–70
Zhao H, Zhang J (2014) Weighted \(p\)-value procedures for controlling FDR of grouped hypotheses. J Stat Plan Inference 151–152:90–106
Acknowledgements
The author thanks the reviewers and associate editor for their useful comments, which greatly improved the quality of the paper. The author also thanks Professor Wenguang Sun for sharing his data and code. The work was supported by the National Natural Science Foundation of China (Grant Nos. 11626227, 11671398) and the Fundamental Research Funds for the Central Universities (Grant No. 2015QS03).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix
Proof of Proposition 1
Proof
and
where the last inequality follows from the convexity of \(F(\cdot /x)\), and the equality holds if and only if \(a_{10}=\cdots =a_{G0}\). Thus, it only needs to verify \(na_{T0}\ge n\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}a_{g0}\). Since
where the last inequality follows from Cauchy-Schwartz inequality, and the equality holds if and only if \(n_{10}/n_1=\cdots =n_{G0}/n_G\), e.g. \(a_{10}=\cdots =a_{G0}\).
Thus, the proof is completed. \(\square \)
Remark 1
In the proof of Theorem 2 in Zhao (2014), \(power(\varvec{\omega }^{**},\frac{k}{n}\alpha )=\sum _{g=1}^Ga_{g1}F(\frac{k\alpha }{na_{g0}})\), which is not true, and we correct in formula (5).
1.1 Proof of Theorem 1
Proof
Denote V be the number of false rejections.
\(\square \)
1.2 Proof of Theorem 2
Proof
For simplicity, we drop the subscript n in \(V_n\) and \(k_n\) throughout the proof.
(1). Since \(\frac{k}{n}\alpha \rightarrow c\alpha \), following the proof of Lemma 3 in Zhao and Zhang (2014), we can get
and
(2) Denote \(\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha )=(\omega _{10}({\hat{r}}, \frac{k}{n}\alpha ),\ldots ,\omega _{G0}({\hat{r}}, \frac{k}{n}\alpha ))'\), and \(\varvec{\omega }_0({\tilde{r}}, c\alpha )=(\omega _{10}({\tilde{r}}, c\alpha ),\ldots ,\omega _{G0}({\tilde{r}}, c\alpha ))'\). Similarly,
By dominated convergence theorem, we have
Then for WT\((\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha ),\frac{k}{n}\alpha )\),
Immediately, \(\limsup _{n\rightarrow \infty } P(V\ge k)\le \alpha .\)
(3) Since \({\hat{\pi }}_{g0}{\mathop {\rightarrow }\limits ^{\mathrm {P}}}\pi _{g0}\in (0,1)\), \({\hat{\pi }}_{T0}=\sum _{g=1}^Ga_g{\hat{\pi }}_{g0}{\mathop {\rightarrow }\limits ^{\mathrm {P}}}\pi _{T0}=\sum _{g=1}^G\pi _g\pi _{g0}\), and \(\frac{1}{{\hat{\pi }}_{T0}}\frac{k}{n}\alpha {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\frac{1}{\pi _{T0}}c\alpha \). It is easy to see
By dominated convergence theorem, we have
As similar as that for WT\((\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha ),\frac{k}{n}\alpha )\), we can get
which means \(\limsup _{n\rightarrow \infty } P(V\ge k)\le \alpha .\)
(4) Since \(\frac{1}{{\hat{\pi }}_{g0}}\frac{k}{n}\alpha {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\frac{1}{\pi _{g0}}c\alpha \) for all g, the proof is similar, and is omitted.
\(\square \)
Rights and permissions
About this article
Cite this article
Wang, L. Weighted multiple testing procedure for grouped hypotheses with k-FWER control. Comput Stat 34, 885–909 (2019). https://doi.org/10.1007/s00180-018-0833-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-018-0833-8