统计与管理学院2015年学术报告第17期
【主 题】 Incorporation of Sparsity Information in Large-scale Multiple Two-sample T Tests
【报告人】 Professor 刘卫东
上海交通大学
【时 间】 2015年5月8日(星期五)10:30-11:30
【地 点】 上海财经大学统计与管理学院大楼1208室
【语 言】 英文
【摘 要】 Large-scale multiple two-sample Student's T testing problems often arise from the statistical analysis of scientific data. To detect components with different values between two mean vectors, a well-known procedure is to apply the Benjamini and Hochberg (B-H) method and two-sample Student's T statistics to control the false discovery rate (FDR). In many applications, mean vectors are expected to be sparse or asymptotically sparse.When dealing with such type of data, can we gain more power than the standard procedure such as the B-H method with Student's T statistics while keeping the FDR under control? The answer is positive. By exploiting the possible sparsity information in mean vectors, we present an uncorrelated screening-based (US) FDR control procedure, which is shown to be more powerful than the B-H method. The US testing procedure depends on a novel construction of screening statistics, which are asymptotically uncorrelated with two-sample Student's T statistics. The US testing procedure is different from some existing testing following screening methods (Reiner, et al., 2007; Yekutieli, 2008) in which independence between screening and testing is crucial to control the FDR, while the independence often requires additional data or splitting of samples. An inadequate splitting of samples may result in a loss rather than an improvement of statistical power. Instead, the uncorrelated screening US is based on the original data and does not need to split the samples. Theoretical results show that the US testing procedure controls the desired FDR asymptotically. Numerical studies are conducted and indicate that the proposed procedure works quite well.


