Zhao Ren, University of Pittsburgh

School of Statistics Seminar Series
Event Date & Time
| -
Event Location
150 Ford Hall

224 Church St SE
Minneapolis, MN 55455

Zhao Ren,  University of Pittsburgh

Sparse Heteroskedastic PCA in High Dimensions

Principal component analysis (PCA) is one of the most commonly used techniques for dimension reduction and feature extraction. Though it has been well-studied for high-dimensional sparse PCA, little is known when the noise is heteroskedastic, which turns out to be ubiquitous in many scenarios. We propose an iterative algorithm, called SparseHPCA, for the sparse PCA problem in the presence of heteroskedastic noise, which alternatively updates the estimates of the sparse eigenvectors using orthogonal iteration with adaptive thresholdings in one step, and imputes the diagonal values of the sample covariance matrix to reduce the estimation bias due to heteroskedastic noise in the other step. Our procedure is computationally fast and provably optimal under the generalized spiked covariance model, assuming the leading eigenvectors are sparse. A comprehensive simulation study shows its robustness and effectiveness under various settings. The application of our new method to two high-dimensional genomics datasets, i.e., microarray and single-cell RNA sequencing (scRNA-seq) data, demonstrates its ability to preserve inherent cluster structures in downstream analyses. Additionally, we extend SparseHPCA to address the sparse singular value decomposition (sparse SVD) problem in the presence of heteroskedastic noise, further showcasing its versatility.

Bio

Zhao Ren is an Associate Professor in the Department of Statistics at the University of Pittsburgh. He obtained his Ph.D. in Statistics from Yale University in 2014. His research focuses on developing theories and methods for high-dimensional statistical inference, robust statistics, graphical models, and nonparametric function estimation. He is also interested in applications in genomics and protein structure modeling. More recently, his interests have expanded to include deep neural networks for nonparametric regression and the development of privacy-preserving statistical methods.

Share on: