Dr. Daniel Allan Kessler, University of North Carolina at Chapel Hill
224 Church St SE
Minneapolis,
MN
55455
Selective inference after community detection on a single network
Abstract
Given a dataset consisting of a single realization of a network, we consider conducting inference on a parameter selected from the data. For instance, suppose that we perform community detection to identify sets of nodes with generally similar connectivity patterns. Inference on the connectivity within and between the estimated communities poses a challenge, since the communities are themselves estimated from the data. Furthermore, since only a single realization of the network is available, sample splitting is not possible. In this work, we show that it is possible to split a single realization of a network with independent edges on $p$ nodes into two (or more) networks involving the same $p$ nodes; the first network can then be used to select a data-driven parameter, and the second to conduct inference on that parameter. In the case of weighted networks with Poisson or Gaussian edges, we obtain two independent realizations of the network; by contrast, in the case of Bernoulli edges, the realizations are dependent, and so extra care is required. We establish the theoretical properties of our estimators, in the sense of confidence intervals that attain the nominal (selective) coverage, and demonstrate their utility in numerical simulations and in application to a dataset representing the relationships among members of a karate club. This is joint work with Daniela Witten and Ethan Ancell of the University of Washington.
Bio
Dan Kessler is an assistant professor in the Department of Statistics & Operations Research (STOR) with a joint appointment in the School of Data Science and Society (SDSS) at the University of North Carolina at Chapel Hill. He obtained his PhD in the Department of Statistics at the University of Michigan where he was advised by Professor Liza Levina. Prior to joining UNC Chapel Hill, Dan was an NSF Mathematical Sciences Postdoctoral Research Fellow (NSF-MSPRF) in the Department of Statistics at the University of Washington. During his Postdoctoral Fellowship, he worked with Professor Daniela Witten. His research interests include the statistical analysis of networks, selective inference, high-dimensional statistics, human neuroimaging, computational and cognitive neuroscience, and high performance computing.