You are here

Statistics Seminar: Antonio Linero

Theory and Practice for Bayesian Regression Tree Ensembles
January 23, 2019 - 3:30pm

108 Folwell Hall

Title: Theory and Practice for Bayesian Regression Tree Ensembles

Abstract

Ensembles of decision trees have become a standard component of the data analyst's toolkit; commonly used algorithms include random forests and boosted decision trees. In this talk, we investigate the properties of regression tree ensembles from a Bayesian standpoint. We focus on the interplay between theory and practice to study the properties of ensembles and obtain insights into (a) why decision tree ensembles are successful in practice and (b) where they might be improved. We provide validation for the long-held hypothesis that BART ensembles perform well due to their ability to detect low-order interactions, a property which describes many real-world settings. Further, we identify two areas in which BART ensembles can be expected to be suboptimal: under sparsity, and when the underlying regression function exhibits higher-order smoothness. We give theoretical support for these insights by establishing posterior contraction at near-optimal rates adaptively across a large family of function spaces, and provide empirical support by applying our methodology to  enchmark datasets. We conclude by presenting extensions of our methodology which account for other interesting structures beyond sparsity and smoothness, and discuss how the insights we obtain can be extended to non-Bayesian decision tree ensembling methods.