Pragya Sur, “Precise high-dimensional asymptotics for AdaBoost via max-margins & min-norm interpolants”

/ August 9, 2021/

Calendar

When:

November 2, 2021 @ 12:00 pm – 1:00 pm

2021-11-02T12:00:00-04:00

2021-11-02T13:00:00-04:00

Join Zoom Meeting:

https://wse.zoom.us/j/99567504456?pwd=WkI2UlpGT3p6MldLS05VNkdmcGxiZz09

Pragya Sur, PhD

Assistant Professor

Statistics Department

Harvard University

“Precise high-dimensional asymptotics for AdaBoost via max-margins & min-norm interpolants”

Abstract: This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the max-min-L1-margin and the min-L1-norm interpolant. In turn, this will characterize the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we will provide a sharp analysis of the stopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of AdaBoost; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. Our statistical and computational arguments can handle (1) finite-rank spiked covariance models for the feature distribution and (2) variants of AdaBoost corresponding to general Lq-geometry, for q in [1,2]. This is based on joint work with Tengyuan Liang.

Biography: Pragya Sur is an Assistant Professor in the Statistics Department at Harvard University. Her research broadly spans high-dimensional statistics, statistical machine learning, robust inference and prediction for multi-study/multi-environment heterogeneous data. She is simultaneously interested in applications of large scale statistical methods to computational neuroscience and genetics. Her research is currently supported by a William F. Milton Fund and an NSF DMS award. Previously, she was a postdoctoral fellow at the Center for Research on Computation and Society, Harvard John A. Paulson School of Engineering and Applied Sciences. She received a Ph.D. in Statistics from Stanford University in 2019, where her thesis was awarded the Theodore W. Anderson Theory of Statistics Dissertation Award.