High-dimensional statistics with systematically corrupted data*
Mon., May 12, 2014
3:00 - 4:00 PM
521 Cory Hall
Noisy and missing data are prevalent in many real-world statistical estimation problems. Popular techniques for handling nonidealities in data are often difficult to analyze theoretically and may terminate in substantially different local optima of a nonconvex objective function -- these problems are only exacerbated in high dimensions.
We present new methods for constructing high-dimensional regression estimators based on corrupted data and quantify the rate of convergence of our estimator to the true regression parameter. Although our approach also involves nonconvex optimization, we establish the rather surprising result that all stationary points of the relevant nonconvex functions are clustered around global minima with good statistical properties. Our theory also extends to certain classes of nonconvex regularizers that have attracted growing interest in the statistics community. We show that the nonconvex functions under consideration may be optimized efficiently using gradient methods.
Finally, we highlight some of our recent work on graphical model estimation that links the structure of the inverse covariance matrix to the edge set of the underlying graph when data are discrete-valued. We synthesize these ideas with our results in high-dimensional regression to obtain new algorithms for inferring the edge structure of a graphical model when data may be systematically corrupted and/or non-Gaussian.
This is based on joint work with Martin Wainwright and Peter Buhlmann.
UC Berkeley Networking
Varun Jog and Ka Kit Lam Last Modification Date: Sunday, January 26, 2014