Inference for Stochastic Gradient Descent: Beyond Finite Variance

Stochastic gradient descent (SGD) plays a foundational role in large-scale machine learning, yet its behavior in realistic training environments often departs from what classical finite-variance theory assumes. Empirical studies indicate that gradient noise can be highly irregular, with occasional large fluctuations and non-Gaussian patterns. Such behavior plays a critical role in shaping optimization dynamics and model calibration.

In this talk, I will discuss an approach to inference for SGD that goes beyond the traditional finite-variance paradigm. I will first describe how SGD behaves asymptotically under realistic noise conditions, emphasizing the key differences from the standard finite-variance viewpoint. I will then introduce a general methodology for uncertainty quantification and confidence-region construction for SGD-based solutions, applicable even when the underlying gradient noise lacks finite variance. This approach provides practical tools for assessing the reliability of stochastic optimization in modern learning systems.

I will conclude with a discussion of practical implications and several open directions connecting optimization behavior with inference procedures.

杨文昊.pdf

学术交流

学术报告

Inference for Stochastic Gradient Descent: Beyond Finite Variance