Andrej Risteski

/ January 22, 2021/

When:

February 16, 2021 @ 12:00 pm – 1:00 pm

2021-02-16T12:00:00-05:00

2021-02-16T13:00:00-05:00

Andrej Risteski, Assistant Professor Machine Learning Department- Carnegie Mellon University

Title: Representational aspects of depth and conditioning in normalizing flows

Abstract: Normalizing flows are among the most popular paradigms in generative modeling, especially for images, primarily because we can efficiently evaluate the likelihood of a data point, allowing likelihood training via gradient descent.However, trainingnormalizing flows comes with difficulties as well: models which produce good samples typically need to beextremely deep and they are often poorly conditioned.

In our paper, we tackle representational aspects around depth and conditioning of normalizing flows: both for general invertible architectures, and for a particular common architecture, affine couplings. We prove that affine coupling layers suffice to exactly represent a permutation or 1×1 convolution, as used in GLOW, showing that representationally the choice of partition is not a bottleneck for depth. We also show that shallowaffine coupling networks are universal approximatorsin Wasserstein distance if ill-conditioning isallowed, and experimentally investigate related phenomena involving padding. Finally, we showa depth lower bound for general flow architectures with few neurons per layer and bounded Lipschitz constant.

Joint with Fred Koehler and VirajMehta.

Bio: Andrej Risteskiis an Assistant Professor at the Machine Learning Department in Carnegie Mellon University. He received his PhD in the Computer Science Department at Princeton University under the advisement of Sanjeev Arora. His research interests lie in machine learning and statistics, spanning topics like representation learning, generative models, word embeddings, variationalinference and MCMC and non-convex optimization. The broad goal of his research is principled and mathematical understanding of statistical and algorithmic phenomena and problems arising in modern machine learning.