Recent advances in deep learning have led to many disruptive technologies: from automatic speech recognition systems, to automated supermarkets, to self-driving cars. However, the complex and large-scale nature of deep networks makes them hard to analyze and, therefore, they are mostly used as black-boxes without formal guarantees on their performance. For example, deep networks provide a self-reported confidence score, but they are frequently inaccurate and uncalibrated, or likely to make large mistakes on rare cases. Moreover, the design of deep networks remains an art and is largely driven by empirical performance on a dataset. As deep learning systems are increasingly employed in our daily lives, it becomes critical to understand if their predictions satisfy certain desired properties.
Transferable, Hierarchical, Expressive, Optimal, Robust, and Interpretable NETworks (THEORINET) is an NSF-Simons Research Collaboration on the Mathematical and Scientific Foundations of Deep Learning (MoDL) whose goal is to develop a mathematical, statistical and computational framework that helps explain the success of current network architectures, understand its pitfalls, and guide the design of novel architectures with guaranteed confidence, robustness, interpretability, optimality, and transferability. THEORINET will also create new undergraduate and graduate programs in the foundations of data science and organize a series of collaborative research events, including semester research programs and summer schools on the foundations of deep learning.
THEORINET brings together a multidisciplinary team of mathematicians, statisticians, theoretical computer scientists, electrical and biomedical engineers to develop the mathematical and scientific foundations of deep learning. The team is led by Prof. Rene Vidal, Director of the Mathematical Institute for Data Science (MINDS) at Johns Hopkins University, and includes faculty from Duke University, Johns Hopkins University, Stanford University, the Technical University of Berlin, the University of California at Berkeley, and the University of Pennsylvania.
THEORINET’s research agenda is divided in four main thrusts.
- Analysis: This thrust uses principles from approximation theory, information theory, statistical inference, and robust control to analyze properties of deep neural networks, such as expressivity, interpretability, confidence, fairness and robustness.
- Learning: This thrust uses principles from dynamical systems, non-convex and stochastic optimization, statistical learning theory, adaptive control, and high-dimensional statistics to design and analyze learning algorithms with guaranteed convergence, optimality and generalization properties.
- Design: This thrust uses principles from algebra, geometry, topology, graph theory and optimization to design and learn network architectures that capture algebraic, geometric and graph structures in both the data and the task.
- Transfer: This thrust uses principles from multiscale analysis and modeling, reinforcement learning, and Markov decision processes to design and study representations suitable for learning from and transferring to multiple tasks.
Education and Training
- THEORINET will train a new, diverse STEM workforce with data science skills that are essential for the global competitiveness of the U.S. economy.
- THEORINET will create new undergraduate and graduate research programs focused on the foundations of data science that include a series of collaborative research events.
- THEORINET will support women and members of underrepresented minority populations through an associated NSF-supported Research Experience for Undergraduates program in the foundations of data science.