Dissertations, Theses, and Capstone Projects

Date of Degree

9-2023

Document Type

Dissertation

Degree Name

Ph.D.

Program

Physics

Advisor

David Schwab

Committee Members

Delaram Kahrobaei

Pouyan Ghaemi

Vadim Oganesyan

John Terilla

Subject Categories

Artificial Intelligence and Robotics | Physics | Statistics and Probability

Keywords

Physics of learning, Stochastic thermodynamics, Representation learning, Information Theory, Machine learning

Abstract

This study delves into the learning process within the Probabilistic Parametric Models (PPMs) framework from a unique thermodynamic perspective. By exploring the core concepts of thermodynamics and its innate connection with information theory, we showcase how this interdisciplinary approach can effectively contribute to the domain of machine learning. In the initial chapter, we establish the link between the learning problem in PPMs and a thermodynamic process by reframing various elements of the learning process within the context of thermodynamics. We introduce novel information-theoretic measurements that provide insights into the information learned in both the parameter space and the overall performance of the PPM. Transitioning to the second chapter, we introduce a stochastic joint set of degrees of freedom that encapsulates the model's generated samples and parameters. We elucidate the concept of the parametric reservoir through the study of stochastic parameter dynamics, demonstrating its suitability to serve as a heat reservoir during the learning process. Leveraging the Fluctuation Theorem for learning PPMs, we quantify the information flow that occurs throughout the learning process. In the third chapter, we concentrate on the thermodynamics of generative models, specifically focusing on Energy-Based Models (EBMs) as a representative example. We explore the thermodynamics of EBMs in the context of a quench-and-relaxation process. Notably, we reveal that entropy production during learning serves as the primary source of learned information. Furthermore, we demonstrate how this acquired information is stored in the parametric reservoir, essentially functioning as a memory space. By leveraging the principles of thermodynamics, we devise practical methods to compute these information-theoretic quantities during the EBM training. In the final chapter, we shift our attention to discriminative models, particularly examining two kinds of classifiers: discriminative and generative classifiers. We apply the thermodynamic framework to quantify the learned information by these classifiers. A significant outcome of our study is the derivation of a novel Information Bottleneck (IB) objective for classification, explicitly deduced from the cross-entropy loss function. This novel approach offers a potential resolution to certain contentious aspects concerning the application of information-theoretic measurements in deterministic neural networks.

This work is embargoed and will be available for download on Tuesday, September 30, 2025

Graduate Center users:
To read this work, log in to your GC ILL account and place a thesis request.

Non-GC Users:
See the GC’s lending policies to learn more.

Share

COinS