Date of Degree
Artificial Intelligence and Robotics | Physics | Statistics and Probability
Physics of learning, Stochastic thermodynamics, Representation learning, Information Theory, Machine learning
This study delves into the learning process within the Probabilistic Parametric Models (PPMs) framework from a unique thermodynamic perspective. By exploring the core concepts of thermodynamics and its innate connection with information theory, we showcase how this interdisciplinary approach can effectively contribute to the domain of machine learning. In the initial chapter, we establish the link between the learning problem in PPMs and a thermodynamic process by reframing various elements of the learning process within the context of thermodynamics. We introduce novel information-theoretic measurements that provide insights into the information learned in both the parameter space and the overall performance of the PPM. Transitioning to the second chapter, we introduce a stochastic joint set of degrees of freedom that encapsulates the model's generated samples and parameters. We elucidate the concept of the parametric reservoir through the study of stochastic parameter dynamics, demonstrating its suitability to serve as a heat reservoir during the learning process. Leveraging the Fluctuation Theorem for learning PPMs, we quantify the information flow that occurs throughout the learning process. In the third chapter, we concentrate on the thermodynamics of generative models, specifically focusing on Energy-Based Models (EBMs) as a representative example. We explore the thermodynamics of EBMs in the context of a quench-and-relaxation process. Notably, we reveal that entropy production during learning serves as the primary source of learned information. Furthermore, we demonstrate how this acquired information is stored in the parametric reservoir, essentially functioning as a memory space. By leveraging the principles of thermodynamics, we devise practical methods to compute these information-theoretic quantities during the EBM training. In the final chapter, we shift our attention to discriminative models, particularly examining two kinds of classifiers: discriminative and generative classifiers. We apply the thermodynamic framework to quantify the learned information by these classifiers. A significant outcome of our study is the derivation of a novel Information Bottleneck (IB) objective for classification, explicitly deduced from the cross-entropy loss function. This novel approach offers a potential resolution to certain contentious aspects concerning the application of information-theoretic measurements in deterministic neural networks.
Sadat Parsi, Shervin, "Thermodynamics of Learning With Parametric Probabilistic Models" (2023). CUNY Academic Works.
This work is embargoed and will be available for download on Tuesday, September 30, 2025
Graduate Center users:
To read this work, log in to your GC ILL account and place a thesis request.
See the GC’s lending policies to learn more.