Ph. D. Thesis information

Contributions to approximate bayesian inference for Machine Learning

Simón Rodríguez Santana

Supervised by D. Hernández-Lobato, D. Gomez-Ullate Oteiza

Universidad Complutense de Madrid. Madrid (Spain)

January 18th, 2022

Summary:

Machine learning (ML) methods can learn from data and then be used for making predictions on new data instances. However, some of the most popular ML methods cannot provide information about the uncertainty of their predictions, which may be crucial in many applications. The Bayesianframework for ML introduces a natural approach to formulate many ML methods, and it also has the advantage of easily incorporating and reflecting different sources of uncertainty in the final predictive distribution. These sources include uncertainty related to, for example, the data, the model chosen, and its parameters. Moreover, they can be automatically balanced and aggregated using information from the observed data. Nevertheless, in spite of this advantage, exact Bayesian inference is intractable in most ML methods, and approximate inference techniques have to be used in practice. In this thesis we propose a collection of methods for approximate inference, withspecific applications in some popular approaches in supervised ML. First, we introduce neural networks (NNs), from their most basic concepts to some of their mostpopular architectures. Gaussian processes (GPs), a simple but important tool in Bayesian regression, are also reviewed. Sparse GPs are presented as a clever solution to improve GPs’ scalability by introducing new parameters: the inducing points. In the second half of the introductory partwe also describe Bayesian inference and extend the NN formulation using a Bayesian approach, which results in a NN model capable of outputting a predictive distribution. We will see why Bayesian inference is intractable in most ML approaches, and also describe sampling-based and optimization-based methods for approximate inference. The use of -divergences is introduced next, leading to a generalization of certain methods for approximate inference. Finally we will extend the GPs to implicit processes (IPs), a more general class of stochastic processes which provide a flexible framework from which we can define numerous models. Although promising, current IP-based ML methods fail to exploit of all their potential due to the limitations of the approximations required in their formulation...

Spanish layman's summary:

Se proponen extensiones para modelos probabilísticos para disponer de predicciones con distribuciones de probabilidad más expresivas. Se formulan dos métodos Bayesianos, AADM y SIP, para mejorar la inferencia aproximada en redes neuronales y otros sistemas del tipo procesos implícitos.

English layman's summary:

This thesis proposes Bayesian extensions for probabilistic models to obtain more expressive prediction distributions. Two methods are detailed, AADM and SIP, which are formulated to improve approximate inference in neural networks and implicit process systems.

Descriptors: Artificial intelligence, Operations research, Statistics

Citation:
S. Rodríguez-Santana (2022), Contributions to approximate bayesian inference for Machine Learning. Madrid (Spain).

Access to public Repository