LM101-077: How to Choose the Best Model using BIC
Listen now
Description
In this 77th episode of www.learningmachines101.com , we explain the proper semantic interpretation of the Bayesian Information Criterion (BIC) and emphasize how this semantic interpretation is fundamentally different from AIC (Akaike Information Criterion) model selection methods. Briefly, BIC is used to estimate the probability of the training data given the probability model, while AIC is used to estimate out-of-sample prediction error. The probability of the training data given the model is called the “marginal likelihood”.  Using the marginal likelihood, one can calculate the probability of a model given the training data and then use this analysis to support selecting the most probable model, selecting a model that minimizes expected risk, and support Bayesian model averaging. The assumptions which are required for BIC to be a valid approximation for the probability of the training data given the probability model are also discussed.
More Episodes
This 86th episode of Learning Machines 101 discusses the problem of assigning probabilities to a possibly infinite set of observed outcomes in a space-time continuum which corresponds to our physical world. The machine learning algorithm uses information about the frequency of environmental...
Published 07/20/21
Published 07/20/21
This 85th episode of Learning Machines 101 discusses formal convergence guarantees for a broad class of machine learning algorithms designed to minimize smooth non-convex objective functions using batch learning methods. Simple mathematical formulas are presented based upon research from the late...
Published 05/21/21