Bayesian ML, Overfitting, and Error Metrics

Statistics · Medium · Free problem
You're interviewing for a quant research role and the interviewer says: "Let's talk broad ML. Walk me through three things." (a) How do Bayesian methods apply in machine learning? What's the connection between Bayesian priors and regularization? When would you prefer a full Bayesian approach over a point estimate? (b) You've built a model that performs beautifully on the training set but falls apart out of sample. Walk me through your toolkit for preventing overfitting -- not just a list of techniques, but when and why you'd pick each one. (c) You need to choose an error metric for a new model. What are the main families of error measures for regression and classification, and how does the choice of metric change depending on the problem's cost structure?

Open the full interactive solver, hints, and worked solution →