Gradient Boosting vs. Random Forests, Batch Normalization, and SGD Momentum
Explain the following machine learning concepts at the level of a quant or engineer who understands the basics but wants the real intuition:
**(a)** What is the key difference between Gradient Boosting and Random Forests? When would you choose one over the other?
**(b)** What is batch normalization in deep learning, and why does it help training?
**(c)** What is momentum in stochastic gradient descent, and why does it speed up convergence?
Open the full interactive solver, hints, and worked solution →