Lecture 18: Deterministic, Unconstrained Optimization. The trade-off of approximation versus time. Newton's method: motivation from Taylor expansion; as gradient descent with adaptive step-size; pros and cons. Coordinate descent instead of multivariate optimization. Nelder-Mead/simplex method for derivative-free optimization. Peculiarities of optimizing statistical functionals: don't bother optimizing much within the margin of error; asymptotic calculations of said margin, using Taylor expansion and the rules for adding and multiplying variances.
Optional reading: Francis Spufford, Red Plenty (cf.); Léon Bottou and Olivier Bosquet, "The Tradeoffs of Large Scale Learning"; Herbert Simon, The Sciences of the Artificial, especially chapters 5 and 8.
Posted at October 28, 2013 10:30 | permanent link