Conjugate gradients minimizer
Gareth.McCaughan at pobox.com
Thu Mar 27 23:26:13 CET 2003
Carl Banks wrote:
> > I concur. If it turns out that the overhead from BFGS
> > *is* too much for some reason, you might also consider
> > something intermediate between conjugate gradient and
> > BFGS, such as L-BFGS.
> > But it's not true that BFGS requires inverting a matrix
> > at each step. The matrix you maintain is an approximation
> > to the *inverse* of the Hessian, and no inversion is
> > necessary.
> Oh, silly me. I probably said that but just intended to mean "you
> have to store an NxN matrix." I never heard of L-BFGS, though. I
> admit I got my optimization knowledge from 20-year-old texts.
Nothing very wrong with that; some 20-year-old optimization
books are very good :-). If you're in the market for something
newer, I can highly recommend Nocedal & Wright, which (inter
alia) covers L-BFGS. (Not surprising, since I think Nocedal
invented it.) The idea of L-BFGS is that instead of storing
the approximation to the inverse Hessian explicitly, you
store some information from each of the last <some-number>
iterations, which you use to reconstruct what the approximate
inverse Hessian would have been if those were the only
iterations that had ever happened. Choosing <some-number>
carefully lets you interpolate between a scheme very like
conjugate gradient (if you only keep a couple of iterations'
worth of information) through to something equivalent to
full BFGS (if you store as many as your space has dimensions).
 Er, I'm not certain it's equivalent. If it's not, it's
probably at any rate about equally good. My books on
optimization are all at work, and I'm at home :-).
Gareth McCaughan Gareth.McCaughan at pobox.com
.sig under construc
More information about the Python-list