Conjugate gradients minimizer

Tue Mar 25 23:09:35 EST 2003

Andrew Walkingshaw wrote:
> The numerical heavy lifting, as far as energy and force (==gradient)
> evaluation goes,  is going to be done in about 100,000 lines of
> F90 (a popular density functional theory code), and we're talking of the
> order of minutes per energy evaluation in the optimistic case; my hope
> was that Python wouldn't therefore be the rate-limiting step in practice.

Ah.  That's a good point.  Often in optimization, the objective
function is much more time consuming than the optimzer itself, as it
is in this case.  I think of conjugate gradient as the type of method
that optimizes rather simple functions with an insanely large number
of variables, meaning the optimizing part is a pretty big chunk of the
time.  That's what it's good for.

In your case, however, it's probably not too important to worry too
much about the inefficiencies of Python.

But let me recommend something else.  You have only a thousand
variables, which is really not a lot.  Conjugate gradient is best
suited for systems of millions of variables, because it doesn't have
to store a matrix.  It doesn't scale down well.

So, unless you have reason to believe conjugate-gradient is specially
suited to this problem (indeed, I have heard of an atomic structure
problem that was like that), or if you plan to scale up to DNA
crystals someday, use the BGFS method instead.

BGFS is far more efficient than conjugate gradient for most systems.
The only thing is, it requires inverting a matrix at every step, which
you really don't want to do in Python.  It's also much more
complicated.  But it solves almost everything faster.

And, if you can calculate a Hessian matrix without much difficulty
(which I kind of doubt in your case), then you should use good ol'
Newton's method with restricted steps.

> What I'm doing is a list comprehension, which isn't exactly fast:
> 
> [n*x for x in self.components], which to my understanding basically
> unrolls tt;
> 
> for inc in range(len(self.components):
>        newarray.append(self.components[inc]*n)
> self.components = newarray
> 
> - but, as I said, my hope is that this isn't the slow step in any case.
> (If it turns out that it *is*, I'll probably give up and recode the
> entire thing in F90...)

According to conventional wisdom, map with a builtin function is
faster than map with a lambda which is a little faster than a list
comprehension.

> Indeed; I'm a newbie (first-year PhD) materials physicist, and learning
> Fortran is moving extremely rapidly up my list of priorities.

Don't listen to anyone who says Fortran's dead.  People still use it
for what it's good: numerical work.

(Too bad people don't use Perl only for what it's good at: ten-line
throwaway scripts.)

>> And having worked with some of them, I can tell you they do know what
>> they're doing.  (But God help them if they want to write regular
>> code.)
> 
> This is my experience too: the other problem seems to be that a lot of
> numerical codes are quick-hack piled on quick-hack, and as such they
> rapidly get to the point where no mortal mind would be capable of keeping
> all the corner cases in memory at once...

You're right.  I've seen some hideously bad numerical code, and code
that was crystal clear despite having many hand optimizations.  The
former were more common.  

-- 
CARL BANKS