[SciPy-User] Sometimes fmin_l_bfgs_b tests NaN parameters and then fails to converge

Mon Jan 3 11:47:08 EST 2011

Hi!

On Mon, 2011-01-03 at 09:46 -0500, josef.pktd at gmail.com wrote:

> In cases like this I often use smooth penalization when the optimizer
>  gets close to the boundary, or reparameterize. Anne Archibald has
>  several times written on this on the mailing lists.

I will search the list for more specific examples, thanks!

> Another alternative is to use for example nlopt
> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms
> >From the description, the optimizers have been modified to not
> evaluate outside of the bounds, in contrast to the scipy optimizers.

Many thanks, I will have a look. This is the first time I hear about
nlopt. Probably worth trying out!

> I don't know what openopt does.

Well, openopt does provide the same methods as SciPy, otherwise there
are few custom algorithms for bounded optimization (gsubg and ralg) and
connectors to few more Fortan libraries, but the documentation really
leaves much to be desired. Also, the interface didn't seem very much
convenient to me / installation is a bit complicated etc.

In particular, the descriptions of the optimizers all claim to perform
the same thing better than any other, but there is no comprehensive
comparison and highlights of specific features, such as evaluation
outside of the bounds as you mentioned :-(

So not knowing the specifics of the algorithms, their limits, advantages
and disadvantages, you are pretty much left in the dark just trying out
stuff in the hope that something will finally work out...

> I have not used either of these two packages. I like fmin, it might be
> slower but it is robust.

I don't have anything against fmin, but first, I am unaware of the
availability of the bounded version of fmin in Scipy and second, when I
tried out the unbounded version it really felt like slow as hell... And
when you have thousands of parameters and thousands concurrent
optimizations running it really doesn't feel inspiring :-(

Actually, I tried it again and it really does seem to be much more
robust as BFGS just as you claim. At least it almost converged to the
same thing starting from two completely distinct points in parameter
space, whereas BFGS came up with completely different results.

If it just were not THAT slow :-( ... Maybe I should try BFGS and then
use the resulting vector as a starting point for DS. Why wouldn't some
optimization genius write a hybrid DS which would use gradients as an
additional guidance?..

Thanks!

-- 
Sincerely yours,
Yury V. Zaytsev