[Numpy-discussion] "import numpy" performance

Chris Barker chris.barker at noaa.gov
Tue Jul 3 12:58:03 EDT 2012


On Mon, Jul 2, 2012 at 12:17 PM, Andrew Dalke <dalke at dalkescientific.com> wrote:
> In this email I propose a few changes which I think are minor
> and which don't really affect the external NumPy API but which
> I think could improve the "import numpy" performance by at
> least 40%.

+1 -- I think I remember that thread -- at the time, I was
experiencing some really, really slow inport times myself -- it turned
out to be something really wierd with my system (though I don't
remember exactly what), but numpy still is too big an import.

Another note -- I ship stuff with py2exe and friends a fair bit --
numpy's "Import a whole bunch of stuff you may well not be using"
approach means I have to include all that stuff, or hack the heck out
of numpy -- not ideal.

> 1) remove "add_newdocs" and put the docstrings in the C code
>  'add_newdocs' still needs to be there,
>
> The code says:
>
> # This is only meant to add docs to objects defined in C-extension modules.
> # The purpose is to allow easier editing of the docstrings without
> # requiring a re-compile.

+1 -- isn't it better for the docs to be with the code, anyway?

> 2) Don't optimistically assume that all submodules are
> needed. For example, some current code uses
>
>>>> import numpy
>>>> numpy.fft.ifft
> <function ifft at 0x10199f578>

+1 see above -- really, what fraction of code uses fft and polynomial, and ...

"namespaces are one honking great idea"

I appreciate the legacy, and the easy-of-use at the interpreter, but
it sure would be nice to clean this up -- maybe keep the leegacy by
having a new import:

import just_numpy as np

that would import the core stuff, and offer the "extra" packages as
specific imports -- ideally, we'd dpreciate the old way, and reccoment
the extra importing for the future, and some day have "numpy" and
"numpy_plus". (Kind of like pylab, I suppose)

lazy importing may work OK, too, though more awkward for py2exe and
friends, and perhaps a bit "magic" for my taste.

> 3) Especially: don't always import 'numpy.testing'

+1

> I have not worried about numpy import performance for
> 4 years. While I have been developing scientific software
> for 20 years, and in Python for 15 years, it has been
> in areas of biology and chemistry which don't use arrays.

remarkable -- I use arrays for everything! most of which are not
classic big arrays you process with lapack type stuff ;-)

>   yeah, it's just using the homogenous array most of the time.

exactly -- I know Travis says: "if you're going to use numpy arrays,
use numpy", but they really are pretty darn handy even if you just use
them as containers.

Ben root wrote:

> Not sure how this would impact projects like ipython that does tab-completion support,
> but I know that that would drive me nuts in my basic tab-completion setup I have for
>my regular python terminal.  Of course, in the grand scheme of things, that really
> isn't all that important, I don't think.

I do think it's important to support easy interactive use, Ipyhton,
etc -- with nice tab completion, easy access to doc string, etc. But
it should alo be possible to not have all that where it isn't required
-- hence my "import numpy_plus" type proposal.

I never did get why the polynomial stuff was added to core numpy....

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list