[Numpy-discussion] import overhead of numpy.testing

Wed Aug 7 09:26:51 EDT 2013

On Aug 7, 2013, at 4:37 AM, Charles R Harris wrote:
> I haven't forgotten and intend to look at it before the next release.

Thanks!

On a related topic, last night I looked into deferring the
import for numpy.testing. This is the only other big place
where numpy's import overhead might be reduced without
breaking backwards compatibility.

I made a _DeferredTester [1] and replaced the 10 __init__.py
uses of:

from .testing import Tester
test = Tester().test
bench = Tester().bench

to use the _DeferredTester instead.

With that in place the "import numpy" time (best of 5)
goes from 0.0796 seconds to 0.0741 seconds, or 7%.

That 0.0796 includes the 0.02 seconds for the exec()
of the polynomial templates. Without that 0.02 seconds
in the baseline would give a 10% speedup. [2]

Would this sort of approach be acceptable to NumPy?
If so, I could improve the patch to make it be acceptable.

The outstanding code issues to be resolve before making
a pull request are:

1) The current wrapper uses *args and **kwargs to forward
   any test() and bench() calls to the actual function.
   As a result, parameter introspection doesn't work.

2) The current wrapper doesn't have a __doc__

3) The only way to fix 1) and 2) is to copy the signatures
   and docstring from the actual Tester() implementation,
   which causes code/docstring duplication.

4) I don't know if it's appropriate to add my _DeferredTester
   to numpy.core vs. some other place in the code base.

If you want to see the patch, I followed the NumPy instructions at
 http://docs.scipy.org/doc/numpy/dev/gitwash/git_development.html
and made an experimental fork at
 https://github.com/adalke/numpy/tree/no-default-tester-import

I have no git/github experience beyond what I did for this
patch, so let me know if there are problems in what I did.

Cheers,

				Andrew
				dalke at dalkescientific.com

[1]

Inside of numpy/core/__init__.py I added

class _DeferredTester(object):
   def __init__(self, package_filename):
       import os
       self._package_path = os.path.dirname(package_filename)
   def test(self, *args, **kwargs):
       from ..testing import Tester
       return Tester(self._package_path).test(*args, **kwargs)
   def bench(self, *args, **kwargs):
       from ..testing import Tester
       return Tester(self._package_path).bench(*args, **kwargs)
   def get_test_and_bench(self):
       return self.test, self.bench

It's used like this:

from ..core import _DeferredTester
test, bench = _DeferredTester(__file__).get_test_and_bench()

That's admittedly ugly. It could also be:

test = _DeferredTester(__file__).test
bench = _DeferredTester(__file__).bench

[2]

Is an import speedup (on my laptop) of 0.0055 seconds
important? I obviously think so. 

This time affects everyone who uses NumPy, even if
incidentally, as in my case. I don't actually use
NumPy, but I use a chemistry toolkit with a C++
core that imports NumPy in order to have access to
its array data structure, even though I don't make
use of that ability.

If there are 1,000,000 "import numpy"s per day, then
that's 90 minutes of savings per day.

Yes, I could also switch to an SSD and the overhead
will decrease. But on the other hand, I've also worked
on a networked file system for a cluster where
"python -c pass" took over a second to run, because
Lustre is lousy with lots of metadata requests. (See
http://www.nas.nasa.gov/hecc/support/kb/Lustre-Best-Practices_226.html )

In that case we switched to a zip importer, but you
get my point that the 0.0055 seconds is also a function
of the filesystem time, and that performance varies.