[SciPy-Dev] Numba as a dependency for SciPy?

Ralf Gommers ralf.gommers at gmail.com
Mon Mar 5 23:06:11 EST 2018


Hi all,

Goal of this email: start a discussion to decide whether we'd be okay with
relying on Numba as a dependency, now or in 1-2 years' time.

Context: in https://github.com/pydata/sparse/issues/126 a discussion is
ongoing about whether to adopt Cython or Numba, with Numba being preferred
by the majority. That `sparse` package is meant to provide sparse *arrays*
that down the line should either be replacing our current sparse *matrices*
or at least be integrated in scipy.sparse in addition to them. See
https://github.com/scipy/scipy/issues/8162 and
https://github.com/hameerabbasi/sparse-ndarray-protocols for more details
on that.

Also related is the question from Serge Guelton some weeks ago about
whether we'd want to rely on Pythran:
https://mail.python.org/pipermail/scipy-dev/2018-January/022325.html

On that Pythran thread I commented that we'd want to take these aspects
into account:
- portability
- performance
- maturity
- maintenance status (active devs, how quick do bugs get fixed after a
release with an issue)
- ease of use (@jit vs. Pythran comments vs. translate to .pyx syntax)
- size of generated binaries
- templating support for multiple dtypes
- debugging and optimization experience/tool

Debugging is one of the ones where I'd say Numba is still worse than
Cython, however that's being resolved as we speak:
https://github.com/numba/numba/issues/2788

One thing I missed in the above list is dependencies: while our use of
Cython only adds a build-time dependency, Numba would add a run-time
dependency. Given that binary wheels and conda packages for all major
platforms are available that's not a showstopper, but it matters.

Overall I'd say that:
- Numba is better than Cython at: performance, ease of use, size of
generated binaries, and templating support for multiple dtypes. Possibly
also maintenance status right now.
- Numba and Cython are about equally good at portability (I think, not much
data about exotic platforms for Numba).
- Cython is better than Numba at: maturity, debugging (but not for long
anymore probably), dependencies.

I'm usually pretty conservative in these things, but considering the above
I'm leaning towards saying use of Numba should be allowed in the future.
The added run-time dependency is the one major downside that's going to
stay, however compared to our Fortran headaches that's a relatively small
issue.

Thoughts?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180305/c2d81dce/attachment.html>


More information about the SciPy-Dev mailing list