[Numpy-discussion] Should I use pip install numpy in linux?

Julian Taylor jtaylor.debian at googlemail.com
Sat Jan 9 07:12:11 EST 2016


On 09.01.2016 04:38, Nathaniel Smith wrote:
> On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum <nathan12343 at gmail.com> wrote:
>> Doesn't building on CentOS 5 also mean using a quite old version of gcc?
> 
> Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc
> 4.8 by using the Redhat Developer Toolset release (which is gcc +
> special backport libraries to let it generate RHEL5/CentOS5-compatible
> binaries). (I might have one or both of those version numbers slightly
> wrong.)
> 
>> I've never tested this, but I've seen claims on the anaconda mailing list of
>> ~25% slowdowns compared to building from source or using system packages,
>> which was attributed to building using an older gcc that doesn't optimize as
>> well as newer versions.
> 
> I'd be very surprised if that were a 25% slowdown in general, as
> opposed to a 25% slowdown on some particular inner loop that happened
> to neatly match some new feature in a new gcc (e.g. something where
> the new autovectorizer kicked in). But yeah, in general this is just
> an inevitable trade-off when it comes to distributing binaries: you're
> always going to pay some penalty for achieving broad compatibility as
> compared to artisanally hand-tuned binaries specialized for your
> machine's exact OS version, processor, etc. Not much to be done,
> really. At some point the baseline for compatibility will switch to
> "compile everything on CentOS 6", and that will be better but it will
> still be worse than binaries that target CentOS 7, and so on and so
> forth.
> 

I have over the years put in one gcc specific optimization after the
other so yes using an ancient version will make many parts significantly
slower. Though that is not really a problem, updating a compiler is easy
even without redhats devtoolset.

At least as far as numpy is concerned linux binaries should not be a
very big problem. The only dependency where the version matters is glibc
which has updated its interfaces we use (in a backward compatible way)
many times.
But here if we use a old enough baseline glibc (e.g. centos5 or ubuntu
10.04) we are fine at reasonable performance costs, basically only
slower memcpy.

Scipy on the other hand is a larger problem as it contains C++ code.
Linux systems are now transitioning to C++11 which is binary
incompatible in parts to the old standard. There a lot of testing is
necessary to check if we are affected.
How does Anaconda deal with C++11?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160109/9b0dda8b/attachment.sig>


More information about the NumPy-Discussion mailing list