data:image/s3,"s3://crabby-images/900d2/900d2c09c8d8d21d25f03fc6497fcf11f93bfad4" alt=""
On 03.07.2014 05:56, Sturla Molden wrote:
On 02/07/14 19:55, Chris Barker wrote:
Indeed -- the default (i.e what you get with pip install numpy) should be SSE2 -- I":d much rather have a few folks with old hardware have to go through some hoops that n have most people get something that is "much slower than MATLAB".
I think we should use SSE3 as default. It is already ten years old. Most users (99.999 %) who want binary wheels have an SSE3 capable CPU.
while true that pretty much all cpus currently around have it there is no technical requirement for even new cpus to have SSE3. Compared to SSE2 you do not have to implement it to sell a compatible 64 bit cpu. Not even the new x32 ABI requires it. In practice I think we could easily get away with using SSE3 as default but I still would like to see if it makes any performance difference in benchmarks. In my experience (which is exclusively on pre-haswell machines) the horizontal operations it offers tend to be slower than other solutions.