[pypy-dev] Very slow Julia example on PyPy/numpy - can you help me understand why it is slow?

Sat Feb 22 19:48:49 CET 2014

On Sat, Feb 22, 2014 at 7:45 PM, Ronan Lamy <ronan.lamy at gmail.com> wrote:
> Hello Ian,
>
> Le 20/02/14 20:40, Ian Ozsvald a écrit :
>
>> Hi Armin. The point of the question was not to remove numpy but to
>> understand the behaviour :-) I've already done a set of benchmarks
>> with lists and with numpy, I've copied the results below. I'm using
>> the same Julia code throughout (there's a note about the code below).
>> PyPy on lists is indeed very compelling.
>>
>> One observation I've made of beginners (and I did the same) is that
>> iterating over numpy arrays seems natural until you learn it is
>> horribly slow. The you learn to vectorise. Some of the current tools
>> handle the non-vectorised case really well and that's something I want
>> to mention.
>>
>> For Julia I've used lists and numpy. Using a numpy list rather than an
>> `array` makes sense as arrays won't hold a complex type (and messing
>> with decomposing the complex elements into two arrays gets even
>> sillier) and the example is still trivial for a reader to understand.
>> numpy arrays (and Python arrays) are good because they use much less
>> RAM than big lists. The reason why my example code above made lists
>> and then turned them into numpy arrays...that's because I was lazy and
>> hadn't finished tidying this demo (my bad!).
>
>
> I agree that your code looks rather sensible (at least, to people who
> haven't internalised yet all the "stupid" implementation details concerning
> arrays, lists, iteration and vectorisation). So it's a bit of a shame that
> PyPy doesn't do better.
>
>
>> I don't mind that my use of numpy is silly, I'm just curious to
>> understand why pypynumpy diverges from the results of the other
>> compiler technologies. The simple answer might be 'because pypynumpy
>> is young' and that'd be fine - at least I'd have an answer if someone
>> asks the question in my talk. If someone has more details, that'd be
>> really interesting too. Is there a fundamental reason why pypynumpy
>> couldn't do the example as fast as cython/numba/pythran?
>
>
> To answer such questions, the best way is to use the jitviewer
> (https://bitbucket.org/pypy/jitviewer ). Looking at the trace for the inner
> loop, I can see every operation on a scalar triggers a dict lookup to obtain
> its dtype. This looks like self-inflicted pain coming the broken objspace
> abstraction rather than anything fundamental. Fixing that should improve
> speed by about an order of magnitude.
>
> Cheers,
> Ronan
>

Hi Ronan.

You can't blame objspace for everything ;-) It looks like it's easily
fixable. I'm in transit right now but I can fix it once I'm home. Ian
- please come with more broken examples, they usually come from stupid
reasons!

Cheers,
fijal