[pypy-dev] Copy of list

Wed Oct 7 19:01:11 CEST 2015

So I just tried it with the latest nightly build and `[:]` is now even
a bit faster than `list()`! Thank you once more!

On Tue, Sep 29, 2015 at 12:27 PM, Tuom Larsen <tuom.larsen at gmail.com> wrote:
> Hi Armin,
>
> thanks a lot, both for the explanation and the fix! I will try it soon.
>
> Have a nice day!
>
> Tuom
>
> PS: The speed difference came from larger piece of code, which I tried
> to reproduce in "minimal viable test case". Hence that `timeit`, where
> it showed up as well. But in any case, thanks a lot once more!
>
>
> On Tue, Sep 29, 2015 at 9:25 AM, Armin Rigo <arigo at tunes.org> wrote:
>> Hi Tuom,
>>
>> On Tue, Sep 29, 2015 at 7:31 AM, Tuom Larsen <tuom.larsen at gmail.com> wrote:
>>> Please, let me rephrase my question: currently I use `[:]` because it
>>> is faster in CPython (0.131 usec vs 0.269 usec per loop). I almost
>>> don't mind changing it to `list()` because of PyPy but I was wondering
>>> what do PyPy developers recommend. I don't understand why is `[:]`
>>> twice as slow as `list()` as it seems it should do the same thing
>>> (create a list and copy the content).
>>
>> Looking at the jit logs, it is tripped by a RPython function with a
>> loop in its slow-path.  Fixed in 4e688540cfe9.
>>
>> There is still a bit of overhead.  For example, lst[:] is equivalent
>> to lst[0:9223372036854775807].  The general logic looks like this:
>> when doing lst[a:b], if b > len(lst) then replace b with len(lst).
>> This means here a check if 9223372036854775807 > len(lst)...  It is
>> not possible that the length of a list be that huge, but this
>> knowledge is not codified explicitly.
>>
>> Yes, we could improve that in the future.
>> But this is really advanced details.  You should write 'list()' or
>> '[:]' as you feel more natural, or maybe as benefits the speed of
>> CPython if it makes an important difference there.  Using 'timeit' to
>> measure microbenchmarks in PyPy may or may not give a useful result.
>> In this case it did only after you stopped using range() and only
>> because we don't have more advanced optimizations that realize that
>> the resulting list is not needed at all.  In general, you should not
>> rely on it.
>>
>> What you should do instead is measure how much time is spent in some
>> real loop of your algorithm, and compare it with variants.  (Make sure
>> every variant is run in its own process, otherwise the JITting of
>> similar pieces of code might interfere in unexpected ways.)  If you're
>> lucky you may be able to find a variant that is overall much faster.
>> If you're not, it means that what you're changing is not relevant for
>> performance.
>>
>>
>> A bientôt,
>>
>> Armin.