[pypy-dev] The correct way to incrementally build list in RPython

Yichao Yu yyc1992 at gmail.com
Fri Jun 27 12:17:22 CEST 2014


On Fri, Jun 27, 2014 at 3:27 PM, Yichao Yu <yyc1992 at gmail.com> wrote:
> On Fri, Jun 27, 2014 at 3:07 PM, Armin Rigo <arigo at tunes.org> wrote:
>> Hi Yichao,
>>
>> On 27 June 2014 04:09, Yichao Yu <yyc1992 at gmail.com> wrote:
>>> On an attempt to fix the hash function of numpy.dtype, I came up with
>>> the patch in the attachment translated from hashdescr.c in the
>>> original numpy source. The patch passed the test when running in
>>> pyinteractive.py. However, when started to translate pypy, I get an
>>> ListChangeUnallowed error.
>>
>> That errors comes from space.newtuple(lst): it must be passed a list
>> "lst" that is "not resized", that is, a list on which we never do an
>> append() or pop() or similar.
>
> Hmmm. So the error actually happens when the compiler traces back
> function calls? I was trying to figure out what's wrong from inside my
> function T_T ...
>
>>
>> The standard fix is to do "lst = [None] * length" and then fill
>> "lst[index] = ...".  This only works if you know the final length in
>> advance.  A more heavy way to fix it is simply to call
>> space.newtuple(lst[:]), making a fresh copy of the list.  But in this
>> case, a better fix would be to avoid the list and the space.newtuple()
>> and space.hash() completely, and instead compute the hash value
>> incrementally, by copying the logic in
>> pypy/objspace/std/tupleobject.py:descr_hash().
>
> Thank you for your advices. I would definitely try that.

I've made a new version[1] and it seems to work.
I am using the hash function from `compute_hash` instead of the
improved one in tupleobject.py:descr_hash since the length is not
known in advance.

[1] https://bitbucket.org/pypy/pypy/pull-request/242/fix-hash-function-for-complicated/diff

>
>>
>> Similarly, I'm not sure you need to wrap self.kind, 0, self.elsize and
>> self.alignment and put them inside a tuple of length 4, when all you
>> could do is some random computation with these three numbers.

I am still creating a tuple out of these 5 numbers/strings since I
don't want to duplicate the hash functions for all of them. Can the
compiler deal with this case efficiently?

>
> I agree. As I said, this is basically copied from the c implementation
> and this is what they do over there..... (btw, it is actually a tuple
> of length 5 with endian being the second element ;P ).
>
>>
>>
>> A bientôt,
>>
>> Armin.
>
> Yichao Yu


More information about the pypy-dev mailing list