[Tutor] How to override getting items from a list for iteration
Dave Angel
davea at davea.name
Sun Feb 10 16:18:22 CET 2013
On 02/10/2013 10:10 AM, Dave Angel wrote:
> On 02/10/2013 09:32 AM, Walter Prins wrote:
>> Hello,
>>
>> I have a program where I'm overriding the retrieval of items from a list.
>> As background: The data held by the lists are calculated but then read
>> potentially many times thereafter, so in order to prevent needless
>> re-calculating the same value over and over, and to remove
>> checking/caching
>> code from the calculation logic code, I therefore created a subclass of
>> list that will automatically calculate the value in a given slot
>> automatically if not yet calculated. (So differently put, I'm
>> implemented a
>> kind of list specific caching/memoization with the intent that it
>> should be
>> transparent to the client code.)
>>
>> The way I've implemented this so far was to simply override
>> list.__getitem__(self, key) to check if the value needs to be
>> calculated or
>> not and call a calculation method if required, after which the value is
>> returned as normal. On subsequent calls __getitem__ then directly
>> returns
>> the value without calculating it again.
>>
>> This worked mostly fine, however yesterday I ran into a slightly
>> unexpected
>> problem when I found that when the list contents is iterated over and
>> values retrieved that way rather than via [], then __getitem__ is in fact
>> *not* called on the list to read the item values from the list, and
>> consequently I get back the "not yet calculated" entries in the list,
>> without the calculation routine being automatically called as is
>> intended.
>>
>> Here's a test application that demonstrates the issue:
>>
>> class NotYetCalculated:
>> pass
>>
>> class CalcList(list):
>> def __init__(self, calcitem):
>> super(CalcList, self).__init__()
>> self.calcitem = calcitem
>>
>> def __getitem__(self, key):
>> """Override __getitem__ to call self.calcitem() if needed"""
>> print "CalcList.__getitem__(): Enter"
>> value = super(CalcList, self).__getitem__(key)
>> if value is NotYetCalculated:
>> print "CalcList.__getitem__(): calculating"
>> value = self.calcitem(key)
>> self[key] = value
>> print "CalcList.__getitem__(): return"
>> return value
>>
>> def calcitem(key):
>> # Demo: return square of index
>> return key*key
>>
>>
>> def main():
>> # Create a list that calculates its contents via a given
>> # method/fn onece only
>> l = CalcList(calcitem)
>> # Extend with few entries to demonstrate issue:
>> l.extend([NotYetCalculated, NotYetCalculated, NotYetCalculated,
>> NotYetCalculated])
>>
>> print "1) Directly getting values from list works as expected:
>> __getitem__ is called:"
>> print "Retrieving value [2]:\n", l[2]
>> print
>> print "Retrieving value [3]:\n", l[3]
>> print
>> print "Retrieving value [2] again (no calculation this time):\n",
>> l[2]
>> print
>>
>> print "Retrieving values via an iterator doesn't work as expected:"
>> print "(__getitem__ is not called and the code returns "
>> print " NotYetCalcualted entries without calling __getitem__. How
>> do I
>> fix this?)"
>> print "List contents:"
>> for x in l: print x
>>
>>
>> if __name__ == "__main__":
>> main()
>>
>> To reiterate:
>>
>> What should happen: In test 2) above all entries should be automatically
>> calculated and output should be numbers only.
>>
>> What actually happens: In test 2) above the first 2 list entries
>> corresponding to list indexes 0 and 1 are output as "NotYetCalculated"
>> and
>> calcitem is not called when required.
>>
>> What's the best way to fix this problem? Do I need to maybe override
>> another method, perhaps provide my own iterator implementation? For that
>> matter, why doesn't iterating over the list contents fall back to calling
>> __getitem__?
>>
>
> Implement your own __iter__() special method.
>
> And consider whether you might need __setitem__(), __len__(),
> __setslice__(), __getslice__() and others.
>
> Maybe you'd be better off not inheriting from list at all, and just
> having an attribute that's a list. It doesn't sound like you're
> defining a very big subset of list, and overriding the methods you
> *don't* want seems to be more work than just implementing the ones you do.
>
> A separate question: is this likely to be a sparse list? If it's very
> sparse, perhaps you'd consider using a dict, rather than a list attribute.
>
>
>
BTW, the answer to why iterating over the list contents doesn't call
__getitem__, the answer is because list does define __iter__, presumably
to do it more efficiently.
And there is your clue that perhaps you don't want to inherit from list.
You don't want its more-efficient version, so all you have to do is
not to implement an __iter__ and it should just work.
--
DaveA
More information about the Tutor
mailing list