[Tutor] How to override getting items from a list for iteration

Sun Feb 10 16:10:24 CET 2013

On 02/10/2013 09:32 AM, Walter Prins wrote:
> Hello,
>
> I have a program where I'm overriding the retrieval of items from a list.
>   As background: The data held by the lists are calculated but then read
> potentially many times thereafter, so in order to prevent needless
> re-calculating the same value over and over, and to remove checking/caching
> code from the calculation logic code, I therefore created a subclass of
> list that will automatically calculate the value in a given slot
> automatically if not yet calculated. (So differently put, I'm implemented a
> kind of list specific caching/memoization with the intent that it should be
> transparent to the client code.)
>
> The way I've implemented this so far was to simply override
> list.__getitem__(self, key) to check if the value needs to be calculated or
> not and call a calculation method if required, after which the value is
> returned as normal.  On subsequent calls __getitem__ then directly returns
> the value without calculating it again.
>
> This worked mostly fine, however yesterday I ran into a slightly unexpected
> problem when I found that when the list contents is iterated over and
> values retrieved that way rather than via [], then __getitem__ is in fact
> *not* called on the list to read the item values from the list, and
> consequently I get back the "not yet calculated" entries in the list,
> without the calculation routine being automatically called as is intended.
>
> Here's a test application that demonstrates the issue:
>
> class NotYetCalculated:
>      pass
>
> class CalcList(list):
>      def __init__(self, calcitem):
>          super(CalcList, self).__init__()
>          self.calcitem = calcitem
>
>      def __getitem__(self, key):
>          """Override __getitem__ to call self.calcitem() if needed"""
>          print "CalcList.__getitem__(): Enter"
>          value = super(CalcList, self).__getitem__(key)
>          if value is NotYetCalculated:
>              print "CalcList.__getitem__(): calculating"
>              value = self.calcitem(key)
>              self[key] = value
>          print "CalcList.__getitem__(): return"
>          return value
>
> def calcitem(key):
>      # Demo: return square of index
>      return key*key
>
>
> def main():
>      # Create a list that calculates its contents via a given
>      # method/fn onece only
>      l = CalcList(calcitem)
>      # Extend with  few entries to demonstrate issue:
>      l.extend([NotYetCalculated, NotYetCalculated, NotYetCalculated,
>                NotYetCalculated])
>
>      print "1) Directly getting values from list works as expected:
> __getitem__ is called:"
>      print "Retrieving value [2]:\n", l[2]
>      print
>      print "Retrieving value [3]:\n", l[3]
>      print
>      print "Retrieving value [2] again (no calculation this time):\n", l[2]
>      print
>
>      print "Retrieving values via an iterator doesn't work as expected:"
>      print "(__getitem__ is not called and the code returns "
>      print " NotYetCalcualted entries without calling __getitem__. How do I
> fix this?)"
>      print "List contents:"
>      for x in l: print x
>
>
> if __name__ == "__main__":
>      main()
>
> To reiterate:
>
> What should happen:  In test 2) above all entries should be automatically
> calculated and output should be numbers only.
>
> What actually happens: In test 2) above the first 2 list entries
> corresponding to list indexes 0 and 1 are output as "NotYetCalculated" and
> calcitem is not called when required.
>
> What's the best way to fix this problem?  Do I need to maybe override
> another method, perhaps provide my own iterator implementation?  For that
> matter, why doesn't iterating over the list contents fall back to calling
> __getitem__?
>

Implement your own __iter__() special method.

And consider whether you might need __setitem__(), __len__(), 
__setslice__(), __getslice__() and others.

Maybe you'd be better off not inheriting from list at all, and just 
having an attribute that's a list.  It doesn't sound like you're 
defining a very big subset of list, and overriding the methods you 
*don't* want seems to be more work than just implementing the ones you do.

A separate question:  is this likely to be a sparse list?  If it's very 
sparse, perhaps you'd consider using a dict, rather than a list attribute.

-- 
DaveA