[Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result]

Alex Martelli python-list@python.org
Fri, 9 May 2003 09:49:33 +0200


Followups set to python-list since this is NOT an appropriate subject matter
for python-dev.  Please continue the discussion on python-list, thanks.

On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote:
   ...
>         count = {}
>         for word in wordlist:
>             count.setdefault(word, 0) += 1
>
> This, as I soon realized, didn't work, exactly because ints are immutable.

Actually it doesn't work because you cannot assign to a function call; the
fact that ints are immutable doesn't enter the picture.

> 	class Counter(int):
> 	    def inc(self):
>                 # to be defined
>                 self += 1??

HERE is where the fact that ints are immutable will bite.  If += mutated
self, this would work -- but it doesn't because ints are immutable.

> As you can see, I have a problem at the comment: how do I access the
> inherited int value??? I realized that this also wasn't going to work,

int(self) will "access the inherited int value" if I understand your meaning.
But it doesn't help you here.

> either. I finally used the perhaps idiomatic
>
>         count = {}
>         for word in wordlist:
>             count[word] = count.get(word, 0) + 1
>
> which of course is suboptimal, because the lookup is done twice. I decided

Yes.

> not to implement a proper Counter class for memory efficiency reasons. The

__slots__ fix your memory efficiency issues: that's the REASON they exist.
However, there's ANOTHER problem...:

> code would have been simple:
>
>         class Counter:
>             def __init__(self):
>                 self.n = 0
>             def inc(self):
>                 self.n += 1
>             def get(self):
>                 return self.n
>
>         count = {}
>         for word in wordlist:
>             count.setdefault(word, Counter()).inc()
>
> But to restate the core question: can class Counter be written as a
> subclass of int?

No (not meaningfully).

The performance tradeoff is tricky not because of memory considerations (which
__slots__ fix) but because you're generating (and often throwing away) a 
Counter instance EVERY time.  Witness:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
''' 'for w in words:'  '  count[w]=count.get(w,0)+1'
100000 loops, best of 3: 11.6 usec per loop

versus:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
class Cnt(object):
  __slots__=["n"]
  def __init__(self): self.n=0
  def inc(self): self.n+=1
''' 'for w in words:'  '  count.setdefault(w,Cnt()).inc()'
10000 loops, best of 3: 43.4 usec per loop

See?  It's not a speedup, but a slowdown by about FOUR times in this
example.

If you want speed, go for speed:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
import psyco
psyco.full()
''' 'for w in words:'  '  count[w]=count.get(w,0)+1'
100000 loops, best of 3: 3.33 usec per loop

Now THIS is acceleration -- a speedup of over THREE times.  And without
any complication nor abandonment of the idiomatic way of expression, too.

> Beat Bolli (please CC: me on replys, I'm not on the list)

Done.  But please use python-list for these discussions: python-dev is only
for discussion about development of *Python itself*.


Alex