[Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result]
Alex Martelli
python-list@python.org
Fri, 9 May 2003 09:49:33 +0200
Followups set to python-list since this is NOT an appropriate subject matter
for python-dev. Please continue the discussion on python-list, thanks.
On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote:
...
> count = {}
> for word in wordlist:
> count.setdefault(word, 0) += 1
>
> This, as I soon realized, didn't work, exactly because ints are immutable.
Actually it doesn't work because you cannot assign to a function call; the
fact that ints are immutable doesn't enter the picture.
> class Counter(int):
> def inc(self):
> # to be defined
> self += 1??
HERE is where the fact that ints are immutable will bite. If += mutated
self, this would work -- but it doesn't because ints are immutable.
> As you can see, I have a problem at the comment: how do I access the
> inherited int value??? I realized that this also wasn't going to work,
int(self) will "access the inherited int value" if I understand your meaning.
But it doesn't help you here.
> either. I finally used the perhaps idiomatic
>
> count = {}
> for word in wordlist:
> count[word] = count.get(word, 0) + 1
>
> which of course is suboptimal, because the lookup is done twice. I decided
Yes.
> not to implement a proper Counter class for memory efficiency reasons. The
__slots__ fix your memory efficiency issues: that's the REASON they exist.
However, there's ANOTHER problem...:
> code would have been simple:
>
> class Counter:
> def __init__(self):
> self.n = 0
> def inc(self):
> self.n += 1
> def get(self):
> return self.n
>
> count = {}
> for word in wordlist:
> count.setdefault(word, Counter()).inc()
>
> But to restate the core question: can class Counter be written as a
> subclass of int?
No (not meaningfully).
The performance tradeoff is tricky not because of memory considerations (which
__slots__ fix) but because you're generating (and often throwing away) a
Counter instance EVERY time. Witness:
[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
''' 'for w in words:' ' count[w]=count.get(w,0)+1'
100000 loops, best of 3: 11.6 usec per loop
versus:
[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
class Cnt(object):
__slots__=["n"]
def __init__(self): self.n=0
def inc(self): self.n+=1
''' 'for w in words:' ' count.setdefault(w,Cnt()).inc()'
10000 loops, best of 3: 43.4 usec per loop
See? It's not a speedup, but a slowdown by about FOUR times in this
example.
If you want speed, go for speed:
[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
import psyco
psyco.full()
''' 'for w in words:' ' count[w]=count.get(w,0)+1'
100000 loops, best of 3: 3.33 usec per loop
Now THIS is acceleration -- a speedup of over THREE times. And without
any complication nor abandonment of the idiomatic way of expression, too.
> Beat Bolli (please CC: me on replys, I'm not on the list)
Done. But please use python-list for these discussions: python-dev is only
for discussion about development of *Python itself*.
Alex