[Tutor] cross sum toying
Raymond Hettinger
python at rcn.com
Fri Oct 10 13:15:06 EDT 2003
> >>>>from itertools import imap
> >>>>def cross_sum(n, codezero=ord('0')):
> >>>>
> >>>>
> > s = str(n)
> > return sum(imap(ord, s)) - len(s) * codezero
> >
> >
> Nice idea! I played with it, as you suggested and, interestingly,
found
> out that it is exactly as fast as:
>
> def cross_sum7(n):
> s=str(n)
> return sum(map(ord, s)) - ord("0")*len(s)
>
> So why use itertools in this case?
I always use an iterator version unless a full listing is essential.
Likewise, I usually prefer xrange() to range(). sum() and imap() work
especially well together -- the memory consumption is constant rather
than linear as with map(). For small inputs, the performance difference
is negligible. For bigger inputs, building a long list and throwing it
away after one use is not only wasteful of memory, but it kills cache
performance once the memory requirements grow beyond the cache size.
IOW, the habit of using iterators results in apps that scale-up nicely.
Aside from memory consumption and cache utilization, map() is also at a
disadvantage when its arguments are do not have a __len__() method. In
the cross_sum() example, the string "s" dutifully reports its length so
that map can pre-allocate storage. However, in other apps, the inputs
can be general iterables (generators for example) which only offer
__iter__() but not __len__(). In that case, map's performance drops-off
each time it has to do a memory reallocation as the sequence length
grows.
One other thought: pre-computing ord("0") into a local variable ought to
produce a measurable time savings for repeated calls to cross_sum(). If
you want to be an overachiever, change the function definition to:
def cross_sum(n, sum=sum, imap=itertools.imap, codezero=ord("0"),
len=len, str=str):
s = str(n)
return sum(imap(ord, s)) - len(s) * codezero
Bonus question: Explain whether imap() would be preferred to map() in
the following definition:
def dotproduct(v1, v2):
return sum(imap(operator.mul, v1, v2))
For grins, time it against the more specialized function:
def integer_dot_product(v1, v2, sum=sum, imap=imap, mul=int.__mul__):
return sum(imap(mul, v1, v2))
Try repeated calls with short vector tuples. Time it again with longer
iterator inputs like: integer_dot_product(xrange(1,1000000,2),
xrange(0,1000000,2)).
Raymond Hettinger
More information about the Tutor
mailing list