Abuse of subject, was Re: Abuse of Big Oh notation
Peter Otten
__peter__ at web.de
Tue Aug 21 03:52:09 EDT 2012
wxjmfauth at gmail.com wrote:
> By chance and luckily, first attempt.
> c:\python32\python -m timeit "('€'*100+'€'*100).replace('€'
> , 'œ')"
> 1000000 loops, best of 3: 1.48 usec per loop
> c:\python33\python -m timeit "('€'*100+'€'*100).replace('€'
> , 'œ')"
> 100000 loops, best of 3: 7.62 usec per loop
OK, that is roughly factor 5. Let's see what I get:
$ python3.2 -m timeit '("€"*100+"€"*100).replace("€", "œ")'
100000 loops, best of 3: 1.8 usec per loop
$ python3.3 -m timeit '("€"*100+"€"*100).replace("€", "œ")'
10000 loops, best of 3: 9.11 usec per loop
That is factor 5, too. So I can replicate your measurement on an AMD64 Linux
system with self-built 3.3 versus system 3.2.
> Note
> The used characters are not members of the latin-1 coding
> scheme (btw an *unusable* coding).
> They are however charaters in cp1252 and mac-roman.
You seem to imply that the slowdown is connected to the inability of latin-1
to encode "œ" and "€" (to take the examples relevant to the above
microbench). So let's repeat with latin-1 characters:
$ python3.2 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")'
100000 loops, best of 3: 1.76 usec per loop
$ python3.3 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")'
10000 loops, best of 3: 10.3 usec per loop
Hm, the slowdown is even a tad bigger. So we can safely dismiss your theory
that an unfortunate choice of the 8 bit encoding is causing it. Do you
agree?
More information about the Python-list
mailing list