Split a string by length
Yermat
loic at fejoz.net
Thu Mar 25 08:14:54 EST 2004
Peter Otten a écrit :
> David McNab wrote:
> [...]
>
> $ timeit.py -s"from re import findall" "findall('.{2}', 'aabbccddee')"
> 100000 loops, best of 3: 9.28 usec per loop
> $ timeit.py -s"from chunks import chunks" "list(chunks('aabbccddee', 2))"
> 100000 loops, best of 3: 7.7 usec per loop
> $ timeit.py -s"from chunks import chunklist" "chunklist('aabbccddee', 2)"
> 100000 loops, best of 3: 7.01 usec per loop
>
> And finally the black sheep:
>
> $ timeit.py -s"from chunks import chunksiter" "[''.join(ch) for ch in
> chunksiter('aabbccddee', 2)]"
> 10000 loops, best of 3: 27.4 usec per loop
>
> To me the greatest difference seems that chunks() and chunklist/divide() can
> handle arbitrary sequence types while re.findall() is limited to strings.
> My attempt to tackle iterables is both slow and errorprone so far (you need
> to exhaust every chunk before you can safely advance to the next).
>
> Peter
>
take care with that kind of comparisons...
Especially look at the last two comparison ! the only difference is the
construction of the list...
so what ? "Beautiful is better than ugly"
Make your choice ;-)
timeit.py -s"import re" "f = re.compile('.{2}')" "f.findall('aabbccddee')"
100000 loops, best of 3: 7.67 usec per loop
timeit.py -s"from chunk import chunks" "list(chunks('aabbccddee', 2))"
100000 loops, best of 3: 7.3 usec per loop
timeit.py -s"from chunk import chunklist" "chunklist('aabbccddee', 2)"
100000 loops, best of 3: 6.3 usec per loop
timeit.py -s"from chunk import chunksiter"
"list(chunksiter('aabbccddee', 2))"
100000 loops, best of 3: 11.8 usec per loop
timeit.py -s"from chunk import chunksiter" "[ x for x in
chunksiter('aabbccddee', 2)]"
100000 loops, best of 3: 15.2 usec per loop
Yermat
More information about the Python-list
mailing list