Split a string by length

Yermat loic at fejoz.net
Thu Mar 25 08:14:54 EST 2004


Peter Otten a écrit :
> David McNab wrote:
> [...]
> 
> $ timeit.py -s"from re import findall" "findall('.{2}', 'aabbccddee')"
> 100000 loops, best of 3: 9.28 usec per loop
> $ timeit.py -s"from chunks import chunks" "list(chunks('aabbccddee', 2))"
> 100000 loops, best of 3: 7.7 usec per loop
> $ timeit.py -s"from chunks import chunklist" "chunklist('aabbccddee', 2)"
> 100000 loops, best of 3: 7.01 usec per loop
> 
> And finally the black sheep:
> 
> $ timeit.py -s"from chunks import chunksiter" "[''.join(ch) for ch in
> chunksiter('aabbccddee', 2)]"
> 10000 loops, best of 3: 27.4 usec per loop
> 
> To me the greatest difference seems that chunks() and chunklist/divide() can
> handle arbitrary sequence types while re.findall() is limited to strings.
> My attempt to tackle iterables is both slow and errorprone so far (you need
> to exhaust every chunk before you can safely advance to the next).
> 
> Peter
> 

take care with that kind of comparisons...
Especially look at the last two comparison ! the only difference is the 
construction of the list...

so what ? "Beautiful is better than ugly"
Make your choice ;-)

timeit.py -s"import re" "f = re.compile('.{2}')" "f.findall('aabbccddee')"
100000 loops, best of 3: 7.67 usec per loop

timeit.py -s"from chunk import chunks" "list(chunks('aabbccddee', 2))"
100000 loops, best of 3: 7.3 usec per loop

timeit.py -s"from chunk import chunklist" "chunklist('aabbccddee', 2)"
100000 loops, best of 3: 6.3 usec per loop

timeit.py -s"from chunk import chunksiter" 
"list(chunksiter('aabbccddee', 2))"
100000 loops, best of 3: 11.8 usec per loop

timeit.py -s"from chunk import chunksiter" "[ x for x in 
chunksiter('aabbccddee', 2)]"
100000 loops, best of 3: 15.2 usec per loop


Yermat




More information about the Python-list mailing list