[Python-3000] find -> index patch

Nick Coghlan ncoghlan at gmail.com
Sun Aug 27 17:37:59 CEST 2006


Jack Diederich wrote:
> On Sat, Aug 26, 2006 at 07:51:03PM -0700, Guido van Rossum wrote:
>> On 8/26/06, Jack Diederich <jackdied at jackdied.com> wrote:
>>> After some benchmarking find() can't go away without really hurting readline()
>>> performance.
>> Can you elaborate? readline() is typically implemented in C so I'm not
>> sure I follow.
>>
> 
> A number of modules in Lib have readline() methods that currently use find().
> StringIO, httplib, tarfile, and others
> 
> sprat:~/src/python-head/Lib# grep 'def readline' *.py | wc -l
> 30
> 
> Mainly I wanted to point out that find() solves a class of problems that
> can't be solved equally well with partition() (bad for large strings that
> want to preserve the seperator) or index() (bad for large numbers of small 
> strings and for frequent misses).  I wanted to reach the conclusion that 
> find() could be yanked out but as Fredrik opined it is still useful for a 
> subset of problems.

What about a version of partition that returned a 3-tuple of xrange objects 
indicating the indices of the partitions, instead of copies of the partitions? 
That would allow you to use the cleaner idiom without having to suffer the 
copying performance penalty.

Something like:

    line, newline, rest = s.partition_indices('\n', rest.start, rest.stop)
    if newline:
        yield s[line.start:newline.stop]

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-3000 mailing list