[Python-3000] find -> index patch

Jack Diederich jack at psynchronous.com
Sun Aug 27 19:05:50 CEST 2006


On Mon, Aug 28, 2006 at 01:37:59AM +1000, Nick Coghlan wrote:
> Jack Diederich wrote:
> > On Sat, Aug 26, 2006 at 07:51:03PM -0700, Guido van Rossum wrote:
> >> On 8/26/06, Jack Diederich <jackdied at jackdied.com> wrote:
> >>> After some benchmarking find() can't go away without really hurting readline()
> >>> performance.
> >> Can you elaborate? readline() is typically implemented in C so I'm not
> >> sure I follow.
> >>
> > 
> > A number of modules in Lib have readline() methods that currently use find().
> > StringIO, httplib, tarfile, and others
> > 
> > sprat:~/src/python-head/Lib# grep 'def readline' *.py | wc -l
> > 30
> > 
> > Mainly I wanted to point out that find() solves a class of problems that
> > can't be solved equally well with partition() (bad for large strings that
> > want to preserve the seperator) or index() (bad for large numbers of small 
> > strings and for frequent misses).  I wanted to reach the conclusion that 
> > find() could be yanked out but as Fredrik opined it is still useful for a 
> > subset of problems.
> 
> What about a version of partition that returned a 3-tuple of xrange objects 
> indicating the indices of the partitions, instead of copies of the partitions? 
> That would allow you to use the cleaner idiom without having to suffer the 
> copying performance penalty.
> 
> Something like:
> 
>     line, newline, rest = s.partition_indices('\n', rest.start, rest.stop)
>     if newline:
>         yield s[line.start:newline.stop]
> 

What is with the sudden rush to solve all problems by using slice objects?
I've never used a slice object and I don't care to start now.  The above code
reads just fine as

i = s.find('\n', start, stop)
if i >= 0:
  yield s[:i]

-Jack


More information about the Python-3000 mailing list