[Python-3000] find -> index patch
Jack Diederich
jack at psynchronous.com
Sun Aug 27 19:05:50 CEST 2006
On Mon, Aug 28, 2006 at 01:37:59AM +1000, Nick Coghlan wrote:
> Jack Diederich wrote:
> > On Sat, Aug 26, 2006 at 07:51:03PM -0700, Guido van Rossum wrote:
> >> On 8/26/06, Jack Diederich <jackdied at jackdied.com> wrote:
> >>> After some benchmarking find() can't go away without really hurting readline()
> >>> performance.
> >> Can you elaborate? readline() is typically implemented in C so I'm not
> >> sure I follow.
> >>
> >
> > A number of modules in Lib have readline() methods that currently use find().
> > StringIO, httplib, tarfile, and others
> >
> > sprat:~/src/python-head/Lib# grep 'def readline' *.py | wc -l
> > 30
> >
> > Mainly I wanted to point out that find() solves a class of problems that
> > can't be solved equally well with partition() (bad for large strings that
> > want to preserve the seperator) or index() (bad for large numbers of small
> > strings and for frequent misses). I wanted to reach the conclusion that
> > find() could be yanked out but as Fredrik opined it is still useful for a
> > subset of problems.
>
> What about a version of partition that returned a 3-tuple of xrange objects
> indicating the indices of the partitions, instead of copies of the partitions?
> That would allow you to use the cleaner idiom without having to suffer the
> copying performance penalty.
>
> Something like:
>
> line, newline, rest = s.partition_indices('\n', rest.start, rest.stop)
> if newline:
> yield s[line.start:newline.stop]
>
What is with the sudden rush to solve all problems by using slice objects?
I've never used a slice object and I don't care to start now. The above code
reads just fine as
i = s.find('\n', start, stop)
if i >= 0:
yield s[:i]
-Jack
More information about the Python-3000
mailing list