[Python-Dev] Proof of the pudding: str.partition()

Josiah Carlson jcarlson at uci.edu
Thu Sep 1 00:55:51 CEST 2005


Steve Holden <steve at holdenweb.com> wrote:
> 
> Fredrik Lundh wrote:

> > the problem isn't the time it takes to unpack the return value, the problem is that
> > it takes time to create the substrings that you don't need.
> > 
> Indeed, and therefore the performance of rpartition is likely to get 
> worse as the length of the input strung increases. I don't like to think 
> about all those strings being created just to be garbage-collected. Pity 
> the poor CPU ... :-)
> 
> > for some use cases, a naive partition-based solution is going to be a lot slower
> > than the old find+slice approach, no matter how you slice, index, or unpack the
> > return value.
> > 
> Yup. Then it gets down to statistical arguments about the distribution 
> of use cases and input lengths. If we had a type that represented a 
> substring of an existing string it might avoid the stress, but I'm not 
> sure I see that one flying.

What about buffer()?  Tack on some string methods and you get a string
slice-like instance with very low memory requirements.  Add on actual
comparisons of buffers and strings, and you can get nearly everything
desired with very low memory overhead.

A bit of free thought brings me to the (half-baked) idea that if string
methods accepted any object which conformed to the buffer interface;
mmap, buffer, array, ... instances could gain all of the really
convenient methods that make strings the objects to use in many cases.

If one wanted to keep string methods returning strings, and other
objects with the buffer protocol which use string methods returning
buffer objects, that seems reasonable (and probably a good idea).

 - Josiah

P.S. Pardon me if the idea is pure insanity, I haven't been getting much
sleep lately, and just got up from a nap that seems to have clouded my
judgement (I just put milk in my juice...).



More information about the Python-Dev mailing list