[Python-Dev] String views

Jim Jewett jimjjewett at gmail.com
Fri Sep 2 01:33:36 CEST 2005

Tim Delaney writes:
> One of the big disadvantages of string views is that they need to keep
> the original object around, no matter how big it is. But in the case of
> partition, much of the time the original string survives for at least a
> similar period to the partitions.

Michael Chermside writes:
> Didn't several of Raymond's examples use the idiom:

>    part_1, _, s = s.partition(first_sep)
>    part_2, _, s = s.partition(second_sep)
>    part_3, _, s = s.partition(third_sep)

Yes, but in those cases, generally the entire original string 
was being kept by at least some part_#, so there really 
wasn't any wasted space.  The problem only really shows 
up when a single 5-byte string keeps a 10K buffer alive.
If it supports 2000 such strings, then everything is fine.

Skip writes:
> I'm skeptical about performance as well, but not for that reason.  A string
> object can have a referent field.  If not NULL, it refers to another string
> object which is INCREFed in the usual way.  At string deallocation, if the
> referent is not NULL, the referent is DECREFed.  If the referent is NULL,
> ob_sval is freed.

Michael Chermside writes:
> Won't work. A string may have multiple referrents, so a single referent
> field isn't sufficient.

I think you're looking at it backwards.  A string would use a reference to
a (series of characters) instead of ob_sval, just as dictionaries point to 
a table instead of small_table.

The catch (as Tim mentioned) is that the underlying series of characters 
might be much larger than *this* string needs.  If it isn't shared, then
the extra is wasted.

One way to deal with this might be have the strings clean up when they're
called.  If the string's length multiplied by the number of references to
the buffer is much less than the size of the buffer, then the string should
make its own small copy.  Whether the complication is worth it, I don't know.


More information about the Python-Dev mailing list