
On Thu, 22 May 2008, Gary Herron wrote:
In fact, a slice is *not* always a copy! In at least some (simple) cases, a slice references the original string:
s = 'abc' t = s[:] s is t True id(s) 3081872000L id(t) 3081872000L
I think the more interesting case is where the string objects are not the same object but use (portions of) the same underlying array in memory. If I understand correctly, Python does not do this, and I thought I read something about why not but I can't remember the details. Sharing contents is an obvious optimization which in some circumstances can dramatically reduce the amount of copying that goes on, but without a reasonably clever algorithm to decide when to let the underlying storage go (copying any part still in use), extremely bad behaviour can result - imagine reading in lots of long lines, then keeping just a short piece of each one. By contrast, the worst that can happen with no sharing is that performance and memory use is what you expect - the only "bad" is the apparent missed opportunity for optimization. I wonder if a "shared slice" object would be useful? That is, an object which behaves like a string obtained from a slicing operation except that it shares storage. It could have a .release method to go ahead and copy the underlying storage. One complexity comes to mind immediately - what happens if one takes a shared slice of a shared slice? Presumably it shares the original string's storage, but if the first shared slice is .released what happens to the second shared slice? It would be nice if it shared with the first shared slice, but keeping track of everything could get tricky. I'd be interested in pointers to any existing discussion on this issue. Trivia - right now there are *no* Google hits for 'python shared slice', although there are lots for 'python shared slices'. They don't appear to be talking about the same thing, however (without being exhaustive). Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist