Steve Holden wrote:
But it seems to me that the only major issue is the inability to provide 
zero-byte terminators with this new representation.
  
I guess I wasn't clear in my description of the patch; sorry about that.

Like "lazy concatenation objects", "lazy slices" render when you call PyString_AsString() on them.  Before rendering, the lazy slice's ob_sval will be NULL. Afterwards it will point to a proper zero-terminated string, at which point the object behaves exactly like any other PyStringObject.

The only function that *might* return a non-terminated char * is PyString_AsUnterminatedString().  This function is static to stringobject.c--and I would be shocked if it were ever otherwise.

If there were any reliable way to make sure these objects never got 
passed to extension modules then I'd say "go for it".
If external Python extension modules are as well-behaved as the shipping Python source tree, there simply wouldn't be a problem.  Python source is delightfully consistent about using the macro PyString_AS_STRING() to get at the creamy char *center of a PyStringObject *.  When code religiously uses that macro (or calls PyString_AsString() directly), all it needs is a recompile with the current stringobject.h and it will Just Work.

I genuinely don't know how many external Python extension modules are well-behaved in this regard.  But in case it helps: I just checked PIL, NumPy, PyWin32, and SWIG, and all of them were well-behaved.

Apart from stringobject.c, there was exactly one spot in the Python source tree which made assumptions about the structure of PyStringObjects (Mac/Modules/macos.c).  It's in the block starting with the comment "This is a hack:".  Note that this is unfixed in my patch, so just now all code using that self-avowed "hack" will break.

Am I correct in understanding that changing the Python minor revision number (2.5 -> 2.6) requires external modules to recompile?  (It certainly does on Windows.)  If so, I could mitigate the problem by renaming ob_sval.  That way, code making explicit reference to it would fail to compile, which I feel is better than silently recompiling unsafe code.


Cheers,


larry