[Python-Dev] The "lazy strings" patch
Larry Hastings
larry at hastings.org
Mon Oct 23 16:58:25 CEST 2006
Steve Holden wrote:
> But it seems to me that the only major issue is the inability to provide
> zero-byte terminators with this new representation.
>
I guess I wasn't clear in my description of the patch; sorry about that.
Like "lazy concatenation objects", "lazy slices" render when you call
PyString_AsString() on them. Before rendering, the lazy slice's ob_sval
will be NULL. Afterwards it will point to a proper zero-terminated
string, at which point the object behaves exactly like any other
PyStringObject.
The only function that *might* return a non-terminated char * is
PyString_AsUnterminatedString(). This function is static to
stringobject.c--and I would be shocked if it were ever otherwise.
> If there were any reliable way to make sure these objects never got
> passed to extension modules then I'd say "go for it".
If external Python extension modules are as well-behaved as the shipping
Python source tree, there simply wouldn't be a problem. Python source
is delightfully consistent about using the macro PyString_AS_STRING() to
get at the creamy char *center of a PyStringObject *. When code
religiously uses that macro (or calls PyString_AsString() directly), all
it needs is a recompile with the current stringobject.h and it will Just
Work.
I genuinely don't know how many external Python extension modules are
well-behaved in this regard. But in case it helps: I just checked PIL,
NumPy, PyWin32, and SWIG, and all of them were well-behaved.
Apart from stringobject.c, there was exactly one spot in the Python
source tree which made assumptions about the structure of
PyStringObjects (Mac/Modules/macos.c). It's in the block starting with
the comment "This is a hack:". Note that this is unfixed in my patch,
so just now all code using that self-avowed "hack" will break.
Am I correct in understanding that changing the Python minor revision
number (2.5 -> 2.6) requires external modules to recompile? (It
certainly does on Windows.) If so, I could mitigate the problem by
renaming ob_sval. That way, code making explicit reference to it would
fail to compile, which I feel is better than silently recompiling unsafe
code.
Cheers,
/larry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061023/a083bf5c/attachment.htm
More information about the Python-Dev
mailing list