[Python-Dev] The "lazy strings" patch
Paul Moore
p.f.moore at gmail.com
Mon Oct 23 17:42:35 CEST 2006
On 10/23/06, Larry Hastings <larry at hastings.org> wrote:
>
> Steve Holden wrote:
>
> But it seems to me that the only major issue is the inability to provide
> zero-byte terminators with this new representation.
>
> I guess I wasn't clear in my description of the patch; sorry about that.
>
> Like "lazy concatenation objects", "lazy slices" render when you call
> PyString_AsString() on them. Before rendering, the lazy slice's ob_sval
> will be NULL. Afterwards it will point to a proper zero-terminated string,
> at which point the object behaves exactly like any other PyStringObject.
I had picked up on this comment, and I have to say that I had been a
little surprised by the resistance to the change based on the "code
would break" argument, when you had made such a thorough attempt to
address this. Perhaps others had missed this point, though.
> I genuinely don't know how many external Python extension modules are
> well-behaved in this regard. But in case it helps: I just checked PIL,
> NumPy, PyWin32, and SWIG, and all of them were well-behaved.
There's code out there which was written to the Python 1.4 API, and
has not been updated since (I know, I wrote some of it!) I wouldn't
call it "well-behaved" (it writes directly into the string's character
buffer) but I don't believe it would fail (it only uses
PyString_AsString to get the buffer address).
/* Allocate an Python string object, with uninitialised contents. We
* must do it this way, so that we can modify the string in place
* later. See the Python source, Objects/stringobject.c for details.
*/
result = PyString_FromStringAndSize(NULL, len);
if (result == NULL)
return NULL;
p = PyString_AsString(result);
while (*str)
{
if (*str == '\n')
*p = '\0';
else
*p = *str;
++p;
++str;
}
> Am I correct in understanding that changing the Python minor revision
> number (2.5 -> 2.6) requires external modules to recompile? (It certainly
> does on Windows.) If so, I could mitigate the problem by renaming ob_sval.
> That way, code making explicit reference to it would fail to compile, which
> I feel is better than silently recompiling unsafe code.
I think you've covered pretty much all the possible backward
compatibility bases. A sufficiently evil extension could blow up, I
guess, but that's always going to be true.
OTOH, I don't have a comment on the desirability of the patch per se,
as (a) I've never been hit by the speed issue, and (b) I'm thoroughly
indoctrinated, so I always use ''.join() :-)
Paul.
More information about the Python-Dev
mailing list