[Python-Dev] just say no...
Fred L. Drake, Jr.
fdrake@acm.org
Fri, 12 Nov 1999 11:28:37 -0500 (EST)
M.-A. Lemburg writes:
> It's been in the proposal since version 0.1. The idea is to
> provide a decent way of making existing script Unicode aware.
Ok, so I haven't read closely enough.
> This is what I intended to implement. The <defencbuf> buffer
> will be filled upon the first request to the UTF-8 encoding.
> "s" and "s#" are examples of such requests. The buffer will
> remain intact until the object is destroyed (since other code
> could store the pointer received via e.g. "s").
Right.
> Note that Unicode object are completely different beast ;-)
> String object are not touched in any way by the proposal.
I wasn't suggesting the PyStringObject be changed, only that the
PyUnicodeObject could maintain a reference. Consider:
s = fp.read()
u = unicode(s, 'utf-8')
u would now hold a reference to s, and s/s# would return a pointer
into s instead of re-building the UTF-8 form. I talked myself out of
this because it would be too easy to keep a lot more string objects
around than were actually needed.
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives