[Python-Dev] just say no...

Fred L. Drake, Jr. fdrake@acm.org
Fri, 12 Nov 1999 11:28:37 -0500 (EST)


M.-A. Lemburg writes:
 > It's been in the proposal since version 0.1. The idea is to
 > provide a decent way of making existing script Unicode aware.

  Ok, so I haven't read closely enough.

 > This is what I intended to implement. The <defencbuf> buffer
 > will be filled upon the first request to the UTF-8 encoding.
 > "s" and "s#" are examples of such requests. The buffer will
 > remain intact until the object is destroyed (since other code
 > could store the pointer received via e.g. "s").

  Right.

 > Note that Unicode object are completely different beast ;-)
 > String object are not touched in any way by the proposal.

  I wasn't suggesting the PyStringObject be changed, only that the
PyUnicodeObject could maintain a reference.  Consider:

        s = fp.read()
        u = unicode(s, 'utf-8')

u would now hold a reference to s, and s/s# would return a pointer
into s instead of re-building the UTF-8 form.  I talked myself out of
this because it would be too easy to keep a lot more string objects
around than were actually needed.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives