cached encoding (Re: [Python-Dev] Internationalization Toolkit)
Fredrik Lundh
fredrik@pythonware.com
Wed, 10 Nov 1999 09:24:16 +0100
Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> One specific question: in you discussion of typed strings, I'm not
> sure why you couldn't convert everything to Unicode and be done with
> it. I have a feeling that the answer is somewhere in your case study
> -- maybe you can elaborate?
Marc-Andre writes:
Unicode objects should have a pointer to a cached (read-only) char
buffer <defencbuf> holding the object's value using the current
<default encoding>. This is needed for performance and internal
parsing (see below) reasons. The buffer is filled when the first
conversion request to the <default encoding> is issued on the object.
keeping track of an external encoding is better left
for the application programmers -- I'm pretty sure that
different application builders will want to handle this
in radically different ways, depending on their environ-
ment, underlying user interface toolkit, etc.
besides, this is how Tcl would have done it. Python's
not Tcl, and I think you need *very* good arguments
for moving in that direction.
</F>