[I18n-sig] Re: [Python-Dev] Pre-PEP: Python Character Model

Paul Prescod paulp@ActiveState.com
Wed, 07 Feb 2001 11:59:35 -0800


"Martin v. Loewis" wrote:
> 
> ...
> 
> So every s and s# conversion would trigger a copying of the
> string. How is that implemented? Currently, every Unicode object has a
> reference to a string object that is produced by converting to the
> default character set. Would it grow another reference to a string
> object that is carrying the Latin-1-conversion?

I'm not clear on the status of the concept of "default charater set."
First, I think you mean "default character encoding". Second, I thought
that that idea was removed from user-view at least, wasn't it? I was
thinking that we would use that slot to hold the char->ord->char
conversion (which you can interpret as Latin-1 or not depending on your
philosophy).

> Certainly. Applications expect to write to the resulting memory, and
> expect to change the underlying string; this is valid only if one had
> been passing NULL to PyString_FromStringAndSize.

The documentation says that the PyString_AsString and PyString_AS_STRING
buffers must never be modified. I forgot that the "real" protocol is
that that buffer can be modified. We'll need to copy its contents back
to the Unicode string before the next operation that uses the Unicode
value. Not rocket science but somewhat tedious.

 Paul Prescod