[Python-3000] C API cleanup str

Walter Dörwald walter at livinglogic.de
Mon Aug 6 11:14:21 CEST 2007


Martin v. Löwis wrote:

>>> I now tried, and it turned out that bytes.__reduce__ would break
>>> (again); I fixed it and changed it in r56755.
>>>
>>> It turned out that PyUnicode_FromString was even documented to
>>> accept latin-1.
>> Yes, that seemed to me to be the most obvious interpretion.
> 
> Unfortunately, this made creating and retrieving asymmetric:
> when you do PyUnicode_AsString, you'll get an UTF-8 string; when
> you do PyUnicode_FromString, you did have to pass Latin-1. Making
> AsString also return Latin-1 would, of course, restrict the number of
> cases where it works.

True, UTF-8 seems to be the better choice. However all spots in the C
source that call PyUnicode_FromString() only pass ASCII anyway, which
will probably be the most common case.

>>> While I was looking at it, I wondered why PyUnicode_FromStringAndSize
>>> allows a NULL first argument, creating a null-initialized Unicode
>>> object.
>> Because that's what PyString_FromStringAndSize() does.
> 
> I guessed that was the historic reason; I just wondered whether the
> rationale for having it in PyString_FromStringAndSize still applies
> to Unicode.
> 
>> So should NULL support be dropped from PyUnicode_FromStringAndSize()?
> 
> That's my proposal, yes.

At least this would give a clear error message in case someone passes NULL.

Servus,
   Walter



More information about the Python-3000 mailing list