[Python-3000] string C API

Sat Sep 16 07:59:45 CEST 2006

On Sep 15, 2006, at 7:04 PM, Jim Jewett wrote:

> On 9/15/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Jim Jewett wrote:
>
>>>> ... would be necessary to at least *scan* the string when it
>>>> was first created in order to ensure it can be decoded without  
>>>> errors
>
>>> What happens today with strings?  I think the answer is:
>>>     "Nothing.
>>>      They print something odd when printed.
>>>      They may raise errors when explicitly recoded to unicde."
>>> Why is this a problem?
>
>> We don't have 8-bit strings lying around in Py3k.
>
> Right.  But we do in Py 2.x, and the equivalent delayed errors have
> not been a serious problem.  I suppose that might change if everyone
> were actually using unicode, so that more stuff got converted
> eventually.  On the other hand, I'm not sure how many strings will
> *ever* need recoding, if we don't do it on construction.

Automatic conversion from str to unicode in Py2.x is a annoying at  
times, mostly because it is easy to mis at development time. Using  
unicode throughout (explicit conversion to unicode at the application  
boundary) solves that, but that problem would reappear if unicode 
(somestr, someencoding) would return a value that might cause a when  
you try to access its value UnicodeError.

Another reason for disliking your idea is that unicode/py3k-str is a  
sequence of unicode code points and should always behave like one to  
the user. A polymorphic string type is an optimization (and an  
unproven one at that) and shouldn't complicate the Python-level  
string API.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20060916/594b545a/attachment.bin