[Python-3000] string C API
Ronald Oussoren
ronaldoussoren at mac.com
Sat Sep 16 07:59:45 CEST 2006
On Sep 15, 2006, at 7:04 PM, Jim Jewett wrote:
> On 9/15/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Jim Jewett wrote:
>
>>>> ... would be necessary to at least *scan* the string when it
>>>> was first created in order to ensure it can be decoded without
>>>> errors
>
>>> What happens today with strings? I think the answer is:
>>> "Nothing.
>>> They print something odd when printed.
>>> They may raise errors when explicitly recoded to unicde."
>>> Why is this a problem?
>
>> We don't have 8-bit strings lying around in Py3k.
>
> Right. But we do in Py 2.x, and the equivalent delayed errors have
> not been a serious problem. I suppose that might change if everyone
> were actually using unicode, so that more stuff got converted
> eventually. On the other hand, I'm not sure how many strings will
> *ever* need recoding, if we don't do it on construction.
Automatic conversion from str to unicode in Py2.x is a annoying at
times, mostly because it is easy to mis at development time. Using
unicode throughout (explicit conversion to unicode at the application
boundary) solves that, but that problem would reappear if unicode
(somestr, someencoding) would return a value that might cause a when
you try to access its value UnicodeError.
Another reason for disliking your idea is that unicode/py3k-str is a
sequence of unicode code points and should always behave like one to
the user. A polymorphic string type is an optimization (and an
unproven one at that) and shouldn't complicate the Python-level
string API.
Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20060916/594b545a/attachment.bin
More information about the Python-3000
mailing list