[Python-3000] Draft PEP for New IO system
Walter Dörwald
walter at livinglogic.de
Tue Feb 27 21:39:48 CET 2007
Guido van Rossum wrote:
> The encoding/decoding behavior should be no different from that of the
> encode() and decode() methods on unicode strings and byte arrays.
Except that it must work in incremental mode. The new (in 2.5)
incremental codecs should be usable for that.
> Certainly no normalization of diacritics will be done; surrogate
> handling depends on the encoding and whether the unicode string
> implementation uses 16 or 32 bits per character.
>
> I agree that we need to be able to specify the error handling as well.
Should it be possible to change the error handling during the lifetime
of a stream? Then this change would have to be passed through to the
underlying codec.
> UnicodeErrors may be raised.
Servus,
Walter
> On 2/27/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>> On 2/27/07, Adam Olsen <rhamph at gmail.com> wrote:
>>> On 2/26/07, Mike Verdone <mike.verdone at gmail.com> wrote:
>>>> Text I/O
>>>> ... operate on a per-character basis instead of a per-byte basis.
>>> "per-character" needs some clarification. I'm guessing this will only
>>> return entire code points, but the unicode type will expose them as
>>> code units, so it could be seen as both per-code-point and
>>> per-code-unit.
>> Does this just mean that you assume
>> (1) UTF32
>> (2) surrogate pairs will show up as two characters
>> (3) diacritics may (or may not) show up separately from their base characters?
>>
>> This does suggest that error-correction should be specified (or at
>> least explicitly not specified). If the underlying input byte-stream
>> contains an invalid sequence, will the TextIO raise a
>> UnicodeDecodeError? Or will its error/replace/delete behavior be
>> settable?
>>
>> Does the Text class promise to catch things like an invalid
>> combination of surrogates?
>>
>> -jJ
>> _______________________________________________
>> Python-3000 mailing list
>> Python-3000 at python.org
>> http://mail.python.org/mailman/listinfo/python-3000
>> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>>
>
>
More information about the Python-3000
mailing list