[Python-3000] Draft PEP for New IO system

Walter Dörwald walter at livinglogic.de
Tue Feb 27 21:39:48 CET 2007


Guido van Rossum wrote:

> The encoding/decoding behavior should be no different from that of the
> encode() and decode() methods on unicode strings and byte arrays.

Except that it must work in incremental mode. The new (in 2.5) 
incremental codecs should be usable for that.

> Certainly no normalization of diacritics will be done; surrogate
> handling depends on the encoding and whether the unicode string
> implementation uses 16 or 32 bits per character.
> 
> I agree that we need to be able to specify the error handling as well.

Should it be possible to change the error handling during the lifetime 
of a stream? Then this change would have to be passed through to the 
underlying codec.

> UnicodeErrors may be raised.

Servus,
    Walter

> On 2/27/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>> On 2/27/07, Adam Olsen <rhamph at gmail.com> wrote:
>>> On 2/26/07, Mike Verdone <mike.verdone at gmail.com> wrote:
>>>> Text I/O
>>>> ... operate on a per-character basis instead of a per-byte basis.
>>> "per-character" needs some clarification.  I'm guessing this will only
>>> return entire code points, but the unicode type will expose them as
>>> code units, so it could be seen as both per-code-point and
>>> per-code-unit.
>> Does this just mean that you assume
>> (1) UTF32
>> (2) surrogate pairs will show up as two characters
>> (3) diacritics may (or may not) show up separately from their base characters?
>>
>> This does suggest that error-correction should be specified (or at
>> least explicitly not specified).  If the underlying input byte-stream
>> contains an invalid sequence, will the TextIO raise a
>> UnicodeDecodeError?  Or will its error/replace/delete behavior be
>> settable?
>>
>> Does the Text class promise to catch things like an invalid
>> combination of surrogates?
>>
>> -jJ
>> _______________________________________________
>> Python-3000 mailing list
>> Python-3000 at python.org
>> http://mail.python.org/mailman/listinfo/python-3000
>> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>>
> 
> 



More information about the Python-3000 mailing list