[Python-Dev] Python3 "complexity"

Lennart Regebro regebro at gmail.com
Fri Jan 10 04:42:04 CET 2014


On Fri, Jan 10, 2014 at 2:03 AM, Joao S. O. Bueno <jsbueno at python.org.br> wrote:
> On 9 January 2014 04:50, Lennart Regebro <regebro at gmail.com> wrote:
>> To be honest, you can define text as "A stream of bytes that are split
>> up in lines separated by a linefeed", and do some basic text
>> processing like that. Just very *basic*, but still. Replacing
>> characters. Extracting certain lines etc.
>
> That is, until you hit a character which has a byte with the same
> value of ASCII newline in the middle of a multi-byte character.
>
> So, this approach is broken to start with.

For a very specific definition of broken, yes, namely that it will
fail with UTF-16 or EBCDIC. Files that with the above definition of
"text files" are not text files. :-)

//Lennart


More information about the Python-Dev mailing list