[Python-ideas] Python 3000 TIOBE -3%
Joao S. O. Bueno
jsbueno at python.org.br
Sun Feb 12 22:32:53 CET 2012
On 11 February 2012 21:24, Paul Moore <p.f.moore at gmail.com> wrote:
> What I *don't* know is what those funny bits of
> mojibake I see in the text editor are.
So, do yourself and to us, "the rest of the world", a favor, and open the
file in binary mode.
Also, I'd suggest you and anyone being picky about encoding to read
http://www.joelonsoftware.com/articles/Unicode.html so you can finally have
in your mind that *** ASCII is not text *** .
It used to be text when to get to non-[A-Z|a-z] text you had to have
someone recording a file in a tape, pack it in the luggage, and take a
plane to "overseas" to the U.S.A. . That is not the case anymore, and that,
as far as I understand, is the reasoning to Python 3 to default to unicode.
Anyone can work "ignoring text" and treating bytes as bytes, opening a file
in binary mode. You can use "os.linesep" instead of a hard-coded "\n" to
overcome linebreaking. (Of course you might accidentally break a line
inside a multi-byte character in some enconding, since you prefer to ignore
them altogether, but it should be rare).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas