[Python-Dev] Python3 "complexity"

Nick Coghlan ncoghlan at gmail.com
Thu Jan 9 08:11:06 CET 2014


On 9 January 2014 10:07, Ben Finney <ben+python at benfinney.id.au> wrote:
> Kristján Valur Jónsson <kristjan at ccpgames.com> writes:
>
>> Believe it or not, sometimes you really don't care about encodings.
>> Sometimes you just want to parse text files.
>
> Files don't contain text, they contain bytes. Bytes only become text
> when filtered through the correct encoding.
>
> Python should not guess the encoding if it's unknown. Without the right
> encoding, you don't get text, you get partial or complete gibberish.
>
> So, if what you want is to parse text and not get gibberish, you need to
> *tell* Python what the encoding is. That's a brute fact of the world of
> text in computing.

Set the mode to "rb", process it as binary. Done.

See http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
for details.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list