[Python-Dev] Python3 "complexity"
Nick Coghlan
ncoghlan at gmail.com
Thu Jan 9 08:11:06 CET 2014
On 9 January 2014 10:07, Ben Finney <ben+python at benfinney.id.au> wrote:
> Kristján Valur Jónsson <kristjan at ccpgames.com> writes:
>
>> Believe it or not, sometimes you really don't care about encodings.
>> Sometimes you just want to parse text files.
>
> Files don't contain text, they contain bytes. Bytes only become text
> when filtered through the correct encoding.
>
> Python should not guess the encoding if it's unknown. Without the right
> encoding, you don't get text, you get partial or complete gibberish.
>
> So, if what you want is to parse text and not get gibberish, you need to
> *tell* Python what the encoding is. That's a brute fact of the world of
> text in computing.
Set the mode to "rb", process it as binary. Done.
See http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
for details.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-Dev
mailing list