[Python-ideas] Python 3 open() text files: make encoding parameter optional for cross-platform scripts

Chris Angelico rosuav at gmail.com
Mon Jun 10 00:13:01 CEST 2013


On Mon, Jun 10, 2013 at 7:52 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> There's definitely a case to be made for implementing some kind of Notepad-like heuristics in Python. It would be great to be able to do this at the interactive interpreter:
>
> line = text.partition('\n')[0]
> for encoding in codecs.guess(text)[:10]:
>     print(encoding, line.decode(encoding))
>
> In fact, if you wrote that at pushed it to PyPI I'd start using it today, and maybe even lobbying for its inclusion in the stdlib.
>
> But I wouldn't want open to use it, and I don't think you would either.

Hang on, you can't partition it on the Unicode string '\n' while it's
still a bytes :) But I agree, this would be a neat feature. It ought
to be able to guess ASCII or UTF-8 with near-certainty, UTF-16 if it
has a BOM, and other things heuristically. Would help a lot when I'm
trying to answer Nikos on python-list...

ChrisA


More information about the Python-ideas mailing list