[Python-Dev] Python3 "complexity"

Stephen J. Turnbull stephen at xemacs.org
Fri Jan 10 08:28:34 CET 2014


Chris Angelico writes:

 > I'm not saying that chardet is bad, but I *am* saying, and I stand
 > by this, that an auto-detect option on file open is a bad idea.

I have used it by default in Emacs and XEmacs since 1990, and I
certainly haven't experienced it as a bad idea at *any* time in more
than two decades.  Of course, it shouldn't be default in Python for
two reasons: (1) Emacsen are invariably interactive so very flexible
with error recovery, not so for Python, and (2) Emacsen can generally
assume that the files they open are more or less text in the first
place, which again is not true for Python.

 > Would you want a parameter to the open() builtin

It's not a parameter, it's a particular value for the encoding
parameter.

 > that tries to read the file as an image, or an audio file, or a
 > document, or an executable, and automatically decodes it to a
 > PIL.Image, an mm.wave, etc,

Emacsen do that, too.  It's not the sayonara Grand Slam in the 7th
game of the World Series spectacular win that text encoding detection
is, but it is very useful much of the time.

What it comes down to for all of the above is "consenting adults."
Python should *not* do any guessing by default, but if the programmer
or user explicitly request a guess with "encoding=chardet", why in the
world would you want Python to do anything but give it the old college
try?  Of course any Python-supplied guesser should take a very
pessimistic approach and error unless it's quite certain, but

 > or execute the code and return its stdout, all entirely
 > automatically?

Now *that* is a really bad idea.  You shouldn't mix it with the
others.  (I'll also concede that many file formats -- Postscript, I'm
looking at you -- require special care to avoid arbitrary code
execution.)



More information about the Python-Dev mailing list