[Python-Dev] deleting setdefaultencoding iin site.py is evil

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Aug 25 18:23:05 CEST 2009


On 04:08 pm, chris at simplistix.co.uk wrote:
>Hi All,
>
>Would anyone object if I removed the deletion of of 
>sys.setdefaultencoding in site.py?
>
>I'm guessing "yes!" so thought I'd state my reasons now:
>
>This deletion appears to be pretty flimsy; reload(sys) and you have it 
>back. Which is lucky, because I need it after it's been deleted...

The ability to change the default encoding is a misfeature.  There's 
essentially no way to write correct Python code in the presence of this 
feature.

Using setdefaultencoding is never the sensible way to deal with encoded 
strings.  Actually exposing this function in the sys module would lead 
all kinds of people who haven't fully grasped the way str, unicode, and 
encodings work to doing horrible things to create broken programs.  It's 
bad enough that it's already possible to get this function back with the 
reload(sys) trick.
>
>Why? Well, because you can no longer put sitecustomize.py in a project- 
>specific location (http://bugs.python.org/issue1734860) and because for 
>some projects the only way I can deal with encoded strings sensibly is 
>to use setdefaultencoding, in my case at the start of a script 
>generated by zc.buildout's zc.recipe.egg (I *know* all the encodings in 
>this project are utf-8, but I don't want to go playing whack-a-mole 
>with whatever modules this rather large project uses that haven't been 
>made properly unicode aware).
>
>Yes, it needs to be used as early as possible, and the docs should say 
>this, but deleting it seems to be petty in terms of stopping its use 
>when sitecustomize.py is too early and too system-wide and spraying 
>.decode('utf-8')'s all over a code base made up of a load of eggs 
>managed by buildout simply isn't feasible...
>
>Thoughts?

It may be a major task, but the best thing you can do is find each str 
and unicode operation in the software you're working with and make them 
correct with respect to your inputs and outputs.  Flipping a giant 
switch for the entire process is just going to change which things are 
wrong.

Jean-Paul


More information about the Python-Dev mailing list