preferred way to set encoding for print

_wolf wolfgang.lipp at gmail.com
Tue Sep 15 09:28:06 EDT 2009


hi folks,

i am doing my first steps in the wonderful world of python 3.

some things are good.
some things have to be relearned.
some things drive me crazy.

sadly, i'm working on a windows box. which, in germany, entails that
python thinks it to be a good idea to take cp1252 as the default
encoding.

so just coz i got my box in germany means i can never print out a
chinese character? say what?

i have no troubles with people configuring their python installation
to use any encoding in the world, but wouldn't it have been less of a
surprise to just assume utf-8 for any file in/output? after all, it is
already the default for python source files as far as i understand.
someone might think they're clever to sniff into the system and make
the somehwat educated guess that this dude's using cp1252 for his
files. but they would be wrong.

so: how can i tell python, in a configuration or using a setting in
sitecustomize.py, or similar, to use utf-8 as a default encoding?
there used to be a trick to say `reload(sys);sys.setdefaultencoding
('utf-8')`, but that has no effect in py3.0.1. also, i cannot set
`sys.stdout.encoding`; is there a way to re-open that stream with a
different encoding?

in all, i believe it is quite unsettling to me to see that, on my py3
installation,

sys.getdefaultencoding() == 'utf-8'
sys.stdout.encoding == 'cp1252'
locale.getlocale() == (None, None)
locale.getdefaultlocale() == ('de_DE', 'cp1252')

which to me makes as much sense as a blackcurrant tart thrown into
space. worse,

locale.setlocale( locale.LC_ALL, locale.getdefaultlocale() )

results in

locale.Error: unsupported locale setting

this bloody thing doesn't accept its *own* output. attempts to feed
that locale beast with anything but the empty string or 'C' were all
doomed. it would take a very patient and eloquent person to explain
that in a credible fashion to me. my word for this is, 'broken'.

i would very much like to rid myself of these considerations. just say
it's all utf-8, wash'n'go.

my attempts of changing python's mind using the locale module have
failed so far. otherwise, i for one don't want to touch that locale
thing with a very long pole. as far as i can see, it does not work as
documented. the platform dependencies are also a clear OFF LIMITS sign
to me.

any suggestions?

cheers,

~flow




More information about the Python-list mailing list