Swedish characters in Python strings

Magnus Heino magnus.heino at pleon.sigma.se
Sun Oct 13 09:39:50 EDT 2002


> check the locale settings; to minimize the pain, make sure you use
> an 8-bit encoding (e.g ISO-8859-1) and not a designed-for-internal-
> use-only variable-width encoding like UTF-8.

Still, all new RH8 installs do use utf-8, and there must be a good reason 
for that, and I guess its something they will do for a while now...
 
> with UTF-8, your operating system is messing things up before Python
> gets a chance to look at the characters (most likely, Python gets 6
> characters from the keyboard, and sends 6 characters to the console).

Is that the reason in this case? This is RH8.0, en_US.utf-8

Python 2.2.1 (#1, Aug 30 2002, 12:15:30)
[GCC 3.2 20020822 (Red Hat Linux Rawhide 3.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import MP3Info
>>> title = getattr(MP3Info.MP3Info(open('file.mp3', 'rb')), 'title')
>>> title
'K\xf6ttbullar i n\xe4san'
>>> print title
K?ttbullar i n?san
>>> type(title)
<type 'str'>
>>> print title
K?ttbullar i n?san
>>> print u'K\xf6ttbullar i n\xe4san'.encode('utf-8')
Köttbullar i näsan
>>>

> (avoiding RedHat 8.0 might also help.  based on the kind of bugs I've
> experienced this far, 8.0 might qualify as the worst unix-like operating
> system ever released...)

Besides this stuff, I think it's really nice..

--

  /Magnus



More information about the Python-list mailing list