unicode to string conversion

Skip Montanaro skip at pobox.com
Thu May 8 18:13:48 EDT 2003


    Jeff> Huh, doesn't work here
    >>>> u = u'questa \xe8 bella'
    >>>> s = u.encode("iso-8859-1")
    >>>> print s
    Jeff> questa [] bella

    Jeff> where [] is a box-shaped character displayed for an invalid byte
    Jeff> sequence.

    Jeff> On my system, I must write
    >>>> print u.encode("utf")
    Jeff> questa è bella
    Jeff> to get the proper result

Yup.  I was unable to guess what your terminal encoding was, but since I saw
an é in my email and I can display iso-8859-1, I jumped to the conclusion
that so could the OP.  It's impossible to guess correctly for everyone.

    Jeff> It *may* be that the encoding returned by
    Jeff>     locale.getdefaultlocale()[1]
    Jeff> is the one that should be used (and it is on my system), or it may
    Jeff> be that the OP only needs the value to work on a single computer
    Jeff> and can determine the right encoding through educated guessing.

Some programs (like Apple's Terminal app) allow the user to specify the
display encoding.  I doubt the choice the user makes there would be
reflected in subsequent calls to locale.getdefaultlocale().

Skip





More information about the Python-list mailing list