I like Unicode more than I used to...

Terry Hancock hancock at anansispaceworks.com
Thu Feb 20 00:06:06 EST 2003


Hmm -- made me want to check it out.

But why doesn't this work:
>>> print u'\u4378'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
>>> 

This does (I did a little poking around the docs after the above):
>>> print u"\u2122".encode('utf-8')
?

Hmm. Why do I need to do that? Is there no way to figure out how to print a 
unicode string when I'm running in a unicode capable terminal?  Also, is 
there a list somewhere of what the "".encode() method understands? I was 
unable to find one.  I just guessed that "utf-8" would work from the above 
example.  Is that extendable in Python, or is it compiled-in?

This is what I've seen on this so far:
http://www.python.org/doc/2.2.1/whatsnew/node8.html
http://www.python.org/doc/2.2.1/lib/string-methods.html#l2h-116
http://www.python.org/doc/2.2.1/lib/module-codecs.html#l2h-803

The last one does have a way to register new codecs in the codecs module -- 
can the string method use any codec defined there?  If so, how do you use 
it?

Cheers,
Terry

-- 
Anansi Spaceworks
http://www.anansispaceworks.com




More information about the Python-list mailing list