[Python-Dev] Shouldn't I be able to print Unicode objects?
M.-A. Lemburg
mal@lemburg.com
Tue, 05 Jun 2001 23:00:23 +0200
Skip Montanaro wrote:
>
> mal> Please see Lib/site.py for details on how to enable all these
> mal> goodies -- it's all there, just disabled and meant for super-users
> mal> only ;-)
>
> Okay, I found the encoding section. I changed the encoding variable
> assignment to be
>
> encoding = "latin1"
>
> and now the degree sign print works. What other side-effects will that have
> besides on printed representations? It appears I can create (but not see
> properly?) variable names containing latin1 characters:
>
> >>> ümlaut = "ümlaut"
Huh ? That should not be possible ! Python literals are still
ASCII.
>>> ümlaut = 'ümlaut'
File "<stdin>", line 1
ümlaut = 'ümlaut'
^
SyntaxError: invalid syntax
> >>> print locals().keys()
> ['orca', 'dir', '__doc__', 'rlcompleter', 'missionb', 'version', 'dirpat', 'xmlrpclib', 'belugab', '__builtin__', 'beluga', 'readline', '__name__', 'orcab', 'addpath', 'Writer', 'atexit', 'sys', 'dolphinb', 'mission', 'pprint', 'dolphin', '__builtins__', 'mlaut', 'help']
>
> I am having trouble printing some strings containing latin1 characters:
>
> >>> print ümlaut
> mlaut
> >>> type("ümlaut")
> <type 'string'>
> >>> type(string.letters)
> <type 'string'>
> >>> print "ümlaut"
> mlaut
> >>> print string.letters
> abcdefghijklmnopqrstuvwxyzµßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
> >>> print string.letters[55:]
> üýþÿABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
>
> The above was pasted from Python running in a shell session in XEmacs, which
> is certainly latin1-aware. Why did I have trouble seeing the ü in some
> situations, but not in others?
No idea what's going on there... the encoding parameter should
not have any effect on printing normal 8-bit strings. It only
defines the standard encoding used in coercion and auto-conversion
from Unicode to 8-bit strings and vice-versa.
> Are the ramifications of all this encoding stuff documented somewhere?
The basic things can be found in Misc/unicode.txt, on the i18n sig
page and some resources on the web. I'll give a talk in Bordeaux about
Unicode too, which will probably provide some additional help
as well.
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting: http://www.egenix.com/
Python Software: http://www.lemburg.com/python/