eval and unicode
Laszlo Nagy
gandalf at shopzeus.com
Fri Mar 21 04:54:34 EDT 2008
Hi Jonathan,
I think I made it too complicated and I did not concentrate on the
question. I could write answers to your post, but I'm going to explain
it formally:
>>> s = '\xdb' # This is a byte, without encoding specified.
>>> s.decode('latin1')
u'\xdb' # The above byte decoded in latin1 encoding
>>> s.decode('latin2')
u'\u0170' # The same byte decoded in latin2 encoding
>>> expr = 'u"' + s + '"' # Create an expression for eval
>>> expr
'u"\xdb"' # expr is not a unicode string - it is a binary string and it
has no encoding assigned.
>>> print repr(eval(expr)) # Eval it
u'\xdb' # What? Why it was decoded as 'latin1'? Why not 'latin2'? Why
not 'ascii'?
>>> eval( "# -*- coding: latin2 -*-\n" + expr)
u'\u0170' # You can specify the encoding for eval, that is cool.
I hope it is clear now. Inside eval, an unicode object was created from
a binary string. I just discovered that PEP 0263 can be used to specify
source encoding for eval. But still there is a problem: eval should not
assume that the expression is in any particular encoding. When it sees
something like '\xdb' then it should raise a SyntaxError - same error
that you should get when running a .py file containing the same expression:
>>> file('test.py','wb+').write(expr + "\n")
>>> ^D
gandalf at saturnus:~$ python test.py
File "test.py", line 1
SyntaxError: Non-ASCII character '\xdb' in file test.py on line 1, but
no encoding declared; see http://www.python.org/peps/pep-0263.html for
details
Otherwise the interpretation of the expression will be ambiguous. If
there is any good reason why eval assumed a particular encoding in the
above example?
Sorry for my misunderstanding - my English is not perfect. I hope it is
clear now.
My problem is solved anyway. Anytime I need to eval an expression, I'm
going to specify the encoding manually with # -*- coding: XXX -*-. It is
good to know that it works for eval and its counterparts. And it is
unambiguous. :-)
Best,
Laszlo
More information about the Python-list
mailing list