hasattr(), getattr(), and doubtless other built-in functions don't accept Unicode strings at all:
import sys hasattr(sys, u'abc') Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: hasattr, argument 2: expected string, unicode found
Is this a bug or a feature? I'd say bug; the Unicode should be coerced using the default ASCII encoding, and an exception raised if that isn't possible. --amk
Andrew Kuchling writes:
Is this a bug or a feature? I'd say bug; the Unicode should be coerced using the default ASCII encoding, and an exception raised if that isn't possible.
I agree. Marc-Andre, what do you think? -Fred -- Fred L. Drake, Jr. <fdrake at beopen.com> BeOpen PythonLabs Team Member
"Fred L. Drake, Jr." wrote:
Andrew Kuchling writes:
Is this a bug or a feature? I'd say bug; the Unicode should be coerced using the default ASCII encoding, and an exception raised if that isn't possible.
I agree. Marc-Andre, what do you think?
Sounds ok to me. The only question is where to apply the patch: 1. in hasattr() 2. in PyObject_GetAttr() I'd opt for using the second solution (it should allow string and Unicode objects as attribute name). hasattr() would then have to be changed to use the "O" parser marker. What do you think ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
hasattr(), getattr(), and doubtless other built-in functions don't accept Unicode strings at all:
import sys hasattr(sys, u'abc') Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: hasattr, argument 2: expected string, unicode found
Is this a bug or a feature? I'd say bug; the Unicode should be coerced using the default ASCII encoding, and an exception raised if that isn't possible.
Agreed. There are probably a bunch of things that need to be changed before thois works though; getattr() c.s. require a string, then call PyObject_GetAttr() which also checks for a string unless the object supports tp_getattro -- but that's only true for classes and instances. Also, should we convert the string to 8-bit, or should we allow Unicode attribute names? It seems there's no easy fix -- better address this after 2.0 is released. --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)
Guido van Rossum wrote:
hasattr(), getattr(), and doubtless other built-in functions don't accept Unicode strings at all:
import sys hasattr(sys, u'abc') Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: hasattr, argument 2: expected string, unicode found
Is this a bug or a feature? I'd say bug; the Unicode should be coerced using the default ASCII encoding, and an exception raised if that isn't possible.
Agreed.
There are probably a bunch of things that need to be changed before thois works though; getattr() c.s. require a string, then call PyObject_GetAttr() which also checks for a string unless the object supports tp_getattro -- but that's only true for classes and instances.
Also, should we convert the string to 8-bit, or should we allow Unicode attribute names?
Attribute names will have to be 8-bit strings (at least in 2.0). The reason here is that attributes are normally Python identifiers which are plain ASCII and stored as 8-bit strings in the namespace dictionaries, i.e. there's no way to add Unicode attribute names other than by assigning directly to __dict__. Note that keyword lookups already automatically convert Unicode lookup strings to 8-bit using the default encoding. The same should happen here, IMHO.
It seems there's no easy fix -- better address this after 2.0 is released.
Why wait for 2.1 ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
participants (4)
-
Andrew Kuchling
-
Fred L. Drake, Jr.
-
Guido van Rossum
-
M.-A. Lemburg