[Python-Dev] hasattr() and Unicode strings

M.-A. Lemburg mal@lemburg.com
Fri, 08 Sep 2000 14:09:03 +0200


Guido van Rossum wrote:
> 
> > hasattr(), getattr(), and doubtless other built-in functions
> > don't accept Unicode strings at all:
> >
> > >>> import sys
> > >>> hasattr(sys, u'abc')
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > TypeError: hasattr, argument 2: expected string, unicode found
> >
> > Is this a bug or a feature?  I'd say bug; the Unicode should be
> > coerced using the default ASCII encoding, and an exception raised if
> > that isn't possible.
> 
> Agreed.
> 
> There are probably a bunch of things that need to be changed before
> thois works though; getattr() c.s. require a string, then call
> PyObject_GetAttr() which also checks for a string unless the object
> supports tp_getattro -- but that's only true for classes and
> instances.
> 
> Also, should we convert the string to 8-bit, or should we allow
> Unicode attribute names?

Attribute names will have to be 8-bit strings (at least in 2.0).

The reason here is that attributes are normally Python identifiers
which are plain ASCII and stored as 8-bit strings in the namespace
dictionaries, i.e. there's no way to add Unicode attribute names
other than by assigning directly to __dict__.

Note that keyword lookups already automatically convert Unicode
lookup strings to 8-bit using the default encoding. The same should
happen here, IMHO.
 
> It seems there's no easy fix -- better address this after 2.0 is
> released.

Why wait for 2.1 ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/