[Python-3000] PyUnicodeObject implementation

Stefan Behnel stefan_ml at behnel.de
Sun Sep 7 16:58:13 CEST 2008


Hi,

Guido van Rossum wrote:
> On Sun, Sep 7, 2008 at 12:15 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> Antoine Pitrou wrote:
>>> Also note that Marc-André Lemburg (one of the authors of the unicode
>>> implementation) is opposed to that change. See the discussion in the bug tracker
>>> issue for the details.
>> From a Cython perspective, I find the lack of efficient subclassing after such
>> a change particularly striking. That seriously bit me in Py2 when I tried
>> making XML text content a bit more intelligent in lxml (i.e. make it remember
>> what XML element it originated from). Having the same problem for unicode in
>> Py3 doesn't sound like a good idea to me.
> 
> Can you explain this a bit more? I presume you're talking about
> subclassing in C

Yes, I mentioned Cython above.


> I do note that the mechanisms that exist for supporting adding a __dict__
> to a str (in 2.x; or bytes in 3.x) or a tuple could be extended for other
> purposes.

I never looked into these, but this does not sound like it would impact
subclassing.


> Also, please explain why instead of subclassing you couldn't use a
> wrapper class? (I.e. use containment instead of inheritance.)

Because users will expect that the return values can be passed into anything
that accepts a string, which is much more than you could catch with a wrapper
class. There are tons of C-level APIs inside and outside of Python itself that
require strings for certain operations and will not accept any other object.
Just think of passing a wrapper object as type name of a newly created type.

Stefan



More information about the Python-3000 mailing list