[Tutor] class decorator question
Steven D'Aprano
steve at pearwood.info
Sun Oct 6 04:52:04 CEST 2013
On Sat, Oct 05, 2013 at 12:26:14PM -0700, Albert-Jan Roskam wrote:
> >> On http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/ I saw
> >> a very cool and useful example of a class decorator. It (re)implements
> >> __str__ and __unicode__ in case Python 2 is used. For Python 3, the
> >> decorator does nothing. I wanted to generalize this decorator so the
> >> __str__ method under Python 2 encodes the string to an arbitrary
> >> encoding. This is what I've created: http://pastebin.com/vghD1bVJ.
> >>
> >> It works, but the code is not very easy to understand, I am affraid.
> >
> >It's easy to understand, it's just doing it the wrong way. It creates
> >and subclass of your class, which it shouldn't do.
>
> Why not? Because it's an unusual coding pattern? Or is it ineffecient?
It is both of those things. (Well, the inefficiency is minor.) My
main objection is that it is inelegant, like using a screwdriver as
a chisel instead of using a chisel -- even when it's "good enough",
it's not something you want other people to see you doing if you
care about looking like a craftsman :-)
Another issue is to do with naming. In your example, you decorate Test.
What that means in practice is that you create a new class, Klass(Test),
throw away Test, and bind Klass to the top-level name Test. So in effect
you're doing this:
class Test # The undecorated version.
class Klass(Test) # Subclass it inside the decorator.
Test = Klass # throw away the original and re-use the variable name.
But classes, like functions, have *two* names. They have the name they
are bound to, the variable name (*usually* one of these, but sometimes
zero or two or more). And they have their own internal name:
Test.__name__
=> returns "Klass"
This will make debugging unneccesarily confusing. If you use your
decorator three times:
@implements_to_string
class Spam
@implements_to_string
class Eggs
@implements_to_string
class Cheese
instances of all three of Spam, Eggs and Cheese will claim to be
instances of "Klass".
Now there is a simple work-around for this: inside the decorator, call
Klass.__name__ = cls.__name__
before returning. But that leads to another issue, where instances of
the parent, undecorated, class (if any!) and instances of the child,
decorated, class both claim to be from the same "Test" class. This is
more of theoretical concern, since you're unlikely to be instantiating
the undecorated parent class.
> I subclassed because I needed the encoding value in the decorator.
> But subclassing may indeed have been overkill.
Yes :-)
The encoding value isn't actually defined until long after the decorator
has finished doing its work, after the class is decorated, and an
instance is defined. So there is no encoding value used in the decorator
itself. The decorator can trivially refer to the encoding value, so long
as that doesn't actually get executed until after an instance is
created:
def decorate(cls):
def spam(self):
print(self.encoding)
cls.spam = spam
return cls
works fine without subclassing.
> >Here's a better
> >approach: inject the appropriate methods into the class directly. Here's
> >a version for Python 3:
[...]
> >This avoids overwriting __str__ if it is already defined, and likewise
> >for __bytes__.
>
> Doesn't a class always have __str__ implementation?
No. Where is the __str__ implementation here?
class X:
pass
This class defines no methods at all. Its *superclass*, object in Python
3, defines methods such as __str__. But you'll notice that I didn't call
hasattr(cls, '__str__')
since that will return True, due to object having a __str__ method. I
called
'__str__' in cls.__dict__
which only returns True if cls explicitly defines a __str__ method.
> Nice, thanks Steven. I made a couple of versions after reading your
> advise. The main change that I still had to somehow retrieve the
> encoding value from the class to be decorated (decoratee?). I simply
> stored it in __dict__. Here is the second version that I created:
> http://pastebin.com/te3Ap50C. I tested it in Python 2 and 3.
Not sufficiently :-) Your test class has problems. See below.
> The Test
> class contains __str__ and __unicode__ which are renamed and redefined
> by the decorator if Python 3 (or 4, or..) is used.
>
>
> General question: I am using pastebin now. Is that okay, given that
> this is not part of the "memory" of the Python Tutor archive? It might
> be annoying if people search the archives and get 404s if they try to
> follow these links. Just in case I am also pasting the code below:
In my opinion, no it's not okay, particularly if your code is short
enough to be posted here.
Just because a pserson has access to this mailing list doesn't
necessarily mean they have access to pastebin. It might be blocked. The
site might be down. They might object to websites that require
Javascript (pastebin doesn't *require* it, but it's only a matter of
time...). Or they may simply be too busy/lazy to follow the link.
> from __future__ import print_function
> import sys
>
> def decorate(cls):
> print("decorate called")
> if sys.version_info[0] > 2:
> cls.__dict__["__str__"].__name__ = '__bytes__'
> cls.__dict__["__unicode__"].__name__ = '__str__'
> cls.__bytes__ = cls.__dict__["__str__"]
> cls.__str__ = cls.__dict__["__unicode__"]
> return cls
I thought your aim was to write something that was cross-version and
that added default __str__ and __unicode__ methods to the class if they
didn't already exist? [looks back at the original code...] Ah no, my
mistake, I misunderstood.
The above requires the caller to write their classes using the Python 2
style __str__ and __unicode__ methods. __unicode__ isn't even mandatory
in Python 2, but your decorate won't work without it!
As given, your decorator:
- does nothing in Python 2, even if the caller didn't define __str__
or __unicode__ methods;
- fails in Python 3 if the class doesn't define a __unicode__ method;
- does the wrong thing in Python 3 if the class already has correctly
working __str__ and __bytes__ methods;
- doesn't help you if you have a Python 3 style class and want to use
it in Python 2;
- doesn't work well if the decorated class inherits its __str__ and
__unicode__ methods from a parent class.
Admittedly, that last one is tricky, thanks to everything inheriting
from object.
> @decorate
> class Test(object):
>
> def __init__(self):
> self.__dict__["encoding"] = self.encoding
Why are you doing that? What is the outcome you are hoping for, and why
do you think it is necessary?
> def __str__(self):
> return "str called".encode(self.encoding)
>
> def __unicode__(self):
> return "unicode called"
These are wrong! Worse, you have multiple errors that cancel each
other out -- sometimes, two wrongs do make a right.
In Python 2: calling encode on a byte-string is permitted, but is the
wrong thing to do. By accident, it (usually?) works, but you shouldn't
do it. So there's your first wrong.
When converted to Python 3, the __str__ method becomes __bytes__, and is
supposed to return bytes. Now the "str called" literal is Unicode, and
encode will work, returning bytes. But it only works because of the
first wrong -- if you re-write __str__ to use b"str called", or to call
"str called".decode, your Python 3 __bytes__ method will fail.
In Python 2, __unicode__ ought to return a unicode string, u"unicode
called". By accident, if you return a byte string, Python will decode it
using ASCII, and it seems to work. But it's still wrong, and it's
particularly likely to go wrong if the __unicode__ method does any,
well, Unicode stuff.
When converted to __str__ by the decorator, the ex-__unicode__ method
will work, but only because you used a (Python2) byte-string literal
"..." inside it. If you wrote a u"Unicode string", it would fail in
Python 3.1 or 3.2 (but work in 3.3 and better).
> @property
> def encoding(self):
> """In reality this method extracts the encoding from a file"""
> return "utf-8" # rot13 no longer exists in Python3
Why would you do that?
Why not just supply the encoding when you initialise the instance?
def __init__(self, encoding):
self.encoding = encoding
> if __name__ == "__main__":
> t = Test()
> if sys.version_info[0] == 2:
> print(unicode(t))
> print(str(t))
This is insufficient testing. In Python 2, you need to test both
unicode(t) and str(t). In Python 3, you need to test both str(t) and
bytes(t).
In may turn out that, by accident, all four tests work for the given
Test class. But that's not going to apply to everything.
--
Steven
More information about the Tutor
mailing list