[Tutor] class decorator question

Steven D'Aprano steve at pearwood.info
Sun Oct 6 04:52:04 CEST 2013


On Sat, Oct 05, 2013 at 12:26:14PM -0700, Albert-Jan Roskam wrote:

> >> On http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/ I saw 
> >> a very cool and useful example of a class decorator. It (re)implements 
> >> __str__ and __unicode__ in case Python 2 is used. For Python 3, the 
> >> decorator does nothing. I wanted to generalize this decorator so the 
> >> __str__ method under Python 2 encodes the string to an arbitrary 
> >> encoding. This is what I've created: http://pastebin.com/vghD1bVJ.
> >> 
> >> It works, but the code is not very easy to understand, I am affraid. 
> >
> >It's easy to understand, it's just doing it the wrong way. It creates 
> >and subclass of your class, which it shouldn't do. 
> 
> Why not? Because it's an unusual coding pattern? Or is it ineffecient?

It is both of those things. (Well, the inefficiency is minor.) My 
main objection is that it is inelegant, like using a screwdriver as 
a chisel instead of using a chisel -- even when it's "good enough", 
it's not something you want other people to see you doing if you 
care about looking like a craftsman :-)

Another issue is to do with naming. In your example, you decorate Test. 
What that means in practice is that you create a new class, Klass(Test), 
throw away Test, and bind Klass to the top-level name Test. So in effect 
you're doing this:

class Test # The undecorated version.

class Klass(Test)  # Subclass it inside the decorator.

Test = Klass  # throw away the original and re-use the variable name.

But classes, like functions, have *two* names. They have the name they 
are bound to, the variable name (*usually* one of these, but sometimes 
zero or two or more). And they have their own internal name:

Test.__name__
=> returns "Klass"


This will make debugging unneccesarily confusing. If you use your 
decorator three times:

@implements_to_string
class Spam

@implements_to_string
class Eggs

@implements_to_string
class Cheese


instances of all three of Spam, Eggs and Cheese will claim to be 
instances of "Klass".

Now there is a simple work-around for this: inside the decorator, call

Klass.__name__ = cls.__name__ 

before returning. But that leads to another issue, where instances of 
the parent, undecorated, class (if any!) and instances of the child, 
decorated, class both claim to be from the same "Test" class. This is 
more of theoretical concern, since you're unlikely to be instantiating 
the undecorated parent class.


> I subclassed because I needed the encoding value in the decorator.
> But subclassing may indeed have been overkill.

Yes :-)

The encoding value isn't actually defined until long after the decorator 
has finished doing its work, after the class is decorated, and an 
instance is defined. So there is no encoding value used in the decorator 
itself. The decorator can trivially refer to the encoding value, so long 
as that doesn't actually get executed until after an instance is 
created:

def decorate(cls):
    def spam(self):
        print(self.encoding)
    cls.spam = spam
    return cls

works fine without subclassing.



> >Here's a better 
> >approach: inject the appropriate methods into the class directly. Here's 
> >a version for Python 3:
[...]
> >This avoids overwriting __str__ if it is already defined, and likewise 
> >for __bytes__.
> 
> Doesn't a class always have __str__ implementation?

No. Where is the __str__ implementation here?

class X:
    pass

This class defines no methods at all. Its *superclass*, object in Python 
3, defines methods such as __str__. But you'll notice that I didn't call 

    hasattr(cls, '__str__') 

since that will return True, due to object having a __str__ method. I 
called

    '__str__' in cls.__dict__

which only returns True if cls explicitly defines a __str__ method.


> Nice, thanks Steven. I made a couple of versions after reading your 
> advise. The main change that I still had to somehow retrieve the 
> encoding value from the class to be decorated (decoratee?). I simply 
> stored it in __dict__. Here is the second version that I created: 
> http://pastebin.com/te3Ap50C. I tested it in Python 2 and 3. 

Not sufficiently :-) Your test class has problems. See below.



> The Test 
> class contains __str__ and __unicode__ which are renamed and redefined 
> by the decorator if Python 3 (or 4, or..) is used.
> 
> 
> General question: I am using pastebin now. Is that okay, given that 
> this is not part of the "memory" of the Python Tutor archive? It might 
> be annoying if people search the archives and get 404s if they try to 
> follow these links. Just in case I am also pasting the code below:

In my opinion, no it's not okay, particularly if your code is short 
enough to be posted here.

Just because a pserson has access to this mailing list doesn't 
necessarily mean they have access to pastebin. It might be blocked. The 
site might be down. They might object to websites that require 
Javascript (pastebin doesn't *require* it, but it's only a matter of 
time...). Or they may simply be too busy/lazy to follow the link.


> from __future__ import print_function
> import sys
>     
> def decorate(cls):
>     print("decorate called")
>     if sys.version_info[0] > 2:
>         cls.__dict__["__str__"].__name__ = '__bytes__'
>         cls.__dict__["__unicode__"].__name__ = '__str__'
>         cls.__bytes__ = cls.__dict__["__str__"]
>         cls.__str__ = cls.__dict__["__unicode__"]  
>     return cls

I thought your aim was to write something that was cross-version and 
that added default __str__ and __unicode__ methods to the class if they 
didn't already exist? [looks back at the original code...] Ah no, my 
mistake, I misunderstood.

The above requires the caller to write their classes using the Python 2 
style __str__ and __unicode__ methods. __unicode__ isn't even mandatory 
in Python 2, but your decorate won't work without it!

As given, your decorator:

- does nothing in Python 2, even if the caller didn't define __str__ 
  or __unicode__ methods;

- fails in Python 3 if the class doesn't define a  __unicode__ method;

- does the wrong thing in Python 3 if the class already has correctly 
  working __str__ and __bytes__ methods;

- doesn't help you if you have a Python 3 style class and want to use
  it in Python 2;

- doesn't work well if the decorated class inherits its __str__ and 
  __unicode__ methods from a parent class.


Admittedly, that last one is tricky, thanks to everything inheriting 
from object.


> @decorate
> class Test(object):
> 
>     def __init__(self):
>         self.__dict__["encoding"] = self.encoding

Why are you doing that? What is the outcome you are hoping for, and why 
do you think it is necessary?


>     def __str__(self):
>         return "str called".encode(self.encoding)
> 
>     def __unicode__(self):
>         return "unicode called"

These are wrong! Worse, you have multiple errors that cancel each 
other out -- sometimes, two wrongs do make a right.

In Python 2: calling encode on a byte-string is permitted, but is the 
wrong thing to do. By accident, it (usually?) works, but you shouldn't 
do it. So there's your first wrong.

When converted to Python 3, the __str__ method becomes __bytes__, and is 
supposed to return bytes. Now the "str called" literal is Unicode, and 
encode will work, returning bytes. But it only works because of the 
first wrong -- if you re-write __str__ to use b"str called", or to call 
"str called".decode, your Python 3 __bytes__ method will fail.

In Python 2, __unicode__ ought to return a unicode string, u"unicode 
called". By accident, if you return a byte string, Python will decode it 
using ASCII, and it seems to work. But it's still wrong, and it's 
particularly likely to go wrong if the __unicode__ method does any, 
well, Unicode stuff. 

When converted to __str__ by the decorator, the ex-__unicode__ method 
will work, but only because you used a (Python2) byte-string literal 
"..." inside it. If you wrote a u"Unicode string", it would fail in 
Python 3.1 or 3.2 (but work in 3.3 and better).


>     @property
>     def encoding(self):
>         """In reality this method extracts the encoding from a file"""
>         return "utf-8" # rot13 no longer exists in Python3

Why would you do that?

Why not just supply the encoding when you initialise the instance?

    def __init__(self, encoding):
        self.encoding = encoding


> if __name__ == "__main__":
>     t = Test()
>     if sys.version_info[0] == 2:
>         print(unicode(t))
>     print(str(t))

This is insufficient testing. In Python 2, you need to test both 
unicode(t) and str(t). In Python 3, you need to test both str(t) and 
bytes(t).

In may turn out that, by accident, all four tests work for the given 
Test class. But that's not going to apply to everything.


 
-- 
Steven


More information about the Tutor mailing list