OT: Performance vs. Clarity vs. Convention
Forgive me if this is slightly off-topic for this list, but since we've been talking about migration guides and coding idioms and tweaking performance and such, I've got a few questions I'd like to ask. I'll start with an actual code sample. This is a very simple class that's part of an xhtml toolkit I'm writing. class Comment: def __init__(self, content=''): self.content = content def __call__(self, content=''): o = self.__class__(content) return str(o) def __str__(self): return '<!-- %s -->' % self.content def __repr__(self): return repr(self.__str__()) When I look at this, I see certain decisions I've made and I'm wondering if I've made the best decisions. I'm wondering how to balance performance against clarity and proper coding conventions. 1. In the __call__ I save a reference to the object. Instead, I could simply: return str(self.__class__(content)) Is there much of a performance impact by explicitly naming intermediate references? (I need some of Tim Peter's performance testing scripts.) 2. I chose the slightly indirect str(o) instead of o.__str__(). Is this slower? Is one style preferred over the other and why? 3. I used a format string, '<!-- %s -->' % self.content, where I could just as easily have concatenated '<!-- ' + self.content + ' -->' instead. Is one faster than the other? 4. Is there any documentation that covers these kinds of issues where there is more than one way to do something? I'd like to have some foundation for making these decisions. As you can probably guess, I usually hate having more than one way to do anything. ;-) --- Patrick K. O'Brien Orbtech
class Comment:
def __init__(self, content=''): self.content = content
def __call__(self, content=''): o = self.__class__(content) return str(o)
def __str__(self): return '<!-- %s -->' % self.content
def __repr__(self): return repr(self.__str__())
When I look at this, I see certain decisions I've made and I'm wondering if I've made the best decisions. I'm wondering how to balance performance against clarity and proper coding conventions.
1. In the __call__ I save a reference to the object. Instead, I could simply:
return str(self.__class__(content))
Is there much of a performance impact by explicitly naming intermediate references? (I need some of Tim Peter's performance testing scripts.)
Since o is a "fast local" (all locals are fast locals except when a function uses exec or import *), it is very fast. The load and store of fast locals are about the fastest opcodes around. I am more worried about the inefficiency of instantiating self.__class__ and then throwing it away after calling str() on it. You could factor out the body of __str__ into a separate method so that you can invoke it from __call__ without creating an instance.
2. I chose the slightly indirect str(o) instead of o.__str__(). Is this slower? Is one style preferred over the other and why?
str(o) is preferred. I would say that you should never call __foo__ methods directly except when you're overriding a base class's __foo__ method.
3. I used a format string, '<!-- %s -->' % self.content, where I could just as easily have concatenated '<!-- ' + self.content + ' -->' instead. Is one faster than the other?
You could time it. My personal belief is that for more than one + operator, %s is faster.
4. Is there any documentation that covers these kinds of issues where there is more than one way to do something? I'd like to have some foundation for making these decisions. As you can probably guess, I usually hate having more than one way to do anything. ;-)
I'm not aware of documentation, and I think you should give yourself some credit for having a personal opinion. Study the standard library and you'll get an idea of what's "done" and what's "not done". BTW I have another gripe about your example.
def __str__(self): return '<!-- %s -->' % self.content
def __repr__(self): return repr(self.__str__())
This definition of __repr__ makes no sense to me -- all it does is add string quotes around the contents of the string (and escape non-printing characters and quotes if there are any). That is confusing, because it will appear to the reader as if the object is a string. You probably should write __repr__ = __str__ instead. --Guido van Rossum (home page: http://www.python.org/~guido/)
[Guido van Rossum]
I am more worried about the inefficiency of instantiating self.__class__ and then throwing it away after calling str() on it. You could factor out the body of __str__ into a separate method so that you can invoke it from __call__ without creating an instance.
Some more code from the module might help explain this design decision. I'm still sort of toying with this to see if I like it. The basic idea here is that I'm trying to support both DOM-like xhtml structures as well as simple function-like callables that return strings. When the instance is called it needs a fresh state in order to better mimic a true function. It isn't immediately obvious to me how I might refactor this to avoid instantiating a throwaway. class Element: def __init__(self, klass, id, style, title): self.name = self.__class__.__name__.lower() self.attrs = { 'class': klass, # Space-separated list of classes. 'id': id, # Document-wide unique id. 'style': style, # Associated style info. 'title': title, # Advisory title/amplification. } def attrstring(self): attrs = self.attrs.keys() attrs.sort() # Sorting is only cosmetic, not required. l = [] # List of formatted attribute/value pairs. for attr in attrs: value = self.attrs[attr] if value is not None and value != '': l += ['%s="%s"' % (attr, convert(value))] s = ' ' + ' '.join(l) # Prepend a single space. return s.rstrip() # Reduce to an empty string if no attrs. def __str__(self): pass def __repr__(self): return repr(self.__str__()) class EmptyElement(Element): def __init__(self, klass=None, id=None, style=None, title=None): Element.__init__(self, klass, id, style, title) def __call__(self, klass=None, id=None, style=None, title=None): o = self.__class__(klass, id, style, title) return str(o) def __str__(self): attrstring = self.attrstring() return '<%s%s />\n' % (self.name, attrstring) class SimpleElement(Element): def __init__(self, content='', klass=None, id=None, style=None, title=None): self.content = content Element.__init__(self, klass, id, style, title) def __call__(self, content='', klass=None, id=None, style=None, title=None): o = self.__class__(content, klass, id, style, title) return str(o) def __str__(self): attrstring = self.attrstring() return '<%s%s>\n%s\n%s>\n' % \ (self.name, attrstring, convert(self.content), self.name) class Br(EmptyElement): pass class Hr(EmptyElement): pass class P(SimpleElement): pass # The following singleton instances are callable, returning strings. # They can be used like simple functions to return properly tagged contents. br = Br() comment = Comment() hr = Hr() p = P()
BTW I have another gripe about your example.
def __str__(self): return '<!-- %s -->' % self.content
def __repr__(self): return repr(self.__str__())
This definition of __repr__ makes no sense to me -- all it does is add string quotes around the contents of the string (and escape non-printing characters and quotes if there are any). That is confusing, because it will appear to the reader as if the object is a string.
Yes. This was a conscious design choice for this particular application. Maybe there is a better way, and maybe I'm not being too Pythonic, but I'm not particularly troubled by this even though I know I'm "breaking the rules". I guess I don't mind if there is more than one way to do something, as long as one way is the Python way and the other way is my way. ;-) --- Patrick K. O'Brien Orbtech
On Wed, Jun 05, 2002, Patrick K. O'Brien wrote:
class Element:
def __str__(self): pass
Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError". -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In the end, outside of spy agencies, people are far too trusting and willing to help." --Ira Winkler
[Aahz]
On Wed, Jun 05, 2002, Patrick K. O'Brien wrote:
class Element:
def __str__(self): pass
Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError".
I agree. That's a bad habit of mine that I need to change. Thanks for the reminder. --- Patrick K. O'Brien Orbtech
class Element:
def __str__(self): pass
Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError".
But that has different semantics! --Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, Jun 06, 2002, Guido van Rossum wrote:
class Element:
def __str__(self): pass
Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError".
But that has different semantics!
Yes, exactly. My point was that one rarely wants the semantics of "pass" for method definitions, and that goes double or triple for the special methods such as __str__. Consider what happens to an application that calls str() on this object and gets back a None instead of a string. Blech -- errors should never pass silently. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I had lots of reasonable theories about children myself, until I had some." --Michael Rios
Guido van Rossum
def __str__(self): pass
Dunno about other people's opinions, but I have a strong distaste for creating methods whose body contains pass. I always use "raise NotImplementedError".
But that has different semantics!
In this particular case, the program blows up anyway if this method is ever called, so you might as well return a meaningful exception! Python 2.2 (#14, May 28 2002, 14:11:27) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information.
class C: ... def __str__(self): ... pass ... c = C() str(c) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: __str__ returned non-string (type NoneType)
Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Yes. This was a conscious design choice for this particular application. Maybe there is a better way, and maybe I'm not being too Pythonic, but I'm not particularly troubled by this even though I know I'm "breaking the rules".
Maybe you shouldn't ask for advice if you have it all worked out already? :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (4)
-
Aahz
-
Greg Ewing
-
Guido van Rossum
-
Patrick K. O'Brien